kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Jämförande undersökning av Gemini Ultra och GPT-4 med avseende på integrering inom en matematiskt pedagogisk verksamhet
KTH, School of Electrical Engineering and Computer Science (EECS).
KTH, School of Electrical Engineering and Computer Science (EECS).
2024 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Comparative Evaluation of Gemini Ultra and GPT-4 for Mathematical Pedagogical Integration (English)
Abstract [en]

This study investigates the potential of two state-of-the-art AI-based language models, OpenAI’s GPT-4 and Google’s Gemini Ultra, to improve math performance among Swedish students. In light of the latest PISA 2022 results, which show a decline in mathematical performance, the need for innovative and effective educational tools more evident than ever. The study focuses on the implementation of these language models within Mattecoach.se, a digital platform offering math assistance, and evaluates their ability to deliver pedagogically relevant and mathematically accurate answers. By integrating AI into the education sector, the study explores opportunities to relieve teachers and create a more adaptive and responsive learning environment. To assess the mathematical competence of the language models, responses were generated for 136 different math questions from national exams at the secondary and high school levels. With the assistance of employees at Mattecoach.se, these responses were evaluated to determine both the mathematical accuracy and the pedagogical adequacy of the language models. The results of the study indicate that GPT-4 performed better in terms of mathematical accuracy, with 79% correct answers, while Gemini Ultra achieved only 57% correct answers. The inability to consistently produce correct answers is reflected in the operational feedback, as employees do not see as much value in using AI if the answers may not be reliable.

Abstract [sv]

Denna studie undersöker potentialen hos två state of the art AI-baserade språkmodeller, OpenAI:s GPT-4 och Google:s Gemini Ultra, för att förbättra matematikresultaten bland svenska grundskole- och gymnasieelever. Mot bakgrund av de senaste resultaten från PISA 2022, vilka visar på en nedgång i matematiska prestationer, är behovet av innovativa och effektiva pedagogiska verktyg tydligare än någonsin. Studien fokuserar på implementeringen av dessa språkmodeller inom Mattecoach.se, en digital plattform som erbjuder matematikhjälp, och utvärderar deras förmåga att leverera pedagogiskt relevanta och matematiskt korrekta svar. Genom att integrera AI i utbildningssektorn undersöker studien möjligheter att avlasta lärare och skapa en mer adaptiv och responsiv läromiljö. För att utvärdera språkmodellernas matematiska kompetens genererades svar till 136 olika matematikfrågor från nationella prov på högstadiet och gymnasienivå. Med assistans från anställda inom Mattecoach.se bedömdes dessa svar för att fastställa både den matematiska korrektheten och den pedagogiska skickligheten hos språkmodellerna. Resultatet av studien är att GPT-4 presterade bättre på matematisk korrekthet med 79% korrekta svar medan Gemini Ultra bara hade 57% korrekta svar. Bristen av att konsekvent kunna producera korrekta svar avspeglas i verksamheten i form av att medarbetarna inte ser lika stort värde i att använda AI om svaren kanske inte stämmer.

Place, publisher, year, edition, pages
2024. , p. 10
Series
TRITA-EECS-EX ; 2024:416
Keywords [en]
Artificial Intelligence (AI), Educational Technology, Pedagogy, GPT-4, Gemini Ultra, AI in Education, Digital Learning Tools, Interactive Learning Environments, Personalized Learning, Online Learning Platforms, Human-Computer Interaction in Education, Computational Pedagogy, AI-Enhanced Learning, Mathematics Education, Semi-Automated Responses
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-351223OAI: oai:DiVA.org:kth-351223DiVA, id: diva2:1886725
Supervisors
Examiners
Available from: 2024-09-19 Created: 2024-08-03 Last updated: 2024-09-19Bibliographically approved

Open Access in DiVA

fulltext(337 kB)59 downloads
File information
File name FULLTEXT01.pdfFile size 337 kBChecksum SHA-512
2ac5bdf026c83f9cd3c23db4262c7b197991986acd935b972516aa2742f1c1a18d6fbe2f9d42c17149ce565a815b4be676be1cf79a953cb482df0fb9dea1e7e9
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 59 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 163 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf