kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluating ChatGPT’s Ability to Compose Music Using the MIDI File Format
KTH, School of Electrical Engineering and Computer Science (EECS).
2023 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This thesis examines the capabilities of the artificial intelligence model (AI), ChatGPT 3.5-turbo, to compose valuable music in a digital format (MIDI) from natural language prompts. Implementing a multifaceted quantitative approach, the study combines objective musical metrics with subjective user evaluations. The proof-of-concept system developed for this research generated MIDI files using ChatGPT, which were analyzed against human-composed music from the Lakh MIDI dataset. Objective measures, including pitch-class distributions, Inter-Onset-Interval (IOI), pitch range, average pitch intervals, pitch counts, note length, and transition matrices, facilitated a comprehensive comparison. Findings revealed that while the AI model’s output demonstrated stylistic consistency and a certain level of musical texture, it exhibited less complexity and variety compared to human compositions. Subjective evaluations, derived from a feedback survey, revealed moderate to low satisfaction with the AI-generated music. The results suggested that users with higher musical experience were less satisfied with the compositions, indicating a correlation between musical experience and perception of the AI-generated music. Despite its limitations, ChatGPT exhibits the capability to generate valuable music from natural language prompts. However, enhancements are necessary to better mimic the complexity and variance found in human compositions in order to make it applicable in music production.

Abstract [sv]

Denna avhandling undersöker förmågan hos artificiell intelligensmodell (AI), ChatGPT 3.5-turbo, att komponera värdefull musik i ett digitalt format (MIDI) från naturligt språk. Studien implementerar en mångfacetterad kvantitativ metodansats, där objektiva musikaliska mått kombineras med subjektiva användarutvärderingar. Det konceptbevis-system som utvecklades för denna forskning genererade MIDI-filer med hjälp av ChatGPT, vilka sedan analyserades och jämfördes mot människokomponerad musik från Lakh MIDI-datasetet. Objektiva mått, inklusive pitch-klassdistributioner, Inter-Onset-Interval (IOI), pitchomfång, genomsnittliga pitchintervaller, pitchräkningar, notlängd och övergångsmatriser, möjliggjorde en omfattande jämförelse. Resultaten visade att medan AI-modellens kompositioner visade stilistisk konsekvens och en viss nivå av musikalisk textur, uppvisade de mindre komplexitet och variation jämfört med människans kompositioner. Subjektiva utvärderingar, härledda från en återkopplingsundersökning, avslöjade måttlig till låg tillfredsställelse med AI-genererad musik. Resultaten antydde att användare med högre musikalisk erfarenhet var mindre nöjda med kompositionerna, vilket indikerar ett samband mellan musikalisk erfarenhet och uppfattningen om AI-genererad musik. Trots sina begränsningar visar ChatGPT förmågan att generera värdefull musik från naturliga språkpåminnelser. Men förbättringar behövs för att bättre efterlikna komplexiteten och variansen som finns i människokompositioner för att göra den tillämplig inom musikproduktion.

Place, publisher, year, edition, pages
2023. , p. 45
Series
TRITA-EECS-EX ; 2023:303
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-330851OAI: oai:DiVA.org:kth-330851DiVA, id: diva2:1779208
Supervisors
Examiners
Available from: 2023-08-01 Created: 2023-07-03 Last updated: 2023-08-01Bibliographically approved

Open Access in DiVA

fulltext(933 kB)1079 downloads
File information
File name FULLTEXT01.pdfFile size 933 kBChecksum SHA-512
e955405a1e04d1f53258863b980770e99f0015d2f8635c3606140648052d6583048b2bffd37703eb43dde264a5d094bb661f2bd9a7842f269c91d931b98f5956
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 1080 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 1011 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf