kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Chordinator: Modeling Music Harmony by Implementing Transformer Networks and Token Strategies
Univ Lille, CNRS, UMR 9189, Cent Lille,CRIStAL, F-59000 Lille, France..
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Speech, Music and Hearing, TMH.ORCID iD: 0000-0003-2549-6367
2024 (English)In: ARTIFICIAL INTELLIGENCE IN MUSIC, SOUND, ART AND DESIGN, EVOMUSART 2024 / [ed] Johnson, C Rebelo, SM Santos, I, Springer Nature , 2024, Vol. 14633, p. 52-66Conference paper, Published paper (Refereed)
Abstract [en]

This paper compares two tokenization strategies for modeling chord progressions using the encoder transformer architecture trained with a large dataset of chord progressions in a variety of styles. The first strategy includes a tokenization method treating all different chords as unique elements, which results in a vocabulary of 5202 independent tokens. The second strategy expresses the chords as a dynamic tuple describing root, nature (e.g., major, minor, diminished, etc.), and extensions (e.g., additions or alterations), producing a specific vocabulary of 59 tokens related to chords and 75 tokens for style, bars, form, and format. In the second approach, MIDI embeddings are added into the positional embedding layer of the transformer architecture, with an array of eight values related to the notes forming the chords. We propose a trigram analysis addition to the dataset to compare the generated chord progressions with the training dataset, which reveals common progressions and the extent to which a sequence is duplicated. We analyze progressions generated by the models comparing HITS@k metrics and human evaluation of 10 participants, rating the plausibility of the progressions as potential music compositions from a musical perspective. The second model reported lower validation loss, better metrics, and more musical consistency in the suggested progressions.

Place, publisher, year, edition, pages
Springer Nature , 2024. Vol. 14633, p. 52-66
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 14633
Keywords [en]
Chord progressions, Transformer Neural Networks, Music Generation
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-347155DOI: 10.1007/978-3-031-56992-0_4ISI: 001212363900004Scopus ID: 2-s2.0-85190658013OAI: oai:DiVA.org:kth-347155DiVA, id: diva2:1864821
Conference
13th International Conference on Artificial Intelligence in Music, Sound, Art and Design (EvoMUSART) Held as Part of EvoStar Conference, APR 03-05, 2024, Aberystwyth, WALES
Note

QC 20240604

Part of ISBN 978-3-031-56991-3; 978-3-031-56992-0

Available from: 2024-06-04 Created: 2024-06-04 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Dalmazzo, DavidSturm, Bob

Search in DiVA

By author/editor
Dalmazzo, DavidSturm, Bob
By organisation
KTHSpeech, Music and Hearing, TMH
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 62 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf