kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Single-pass Hierarchical Text Classification with Large Language Models
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-4310-0867
University of Oslo, Norway.
Braive AS, Norway.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-2748-8929
Show others and affiliations
2024 (English)In: Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 5412-5421Conference paper, Published paper (Refereed)
Abstract [en]

Numerous text classification tasks inherently possess hierarchical structures among classes, often overlooked in traditional classification paradigms. This study introduces novel approaches for hierarchical text classification using Large Language Models (LLMs), exploiting taxonomies to improve accuracy and traceability in a zero-shot setting. We propose two hierarchical classification methods, namely (i) single-path and (ii) path-traversal, which all leverage the hierarchical class structures inherent in the target classes (e.g., a bird is a type of animal that belongs to a species) and improve naïve hierarchical text classification from literature. We implement them as prompts for generative models such as OpenAI GPTs and benchmark them against discriminative language models (BERT and RoBERTa). We measure the classification performance (precision, recall, and F1-score) vs. computational efficiency (time and cost). Throughout the evaluations of the classification methods on two diverse datasets, namely ComFaSyn, containing mental health patients' diary entries, and DBpedia, containing structured information extracted from Wikipedia, we observed that our methods, without any form of fine-tuning and few-shot examples, achieve comparable results to flat classification and existing methods from literature with minimal increases in the prompts and processing time.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. p. 5412-5421
Keywords [en]
Hierarchical text classification, Large Language Models (LLMs), zero-shot classification
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-360563DOI: 10.1109/BigData62323.2024.10825412Scopus ID: 2-s2.0-85218008858OAI: oai:DiVA.org:kth-360563DiVA, id: diva2:1940629
Conference
2024 IEEE International Conference on Big Data, BigData 2024, Washington, United States of America, Dec 15 2024 - Dec 18 2024
Note

Part of ISBN 9798350362480

QC 20250226

Available from: 2025-02-26 Created: 2025-02-26 Last updated: 2025-02-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Schmidt, FabianPayberah, Amir H.Vlassov, Vladimir

Search in DiVA

By author/editor
Schmidt, FabianPayberah, Amir H.Vlassov, Vladimir
By organisation
Software and Computer systems, SCS
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 157 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf