High-accuracy prediction of mental health scores from English BERT embeddings trained on LLM-generated synthetic self-reports: a synthetic-only method development study
2026 (English)In: Frontiers in Digital Health, E-ISSN 2673-253X, Vol. 7, article id 1694464Article in journal (Refereed) Published
Abstract [en]
Objective: To assess whether synthetic-only first-person clinical self-reports generated by a large language model (LLM) can support accurate prediction of standardized mental-health scores, enabling a privacy-preserving path for method development and rapid prototyping when real clinical text is unavailable. Methods: We prompted an LLM (Gemini 2.5; July 2025 snapshot) to produce English-language first-person narratives that are paired with target scores for three instruments—PHQ-9 (including suicidal ideation), LSAS, and PCL-5. No real patients or clinical notes were used. Narratives and labels were created synthetically and manually screened for coherence and label alignment. Each narrative was embedded using bert-base-uncased (mean-pooled 768-d vectors). We trained linear/regularized linear (Linear, Ridge, Lasso) and ensemble models (Random Forest, Gradient Boosting) for regression, and Logistic Regression/Random Forest for suicidal-ideation classification. Evaluation used 5-fold cross-validation (PHQ-9/SI) and 80/20 held-out splits (LSAS/PCL-5). Metrics: MSE, (Formula presented.), MAE; classification metrics are reported for SI. Results: Within the synthetic distribution, models fit the label–text signal strongly (e.g., PHQ-9 Ridge: MSE (Formula presented.), (Formula presented.) ; LSAS Gradient Boosting test: MSE (Formula presented.), (Formula presented.) ; PCL-5 Ridge test: MSE (Formula presented.), (Formula presented.)). Conclusions: LLM-generated self-reports encode a score-aligned signal that standard ML models can learn, indicating utility for privacy-preserving, synthetic-only prototyping. This is not a clinical tool: results do not imply generalization to real patient text. We clarify terminology (synthetic text vs. real text) and provide a roadmap for external validation, bias/fidelity assessment, and scope-limited deployment considerations before any clinical use.
Place, publisher, year, edition, pages
Frontiers Media SA , 2026. Vol. 7, article id 1694464
Keywords [en]
BERT, digital mental health, large language models, LSAS, natural language processing, PCL-5, PHQ-9, privacy-preserving evaluation
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:kth:diva-376520DOI: 10.3389/fdgth.2025.1694464ISI: 001667048700001PubMedID: 41586203Scopus ID: 2-s2.0-105028571312OAI: oai:DiVA.org:kth-376520DiVA, id: diva2:2036661
Note
QC 20260209
2026-02-092026-02-092026-02-09Bibliographically approved