Detection of schizophrenia spectrum disorder and major depression disorder using automated speech analysis
Objective biomarkers for differential diagnosis in psychiatry are still scarce. Voice atypicalities characterize two prominent, often co-occurring psychiatric disorders: schizophrenia-spectrum disorders (SSD) and major depressive disorders (MDD). Given that voice recordings can be easily obtained, a...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article (Journal) |
| Language: | English |
| Published: |
Oct.-Dec. 2025
|
| In: |
IEEE transactions on affective computing
Year: 2025, Volume: 16, Issue: 4, Pages: 2988-2999 |
| ISSN: | 1949-3045 |
| DOI: | 10.1109/TAFFC.2025.3564531 |
| Online Access: | Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1109/TAFFC.2025.3564531 Verlag, lizenzpflichtig, Volltext: https://ieeexplore.ieee.org/document/11002678 |
| Author Notes: | Inka C. Hiß, Jarek Krajewski, Ulrich Canzler, Steffen Leonhardt, Christoph Weiss, Benjamin Clemens, and Ute Habel |
| Summary: | Objective biomarkers for differential diagnosis in psychiatry are still scarce. Voice atypicalities characterize two prominent, often co-occurring psychiatric disorders: schizophrenia-spectrum disorders (SSD) and major depressive disorders (MDD). Given that voice recordings can be easily obtained, advanced speech analysis might facilitate the development of diagnostic biomarkers for SSD and MDD. Speech was recorded from a transdiagnostic sample comprising 47 SSD patients, 62 MDD patients, and 41 healthy controls (HC), during three different tasks: a semi-structured interview, a reading task and an empathy task. We evaluated the discriminative power of standardized speech parameters and compared the performance of the three tasks. The extended Geneva Acoustic Minimalistic Parameter Set (eGeMAPS) was extracted using openSMILE and fed into random forest (RF) algorithms with 10-fold cross-validation. Model performances were evaluated using accuracy, F1-score, precision, and recall. Importance of specific predictors was assessed using Gini importance. In this three-class problem, a simple 1-minute video task reached best results with 57% accuracy. The acoustic parameters revealed distinct vocal profiles associated with each disorder. Considering the chance probability of 33%, our results show that automated speech analysis could predict diagnostic classes with good to high accuracy. |
|---|---|
| Item Description: | Online veröffentlicht: 12. Mai 2025 Gesehen am 08.01.2026 |
| Physical Description: | Online Resource |
| ISSN: | 1949-3045 |
| DOI: | 10.1109/TAFFC.2025.3564531 |