基于语音特征的抑郁症 AI 筛查模型的研究与设计

TACS

Technology and Application of Computer Science

2998-89262998-8934

Art and Design

10.61369/TACS.2025010025

Article

基于语音特征的抑郁症 AI 筛查模型的研究与设计https://artdesignp.com/journal/TACS/2/1/10.61369/TACS.2025010025张宵,白雪俊,唐琳,张乐伊,苏雪,王金社,沈宇星,陈彦华

2025

2025-01-14

针对传统抑郁症量表 [2] 筛查效率低的问题，本研究提出基于语音特征的自动筛查模型。通过采集 200例临床患者和健康个体的语音样本，经预处理提取特征后，构建结合 LSTM 时间建模与 Attention 机制的深度学习模型。测试显示模型准确率达 84.62%，F1分数 0.86，在效率和一致性上优于传统量表。抑郁症,语音特征,LSTM-Attention 机制,深度学习,心理健康筛查

[1] 世界卫生组织 . 抑郁症及其他常见精神障碍 : 全球卫生估算报告 [R]. 瑞士 : 世界卫生组织 , 2022. [2]World Health Organization. The ICD-10 classification of mental and behavioural disorders: Clinical descriptions and diagnostic guidelines[M]. Geneva: WHO, 1992. [3]Kroenke K, Spitzer R L. The PHQ-9: A new depression diagnostic and severity measure[J]. Psychiatric Annals, 2002, 32(9): 509-515. [4]Hamilton M. A rating scale for depression[J]. Journal of Neurology, Neurosurgery & Psychiatry, 1960, 23(1): 56-62. [5]Donoho D L, Johnstone I M. Ideal spatial adaptation by wavelet shrinkage[J]. Biometrika, 1994, 81(3): 425-455. [6]Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation,1997, 9(8): 1735-1780. [7]Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014. [8]Cooley J W, Tukey J W. An algorithm for the machine calculation of complex Fourier series[J]. Mathematics of Computation, 1965, 19(90): 297-301. [9]Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014. [10]Vaswani A, et al. Attention is all you need[C]. Advances in Neural Information Processing Systems, 2017: 5998-6008. [11]Cummins N, et al. A review of depression and suicide risk assessment using speech analysis[J]. Speech Communication, 2015, 71: 10-49. [12]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 770-778. [13]Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv:1810.04805, 2018. [14]Valstar M, Schuller B, Smith K, et al. AVEC 2016: Depression, mood, and emotion recognition workshop and challenge[C]//Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge. Amsterdam, Netherlands: ACM,2016: 3-10. [15]Li X, Pang T, Liu Y, et al. Multimodal fusion for mental health assessment[J]. IEEE Transactions on Affective Computing, 2021, 12(3): 582-595.