Detecting Depression in Social Media with NLP Models Trained on Journal Entry Data
DOI:
https://doi.org/10.58445/rars.1632Keywords:
text classification, depression, tweetsAbstract
Writing has always been recognized as a powerful means of expressing human emotions, serving as a reflective practice that allows individuals to process and articulate their inner experiences. However, with the rise of social media, the landscape of emotional expression has shifted. This transition from private journaling to public social media posting raises important questions about how effectively these platforms serve as emotional outlets and what they reveal about users' mental health, specifically with pervasive mood disorders like depression, which affects over 18 million adults in the United States. Recently, NLP models have been noted as a promising tool for detecting underlying sentiment in text. This research explores how the Twitter posts of individuals suffering from depression compare when analyzed using a natural language processing (NLP) model trained on journal data classified by emotions. Two separate clustering approaches were used to reduce dimensionality in training data and train machine learning models, one with spectral clustering and principal component analysis (PCA), and the other with the Natural Language Toolkit (NLTK) library. The results of both machine learning approaches, with accuracy over 99%, demonstrated that tweets of depressed Twitter users are classified as more negative compared to those of non-depressed users. These findings suggest that the emotional content expressed in social media posts by individuals with depression is consistently more negative, aligning with the patterns observed in their journal entries. Ultimately, this research highlights the evolving role of social media as a platform for emotional expression and its implications for mental health monitoring.
References
Desmet, B., & Hoste, V. (2013). Emotion detection in suicide notes. Expert Systems with Applications, 40, 6351–6358. https://doi.org/10.1016/j.eswa.2013.05.050
Early Identification of Mental Health Issues in Young People. (n.d.). Mental Health America. https://mhanational.org/issues/early-identification-mental-health-issues-young-people
Facts about Depression | Hope for Depression. (2013). Hope for Depression. https://www.hopefordepression.org/depression-facts/
GeeksforGeeks. (2023, December 14). Spectral Clustering A Comprehensive Guide for Beginners. GeeksforGeeks; GeeksforGeeks. https://www.geeksforgeeks.org/spectral-clustering-a-comprehensive-guide-for-beginners/#
Hyun Ki Cho. (2021). Twitter Depression Dataset. Kaggle.com. https://www.kaggle.com/datasets/hyunkic/twitter-depression-dataset?resource=download
Mogyorosi, M. (n.d.). Sentiment Analysis: First Steps With Python’s NLTK Library – Real Python. Realpython.com. https://realpython.com/python-nltk-sentiment-analysis/
National Institute Of Mental Health. (2023, March). Depression. National Institute of Mental Health. https://www.nimh.nih.gov/health/topics/depression
Potamias, R. A., Siolas, G., & Stafylopatis, A. - G. (2020). A transformer-based approach to irony and sarcasm detection. Neural Computing and Applications, 32(23), 17309–17320. https://doi.org/10.1007/s00521-020-05102-3
Principal Component Analysis (PCA) Explained | Built In. (n.d.). Builtin.com. https://builtin.com/data-science/step-step-explanation-principal-component-analysis#:~:text=necessary%20for%20context.-
Sohal, M., Singh, P., Dhillon, B. S., & Gill, H. S. (2022). Efficacy of journaling in the management of mental illness: a systematic review and meta-analysis. Family medicine and community health, 10(1), e001154. https://doi.org/10.1136/fmch-2021-001154
X. Alice Li, & Parikh, D. (2020). Lemotif: An affective visual journal using deep neural networks. https://arxiv.org/abs/1903.07766
Downloads
Posted
Categories
License
Copyright (c) 2024 Tvisha Choubey

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license