Public Sentiment NLP – Reputational Risk
Public sentiment analysis on the X (Twitter) platform regarding the release of the Epstein documents. Data was collected using Tweet-Harvest (Node.js), stored in Google BigQuery, processed through a multi-stage NLP pipeline, and analyzed using VADER.
Detailed Insights
Data Collection & Pipeline
Scraping tweets with 'epstain/epstein' keywords in Indonesian. Data cleaned (Regex), tokenized, and lemmatized (Sastrawi). Sentiment labeled via VADER compound score.
Sentiment Findings
Negative sentiment dominates at ~49.6%. 'war' (176) indicates dominant geopolitical narratives. 'child' (92) reflects public anger. This indicates a highly emotionally charged discourse.
Recommendations
Increase reporting transparency to build public trust. Narratives need to be more balanced between legal facts and geopolitical context.
Tech Stack
Key Results
- 2,262 tweets analyzed
- 49.6% Negative, 25.6% Positive
- Top words: evidence (318), war (197)