Longitudinal analysis of sentiment and emotion in news media headlines using automated labelling with Transformer language models. David Rozado ,Ruth Hughes,Jamin Halberstadt. PLOS One, October 18, 2022. https://doi.org/10.1371/journal.pone.0276367
Abstract: This work describes a chronological (2000–2019) analysis of sentiment and emotion in 23 million headlines from 47 news media outlets popular in the United States. We use Transformer language models fine-tuned for detection of sentiment (positive, negative) and Ekman’s six basic emotions (anger, disgust, fear, joy, sadness, surprise) plus neutral to automatically label the headlines. Results show an increase of sentiment negativity in headlines across written news media since the year 2000. Headlines from right-leaning news media have been, on average, consistently more negative than headlines from left-leaning outlets over the entire studied time period. The chronological analysis of headlines emotionality shows a growing proportion of headlines denoting anger, fear, disgust and sadness and a decrease in the prevalence of emotionally neutral headlines across the studied outlets over the 2000–2019 interval. The prevalence of headlines denoting anger appears to be higher, on average, in right-leaning news outlets than in left-leaning news media.
The chronological analysis of headlines emotionality shows a growing proportion of headlines denoting anger, fear, disgust and sadness and a decrease in the prevalence of emotionally neutral headlines across the studied outlets over the 2000–2019 interval. The prevalence of headlines denoting anger appears to be higher, on average, in right-leaning news outlets than in left-leaning news media.
Discussion
The results of this work show an increase of sentiment negativity in headlines across news media outlets popular in the United States since at least the year 2000. The sentiment of headlines in right-leaning news outlets has been, on average, more negative than the sentiment of headlines in left-leaning news outlets for the entirety of the 2000–2019 studied time interval. Also, since at least the year 2008, there has been a substantial increase in the prevalence of headlines denoting anger across popular news media outlets. Here as well, right-leaning news media appear, on average, to have used a higher proportion of anger denoting headlines than left-leaning news outlets. The prevalence of headlines denoting fear and sadness has also increased overall during the 2000–2019 interval. Within the same temporal period, the proportion of headlines with neutral emotional valence has markedly decreased across the entire news media ideological spectrum.
The higher prevalence of negativity and anger in right-leaning news media is noteworthy. Perhaps this is due to right-leaning news media simply using more negative language than left-leaning news media to describe the same phenomena. Alternatively, the higher negativity and anger undertones in headlines from right-leaning news media could be driven by differences in topic coverage between both types of outlets. Clarifying the underlying reasons for the different sentiment and emotional undertones of headlines between left-leaning and right-leaning news media could be an avenue for relevant future research.
The structural break in the sentiment polarity and the emotional payload of headlines around 2010 is intriguing, although the short nature of the time series under investigation (just 20 years of observations) makes the reliability uncertain. Due to the methodological limitations of our observational study, we can only speculate about its potential causes.
In the year 2009, social media giants Facebook and Twitter added the like and retweet buttons respectively to their platforms [33]. These features allowed those social media companies to collect information about how to capture users’ attention and maximize engagement through algorithmically determined personalized feeds. Information about which news articles diffused more profusely through social media percolated to news outlets by user-tracking systems such as browser cookies and social media virality metrics. In the early 2010s, media companies also began testing news media headlines across dozens of variations to determine the version that generated the highest click-through ratio [34]. Thus, a perverse incentive might have emerged in which news outlets, judging by the larger reach/popularity of their articles with negative/emotional headlines, started to drift towards increasing usage of negative sentiment/emotions in their headlines.
A limitation of this work is the frequent semantic overloading of the sentiment/emotion annotation task. The negative sentiment category for instance often conflates into the same umbrella notion of negativity text that describes suffering and/or being at the receiving end of mistreatment, as in “the Prime Minister has been a victim of defamation”, with text that denotes negative behavior or character traits, as in “the Prime Minister is selfish”. Thus, it is uncertain whether the increasing prevalence of headlines with negative connotations emphasize victimization, negative behavior/judgment or a mixture of the two.
An additional limitation of this work is the frequent ambiguity of the sentiment/emotion annotation task. The sentiment polarity and particularly the emotional payload of a text instance can be highly subjective and intercoder agreement is generally low, especially for the latter, albeit above chance guessing. For this reason, automated annotations for single instances of text can be noisy and thus unreliable. Yet, as shown in the simulation experiments (see S1 File for details), when aggregating the emotional payload over a large number of headlines, the average signal raises above the noise to provide a robust proxy of overall emotion in large text corpora. Reliable annotations at the individual headline level however would require more overdetermined emotional categories.
The imbalanced nature of the emotion labels also represents a challenge for the classification analysis. For that reason, we used performance metrics that are recommended when handling imbalanced data such as confusion matrices, precision, recall and F-1 scores. Usage of different algorithms such as decision trees are often recommended when working with imbalanced data. However, since Transformer models represent the state-of-the-art for NLP text classification, we circumscribed our analysis to their usage. Other techniques for dealing with imbalanced data such as oversampling the minority class or under sampling the majority class could have also been used. However, our relatively small number of human annotated headlines (1124 for sentiment and 5353 for emotion), constrained our ability to trim the human-annotated data set.
Another limitation of this work is the potential biases of the human raters that annotated the sentiment and emotion of news media headlines. It is conceivable that our sample of human raters, recruited through Mechanical Turk, is not representative of the general US population. For instance, the distribution of socioeconomic status among raters active in Mechanical Turk might not match the distribution of the entire US population. The impact of such potential sample bias on headlines sentiment/emotion estimation is uncertain.
A final limitation of our work is the small number of outlets falling into the centrist political orientation category according to the AllSides Media Bias Chart v1.1. Such small sample size limits the sample representativeness and constraints the external validity of the centrist outlets results reported here.
An important question raised by this work is whether the sentiment and emotionality embedded in news media headlines reflect a wider societal mood or if instead they just reflect the sentiment and emotionality prevalent or pushed by those creating news content. Financial incentives to maximize click-through ratios could be at play in increasing the sentiment polarity and emotional charge of headlines over time. Conceivably, the temptation of shaping the sentiment and emotional undertones of news headlines to advance political agendas could also be playing a role. Deciphering these unknowns is beyond the scope of this article and could be a worthy goal for future research.
To conclude, we hope this work paves the way for further exploration about the potential impact on public consciousness of growing emotionality and sentiment negativity of news media content and whether such trends are conductive to sustain public well-being. Thus, we hope that future research throws light on the potential psychological and social impact of public consumption of news media diets with increasingly negative sentiment and anger/fear/sadness undertones embedded within them.
No comments:
Post a Comment