Project Background and Summary

New York Times Headline Sentiment Analysis

Previous Research

For this project, I am using some data gathered in the DACSS 602 course “Research Design”. In the project for that course, our research group posed the question:

How did the sentiment of news reporting on the U.S. withdrawal of Afghanistan shift over the period between when it was agreed upon between ex-President Trump and when it was executed by President Biden?

Basic Research Design: Manual coding of a stratified, representative sample of 300 articles after reaching an appropriate inter-coder reliability rating.

Data Sample: We coded articles that mention Afghanistan from the time of the Doha Agreement (February 29, 2020) through September 30, 2021, following the Congressional testimonies conducted September 28-29, 2021 regarding the withdrawal. The articles were collected from the New York Times and the Wall Street Journal World and News sections.

Text Coding Method: We used NVivo 12 to code news articles covering the U.S. withdrawal from Afghanistan during the period leading up to and following the day the last of the U.S. forces left Afghanistan.

Methods/Text Coding Categories:

Outcomes:

The article source (NYT v. WSJ) served as a moderator, with the outcomes being the analysis of media frames ‘before and after’ (Trump admin v. Biden admin).

Current Project

I continued down the same path but with new data and a new direction through the DACSS 697D course “Text as Data”.

This project examines the difference in headlines between the paper and online versions of the New York Times articles related to the withdrawal of U.S. troops from Afghanistan. The analysis includes articles that mention “Afghanistan” from the time of the Doha Agreement (February 29, 2020) through September 30, 2021, following the Congressional testimonies conducted September 28-29, 2021.

Analysis of a corpus compiled from data obtained through the New York Times API showed no statistically significant differences in the headlines using three widely used sentiment and emotion lexicons.

Topic modeling and examining a co-occurrence matrix of each set of headlines showed patterns in which types of words are chosen for the respective audience.

Specifically, this preliminary analysis showed that print headlines might carry fewer emotionally weighted words than online headlines.

Citations: