Using New Dictionaries

This time, my analysis will be through a different process of analysis than I used in my first two analyses.

This anaysis will use an alternative sequence to analyze a different set of three sentiment datasets within the “tidytext” package.

The dictionaries are “bing”, “nrc”, (which I used previously) and “AFINN”.

Load Data

Loading the data from the expanded analysis:

Show code

#load data
main_headlines <- read.csv("afghanistan_headlines_main.csv")
main_headlines <- as.data.frame(main_headlines)
#turn into data frame
print_headlines <- read.csv("afghanistan_headlines_print.csv")
print_headlines <- as.data.frame(print_headlines)

Bing Lexicon

The bing lexicon sorts words into positive or negative positions.

First I’ll create tokens for the main and print headlines.

Show code

#create tokens without stop words for main headlines
tkn_l_main <- apply(main_headlines, 1, function(x) { data.frame(text=x, stringsAsFactors = FALSE) %>% unnest_tokens(word, text)})
main_news_tokens <- lapply(tkn_l_main, function(x) {anti_join(x, stop_words)})
str(main_news_tokens, list.len = 5)

List of 936
 $ :'data.frame':   12 obs. of  1 variable:
  ..$ word: chr [1:12] "1" "7" "17" "2020" ...
 $ :'data.frame':   12 obs. of  1 variable:
  ..$ word: chr [1:12] "2" "8" "30" "2020" ...
 $ :'data.frame':   12 obs. of  1 variable:
  ..$ word: chr [1:12] "3" "6" "2" "2021" ...
 $ :'data.frame':   11 obs. of  1 variable:
  ..$ word: chr [1:11] "4" "12" "20" "2020" ...
 $ :'data.frame':   10 obs. of  1 variable:
  ..$ word: chr [1:10] "5" "9" "11" "2021" ...
  [list output truncated]

Show code

main_news_tokens[[1]]

             word
doc_id          1
date...2        7
date...3       17
date...4     2020
text...5      174
text...6  million
text...7   afghan
text...8    drone
text...9  program
text...10 riddled
text...11     u.s
text...12  report

Show code

#create tokens without stop words for print headlines
tkn_l_print <- apply(print_headlines, 1, function(x) { data.frame(text=x, stringsAsFactors = FALSE) %>% unnest_tokens(word, text)})
print_news_tokens <- lapply(tkn_l_print, function(x) {anti_join(x, stop_words)})
str(print_news_tokens, list.len = 5)

List of 936
 $ :'data.frame':   11 obs. of  1 variable:
  ..$ word: chr [1:11] "1" "7" "17" "2020" ...
 $ :'data.frame':   10 obs. of  1 variable:
  ..$ word: chr [1:10] "2" "8" "30" "2020" ...
 $ :'data.frame':   10 obs. of  1 variable:
  ..$ word: chr [1:10] "3" "6" "2" "2021" ...
 $ :'data.frame':   12 obs. of  1 variable:
  ..$ word: chr [1:12] "4" "12" "20" "2020" ...
 $ :'data.frame':   7 obs. of  1 variable:
  ..$ word: chr [1:7] "5" "9" "11" "2021" ...
  [list output truncated]

Show code

print_news_tokens[[1]]

              word
doc_id           1
date...2         7
date...3        17
date...4      2020
text...5       174
text...6   million
text...7     drone
text...8   program
text...9   afghans
text...10  riddled
text...11 pentagon

Create Function

I need to next create a function to assign sentiment labels.

Show code

compute_sentiment <- function(d) {
  if (nrow(d) == 0) {
    return(NA)
  }
  neg_score <- d %>% filter(sentiment=="negative") %>% nrow()
  pos_score <- d %>% filter(sentiment=="positive") %>% nrow()
  pos_score - neg_score
}

Apply Sentiments and Function

Then I can apply that sentiment function to the headline data sets

Show code

sentiments_bing <- get_sentiments("bing")

#apply sentiment to main headlines
main_news_sentiment_bing <- sapply(main_news_tokens, function(x) { x %>% inner_join(sentiments_bing) %>% compute_sentiment()})
#apply sentiment to print headlines
print_news_sentiment_bing <- sapply(print_news_tokens, function(x) { x %>% inner_join(sentiments_bing) %>% compute_sentiment()})

Summary

The summaries of each show the number of NA’s are minimal.

Show code

summary(main_news_sentiment_bing)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
-4.0000 -1.0000 -1.0000 -0.5945  1.0000  2.0000     349

Show code

summary(print_news_sentiment_bing)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
-4.0000 -1.0000 -1.0000 -0.6209  0.0000  3.0000     353

Preview Results

Now I can look at the first 10 headlines and the corresponding bing analysis scores. I can see that the scores vary, even in the first 10 headlines.

Show code

#head 10 main headlines with bing analysis scores
main_news_sentiment_bing_df <- data.frame(main_text=main_headlines$text, score = main_news_sentiment_bing)
head(main_news_sentiment_bing_df, 10)

                                                                       main_text
1   $174 Million Afghan Drone Program Is Riddled With Problems, U.S. Report Says
2           ‘A Hail Mary’: Psychedelic Therapy Draws Veterans to Jungle Retreats
3    ‘Come On In, Boys’: A Wave of the Hand Sets Off Spain-Morocco Migrant Fight
4  ‘Covid Can’t Compete.’ In a Place Mired in War, the Virus Is an Afterthought.
5     ‘Everything Changed Overnight’: Afghan Reporters Face an Intolerant Regime
6       ‘Finally, I Am Safe’: U.S. Air Base Becomes Temporary Refuge for Afghans
7                    ‘Find Him and Kill Him’: An Afghan Pilot’s Desperate Escape
8     ‘Football Is Like Food’: Afghan Female Soccer Players Find a Home in Italy
9    ‘Go Big’ on Coronavirus Stimulus, Trump Says, Pitching Checks for Americans
10            ‘Hospital Needs to Be Quarantined,’ but Works On in Country at War
   score
1     NA
2      1
3     NA
4     -1
5     NA
6      1
7     -2
8     NA
9      1
10    NA

Show code

#head 10 print headlines with bing analysis scores
print_news_sentiment_bing_df <- data.frame(print_text=print_headlines$text, score = print_news_sentiment_bing)
head(print_news_sentiment_bing_df, 10)

                                                                       print_text
1  $174 Million Drone Program for Afghans Is Riddled With Problems, Pentagon Says
2                 Psychedelic Therapy In the Jungle Soothes The Pain for Veterans
3                                  Morocco Sends Spanish Outpost a Migrant Influx
4          ‘It’s a Lie’: Denial and Skepticism Permeate a Nation Embroiled in War
5                                      ‘Everything Changed’: Media Face Crackdown
6          ‘Finally, I Am Safe’: Thousands Find Temporary Refuge at U.S. Air Base
7                  ‘Find Him and Kill Him’: A Pilot’s Desperate Escape From Kabul
8                                     Soccer Players Under Threat Escape to Italy
9                                    Plan Would Inject $1 Trillion Into Economy  
10  As Pandemic Takes Toll on Afghan Doctors, Hospitals Still Tend to War Wounded
   score
1     NA
2     -1
3     NA
4     -4
5     NA
6      1
7     -2
8     -1
9     NA
10    -1

NRC

As I saw in my first two analyses, the NRC lexicon uses 10 different sentiments, including negative and positive but with additional sentiments as well.

Show code

sentiments_nrc <- get_sentiments("nrc")
(unique_sentiments_nrc <- unique(sentiments_nrc$sentiment))

 [1] "trust"        "fear"         "negative"     "sadness"     
 [5] "anger"        "surprise"     "positive"     "disgust"     
 [9] "joy"          "anticipation"

Create Function

Next again I will create a function to assign sentiment labels that apply ‘positive’ and ‘negative’ in a binary interpretation of each of the 8 other sentiments.

Show code

compute_pos_neg_sentiments_nrc <- function(the_sentiments_nrc) {
  s <- unique(the_sentiments_nrc$sentiment)
  df_sentiments <- data.frame(sentiment = s, 
                              mapped_sentiment = c("positive", "negative", "negative", "negative",
                                                    "negative", "positive", "positive", "negative", 
                                                    "positive", "positive"))
  ss <- sentiments_nrc %>% inner_join(df_sentiments)
  the_sentiments_nrc$sentiment <- ss$mapped_sentiment
  the_sentiments_nrc
}

nrc_sentiments_pos_neg_scale <- compute_pos_neg_sentiments_nrc(sentiments_nrc)

Apply Function

Then I can apply that sentiment function to the headline data sets

Show code

#calculating NRC sentiment for main headlines
main_news_sentiment_nrc <- sapply(main_news_tokens, function(x) { x %>% inner_join(nrc_sentiments_pos_neg_scale) %>% compute_sentiment()})

#calculating NRC sentiment for print headlines
print_news_sentiment_nrc <- sapply(print_news_tokens, function(x) { x %>% inner_join(nrc_sentiments_pos_neg_scale) %>% compute_sentiment()})

Summary

The summaries of each show the number of NA’s are even more minimal.

Show code

summary(main_news_sentiment_nrc)

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
-12.0000  -3.0000  -1.0000  -0.7417   1.0000  13.0000      150

Show code

summary(print_news_sentiment_nrc)

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
-12.0000  -3.0000  -1.0000  -0.4994   2.0000  13.0000      151

Preview Results

Now I can look at the first 10 headlines and the corresponding NRC analysis scores. I can see that the scores vary as well, even in the first 10 headlines.

Show code

#data frame of main NRC sentiment
main_news_sentiment_nrc_df <- data.frame(main_text=main_headlines$text, score = main_news_sentiment_nrc)
head(main_news_sentiment_nrc_df, 10)

                                                                       main_text
1   $174 Million Afghan Drone Program Is Riddled With Problems, U.S. Report Says
2           ‘A Hail Mary’: Psychedelic Therapy Draws Veterans to Jungle Retreats
3    ‘Come On In, Boys’: A Wave of the Hand Sets Off Spain-Morocco Migrant Fight
4  ‘Covid Can’t Compete.’ In a Place Mired in War, the Virus Is an Afterthought.
5     ‘Everything Changed Overnight’: Afghan Reporters Face an Intolerant Regime
6       ‘Finally, I Am Safe’: U.S. Air Base Becomes Temporary Refuge for Afghans
7                    ‘Find Him and Kill Him’: An Afghan Pilot’s Desperate Escape
8     ‘Football Is Like Food’: Afghan Female Soccer Players Find a Home in Italy
9    ‘Go Big’ on Coronavirus Stimulus, Trump Says, Pitching Checks for Americans
10            ‘Hospital Needs to Be Quarantined,’ but Works On in Country at War
   score
1     -2
2      0
3     -3
4     -3
5     -5
6      8
7     -4
8      6
9      1
10    -3

Show code

#data frame of print NRC sentiment
print_news_sentiment_nrc_df <- data.frame(print_text=print_headlines$text, score = print_news_sentiment_nrc)
head(print_news_sentiment_nrc_df, 10)

                                                                       print_text
1  $174 Million Drone Program for Afghans Is Riddled With Problems, Pentagon Says
2                 Psychedelic Therapy In the Jungle Soothes The Pain for Veterans
3                                  Morocco Sends Spanish Outpost a Migrant Influx
4          ‘It’s a Lie’: Denial and Skepticism Permeate a Nation Embroiled in War
5                                      ‘Everything Changed’: Media Face Crackdown
6          ‘Finally, I Am Safe’: Thousands Find Temporary Refuge at U.S. Air Base
7                  ‘Find Him and Kill Him’: A Pilot’s Desperate Escape From Kabul
8                                     Soccer Players Under Threat Escape to Italy
9                                    Plan Would Inject $1 Trillion Into Economy  
10  As Pandemic Takes Toll on Afghan Doctors, Hospitals Still Tend to War Wounded
   score
1     -2
2     -4
3     -1
4     -7
5     NA
6      8
7     -4
8     -3
9      2
10    -5

AFINN

The AFINN lexicon has valence ratings between -5 (negative) and +5 (positive).

Show code

sentiments_afinn <- get_sentiments("afinn")

colnames(sentiments_afinn) <- c("word", "sentiment")

Create Function

Again I’ll create a function to assign sentiment labels.

Show code

#applying AFINN sentiment to main headlines
main_news_sentiment_afinn_df <- lapply(main_news_tokens, function(x) { x %>% inner_join(sentiments_afinn)})
main_news_sentiment_afinn <- sapply(main_news_sentiment_afinn_df, function(x) { 
      ifelse(nrow(x) > 0, sum(x$sentiment), NA)
  })
#applying AFINN sentiment to print headlines
print_news_sentiment_afinn_df <- lapply(print_news_tokens, function(x) { x %>% inner_join(sentiments_afinn)})
print_news_sentiment_afinn <- sapply(print_news_sentiment_afinn_df, function(x) { 
      ifelse(nrow(x) > 0, sum(x$sentiment), NA)
  })

Summary

The summaries of each show the number of NA’s are similar to that in the bing lexicon.

Show code

summary(main_news_sentiment_afinn)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
-10.000  -3.000  -2.000  -1.769  -1.000   5.000     368

Show code

summary(print_news_sentiment_afinn)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
 -10.00   -3.00   -2.00   -1.52   -1.00    6.00     359

Preview Results

Now I can look at the first 10 headlines and the corresponding AFINN analysis scores. I can see that the scores vary a lot less than in the first two lexicons.

Show code

#data frame of AFINN main headlines
main_news_sentiment_afinn_df <- data.frame(main_text=main_headlines$text, score = main_news_sentiment_afinn)
head(main_news_sentiment_afinn_df, 10)

                                                                       main_text
1   $174 Million Afghan Drone Program Is Riddled With Problems, U.S. Report Says
2           ‘A Hail Mary’: Psychedelic Therapy Draws Veterans to Jungle Retreats
3    ‘Come On In, Boys’: A Wave of the Hand Sets Off Spain-Morocco Migrant Fight
4  ‘Covid Can’t Compete.’ In a Place Mired in War, the Virus Is an Afterthought.
5     ‘Everything Changed Overnight’: Afghan Reporters Face an Intolerant Regime
6       ‘Finally, I Am Safe’: U.S. Air Base Becomes Temporary Refuge for Afghans
7                    ‘Find Him and Kill Him’: An Afghan Pilot’s Desperate Escape
8     ‘Football Is Like Food’: Afghan Female Soccer Players Find a Home in Italy
9    ‘Go Big’ on Coronavirus Stimulus, Trump Says, Pitching Checks for Americans
10            ‘Hospital Needs to Be Quarantined,’ but Works On in Country at War
   score
1     NA
2      2
3     -1
4     -2
5     NA
6      1
7     -7
8     NA
9     NA
10    -2

Show code

#data frame of AFINN print headlines
print_news_sentiment_afinn_df <- data.frame(print_text=print_headlines$text, score = print_news_sentiment_afinn)
head(print_news_sentiment_afinn_df, 10)

                                                                       print_text
1  $174 Million Drone Program for Afghans Is Riddled With Problems, Pentagon Says
2                 Psychedelic Therapy In the Jungle Soothes The Pain for Veterans
3                                  Morocco Sends Spanish Outpost a Migrant Influx
4          ‘It’s a Lie’: Denial and Skepticism Permeate a Nation Embroiled in War
5                                      ‘Everything Changed’: Media Face Crackdown
6          ‘Finally, I Am Safe’: Thousands Find Temporary Refuge at U.S. Air Base
7                  ‘Find Him and Kill Him’: A Pilot’s Desperate Escape From Kabul
8                                     Soccer Players Under Threat Escape to Italy
9                                    Plan Would Inject $1 Trillion Into Economy  
10  As Pandemic Takes Toll on Afghan Doctors, Hospitals Still Tend to War Wounded
   score
1     NA
2     -2
3     NA
4     -4
5     NA
6      1
7     -7
8     -3
9     NA
10    -2

Congruence

Having obtained for each headline data set three potential results as sentiment evaluation, next I will calculate their congruence.

By congruence, I am looking at the fact that all three lexicons express a positive or negative result. In other words, the same score signal the same sentiment independently from the lexicon’s respective scale of magnitude. If “NA” values are present, the congruence is computed until at least two non-“NA” values are available, otherwise the value is equal to “NA”.

Then, I compute the final news sentiment as based upon the sum of each lexicon sentiment score.

Create Function

Show code

compute_congruence <- function(x,y,z) {
  v <- c(sign(x), sign(y), sign(z))
  # if only one lexicon reports the score, we cannot check for congruence
  if (sum(is.na(v)) >= 2) {
    return (NA)
  }
  # removing NA and zero value
  v <- na.omit(v)
  v_sum <- sum(v)
  abs(v_sum) == length(v)
}

Show code

compute_final_sentiment <- function(x,y,z) {
  if (is.na(x) && is.na(y) && is.na(z)) {
    return (NA)
  }

  s <- sum(x, y, z, na.rm=TRUE)
  # positive sentiments have score strictly greater than zero
  # negative sentiments have score strictly less than zero
  # neutral sentiments have score equal to zero 
  ifelse(s > 0, "positive", ifelse(s < 0, "negative", "neutral"))
}

Apply Function

Now I will put the sentiment results in new data frames and apply the analyses.

Show code

main_sentiments_results <- data.frame(main_text = main_headlines$text, 
                                 bing_score = main_news_sentiment_bing, 
                                 nrc_score = main_news_sentiment_nrc, 
                                 afinn_score = main_news_sentiment_afinn,
                                 stringsAsFactors = FALSE)

print_sentiments_results <- data.frame(print_text = print_headlines$text, 
                                 bing_score = print_news_sentiment_bing, 
                                 nrc_score = print_news_sentiment_nrc, 
                                 afinn_score = print_news_sentiment_afinn,
                                 stringsAsFactors = FALSE)


main_sentiments_results <- main_sentiments_results %>% rowwise() %>% 
  mutate(final_sentiment = compute_final_sentiment(bing_score, nrc_score, afinn_score),
         congruence = compute_congruence(bing_score, nrc_score, afinn_score))

print_sentiments_results <- print_sentiments_results %>% rowwise() %>% 
  mutate(final_sentiment = compute_final_sentiment(bing_score, nrc_score, afinn_score),
         congruence = compute_congruence(bing_score, nrc_score, afinn_score))

head(main_sentiments_results, 10)

# A tibble: 10 x 6
# Rowwise: 
   main_text          bing_score nrc_score afinn_score final_sentiment
   <chr>                   <int>     <int>       <dbl> <chr>          
 1 $174 Million Afgh~         NA        -2          NA negative       
 2 ‘A Hail Mary’: Ps~          1         0           2 negative       
 3 ‘Come On In, Boys~         NA        -3          -1 negative       
 4 ‘Covid Can’t Comp~         -1        -3          -2 negative       
 5 ‘Everything Chang~         NA        -5          NA negative       
 6 ‘Finally, I Am Sa~          1         8           1 negative       
 7 ‘Find Him and Kil~         -2        -4          -7 negative       
 8 ‘Football Is Like~         NA         6          NA negative       
 9 ‘Go Big’ on Coron~          1         1          NA negative       
10 ‘Hospital Needs t~         NA        -3          -2 negative       
# ... with 1 more variable: congruence <lgl>

Show code

head(print_sentiments_results, 10)

# A tibble: 10 x 6
# Rowwise: 
   print_text         bing_score nrc_score afinn_score final_sentiment
   <chr>                   <int>     <int>       <dbl> <chr>          
 1 "$174 Million Dro~         NA        -2          NA negative       
 2 "Psychedelic Ther~         -1        -4          -2 negative       
 3 "Morocco Sends Sp~         NA        -1          NA negative       
 4 "‘It’s a Lie’: De~         -4        -7          -4 negative       
 5 "‘Everything Chan~         NA        NA          NA negative       
 6 "‘Finally, I Am S~          1         8           1 negative       
 7 "‘Find Him and Ki~         -2        -4          -7 negative       
 8 "Soccer Players U~         -1        -3          -3 negative       
 9 "Plan Would Injec~         NA         2          NA negative       
10 "As Pandemic Take~         -1        -5          -2 negative       
# ... with 1 more variable: congruence <lgl>

Evaluation

It seems like I need to do more work on the congruence function, as I have all “NA” results.

Show code

#If it would be useful to replace the numeric score with same {negative, neutral, positive} scale.
replace_score_with_sentiment <- function(v_score) {
  v_score[v_score > 0] <- "positive"
  v_score[v_score < 0] <- "negative"
  v_score[v_score == 0] <- "neutral"
  v_score
}

I’ll combine all of the normalized and binary ‘positive’ and ‘negative’ sentiments from all three lexicons into one data frame for each headline set.

Show code

#apply scale to main results
main_sentiments_results$bing_score <- replace_score_with_sentiment(main_sentiments_results$bing_score)
main_sentiments_results$nrc_score <- replace_score_with_sentiment(main_sentiments_results$nrc_score)
main_sentiments_results$afinn_score <- replace_score_with_sentiment(main_sentiments_results$afinn_score)
main_sentiments_results[,2:5] <- lapply(main_sentiments_results[,2:5], as.factor)
head(main_sentiments_results, 40)

# A tibble: 40 x 6
# Rowwise: 
   main_text          bing_score nrc_score afinn_score final_sentiment
   <chr>              <fct>      <fct>     <fct>       <fct>          
 1 $174 Million Afgh~ <NA>       negative  <NA>        negative       
 2 ‘A Hail Mary’: Ps~ positive   neutral   positive    negative       
 3 ‘Come On In, Boys~ <NA>       negative  negative    negative       
 4 ‘Covid Can’t Comp~ negative   negative  negative    negative       
 5 ‘Everything Chang~ <NA>       negative  <NA>        negative       
 6 ‘Finally, I Am Sa~ positive   positive  positive    negative       
 7 ‘Find Him and Kil~ negative   negative  negative    negative       
 8 ‘Football Is Like~ <NA>       positive  <NA>        negative       
 9 ‘Go Big’ on Coron~ positive   positive  <NA>        negative       
10 ‘Hospital Needs t~ <NA>       negative  negative    negative       
# ... with 30 more rows, and 1 more variable: congruence <lgl>

Show code

#apply scale to print results
print_sentiments_results$bing_score <- replace_score_with_sentiment(print_sentiments_results$bing_score)
print_sentiments_results$nrc_score <- replace_score_with_sentiment(print_sentiments_results$nrc_score)
print_sentiments_results$afinn_score <- replace_score_with_sentiment(print_sentiments_results$afinn_score)
print_sentiments_results[,2:5] <- lapply(print_sentiments_results[,2:5], as.factor)
head(print_sentiments_results, 40)

# A tibble: 40 x 6
# Rowwise: 
   print_text         bing_score nrc_score afinn_score final_sentiment
   <chr>              <fct>      <fct>     <fct>       <fct>          
 1 "$174 Million Dro~ <NA>       negative  <NA>        negative       
 2 "Psychedelic Ther~ negative   negative  negative    negative       
 3 "Morocco Sends Sp~ <NA>       negative  <NA>        negative       
 4 "‘It’s a Lie’: De~ negative   negative  negative    negative       
 5 "‘Everything Chan~ <NA>       <NA>      <NA>        negative       
 6 "‘Finally, I Am S~ positive   positive  positive    negative       
 7 "‘Find Him and Ki~ negative   negative  negative    negative       
 8 "Soccer Players U~ negative   negative  negative    negative       
 9 "Plan Would Injec~ <NA>       positive  <NA>        negative       
10 "As Pandemic Take~ negative   negative  negative    negative       
# ... with 30 more rows, and 1 more variable: congruence <lgl>

Final Results

I’ll take the overall sentiment score and join them in one data frame and visualize it. After taking the value ‘positive’ or ‘negative’ that is in the majority of the 3 evaluations, the dataset is overwhelmingly ‘negative’ (100%).

Show code

final_total <- read.csv("all_sentiments_binary.csv")

head(final_total)

  article print_bing print_nrc print_afinn print_final main_bing
1       1       <NA>  negative        <NA>    negative      <NA>
2       2   negative  negative    negative    negative  positive
3       3       <NA>  negative        <NA>    negative      <NA>
4       4   negative  negative    negative    negative  negative
5       5       <NA>      <NA>        <NA>    negative      <NA>
6       6   positive  positive    positive    negative  positive
  main_nrc main_afinn main_final
1 negative       <NA>   negative
2  neutral   positive   negative
3 negative   negative   negative
4 negative   negative   negative
5 negative       <NA>   negative
6 positive   positive   negative

Visualization

Show code

main_graph <- read.csv("main_graph.csv")

library(ggplot2)

main_plot <- main_graph %>%
  ggplot(aes(date, sentiment, fill = lexicon)) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~lexicon, ncol = 1, scales = "free_y") +
  scale_fill_manual(values=c("#993333", "#336699", "#669900")) +
  theme_minimal()
main_plot

Citations

This research makes use of the NRC Word-Emotion Association Lexicon, created by Saif Mohammad and Peter Turney at the National Research Council Canada.
This research makes use of the Bing Lexicon. This dataset was first published in Minqing Hu and Bing Liu, ``Mining and summarizing customer reviews.’’, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004), 2004.
This research makes use of the AFINN Lexicon, Nielsen, F. Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903.

New Direction of Sentiment Analysis

Using New Dictionaries

Load Data

Bing Lexicon

Create Function

Apply Sentiments and Function

Summary

Preview Results

NRC

Create Function

Apply Function

Summary

Preview Results

AFINN

Create Function

Summary

Preview Results

Congruence

Create Function

Apply Function

Evaluation

Final Results

Visualization

Citations