Mean People Tweet - Biased and Inefficient

There is discussion from time to time on Kiwi Twitter about which public figures get treated worse on Twitter. Eric Crampton suggested that it would be easy to answer this question empirically, by analysing tweet sentiment. I wasn’t convinced, but I tried it. This post is about what I found.

First, we need some way of classifying sentiment. I’ve got lists of about 2000 positive and 5000 negative words, collected by Bing Liu. I’ve also got an L1-penalised logistic regression model that predicts positive/negative sentiment using the GLoVe 42-billion token word embeddings. I wrote about this last year; the racist tendencies of the model that come from using general online text for training are probably an advantage here. Last year I used MonetDB for the database backend, but the connection package has vanished from CRAN, so I’ve fallen back on sqlite.

library(twitteR)
library(tokenizers)
pos_words<-scan("~/TEACHING/369/WORDS/positive-words.txt", blank.lines.skip=TRUE, comment.char=";", what="")
neg_words<-scan("~/TEACHING/369/WORDS/negative-words.txt", blank.lines.skip=TRUE, comment.char=";", what="")
load("~/.ssh/twitter-bot-secrets.rda")
with(secrets,setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret))

## [1] "Using direct authentication"

library(glmnet)
load("~/TEACHING/369/WORDS/fittedmodel.rda")
library(dplyr)
library(RSQLite)
ms<-src_sqlite("~/TEACHING/369/WORDS/glove")

Here I’m defining some sentiment scoring functions, based on counting known words and based on the model. The function make.sentiment makes sentiment scoring functions from the predictive model: it looks up the embeddings for each word and computes the predicted log odds of being positive. The idea of the threshold is that we might not want to allow positive tweets to compensate for negative ones when we’re looking at online nastiness.

pos<-function(text){
    sum(!is.na(match(text,pos_words)))
}

neg<-function(text){
        sum(!is.na(match(text,neg_words)))  
}

make.sentiment <- function(db,model,threshold=NULL){
      glove<-tbl(db,"glove")
      function(words){
        word_tbl<-copy_to(db, 
            tibble(V1=words),
            name="temp_words",overwrite=TRUE)
        word_x<-inner_join(word_tbl, glove,by="V1") %>%
          collect() %>% 
          select(-V1) %>% 
          as.matrix()
        if(nrow(word_x)==0) return(0)
        sentiments<-predict(model$glmnet.fit, word_x, 
                            s=model$lambda.min)
        if(is.null(threshold)) 
          mean(sentiments)                    
        else if(any(sentiments<threshold))
            mean(sentiments[sentiments<threshold])
        else threshold
        }
    }

sentiment<-make.sentiment(ms,cvmodel)
negsentiment<-make.sentiment(ms,cvmodel,0.3)

We need to get some tweets. Golriz Ghahraman and David Seymour are young Members of Parliament who are deeply unpopular with some of their opponents. Eric and I are in there as controls. Westpac NZ is a bank. @dpfdpf is David Farrar, who does polling (including for the National Party), and who asked what his scores were like when I tweeted some early results. You can look up the rest if you care.

a<-searchTwitter("@golrizghahraman",n=100,resultType="recent")
d<-searchTwitter("@dbseymour",n=100,resultType="recent")
f<-searchTwitter("@EricCrampton",n=100,resultType="recent")
g<-searchTwitter("@tslumley",n=100,resultType="recent")

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 100 tweets were requested but the
## API can only return 98

h<-searchTwitter("@GerryBrownleeMP",n=100,resultType="recent")

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 100 tweets were requested but the
## API can only return 22

j<-searchTwitter("@JudithCollinsMP",n=100,resultType="recent")
k<-searchTwitter("@MaramaDavidson",n=100,resultType="recent")
l<-searchTwitter("@realDonaldTrump",n=100,resultType="recent")
m<-searchTwitter("@dpfdpf",n=100,resultType="recent")
n<-searchTwitter("@WestpacNZ",n=100,resultType="recent")

Looking at the data, it turns out that a lot of mentions for some people come from retweets of their own tweets. I don’t think these should count, so filter_retweets gets rid of them

filter_retweets<-function(tweets, handle){
    texts<-sapply(tweets, function(tweet) tweet$text)
    drop <- grepl(paste0("^RT ",handle,":"), texts)
    tweets[!drop]
}


sent<-function(tweets){
    sapply(tweets, function(tweet) {
        words<-tokenize_words(tweet$text)[[1]]
        c(pos=pos(words),neg=neg(words),mean=sentiment(words),nmn=negsentiment(words))      
        })
    
}

Now, the results

rowMeans(sent(filter_retweets(a, "@golrizghahraman")))

##        pos        neg       mean        nmn 
##  0.3414634  0.5609756  0.1702128 -1.7266530

rowMeans(sent(filter_retweets(d, "@dbseymour")))

##         pos         neg        mean         nmn 
##  0.40000000  0.63157895 -0.07195284 -1.60441181

rowMeans(sent(filter_retweets(f, "@EricCrampton")))

##        pos        neg       mean        nmn 
##  0.4302326  0.2906977  0.5398656 -1.1307189

rowMeans(sent(filter_retweets(g, "@tslumley")))

##        pos        neg       mean        nmn 
##  0.4342105  0.2894737  0.4234938 -1.3043339

rowMeans(sent(filter_retweets(h, "@GerryBrownleeMP")))

##        pos        neg       mean        nmn 
##  0.3636364  0.1818182  0.5950657 -1.0555124

rowMeans(sent(filter_retweets(j, "@JudithCollinsMP")))

##        pos        neg       mean        nmn 
##  0.2777778  0.5888889  0.3705713 -1.5172044

rowMeans(sent(filter_retweets(k, "@MaramaDavidson")))

##        pos        neg       mean        nmn 
##  0.3506494  0.2077922  0.7366782 -1.5067170

rowMeans(sent(filter_retweets(l, "@realDonaldTrump")))

##        pos        neg       mean        nmn 
##  0.3650794  0.2857143  0.1882573 -1.2162030

rowMeans(sent(filter_retweets(m, "@dpfdpf")))

##        pos        neg       mean        nmn 
##  0.3043478  0.8586957 -0.0289278 -1.7399734

rowMeans(sent(filter_retweets(n, "@WestpacNZ")))

##        pos        neg       mean        nmn 
##  0.6836735  0.2448980  1.0848726 -1.1189153

The results look a bit weird. Neither Eric nor I have a horrible Twitter experience, and turns out that most of the negative-sentiment tweets are related to things we think are bad that our followers or the internet audience agree are bad.

It’s surprisingly hard to get around this problem. One moderately useful step is to remove tweets from people we follow.

searchNoFriends<-function(handle,n=100,...){
    user<-getUser(handle)
    friends<-user$getFriends()
    tweets<-searchTwitter(handle,n=n,resultType="recent")
    drop<- sapply(tweets, function(t) t$screenName) %in% sapply(friends, function(f) f$screenName)
    tweets[!drop]
}

a1<-searchNoFriends("@golrizghahraman",n=100)
d1<-searchNoFriends("@dbseymour",n=100)
f1<-searchNoFriends("@EricCrampton",n=100)
g1<-searchNoFriends("@tslumley",n=100)

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 100 tweets were requested but the
## API can only return 98

h1<-searchNoFriends("@GerryBrownleeMP",n=100)

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 100 tweets were requested but the
## API can only return 22

j1<-searchNoFriends("@JudithCollinsMP",n=100)
k1<-searchNoFriends("@MaramaDavidson",n=100)
l1<-searchNoFriends("@realDonaldTrump",n=100)
m1<-searchNoFriends("@dpfdpf",n=100)
n1<-searchNoFriends("@WestpacNZ",n=100)

These results should be a bit more representative, and you see that Eric and I get more positive ratings, and people with actually bad timelines get more negative ratings.

rowMeans(sent(filter_retweets(a1, "@golrizghahraman")))

##         pos         neg        mean         nmn 
##  0.27027027  0.56756757  0.08362896 -1.80462827

rowMeans(sent(filter_retweets(d1, "@dbseymour")))

##        pos        neg       mean        nmn 
##  0.4137931  0.5747126 -0.0552590 -1.5820234

rowMeans(sent(filter_retweets(f1, "@EricCrampton")))

##        pos        neg       mean        nmn 
##  0.3888889  0.3333333  0.4932705 -1.1519884

rowMeans(sent(filter_retweets(g1, "@tslumley")))

##        pos        neg       mean        nmn 
##  0.4193548  0.2580645  0.8457364 -1.0536356

rowMeans(sent(filter_retweets(h1, "@GerryBrownleeMP")))

##        pos        neg       mean        nmn 
##  0.3636364  0.1818182  0.5950657 -1.0555124

rowMeans(sent(filter_retweets(j1, "@JudithCollinsMP")))

##        pos        neg       mean        nmn 
##  0.3157895  0.5921053  0.2999593 -1.5357380

rowMeans(sent(filter_retweets(k1, "@MaramaDavidson")))

##        pos        neg       mean        nmn 
##  0.2608696  0.2173913  0.3185163 -1.7419573

rowMeans(sent(filter_retweets(l1, "@realDonaldTrump")))

##        pos        neg       mean        nmn 
##  0.4444444  0.4861111  0.3453925 -1.2743030

rowMeans(sent(filter_retweets(m1, "@dpfdpf")))

##         pos         neg        mean         nmn 
##  0.29545455  0.85227273  0.01217917 -1.70170687

rowMeans(sent(filter_retweets(n1, "@WestpacNZ")))

##        pos        neg       mean        nmn 
##  0.7826087  0.2898551  1.0975516 -1.2332805

Even this, though has a big problem. It’s possible to tell moderately well from the text of a single tweet whether it’s positive or negative in sentiment, but it’s much harder to tell what it’s positive or negative about. The ratings don’t really distinguish people critising the analysis subject from those being supportive of the analysis subject by criticising their opponents. And it’s actually quite hard. A model trained with politically partisan invective would know which side was the likely target of a term like “feral snowflake” or “fascist manbaby”, but “menace to society” or “partisan hack” would still confuse it. In fact, I often needed to go back and look at an entire Twitter conversation to tell who was being targeted by invective in a single tweet.

On top of this, we’re looking at a single snapshot in time, and the results could well vary. And, of course, a lot of the more serious online hate isn’t public and easy to analyse. Members of Parliament may have Twitter direct messages and Facebook and they definitely have email accounts. So even if we could come to a simple empirical decision from public tweets it wouldn’t be decisive.

But it’s still interesting how difficult it is to come to an empirical conclusion – and not just for the obvious reasons that a bag-of-words model on short texts is limited.