How to treat a Twitter troll

A study looking for a remedy for abusive Twitter users reveals their psychology, and explains what their vitriol has in common with gang signs

Content image

In this July 27, 2016, file photo, the Twitter symbol appears above a trading post on the floor of the New York Stock Exchange. (AP Photo/Richard Drew, File)

In this July 27, 2016, file photo, the Twitter symbol appears above a trading post on the floor of the New York Stock Exchange. (AP Photo/Richard Drew, File)
In this July 27, 2016, file photo, the Twitter symbol appears above a trading post on the floor of the New York Stock Exchange. (AP Photo/Richard Drew, File)

There’s a new study out that reads like the script of a high-brow nature film where a plummy-accented narrator explains the motives and aggressions of a species in its natural habitat. The habitat here is online: “Tweetment Effects on the Tweeted: Experimentally Reducing Racist Harassment,” from the current issue of the journal Political Behavior.

Kevin Munger, a PhD student in the department of politics at New York University, set out to see if there was a “treatment” for Twitter users who made racist comments. “The rise of online social interaction has brought with it new opportunities for individuals to express their prejudices and engage in verbal harassment,” he writes. So he started by seeking out the worst of the worst to test his hypotheses—in research parlance, a “hard case”—by targeting users who lobbed the word “n—-r” directly at another person. “In the racial context of the United States, this term is almost certainly the most intrinsically offensive,” he writes, and that leaves “no doubt” that people who aim that word at someone else know exactly what they’re doing. Then, because they are the largest and most relevant group directing online harassment at African-Americans—and in order to ensure the effect of his interventions didn’t vary with his subjects’ race or sex—Munger limited his sample to white men.

Once he found a tweet containing the N-word, he compared the user’s 1,000 most recent tweets to a custom “dictionary” containing a bloodcurdling array of racist and graphically sexual terms. An algorithm generated an “offensiveness score” for each user, and anyone who fell below the 75th percentile, as determined by a random sample of 450 Twitter users, was excluded from the study; in concrete terms, at least three per cent of a user’s tweets had to contain an offensive word. “It was basically to ensure that this is a certain type of person,” Munger says. He also manually reviewed tweets containing the N-word, to ensure that the Twitter users involved weren’t friends, for instance, or that the term was being used in a less vicious context.

Then, the crux of the experiment: Munger created sock-puppet Twitter accounts to respond to his subjects. Within 24 hours of someone in the study tweeting the N-word at someone else, one of the researcher’s shadow accounts would reply to the offending person: “Hey man, just remember that there are real people who are hurt when you harass them with that kind of language.”

Munger’s bots had four distinct identities: a white guy (as indicated by a stereotypically white name—Greg—and a Caucasian cartoon avatar) with fewer than 10 followers; a white guy with more than 500 followers; a black guy (with a name that read as black—Rasheed—and a dark-skinned avatar) with few followers; and a black guy with lots of followers. This was meant to test “in-group or out-group sanctioning”: in essence, we all look for cues about what behaviour is acceptable in our social groups, and we learn more about that from someone like us than someone different, so we pay more attention to those like us. Munger also took great pains to ensure his bot accounts looked authentic, and indeed, only three of the 242 subjects he replied to accused him of using a fake account.

MORE: The social-media status quo isn’t working

Munger hypothesized that the most effective messages to racist white users would come from someone white (like them) who had a lot of followers—or in the parlance of the study, “high status” in terms of perceived influence on Twitter. That proved true: while he saw some reduction in subsequent racist tweets from subjects across all conditions in the study, the only users who showed statistically significant changes in behaviour were those who got called out by white, high-follower Twitter accounts. The 50 subjects in that group used the N-word an estimated 186 fewer times in the month after treatment. That amounts to 0.3 fewer utterances of that word, per user, per day.

Munger measured the effects on racist tweeters’ behaviour one week, two weeks and one month out. Interestingly, he saw smaller-than-expected effects in the immediate aftermath of a user being called out, even if they later toned down their racist language. The study author believes “reactance” is what underlies that pattern. “If you feel like someone is trying to constrain your choices, you can react very negatively against that,” he says. “For some of the subjects—though they did ultimately change their behaviour to use less racist language—initially they were defensive, and in doing so, they yelled at bots a lot.” One particularly dedicated troll took an all-caps run at a bot twice, using language that would, frankly, render the sentences gibberish if they were reproduced here with the most vulgar language replaced with asterisks.

MORE: Web anonymity—once welcomed on social media—is now a huge liability

One of the most fascinating aspects of Munger’s study is how anonymity figured into it. He hypothesized that anonymous accounts would be less affected by “treatment” than those with a real name or photo attached, because he figured they just didn’t care what anyone thought. It turned out the opposite was true: only anonymous Twitter users reduced their racist abuse after being told off. “It was in fact the people who were using their real identities online who were not paying attention to what people told them to do, probably because they’re willing to put their real name associated with this sort of behaviour,” Munger says. “They’re pretty clearly committed to it.”

This can also be explained by the “social identity model of deindividuation effects,” or “SIDE theory,” Munger explains. That tells us that as people’s individual identities become less prominent—by, for instance, operating as a nameless, faceless account on Twitter or elsewhere online—their sense of group belonging becomes more important to them. That makes them more susceptible to messages about group norms or anything else that shores up their identity as part of a subculture.

And that’s another area where Munger’s paper feels like a deep-dive into the psyche and conduct of the darkest underbelly of Twitter. “The reason a lot of online groups use extreme language is because it’s useful for them to define their group identities this way. It’s a signal to outsiders,” he says. “Think about gang members who have their gang sign tattooed on their face…to a lesser extent, the use of extreme language online serves a similar function.”

Munger hasn’t tested this, but he strongly suspects that calling someone out with a more direct, scolding message such as “You’re being racist” wouldn’t be as effective. He carefully calibrated his sanctioning message to be low-key and collegial in tone, right down to the “Hey man” opening, which was meant to invoke the shared masculinity of the racist tweeter and the faux-account calling them out.

Of course, Twitter being Twitter, even that phrasing became fodder for more obnoxiousness. “I think that might have been a mistake,” Munger says, chuckling. “Many of the subjects actually made fun of me for that kind of language.”