How Google’s Jigsaw Is Trying to Detoxify the Internet


By Rob Marvin, Associate Features Editor, PC Magazine

The internet can feel like a toxic place. Trolls descend on comment sections and social media threads to hurl hate speech and harassment, turning potentially enlightening discussions into ad hominem attacks and group pile-ons. Expressing an opinion online often doesn’t seem worth the resulting vitriol.

Massive social platforms—including Facebook, Twitter, and YouTube—admit they can’t adequately police these issues. They’re in an arms race with bots, trolls, and every other undesirable who slips through content filters. Humans are not physically capable of reading every single comment on the web; those who try often regret it.

Tech giants have experimented with various combinations of human moderation, AI algorithms, and filters to wade through the deluge of content flowing through their feeds each day. Jigsaw is trying to find a middle ground. The Alphabet subsidiary and tech incubator, formerly known as Google Ideas, is beginning to prove that machine learning (ML) fashioned into tools for human moderators can change the way we approach the internet’s toxicity problem.

Perspective is an API developed by Jigsaw and Google’s Counter Abuse Technology team. It uses ML to spot abuse and harassment online, and scores comments based on the perceived impact they might have on a conversation in a bid to make human moderators’ lives easier.

Perspective Amidst the Shouting Matches

The open-source tech was first announced in 2017, though development on it started a few years earlier. Some of the first sites to experiment with Perspective have been news publications such as The New York Times and sites such as Wikipedia. But recently, Perspective has found a home on sites like Reddit and comment platform Disqus (which is used on

CJ Adams, product manager for Perspective, said the project wanted to examine how people’s voices are silenced online. Jigsaw wanted to explore how targeted abuse or a general atmosphere of harassment can create a chilling effect, discouraging people to the point where they feel it’s not worth the time or energy to add their voice to a discussion. How often have you seen a tweet, post, or comment and chosen not to respond because fighting trolls and getting Mad Online just isn’t worth the aggravation?

“It’s very easy to ruin an online conversation,” said Adams. “It’s easy to jump in, but one person being really mean or toxic could drive other voices out. Maybe 100 people read an article or start a debate, and often you end up with the loudest voices in the room being the only ones left, in an internet that’s optimized for likes and shares. So you kind of silence all these voices. Then what’s defining the debate is just the loudest voice in the room—the shouting match.”

Jigsaw and Google

It’s been a rough year for Jigsaw’s sister company, Google, which has grappled with data security issues, employee pushback on its involvement in projects for the Pentagon and China, and revelations over its handling of sexual harassment. Not to mention a contentious Congressional hearing in which CEO Sundar Pichai was grilled by lawmakers.

Over at Jigsaw, Alphabet’s altruistic incubator, things have been a bit less dramatic. The team has spent its time examining more technical forms of censorship, such as DNS poisoning with its Intra app and DDoS attacks with Project Shield. With Perspective, the goal is more abstract. Rather than using machine learning to determine what is or isn’t against a given set of rules, Perspective’s challenge is an intensely subjective one: classifying the emotional impact of language.

To do that, you need natural language processing (NLP), which breaks down a sentence to spot patterns. The Perspective team is confronting problems like confirmation bias, groupthink, and harassing behavior in an environment where technology has amplified their reach and made them harder to solve.

AI Is ‘Wrong and Dumb Sometimes’

Improving online conversations with machine learning isn’t a straightforward task. It’s still an emerging field of research. Algorithms can be biased, machine learning systems require endless refinement, and the hardest and most important problems are still largely unexplored.

The Conversation AI research group, which created Perspective, started by meeting with newspapers, publishers, and other sites hosting conversations. Some of the first sites to experiment with the technology were The New York Times, Wikipedia, The Guardian, and The Economist.

In 2017, the team opened up the initial Perspective demo via public website as part of an alpha test, letting people type millions of vile, abusive comments into the site. It was kind of like Microsoft’s infamous failed Tay chatbot experiment, except instead of tricking the bot into replying with racist tweets, Jigsaw used the crowdsourced virulence as training data to feed its models, helping to identify and categorize different types of online abuse.

The initial public test run did not go smoothly. Wired’s “Trolls Across America,” which broke down toxicity in commenting across the country based on Perspective scoring, showed how the algorithm inadvertently discriminated against groups by race, gender identity, or sexual orientation.

Adams was candid about the fact that Perspective’s initial testing revealed major blind spots and algorithmic bias. Like Amazon’s scrapped recruiting tool, which trained on decades of flawed job data and developed an inherent bias against female applicants, the early Perspective models had glaring flaws because of the data on which it was trained.

“In the example of frequently targeted groups, if you looked at the distribution across the comments in the training data set, there were a vanishingly small number of comments that included the word ‘gay’ or ‘feminist’ and were using it in a positive way,” explained Adams. “Abusive comments use the words as insults. So the ML, looking at the patterns, would say, “Hey, the presence of this word is a pretty good predictor of whether or not this sentiment is toxic.”

For example, the alpha algorithm might have mistakenly labeled statements like “I’m a proud gay man,” or, “I’m a feminist and transgender” with high toxicity scores. But the publicly transparent training process—while painful—was an invaluable lesson for Jigsaw in the consequences of unintended bias, Adams said.

When training machine-learning models on something as distressing and personal as online abuse and harassment, the existence of algorithmic bias also underscores why AI alone is not the solution. Social companies such as Facebook and YouTube have both touted their platforms’ AI content-moderation features only to backtrack amid scandal and course-correct by hiring thousands of human moderators.

Jigsaw’s tack is a hybrid of the two. Perspective isn’t AI algorithms making decisions in a vacuum; the API is integrated into community-management and content-moderation interfaces to serve as an assistive tool for human moderators. Perspective engineers describe moderating hate speech with and without ML using a haystack analogy: AI helps by automating the sorting process, whittling vast haystacks down while still giving humans the final say over whether a comment is considered abusive or harassment.

“It’s this new capability of ML,” said Adams. “People talk about how smart AI is, but they often don’t talk about all the ways it’s wrong and dumb sometimes. From the very beginning, we knew this was going to make a lot of mistakes, and so we said, ‘This tool is helpful for machine-assisted human moderation, but it is not ready to be making automatic decisions.’ But it can take the ‘needle in a haystack’ problem finding this toxic speech and get it down to a handful of hay.”

Read the source post in PC Magazine.