Twitter is betting on algorithms to punish trolls before you report them

Impact

March 1, 2017

Twitter announced a handful of new updates and tools on Wednesday that continue to further its mission to make the platform less hellish to hang out on. And in classic Silicon Valley fashion, what's an announcement without mentioning a beloved algorithm?

Twitter's Vice President of Engineering Ed Ho listed a number of user-facing features, such as expanding on the mute feature and offering more filtering options. He also indicated behind-the-scenes efforts the company is taking to mitigate harassment before users need to report it — an effort, as Ho mentioned, that depends on an algorithm.

There's a delicate balance to strike when it comes to leaning on algorithms to help combat abuse online. Lean too heavily, and and risk overstepping, like Facebook's censorship tools; machines miss the nuances of human behavior. But relying solely on humans is impossible; the flood of reported abuse is likely too much for a team to combat without algorithmic intervention to streamline the process.

And as researchers from the University of Washington recently pointed out in a new report, trolls can outsmart the machines.

"Machine learning systems are generally designed to yield the best performance in benign settings. But in real-world applications, these systems are susceptible to intelligent subversion or attacks," said Radha Poovendran, chair of the UW electrical engineering department and director of the Network Security Lab, according to a press release. "We wanted to demonstrate the importance of designing these machine learning tools in adversarial environments. Designing a system with a benign operating environment in mind and deploying it in adversarial environments can have devastating consequences."

So it's important to take anti-harassment tools dependent on an algorithm with a grain of salt — but as Ho points out in the blog post, "Since these tools are new we will sometimes make mistakes, but know that we are actively working to improve and iterate on them everyday."

Twitter's latest round of tools to wield against the trolls:

Twitter wants to flag abusive accounts before you have to. Ho said in the blog post that the company will identify accounts that are "engaging in abusive behavior" and then limit the functionality of these accounts "for a set amount of time." They will determine which accounts are violating its rules based on its algorithms.

If effective, this lightens the load of users who endure constant targeted harassment — instead of having to report each instance of harassment, Twitter might spot the abuse and act on it first.

Users will also have more filtering options in their notifications. This sounds a lot like a more controlled quality filter — you'll be able to choose to remove notifications from accounts without a profile photo (bye, eggs!), unverified email addresses or phone numbers.

Twitter

Another feature Twitter is rolling out is an expanded mute feature. Mic has reported on the ineffectiveness of the mute feature as a means to cleanse your feed of targeted abuse — not only does it require users to manually input all abusive words and phrases, but it's a feature trolls can easily work around. The updated mute feature will let users mute keywords, phrases and entire conversations from their home timeline and also set how long they want it to be muted, whether it's for a day or forever. Up to you! It's a helpful feature for a more controlled user experience, but it's not a great standalone tool to combat unwanted attention.

Twitter

What many frequent targets of online harassment will likely find to be the most exciting is added transparency around the reporting process.

Game developer Brianna Wu, who has been a regular target for Gamergate, told Mic in February that she believed Twitter needed to have an appeal process for when Twitter's safety team mistakenly gave a pass to accounts clearly violating the Terms of Service.

She said that Twitter needs to develop a more transparent policy about what happens when the machine language makes a mistake. "There are things happening to me literally every day that are blatant violations of the ToS, and I know if Twitter just gives it a pass the first time, nothing is going to happen," she said.

Twitter

While Twitter isn't offering up transparency around an appeal process, it does plan to be more open regarding the status of reported content.

"You will be notified when we've received your report and informed if we take further action," Ho wrote in the blog post. "This will all be visible in your notifications tab on our app."

It's clear Twitter is mining through its own platform to better understand what its users want — this was evident in its quick rollback of a feature that eliminated notifications when users are added to a new list. The swift backlash led to the reversal of the feature in the same day it was announced. Twitter's product managers also worked on tweaking its algorithms to ensure more inclusive recommended lists after a science and tech journalist pointed out that its science and tech list didn't include any women.

It's refreshing to see Twitter finally approaching harassment with a sense of urgency, it's just too bad that aggressiveness wasn't around when it was looking for buyers.