Lisa Machnig


Combating hate speech with artificial intelligence

Algorithms that warn against fake news or punishable offenses - a far-off scenario or reality? We spoke to experts about what AI can do in terms of tackling hate speech online.


Neural networks are a form of machine learning. They work in a similar manner to nerve cell connections in our brain.

Fishing for (anti) compliments: neural networks capture hateful messages

For several years, Deutsche Telekom has been researching in the field of innovative technological trends such as AI. According to Tanja Hagemann, data scientist at Telekom Innovation Laboratories (T-Labs), there are several approaches when it comes to using AI to tackle online hate speech: “The first step is to determine whether the hateful content is written, spoken, or in the form of an image. The next step is to provide an automated response to these hateful messages.” But how does this work? With neural networks.

Neural networks are a form of machine learning. They work in a similar manner to nerve cell connections in our brain. Instead of neurons and synapses, data nodes and weighted connections communicate with each other. If a neural network receives data in written or spoken form, it adjusts its connections and learns to classify the information and, where necessary, identify the data as a hateful message.

Learning without discrimination

You take a suitable system, supply it with data and, voilà, you have super anti-hate speech intelligence. Well, it’s not quite that simple because whether a message is considered to be insulting varies from person to person. And if we cannot clearly distinguish a discriminating message, AI won’t be able to either.

T-Labs has long been working with partners from the field of science and the start-up world to research how the trustworthiness of AI systems can be guaranteed in the future. For example, this includes the explainability of neural networks and the question of which processes ensure that an AI system is free from discrimination. “As the models can only understand what they have previously observed in data, they can – in certain situations – reproduce a very restricted perspective or opinion. When developing AI models, it is therefore essential that awareness is first generated about which values need to be taken into consideration by a model, explains Tanja Hagemann.

Many challenges that need to be overcome and that some projects are already tackling. Here are two practical examples that use AI to combat online hate:

AI case at the Fraunhofer Institute: Providing warnings against fake news 

Prof. Dr. Ulrich Schade from the Fraunhofer Institute for Communication, Information Processing, and Ergonomics (FKIE) is working on “Natural Language Processing” (NLP). The natural language is recoded here and, using rules and algorithms, is processed by a computer. Prof. Schade is working on a tool that is aimed at warning against fake news. “Although the tool itself does not perform a fact check itself, it is able to estimate whether the data could be fake news based on the characteristics of the language such as word choice, the occurrence of specific errors, etc. as well as by looking at metadata such as when and how often something is posted.”

Monitored machine learning is used here. “Monitoring” means that the tool – on the basis of a huge number of examples – learns to differentiate between fake and real news. This means that examples of fake and correct news are needed in advance. Using these examples, the tool selects and learns characteristics that can be used for the respective classification as markers”, explains Prof. Schade.

Could such a tool also be used to tackle hate speech? Yes says Prof. Schade. “We are actually developing classifiers that observe a number of texts and divide the individual texts into various classes. One such classification is ‘fake’ vs. ‘real’ news, another is ‘comment with hate speech’ vs. ‘message without hate speech’.”

AI case at the Max Planck Institute: Automatically testing legal culpability 

Dr. Frederike Zufall, a legal expert at the Max-Planck-Institute for the Research on Collective Goods, developed a model that determines whether specific Twitter postings could be deemed as offenses under German criminal law. “The ideal target would be for our model to ultimately be able to make a prediction about a new, unknown comment as to whether it represents a criminal offense such as incitement to hatred as per § 130 (1) of the German Criminal Code (StGB).

This system also uses monitored machine learning. The snag: “From the variety of possible expressions that could represent this type of criminal offense, it’s almost impossible for the model to generalize a pattern that could result in reliable predications,” explains Dr. Frederike Zufall. The solution? Sub-decisions. The whole thing is broken down into many individual parts. An example of a sub-decision would be an evaluation as to whether a comment is aimed at a certain group of people. Each sub-decision is classified by the system and impacts the overall decision.

Whether fake new, criminal offenses, or hateful comments – algorithm-based data processing offers a great advantage: it is fast. But does this mean AI will soon replace humans in the fight against hate speech?

#TAKEPART in fighting for a network without hate

No Hate Speech

Words must not become a weapon. Deutsche Telekom is fighting for a network without hate in which we treat one another respectfully.