What’s the intersection of social activism, political art and tech? It’s ShhorAI for Aindriya Barua

A high-end bot with a special focus on vernacular languages and safety of marginalised communities, ShhorAI can identify and remove online hate speech in Hinglish

By Aditi Subramanian

| Posted on March 13, 2024

Pranshu, a queer makeup artist, had nearly 22,000 followers on Instagram, but that did not help when the 16-year-old posted a reel wearing a saree on Deepavali day. The reel elicited over 4,000 hate comments, which is believed to have driven the self-taught artist to commit suicide in Ujjain, Madhya Pradesh.

Online bullying is nothing new for the LGBTQIA community, but how can this negativity be checked? The answer lies with Aindriya Barua (26), an Adivasi, queer, neurodivergent AI engineer and artist from rural Tripura, who by own description is “a machine learning engineer with a paintbrush.”

Most offense reporting systems are trained in major Western languages, so when someone uses Hinglish, for example, the reporting system cannot identify and remove that specific content. That is where Barua’s AI-powered high-end bot ShhorAI can come into play. ShhorAI especially focuses on vernacular languages and safety of marginalised communities.

Code-mixing and low-resource language were the two main problems that Barua had tackled before developing the AI-bot. And there were some real reasons that made this brilliant activist burn the midnight oil to crack it.

Aindriya Barua coding for ShhorAI

In December 2021, a group of coal mine workers from a village in Nagaland were returning home when they were allegedly shot at by the security forces. The Armed Forces Special Powers Act (AFSPA) was talked about yet again, and various groups protested, demanding that it be repealed.

When Barua’s post explaining the history of AFSPA and why it must be repealed went viral, the profile was spammed with hate messages by users who continuously tagged others who commented hateful messages. “I got five to seven notifications every minute. I was unable to take any action… Initially, I thought of taking screenshots of these messages in case it would come to use in future, but I gave up on this due to the sheer number of messages.”

During this period, comments with Hinglish slurs came from anonymous social media users who threatened to dox Barua and family. While Barua deleted these comments and blocked some of them, keeping up with the mass hate campaign was simply impossible.

Barua understood the vulnerabilities of online space on realising that it was not possible to report these comments, direct messages, and social media stories as Instagram did not identify Hinglish slurs or consider them to be against their rules and guidelines. While the hate campaign lasted for several days, it ended when anonymous users harassed Barua into posting a written apology.

The incident traumatised Barua and others who witnessed the hate campaign unfold. While they had seen and heard of such instances, experiencing it first-hand and Instagram’s inability to identify, take down, or censor hate speech made them try to think of solutions to combat the problem.

“Instagram, like many social media platforms, identifies and censors hate speech in English. The situation in India, however, is that hate speech is seldom in English; users use the English keyboard to type in their native languages with different spellings and pronunciations, known as code-mixing,” Barua says.

Code-mixing makes it impossible for the existing tools to detect hate speech, given its complexity. Barua decided to solve this problem by inventing a bot capable of detecting hate speech of such nature. Hinglish hate speech was chosen to start the work with, as Barua had seen it being used against many.

However, the available tagged data for Indian languages was scarce if not insufficient for the project. This realisation made Barua start the project from scratch by manually building a vast repository of Hinglish comments to train the AI. To create the dataset, Barua mobilised volunteers online and contacted other marginalised artists and activists who had been victims of mass hate campaigns.

A scraper was built to automatically collect hate comments, while users who spammed hateful messages continued to do so — indirectly helping the project!

Having accumulated India’s most extensive Hinglish dataset with over 40,000 comments, the high-end AI model was trained on the same. Given the vast repository of data they had accumulated over time, Barua researched it to understand the ‘logic’ of hate speech despite its apparent irrationality and mapped a pyramid of marginalisation that is hierarchical in nature. The conclusions noted the intersectionality of marginalisation, which was explained through an example.

“The character hate speech spammed by anonymous users, in my opinion, is never discussed prior. They tend to look at the identity of the person towards whom the hate campaign is directed and type hate comments accordingly. The type of hate comments a dominant caste cishet woman receives, for instance, is different from the type of hate comments a Dalit, queer, and neurodivergent person receives. The more the intersectionality of marginalisation, the worse the hate comments,” Barua says. The research is one-of-its-kind since it analyses online hate and threats of violence in India in detail.

Barua developed the AI bot and implemented it on Reddit to test its ability to detect hate speech in comments, pictures, videos, and hate comments that contain emoticons. The bot responds to hate speech in real time by giving a user three warnings. However, the moderator of subreddits can configure the number of alerts, after which the user is banned.

Barua is currently in talks with other social media platforms, but the process has been complex. “It is important to note that several such platforms selectively detect hate speech and shadow-ban selective content. This becomes more pronounced in times of conflict and war, wherein extremist and conservative groups are often given space to spread hate with no consequences. In contrast, it becomes impossible for others to report it as hate speech. Meanwhile, the content shared by the latter, who often belong to marginalised communities, is shadow-banned, and are often given several warnings by these platforms.”

Barua hopes to create a more inclusive and safe space online for all through ShhorAI, which directly addresses the elephant in the room. While documenting the process of the development of the bot on website (https://www.shhorai.com/), Barua also wishes to expand research to other native languages.

LED India’s first all girls cybersecurity team won 3rd prize at Winja CTF Nullcon 2018

Barua meeting General Judith Ravin, US Consul General, and Political/Economic Chief Virsa Perkins

Art by Aindriya Barua for ShhorAI

The efforts to make online platforms a safe space for the LGBTQIA community has won Barua the second runners-up position at an International Hackathon organised by the United Nations Population Fund on preventing and addressing gender-based violence. Barua was invited for a detailed discussion on the intersection of social activism, political art, and AI as a means of development, with the then US Consul General in Chennai, Judith Ravin, and Political/Economic Chief Virsa Perkins.

In fact, right from the beginning, Barua had devised ways to combine knowledge from different fields. Interested in technology — especially Natural Language Processing (NLP) — art, and politics, Barua produced innovative pieces using art and technology to advocate for social justice, be it feminist issues or discourses on contemporary news.

“I always fondly remember the way I realised my interest in coding. It all started in middle school when my Computer Science teacher introduced me to C… I spent countless afternoons in my youth fully absorbed in painting and learning to code,” The first computer project that Barua took up was an attempt to help own parents, who were encountering daily challenges in irrigating their farms.

Barua engineered an IoT device featuring soil moisture sensors, which relayed data for the code to automate water taps. On the project’s successful completion, Barua discovered the joy of creating a real impact and made it a mission to create more such software capable of bringing about a positive change. After realising software development as the life’s passion, Barua made it to the Bachelor of Technology in Computer Science and Engineering (BTech CSE) course and enrolled in an engineering college in Kerala. In line with the staunch belief that innovation should be accessible to all, Barua developed an interest in FOSS (Free and Open Source Software) and shared projects on GitHub. However, the college failed to fulfil expectations, as the institution was highly restrictive for people of marginalised genders and castes.

The classrooms were dominated by cishet men from dominant castes, dictating the dynamics of the space wherein those from marginalised genders were neither given due recognition for the work they did nor given a space to voice their thoughts and opinions. Unable to express grievances, and feeling alienated from peers, Barua channelled all frustration into art and the newfound passion for NLP.

Barua identified a research gap in the field of NLP specific to Indian languages. Despite observing a broader trend of progress in AI and NLP, this dearth of research on NLP applied to Indian languages. This prompted Barua to write a peer-reviewed paper analysing contextual and non-contextual models for Hindi Named Entity Recognition (NER).

Around the same time, Barua started an Instagram page to share self-made artworks, primarily focusing on personal struggles caused by systemic means of oppression, including casteism, sexism and homophobia. Art became Barua’s tool to convey opinions on contemporary events, while technology became more accessible due to the ability to code. Both these engagements finally led Barua to Project ShhorAI.

About the author

Aditi Subramaniam is a freelance writer based in Bangalore. She is currently pursuing her Master’s degree in Sociology at the Delhi School of Economics and has a background in Journalism. She is interested in questions of ecology, gender and development.