Blog: Half-way to halting Hate Speech ?

Monday, 11 November 2024

Half-way to halting Hate Speech ?

For past few years, people are getting less and less tolerant of each other. In India , this problem has got worse with some or other elections around the year – when politicians engage in abusing one another , using foul language .

About an year ago , this prompted me to send following blog ( by email ) to our Cabinet Ministers / State Chief Ministers , in which I proposed a “ Technology-based “ , solution as described below :

Ø Hate Speech : Prevention Mechanism …………………….. 01 Dec 2023

Extract :

STEP A :

We must prepare a list ( dataset ) of words / phrases , denoting hate . This list will comprise words which have following connotations :

“ Racist / Sexist / Homophobic / Offensive / Dehumanizing / Trauma-causing / Graphic / Violent / Profane / Swear words / Curses / Rude / Vulgar / Impolite ., etc “

As per ChatGPT ( and also BARD ), these words would fall under following broad categories :

1. Racist Language:

Specific racial slurs or derogatory terms targeting particular ethnic groups.

2. Sexist Language:

Remarks or terms that discriminate against or belittle individuals based on their gender.

3. Homophobic or Transphobic Language:

Derogatory comments or slurs aimed at individuals based on their sexual orientation or gender identity.

4. Offensive or Dehumanizing Language:

Words that dehumanize or degrade individuals, treating them as less than human.

5. Trauma-Inducing Language:

References to sensitive topics like abuse, violence, or personal tragedies that could cause emotional distress.

6. Graphic or Violent Language:

Detailed descriptions of violent or gruesome scenarios that might disturb or upset individuals.

7. Profanity, Swear Words, and Curses:

Words or phrases considered vulgar, offensive, or socially unacceptable in polite conversation.

8. Rude or Impolite Language:

Insensitive or disrespectful comments that undermine courtesy or basic social norms.

9. Disinformation or Misinformation:

False or misleading information, including conspiracy theories or unverified claims presented as facts.

10. Threatening or Aggressive Language:

Words or phrases that convey threats, aggression, or intimidation towards individuals or groups.

STEP B :

Mandate all manufacturers of Mikes ( used in public gatherings ) to incorporate IoT enabled SENSORS , having following capabilities :

# Capture entire speech and convert it into text ( “ Speech to Text “ software are already commonplace )

# Transmit ( over internet ), entire TEXT to a CENTRAL SERVER

STEP C :

Central Server will subject this text to an AI based analysis .

Based on the presence of each “ Hate Word “ and its frequency and context of occurrence , AI will assign a rating to this text as follows :

1 ……………. Distasteful

2 ……………. Disrespectful

3……………….Offensive

4 …………….. Profane / Vulgar

5 ……………. Inciting violence / Outright Hate

AI will :

# Immediately publish its rating on the website > www.HatePrevent.gov.in

STEP D :

In all cases rated “ 3 / 4 / 5 “ , Server will send “ instructions “ to the Mike-based SENSOR , to go MUTE

Such instructions could be sent even as the speech is proceeding !

Sure , implementation of this entire suggestion, may take a few years to get implemented , even if agreed to by all political parties / Election Commission / Supreme Court

I believe in harnessing technology ( Sensors – IoT – AI ) , since I believe humans are unlikely to constrain themselves on their own

Now French AI startup MistralAI have come up with a very similar suggestion , as follows :

“ MistralAI takes on OpenAI with new moderation API , tackling harmful content in 11 languages “

{ Venture Beat / 07 Nov 2024 }

Extract :

French artificial intelligence startup Mistral AI launched a new content moderation API on Thursday, marking its latest move to compete with OpenAI and other AI leaders while addressing growing concerns about AI safety and content filtering.

The new moderation service, powered by a fine-tuned version of Mistral’s Ministral 8B model, is designed to detect potentially harmful content across nine different categories, including sexual content, hate speech, violence, dangerous activities, and personally identifiable information. The API offers both raw text and conversational content analysis capabilities.

“Safety plays a key role in making AI useful,” Mistral’s team said in announcing the release. “At Mistral AI, we believe that system level guardrails are critical to protecting downstream deployments.”

The launch comes at a crucial time for the AI industry, as companies face mounting pressure to implement stronger safeguards around their technology. Just last month, Mistral joined other major AI companies in signing the UK AI Safety Summit accord, pledging to develop AI responsibly.

The moderation API is already being used in Mistral’s own Le Chat platform and supports 11 languages, including Arabic, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish. This multilingual capability gives Mistral an edge over some competitors whose moderation tools primarily focus on English content.

“Over the past few months, we’ve seen growing enthusiasm across the industry and research community for new LLM-based moderation systems, which can help make moderation more scalable and robust across applications,” the company stated.

The company’s technical approach also shows sophistication beyond its years. By training its moderation model to understand conversational context rather than just analyzing isolated text, Mistral has created a system that can potentially catch subtle forms of harmful content that might slip through more basic filters.

The moderation API is available immediately through Mistral’s cloud platform, with pricing based on usage. The company says it will continue to improve the system’s accuracy and expand its capabilities based on customer feedback and evolving safety requirements.

Mistral’s move shows how quickly the AI landscape is changing. Just a year ago, the Paris-based startup didn’t exist. Now it’s helping shape how enterprises think about AI safety. In a field dominated by American tech giants, Mistral’s European perspective on privacy and security might prove to be its greatest advantage.

In my 91 years of age , I have never heard politicians using as much “ abusive / corrosive / derogatory / insulting “ language as it is being used in election rallies during past 12 months

I urge the Election Commission to become much more pro-active ( ala T N Sheshan ) and write to all political parties / Central Government / State Governments and the Supreme Court , to consider my suggestion

With regards,

Hemen Parekh

www.My-Teacher.in

www.HemenParekh.ai / 12 Nov 2024

Monday, 11 November 2024

Half-way to halting Hate Speech ?

No comments:

Post a Comment