MAHED 2025 - Multimodal Detection of Hope and Hate Emotions in Arabic Content

Overview

MAHED 2025 is a shared task at ArabicNLP 2025 Co-located with EMNLP 2025, focusing on the detection of hate speech, hope speech, and emotional expression in Arabic content across both textual and multimodal formats. This task aims to advance Arabic Natural Language Processing (NLP) through multi-task model and multimodal analysis. Participants may choose to participate in one or more of the following three subtasks:

Text-based Hate and Hope Speech Classification
Emotion, Offensive, and Hate Detection (Multitask)
Multimodal Hateful Meme Detection

Motivation

Social media in the Arabic-speaking world exhibits a dynamic interplay between hateful and hopeful expressions. However, challenges like linguistic diversity, dialect variation, limited datasets, and the emergence of multimodal content (e.g., memes) make automatic detection complex.

MAHED 2025 addresses this by:

Tackling the dual nature of content (hate vs. hope)
Enabling multimodal (text + image) analysis
Integrating emotion as a critical contextual feature
Providing high-quality Arabic datasets for hate, hope, and emotion detection

Dataset Resources

The shared task will use the following annotated datasets:

Attribute	Subtask 1	Subtask 2	Subtask 3
Size	9,843	8,515	4,500
Labels	hope hate not_applicable	Emotion (12 labels) Offensive (yes, no) Hate (hate, not_hate)	Hateful Not-hateful
Source	Sub-Task 1 source	Sub-Task 2 source	Sub-Task 3 source

Annotation: All content was collected from public social media, anonymized, and annotated by native speakers. Annotation agreement (Cohen's Kappa) is > 0.85.

Data Split: Training, development, and test sets provided with evaluation scripts.

Ethics: The dataset complies with ethical data-sharing standards.

Task Description

MAHED 2025 consists of three interconnected subtasks:

Sub-task 1: Text-based Hate and Hope Speech Classification

Goal: Classify Arabic text as hate, hope, or not_applicable

Input: Arabic text (MSA or dialect)
Output: Labels—'hate', 'hope', or 'not_applicable'

Examples:

كل المهاجرين لصوص ومجرمون يجب طردهم فوراً

Label: hate
معاً يمكننا بناء مستقبل أفضل لأطفالنا

Label: hope
اليوم هو يوم مشمس وجميل

Label: not_applicable

Sub-task 2: Emotion, Offensive Language, and Hate Detection

Goal: Identify the emotion, whether the text is offensive, and if offensive, whether it contains hate content.

Labels

Emotion (single label): neutral, anger, anticipation, disgust, fear, joy, love, optimism, pessimism, sadness, surprise, trust
Offensive: yes, no
Hate (if offensive = yes): hate (identity-targeted hate), not_hate (non-targeted or casual offensiveness)

Key Distinctions

Hate speech is always offensive, but not all offensive content is hate speech.
Hate refers to offensive content targeting a group/person based on identity (e.g., race, gender, religion).

Text	Emotion	Offensive	Hate
كل المهاجرين لصوص ويجب طردهم	anger	yes	hate
يا حمار ليش نسيت المفاتيح؟	anger	yes	not_hate
أشعر بالحزن لأنني خسرت وظيفتي	sadness	no	—

Sub-task 3: Multimodal Hate Speech Detection in Memes

Goal: Detect whether a meme (text + image) is hateful or not.

Input: Image and embedded Arabic text
Output: hateful, non-hateful

Examples:

Hateful Meme

Text: الشعب ده ما يستاهلش غير كده (These people deserve nothing more than this)

Image: Cartoon mocking a marginalized group

Label: hateful

Non-hateful Meme

Text: لما تصحى الصبح وتلاقي القهوة جاهزة (When you wake up in the morning and find coffee ready)

Image: Cheerful person with coffee

Label: non-hateful

Participants may engage in any or all sub-tasks. The evaluation uses F1-score, precision, and recall, with macro-averaged F1 as the primary metric.

Evaluation Metrics

Evaluation will be performed using:

macro-average F1-score *(primary metric on leaderboard)*
macro-average accuracy, macro-average precision and macro-average recall
Separate leaderboard per sub-task

Pilot Run Details

A pilot study with 2,000 text instances and 500 memes yielded:

Sub-task 1: F1 0.53 with BERT
Sub-task 2: F1 0.50 with ArabERT model
Sub-task 3: F1 0.70 for multimodal hate memes

Feedback refined the dataset; details will be shared upon acceptance.

Timeline

Date	Event
June 1, 2025	Release of training, dev data and evaluation scripts
July 20, 2025	Final registration deadline and test set release
July 25, 2025	Test submission deadline via CodaLab
July 30, 2025	Final results released to participants
August 15, 2025	System description papers due
November 5-9, 2025	ArabicNLP 2025 Workshop in Suzhou, China

Registration & Submission

✓

Step 1: Fill the Google Form

Complete the registration form to let us know you're interested.
⚠ Note: This step only registers your interest — it does not enroll you in the competition.

Step 2: Create a CodaBench Account

If you don't have a CodaBench account yet, go to https://www.codabench.org/ and sign up.
Use a valid email and create a password.
Confirm your account through the email verification link (if required).

Step 3: Visit the CodaBench Task Pages

Choose the tasks you want to compete in and officially join them.

Here are the links:

Task 1: https://www.codabench.org/competitions/9136/

Task 2: https://www.codabench.org/competitions/9166/

Task 3: https://www.codabench.org/competitions/9192/

Click the "Join" or "Participate" button on the task page once you're logged in.

Step 4: Model Development

After joining, follow the task instructions.
Get access to data and baseline code: https://github.com/marsadlab/MAHED2025Dataset
Develop your model and evaluate it through the Codabench Evaluation Phase by submitting prediction file

Step 5: Prediction File and System Paper Submission

Submit your final predictions through the CodaBench platform.
Create an acocunt openreview portal
Submit your system description paper in openreview submission portal for MAHED Shared Task
Ensure all submissions are completed before the specified deadlines.

⚠ Important! OpenReview's moderation policy for newly created profiles: New profiles without an institutional email will go through a moderation process that can take up to two weeks. New profiles with an institutional email will be activated automatically

Remote registration for this task can be funded through available student funding.

📞
Support & Contact

MAHED 2025 WhatsApp Support Group

MAHED 2025 Gmail Contact

Organizing Team

Wajdi Zaghouani, Northwestern University in Qatar (wajdi.zaghouani@northwestern.edu)
Md. Rafiul Biswas, Hamad Bin Khalifa University, Doha, Qatar (mbiswas@hbku.edu.qa)
Mabrouka Bessghaier, Northwestern University in Qatar (mabrouka.bessghaier@northwestern.edu)
Shimaa Ibrahim, Northwestern University in Qatar (shimaa.ibrahim@northwestern.edu)
Georgios Mikros, Hamad Bin Khalifa University, Qatar(GMikros@hbku.edu.qa)
Firoj Alam, Qatar Computing Research Institute, HBKU (fialam@hbku.edu.qa)

Reference:

Wajdi Zaghouani and Md. Rafiul Biswas. 2025. EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English and Arabic. arXiv preprint arXiv:2505.11959 [cs.CL]

Zaghouani, Wajdi, Hamdy Mubarak, and Md. Rafiul Biswas. 2024. So Hateful! Building a Multi-Label Hate Speech Annotated Arabic Dataset. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 15044–15055, Torino, Italy

Alam, Firoj, et al. "Propaganda to Hate: A Multimodal Analysis of Arabic Memes with Multi-agent LLMs." International Conference on Web Information Systems Engineering. Singapore: Springer Nature Singapore, 2024.

MAHED 2025: Multimodal Detection of Hope and Hate Emotions in Arabic Content

Leaderboard - Task

Overview

Motivation

Dataset Resources

Task Description

Sub-task 1: Text-based Hate and Hope Speech Classification

Sub-task 2: Emotion, Offensive Language, and Hate Detection

Labels

Key Distinctions

Sub-task 3: Multimodal Hate Speech Detection in Memes

Hateful Meme

Non-hateful Meme

Evaluation Metrics

Pilot Run Details

Timeline

Registration & Submission

Here are the links:

📞
Support & Contact

Organizing Team

Leaderboard - Task

Overview

Motivation

Dataset Resources

Task Description

Sub-task 1: Text-based Hate and Hope Speech Classification

Sub-task 2: Emotion, Offensive Language, and Hate Detection

Labels

Key Distinctions

Sub-task 3: Multimodal Hate Speech Detection in Memes

Hateful Meme

Non-hateful Meme

Evaluation Metrics

Pilot Run Details

Timeline

Registration & Submission

Here are the links:

📞 Support & Contact

Organizing Team

📞
Support & Contact