Participate
Participation is open and free to individuals and teams from academia and industry. Enter one task or both, in any of the available language tracks.
How to Take Part
- Register. Sign up using the registration form and accept the dataset usage terms.
- Get the data. The dataset is hosted on Hugging Face. Start with the sample set (from 15 July 2026) to preview the format, then download the full training and development data (from 1 September 2026).
- Build your system. Develop a model for Task 1 (spoken), Task 2 (textual), or both, across whichever language tracks you choose.
- Evaluate locally. Use the official evaluation script to measure BERTScore F1 (plus BLEU and ROUGE) before you submit.
- Submit on CodaBench. During the evaluation phase (10 to 31 January 2027), submit your predictions on the official competition.
- Write it up. Submit a system description paper for the SemEval 2027 proceedings and present your work at the workshop.
Submission
Submissions are planned through CodaBench. The competition link, the submission format, and the official evaluation script will be posted here when the evaluation phase opens. A starter kit with data loaders and baseline systems will be released with the data.
Rules
Full rules ship with the starter kit. The essentials:
- Eligibility: open and free to everyone, individuals and teams alike.
- Tasks and tracks: enter any subset, a single task, a single language track, or all of them. Systems are ranked per task and per track.
- Ranking: planned official ranking by BERTScore F1.
- External resources: pretrained models and public data are generally permitted; document them in your paper. Any limits will be stated at release.
- Data license: CC BY-NC-SA 4.0, for non-commercial research use.
Frequently Asked Questions
Who can participate?
Anyone. Participation is open and free to individuals and teams from academia and industry worldwide. You register on the submission platform and accept the dataset usage terms.
Do I have to enter every task and track?
No. You may participate in any subset: a single task, a single language track, or all of them. Systems are ranked per task and per track.
How are submissions evaluated and ranked?
Systems generate open-ended answers. The planned official ranking metric is the F1 variant of BERTScore, which rewards semantically equivalent answers with different surface forms. BLEU and ROUGE are reported as auxiliary lexical-overlap metrics. LLM-based evaluation may be reported as supplementary analysis but is not the official ranking metric.
Can I use external data and pretrained models?
Detailed rules will ship with the starter kit. In line with typical SemEval practice, pretrained models and external resources are generally allowed provided they are publicly documented and you describe them in your system paper. Any restrictions will be stated clearly at data release.
How do I submit my predictions?
Submissions are planned through CodaBench. The competition link, submission format, and the official evaluation script will be posted on the Participate page when the evaluation phase opens.
Is there a paper to write?
Yes. Participants are encouraged to submit a system description paper, which will be published in the SemEval 2027 proceedings and presented at the workshop.
What does it cost, and how is the data licensed?
Participation is free. The shared-task dataset is planned for release under the CC BY-NC-SA 4.0 license for non-commercial research use.
How do I stay updated?
Join the mailing list (link coming soon) and watch the Home page news feed. All announcements, including data, baselines, and submission details, appear there first.
Contact
A participant mailing list is being set up and will be linked here; it will carry every announcement about data, baselines, and submission. In the meantime, reach the team on the organizers page and watch the home page for updates.