OASIS
OASIS is a large multimodal dataset of images paired with spoken and textual questions and answers from across the MENA region. The shared-task data is drawn from OASIS, or created with the same framework.
About OASIS
OASIS pairs each image with a question and an open-ended answer, in both spoken and textual form. It was collected across 18 countries in the MENA region and spans everyday and cultural life, organized into 9 categories and 31 sub-categories. Questions are provided in English, Modern Standard Arabic, Egyptian Arabic, and Levantine Arabic, and every question is available as both audio and text.
License
The shared-task dataset is planned for release under the CC BY-NC-SA 4.0 license, for non-commercial research use with attribution and share-alike.
Citation
OASIS is described in the paper arXiv:2510.06371. If you use the dataset, please cite:
@article{alam2025everydaymmqa,
title = {{OASIS}: A Multilingual and Multimodal Dataset for Culturally Grounded Spoken Visual QA},
author = {Alam, Firoj and Shahroor, Ali Ezzat and Hasan, Md. Arid and Ali, Zien Sheikh and Bhatti, Hunzalah Hassan and Kmainasi, Mohamed Bayan and Chowdhury, Shammur Absar and Mousi, Basel and Dalvi, Fahim and Durrani, Nadir and Milic-Frayling, Natasa},
journal = {arXiv preprint arXiv:2510.06371},
year = {2025},
}