The Arabic Fact-Checking and Stance Detection Corpus

Overview

This is a novel Arabic corpus that unifies stance detection, stance rationale, relevant document retrieval and fact checking. The corpus contains 422 claims that are made about the war in Syria and related Middle East political issues, where each claim is labeled for 'Factuality', indicating whether they are True or False. The corpus also contains 3,042 articles that are retrieved for these claims, where each claim-article pair is annotated for 'Stance', indicating whether the article agrees, disagrees, discusses or is unrelated to the claim. The corpus also points to which sentence(s) from the articles corresponds to the stance 'Rationale'. This is the first corpus to offer such a combination.

If you use the dataset in your research, kindly cite the following paper:

R. Baly, M. Mohtarami, and J. Glass, L. Marquez, A. Moschitti, P. Nakov. "Integrating Stance Detection and Fact Checking in a Unified Corpus," Proceedings of the 16th Annual Conference of North North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), New Orleans, LA, USA, June 2017 (PDF)

@InProceedings{Baly:NAACL:2018,
title = {Integrating Stance Detection and Fact Checking in a Unified Corpus},
author = {Baly, Ramy and Mohtarami, Mitra and Glass, James and M`arquez, Llu'is and Moschitti, Alessandro and Nakov, Preslav},
booktitle = {Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics},
series = {NAACL-HLT~'18},
address = {New Orleans, LA, USA},
month = {June},
year = {2018}
}

This data is distributed under the Creative Commons Attribution-ShareAlike (CC BY-SA) license (link).