Welcome to HateCheck.ai.
HateCheck is a fully open-source resource for testing hate speech detection models, built by expert researchers and presented at top academic conferences.
Covering 11 languages, HateCheck provides targeted diagnostic insights into the performance of hate speech detection models. It offers 25+ functional tests in each language, and special tests for emoji-based hate in English—all selected based on an extensive review of existing research, as well as interviews with civil society organisations.
total test cases
We created HateCheck to help improve the accuracy and fairness of hate speech detection models, by highlighting their strengths and weaknesses.
We identified key challenges for existing hate speech detection models and incorporated them into HateCheck: for example, HateCheck tests model performance on counterspeech, which models often misclassify as hate. HateCheck can reveal these kinds of weaknesses, as well as biases—like if models are worse at detecting hate aimed at some protected groups (e.g. women) than others (e.g. Muslims).
HateCheck is a community project, and it is already being used in cutting-edge research. Let us know if you want to talk about your own projects!
Start Using HateCheck!
Read the HateCheck research papers and
start testing hate speech detection models today.