SEPLN 2024 | Pol Pastells [po̞l pəsˈteʎs]

Last week I participated in the XL SEPLN (the conference of the NLP Spanish society) in Valladolid. It was a nice conference. On Tuesday we presented the overview of the DETESTS-Dis task at IberLEF 2024, you can already read the article, and on Wednesday I presented a paper.

You can see more information about the corpus in HuggingFace.

Both the shared task and the article were about the detection of racist stereotypes in Spanish. We have two corpus where the presence or not of stereotype is annotated, as well as its impliciteness, among other annotations. One corpus consisted of comments in online news and the other of tweets.

The task, where different teams were presented in a competitive way, was to create a model to classify texts according to whether they contained stereotypes or not, and whether they were impl.cits or explscits.
The article was based on the idea that annotators often need to see the context to know if there is a stereotype or to classify it. Context is usually omitted when creating models. Our idea was to incorporate different types of context (previous comments or the news title) into BERT-type models using the [SEP] token to pass the context. In the end things didn’t work very well. We therefore presented negative results. This methodology is not useful to improve the models by incorporating context, at least for our case.