Sentiment Analisys Workshop at SEPLN (TASS)

TASS is an experimental evaluation workshop for sentiment analysis and online reputation analysis focused on Spanish language. The aim of TASS is to provide a forum for discussion and communication where the latest research work and developments in the field of sentiment analysis in social media, specifically focused on Spanish language, can be shown and discussed by scientific and business communities. The main objective is to promote the application of state-of-the-art algorithms and techniques for sentiment analysis applied to short text opinions extracted from social media messages (specifically Twitter).

We are interested in evaluating the evolution of the different approaches for sentiment analysis and text classification in Spanish during these years. So, the traditional sentiment analysis at global level task held in previous years will be repeated again, reusing the same corpus, to compare results. This corpus contains over 68 000 Twitter messages, written in Spanish by about 150 well-known personalities and celebrities of the world of politics, economy, communication, mass media and culture, between November 2011 and March 2012. Each tweet includes a unique identifier, the date of creation, the user identifier and the content itself. These tweets have been assigned a global polarity, identifying whether the content expresses positive, negative or neutral opinions. The corpus has been divided into two sets: training (about 10%) and test (90%). The training set will be released so that participants may train and validate their models. The test corpus will be provided without any tagging and will be used to evaluate the results provided by the different systems.

On the other hand, at TASS we want to foster the research in the analysis of fine-grained polarity analysis at aspect level (aspect-based sentiment analysis), one of the new requirements of the market of natural language processing in these areas. To this end, participants will be provided with a corpus tagged with a series of aspects, and systems must identify the polarity at the aspect-level. Training and test sets for two corpora will be provided: the Social-TV corpus, used last year, and the new Politics corpus, collected this year. This new corpus contains messages from the 6 main political parties participating at the Andalusian parliamentary election, 2015, written by any of the official Twitter accounts of each party or their most important candidates in each region. We are currently in the process of defining aspects that represent the main ideas for political discussion, such as "economic crisis", "unemployment", "corruption", "role of women", "academic failure", "abortion", etc. and assign their polarity for sentiment in the opinion expressed in the tweet.


Julio Villena Román

Daedalus, S.A.

Miguel Ángel García Cumbreras

Universidad de Jaén

Eugenio Martínez Cámara
Alfonso Ureña López
María Teresa Martín Valdivia


David Vilares Calvo

Universidad de A Coruña

Ferrán Pla

Universidad Politécnica de Valencia

Lluís F. Hurtado
David Tomás

Universidad de Alicante

Yoan Gutiérrez
Manuel Montes

Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE)

Luis Villaseñor