HEART-NLP: Human-centeredand Knowledge-driven NLP for Ensuring Quality, Integrity, and Transparency in a Trustworthy DigitalEcosystem → GPLSI

HEART-NLP: Human-centered and Knowledge-driven NLP for Ensuring Quality, Integrity, and Transparency in a Trustworthy Digital Ecosystem

The advancement of NLP technologies focused on knowledge and people offers a unique opportunity to strengthen the business and societal structures by fostering trustworthy digital ecosystems. The results of this project have the potential to positively impact society by providing tools and resources that ensure the quality, integrity, and transparency of language models and digital content, enhancing the ability to access and generate trustworthy information. By addressing challenges such as disinformation, hate speech, and abusive behavior, while fostering the dissemination of valuable and reliable content, this project establishes an innovative framework to integrate quality metrics, FAIR principles, and privacy and security strategies into the development of NLP technologies, laying the groundwork for a responsible and beneficial digital transformation for society and market.

HYPOTHESIS: Natural language is the main means of interaction in human societies. When these societies move to the digital world, the interaction between elements generates digital content, also mostly made up of natural language. Digital media and social networks have enhanced this interaction, creating an ecosystem of spaces where content is created and consumed in increasing quantity and speed. However, this ubiquitous environment of interaction has two opposing perspectives. On the one hand, it is the repository of certain pervasive content that negatively affects the quality and freedom of information. Digital media have become a space where hoaxes, hate speech, or abusive behaviour proliferate, among other content types that directly and negatively harm users of this space in particular and society in general. Furthermore, it is a space where many digital media distort information, amplifying the challenges faced by individuals in discerning accurate and reliable content. On the other hand, there is also a benefit generated by the sharing of valuable, quality and reliable information that generates a digital collective intelligence that can be used to the advantage of society as a whole. Therefore, in this context, our main hypothesis is that Knowledge- and human-centered Natural Language Processing technologies can foster a trustworthy digital ecosystem by ensuring the quality, integrity, and transparency of language models and digital content, enabling users to access reliable and ethical information.