Projects

The Language Processing and Information Systems Group is an active research group that participates in and leads European, national and regional projects. Each of these projects involves highly trained members of the group, with strong training in Natural Language Processing and who actively collaborate with other research groups. The result of this human effort has been the development of many resources, software and products that are used successfully in commercial products or as a basis for other projects. They have also generated innumerable high-impact publications.

Active Projects

CRITERIA: Evaluation Criteria for Quality Corpora in Artificial Intelligence

In recent years, the development of artificial intelligence (AI) systems based on Large Language Models (LLMs) has transformed how we interact with textual data, as well as the possibilities for generating, understanding, and analyzing human language on a large scale. In the fields of Digital Humanities and Natural Language Processing (NLP), these models have demonstrated unprecedented performance in tasks such as text generation, machine translation, summarization, semantic classification, and automatic question answering. However, this success has been accompanied by a fundamental challenge that still lacks a standardized and systematic solution: the quality of the data used to train these systems.

Active Projects

SAFEWORDS: Ethical, Privacy-Preserving and Trustworthy Language Technologies within the HumanAIze Project

SAFEWORDS is a core component of the HumanAIze project, a coordinated research initiative aimed at developing the next generation of human-centred, trustworthy and multilingual large language models (LLMs) for Europe. Within this framework, SAFEWORDS provides the ethical, legal and data governance backbone that ensures all technological developments are fully aligned with European fundamental rights, democratic values and regulatory requirements.

Active Projects

HEART-NLP: Human-centeredand Knowledge-driven NLP for Ensuring Quality, Integrity, and Transparency in a Trustworthy DigitalEcosystem

The advancement of NLP technologies focused on knowledge and people offers a unique opportunity to strengthen the business and societal structures by fostering trustworthy digital ecosystems. The results of this project have the potential to positively impact society by providing tools and resources that ensure the quality, integrity, and transparency of language models and digital content, enhancing the ability to access and generate trustworthy information. By addressing challenges such as disinformation, hate speech, and abusive behavior, while fostering the dissemination of valuable and reliable content, this project establishes an innovative framework to integrate quality metrics, FAIR principles, and privacy and security strategies into the development of NLP technologies, laying the groundwork for a responsible and beneficial digital transformation for society and market.

Active Projects

HIVEMIND: Human-centred collaboratIVE MultI-ageNtframework for accelerating software Developmentand maintenance

HIVEMIND transforms software engineering with a human-centric, AI-driven approach. By integrating an adaptive multi-agent framework, we enable seamless collaboration between developers and specialised LLM agents, streamlining software specification, code development, and maintenance. Validated across key industrial and societal sectors, HIVEMIND enhances efficiency, security, and adaptability, setting a new standard for responsible software development.

Active Projects

ML3T-SCI Master in Large Language Models and Language Technologies: Scientific and Corporate Innovation orientations

ML3T-SCI is framed as a holistic program for training experts in Large Language Models and Language Technologies, offering two specializations: one oriented towards scientific research and the other focused on corporate innovation.

Active Projects

CIDEGENT: The limits and future of data-driven aproaches: A comparative study of deep learning, knowledge-based and rule-based models and methods in Natural Language Processing

While it is widely accepted that Deep Learning (DL) techniques are superior to rule-based and knowledge-based ones, this 4-year project will seek to establish formally through proper evaluations spanning different datasets whether DL always performs better and if not, in what circumstances. In addition, the project will seek to find answers as to how to boost the performance of the successful DL-based applications even further. The study will examine different NLP tasks/applications and to the best of our knowledge will be the first study to establish the extent to which DL guarantees improvement over rule-based methods and whether combining DL methods with various techniques, models and resources, would offer further improvements. Given the emergence and pervasive employment of Large Language Models, the project will experiment with Large Language Models as additional methodology and will compare them with Deep Learning and rule-based methods. This GenT 4-year project funded by the Valencian...

Done Projects

INTEGER: Intelligent Text Generation

INTEGER – Intelligent Text Generation [RTI2018-094649-B-I00] The Integer project proposes that it is possible to consider the communicative objective as one more input in the automatic text generation process, without prefixing it in the system design, giving rise to more flexible solutions. Furthermore, through the use of recent learning techniques, the versatility of the system can be extended by integrating heterogeneous information sources (text, image, sound, etc.). INTEGER – Intelligent Text Generation [RTI2018-094649-B-I00] is a project funded by the State Program for Research, Development and Innovation Oriented to the Challenges of Society, within the framework of the State Plan for Scientific and Technical Research and Innovation 2017-2020. https://integer.gplsi.es/

1 2