Projects

The Language Processing and Information Systems Group is an active research group that participates in and leads European, national and regional projects. Each of these projects involves highly trained members of the group, with strong training in Natural Language Processing and who actively collaborate with other research groups. The result of this human effort has been the development of many resources, software and products that are used successfully in commercial products or as a basis for other projects. They have also generated innumerable high-impact publications.

CIDEGENT: The limits and future of data-driven aproaches: A comparative study of deep learning, knowledge-based and rule-based models and methods in Natural Language Processing

While it is widely accepted that Deep Learning (DL) techniques are superior to rule-based and knowledge-based ones, this 4-year project will seek to establish formally through proper evaluations spanning different datasets whether DL always performs better and if not, in what circumstances. In addition, the project will seek to find answers as to how to boost the performance of the successful DL-based applications even further. The study will examine different NLP tasks/applications and to the best of our knowledge will be the first study to establish the extent to which DL guarantees improvement over rule-based methods and whether combining DL methods with various techniques, models and resources, would offer further improvements. Given the emergence and pervasive employment of Large Language Models, the project will experiment with Large Language Models as additional methodology and will compare them with Deep Learning and rule-based methods. This GenT 4-year project funded by the Valencian...

INTEGER: Intelligent Text Generation

INTEGER – Intelligent Text Generation [RTI2018-094649-B-I00] The Integer project proposes that it is possible to consider the communicative objective as one more input in the automatic text generation process, without prefixing it in the system design, giving rise to more flexible solutions. Furthermore, through the use of recent learning techniques, the versatility of the system can be extended by integrating heterogeneous information sources (text, image, sound, etc.).  INTEGER – Intelligent Text Generation [RTI2018-094649-B-I00] is a project funded by the State Program for Research, Development and Innovation Oriented to the Challenges of Society, within the framework of the State Plan for Scientific and Technical Research and Innovation 2017-2020. https://integer.gplsi.es/

CORTEX: COnscious natuRal TEXt generation

Artificial Intelligence (AI) is an essential enabling technology whose potential should be developed to serve human progress, while at the same time avoiding and combating the potential risks associated with its malicious use. AI has raised new challenges concerning human-machine interaction and communication. Humans communicate and understand reality through natural language and as huge amounts of digital data are expressed in natural language, this has triggered the development of Human Language Technologies (HLT) and Natural Language Processing (NLP) methods (1), such as searchers, chatbots, translators or conversational agents, among others. These technologies have become essential everyday tools for a growing number of people around the globe.

COOLANG: COntent Oriented LANGuage technologies

Natural language is the main means of interaction in human societies. When these societies move to the digital world, the interaction between elements generates digital content, also mostly made up of natural language. Digital media and social networks have enhanced this interaction, creating an ecosystem of spaces where content is created and consumed in increasing quantity and speed. However, this ubiquitous environment of interaction has two opposing perspectives.

CLEARTEXT: Enhancing the modernization public sector organizations by deploying Natural Language Processing to make their digital content CLEARER to those with cognitive disabilities.

People with cognitive disabilities have significant limitations in their intellectual functioning and/or may also lack the ability to adapt to everyday situations. People with cognitive disabilities have spoken and written word comprehension deficit that may include misinterpretation of literal meanings and difficulty understanding complex instructions. They are confused by idioms, figures of speech, abstractions, uncommon words, and lack of precision. Hence, we start with the hypothesis that research, development and deployment of natural language processing technology can support the authoring of accessible content in Spanish for people with cognitive disabilities with a view to widening their inclusion and empowerment in Europe.

Multi3Generation: Multi-task, Multilingual, Multi-modal Language Generation (CA18231)

Language generation (LG) is a crucial technology if machines are to communicate with humans seamlessly using human natural language. A great number of different tasks within Natural Language Processing (NLP) are language generation tasks, and being able to effectively perform these tasks implies (1) that machines are equipped with world knowledge that can require multi-modal processing and reasoning (e.g. textual, visual and auditory inputs, or sensory data streams), and (2) the study of strong, novel Machine Learning (ML) methods (e.g. structured prediction, generative models), since virtually all state-of-the-art NLP models are learned from data.