An Active Ingredients Entity Recogniser System Based on Profiles.

This paper describes an active ingredients named entity recogniser. Our machine learning system, which is language and domain independent, employs unsupervised feature generation and weighting from the training data. The proposed automatic feature extraction process is based on generating a profile for the given entity without traditional knowledge resources (such as dictionaries). Our results (F1 87.3 % [95 %CI: 82.07–92.53]) proves that unsupervised feature generation can achieve a high performance for this task.

Analysing the Integration of Semantic Web Features for Document Planning across Genres

Language is usually studied and analysed from different disciplines generally on the premise that it constitutes a form of communication which pursues a specific objective. The discourse, in that sense, can be understood as a text which is constructed to express such objective. When a discourse is created, its production is related to some textual genre, usually connected with some pragmatic features, like the intention of the writer or the audience to whom is addressed, both conditioning the use of language.

Generating sets of related sentences from input seed features

The Semantic Web (SW) can provide Natural Language Generation (NLG) with technologies capable to facilitate access to structured Web data. This type of data can be useful to this research area, which aims to automatically produce human utterances, in its different subtasks, such as in the content selection or its structure. NLG has been widely applied to several fields, for instance to the generation of recommendations (Lim-Cheng et al., 2014). However, generation systems are currently designed for very specific domains (Ramos-Soto et al., 2015) and pre-defined purposes (Ge et al., 2015).

Content Selection through Paraphrase Detection: Capturing differentSemantic Realisations of the Same Idea

Summarisation can be seen as an instance of Natural Language Generation (NLG), where “what to say” corresponds to the identification of relevant information, and “how to say it” would be associated to the final creation of the summary. When dealing with data coming from the Semantic Web (e.g., RDF triples), the challenge of how a good summary can be produced arises. For instance, having the RDF properties from an infobox of a Wikipedia page, how could a summary expressed in natural language text be generated?

Cross-document event ordering through temporal, lexical and distributional knowledge

In this paper we present a system that automatically builds ordered timelines of events from different written texts in English. The system deals with problems such as automatic event extraction, cross-document temporal relation extraction and cross-document event coreference resolution. Its main characteristic is the application of three different types of knowledge: temporal knowledge, lexical-semantic knowledge and distributional-semantic knowledge, in order to anchor and order the events in the timeline. It has been evaluated within the framework of SemEval 2015.

Herramientas AMR

Con el objetivo de conseguir una representación completa de un texto, se ha realizado un análisis de herramientas existentes para generar AMR a nivel de oración.

Seguimiento 26.05.2017

Informe de seguimiento de los avances durante el primer año de proyecto en relación a cuatro puntos: evolución, estado de los módulos, publicaciones y página web. Se adjunta el informe en formato de presentación.


La diseminación y divulgación del proyecto son aspectos suficientemente relevantes como para ser contemplados en los objetivos específicos del proyecto, concretamente en el último objetivo (OBJ6).

Actividad E.2. Evaluación extrínseca

El objetivo de esta actividad es la definición del escenario concreto en el que se va evaluar extrínsecamente el modelo RESCATA. Esta integración y el escenario darán lugar a una aplicación concreta que servirá para demostrar la validez del modelo a la vez que sentará las bases de la propuesta de una tarea de alto interés actualmente en la comunidad investigadora.

Actividad E.1. Evaluación intrínseca

El objetivo de esta actividad es el análisis y la definición de un conjunto de métricas que nos permitan evaluar de manera intrínseca la representación canónica y sus flexiones. Se pretende abarcar métricas tanto cualitativas como cuantitativas capaces de medir la validez de lo desarrollado en el proyecto, y debido a lo novedoso de la propuesta será necesario analizar qué métricas de las utilizadas hasta el momento son aplicables y qué métricas nuevas específicas serían necesarias.


Suscribirse a RESCATA RSS