RESCATA | Representación Canónica y Transformaciones de los Textos Aplicado a las Tecnologías del Lenguaje Humano

Financiación

El proyecto RESCATA (Representación canónica y transformaciones de los textos aplicado a las Tecnologías del Lenguaje Humano) con referencia TIN2015-65100-R está parcialmente financiado por la Universidad de Alicante y el gobierno de España a través del Programa Estatal de I+D+i Orientada a los Retos de la Sociedad del Ministerio de Economía, Industria y Competitividad.

An Active Ingredients Entity Recogniser System Based on Profiles.

Enviado por imoreno el Vie, 06/15/2018 - 13:02

This paper describes an active ingredients named entity recogniser. Our machine learning system, which is language and domain independent, employs unsupervised feature generation and weighting from the training data. The proposed automatic feature extraction process is based on generating a profile for the given entity without traditional knowledge resources (such as dictionaries). Our results (F1 87.3 % [95 %CI: 82.07–92.53]) proves that unsupervised feature generation can achieve a high performance for this task.

URL:

https://doi.org/10.1007/978-3-319-41754-7_25

Analysing the Integration of Semantic Web Features for Document Planning across Genres

Enviado por imoreno el Vie, 06/15/2018 - 12:47

Language is usually studied and analysed from different disciplines generally on the premise that it constitutes a form of communication which pursues a specific objective. The discourse, in that sense, can be understood as a text which is constructed to express such objective. When a discourse is created, its production is related to some textual genre, usually connected with some pragmatic features, like the intention of the writer or the audience to whom is addressed, both conditioning the use of language.

URL:

https://webnlg2016.sciencesconf.org/data/pages/13.pdf

Generating sets of related sentences from input seed features

Enviado por imoreno el Vie, 06/15/2018 - 12:41

The Semantic Web (SW) can provide Natural Language Generation (NLG) with technologies capable to facilitate access to structured Web data. This type of data can be useful to this research area, which aims to automatically produce human utterances, in its different subtasks, such as in the content selection or its structure. NLG has been widely applied to several fields, for instance to the generation of recommendations (Lim-Cheng et al., 2014). However, generation systems are currently designed for very specific domains (Ramos-Soto et al., 2015) and pre-defined purposes (Ge et al., 2015).

URL:

https://webnlg2016.sciencesconf.org/data/pages/01.pdf

Content Selection through Paraphrase Detection: Capturing differentSemantic Realisations of the Same Idea

Enviado por imoreno el Vie, 06/15/2018 - 12:33

Summarisation can be seen as an instance of Natural Language Generation (NLG), where “what to say” corresponds to the identification of relevant information, and “how to say it” would be associated to the final creation of the summary. When dealing with data coming from the Semantic Web (e.g., RDF triples), the challenge of how a good summary can be produced arises. For instance, having the RDF properties from an infobox of a Wikipedia page, how could a summary expressed in natural language text be generated?

URL:

http://www.aclweb.org/anthology/W16-3505

Cross-document event ordering through temporal, lexical and distributional knowledge

Enviado por imoreno el Vie, 06/15/2018 - 12:09

In this paper we present a system that automatically builds ordered timelines of events from different written texts in English. The system deals with problems such as automatic event extraction, cross-document temporal relation extraction and cross-document event coreference resolution. Its main characteristic is the application of three different types of knowledge: temporal knowledge, lexical-semantic knowledge and distributional-semantic knowledge, in order to anchor and order the events in the timeline. It has been evaluated within the framework of SemEval 2015.

URL:

https://doi.org/10.1016/j.knosys.2016.07.032

Herramientas AMR

Enviado por imoreno el Mar, 09/26/2017 - 15:52

Con el objetivo de conseguir una representación completa de un texto, se ha realizado un análisis de herramientas existentes para generar AMR a nivel de oración.

Leer más sobre Herramientas AMR
Inicie sesión para comentar

Seguimiento 26.05.2017

Enviado por imoreno el Mar, 09/26/2017 - 15:35

Informe de seguimiento de los avances durante el primer año de proyecto en relación a cuatro puntos: evolución, estado de los módulos, publicaciones y página web. Se adjunta el informe en formato de presentación.

Leer más sobre Seguimiento 26.05.2017
Inicie sesión para comentar

MÓDULO F: DIFUSIÓN DEL PROYECTO Y DE LOS RESULTADOS DE INVESTIGACIÓN

Enviado por imoreno el Mié, 05/24/2017 - 16:41

La diseminación y divulgación del proyecto son aspectos suficientemente relevantes como para ser contemplados en los objetivos específicos del proyecto, concretamente en el último objetivo (OBJ6).

Leer más sobre MÓDULO F: DIFUSIÓN DEL PROYECTO Y DE LOS RESULTADOS DE INVESTIGACIÓN

Actividad E.2. Evaluación extrínseca

Enviado por imoreno el Mié, 05/24/2017 - 16:39

El objetivo de esta actividad es la definición del escenario concreto en el que se va evaluar extrínsecamente el modelo RESCATA. Esta integración y el escenario darán lugar a una aplicación concreta que servirá para demostrar la validez del modelo a la vez que sentará las bases de la propuesta de una tarea de alto interés actualmente en la comunidad investigadora.

Leer más sobre Actividad E.2. Evaluación extrínseca

Actividad E.1. Evaluación intrínseca

Enviado por imoreno el Mié, 05/24/2017 - 16:38

El objetivo de esta actividad es el análisis y la definición de un conjunto de métricas que nos permitan evaluar de manera intrínseca la representación canónica y sus flexiones. Se pretende abarcar métricas tanto cualitativas como cuantitativas capaces de medir la validez de lo desarrollado en el proyecto, y debido a lo novedoso de la propuesta será necesario analizar qué métricas de las utilizadas hasta el momento son aplicables y qué métricas nuevas específicas serían necesarias.

Leer más sobre Actividad E.1. Evaluación intrínseca

Representación Canónica

Flexiones

Necesidades del Usuario

Financiación

An Active Ingredients Entity Recogniser System Based on Profiles.

Analysing the Integration of Semantic Web Features for Document Planning across Genres

Generating sets of related sentences from input seed features

Content Selection through Paraphrase Detection: Capturing differentSemantic Realisations of the Same Idea

Cross-document event ordering through temporal, lexical and distributional knowledge

Herramientas AMR

Seguimiento 26.05.2017

MÓDULO F: DIFUSIÓN DEL PROYECTO Y DE LOS RESULTADOS DE INVESTIGACIÓN

Actividad E.2. Evaluación extrínseca

Actividad E.1. Evaluación intrínseca

Páginas

Representación Canónica

Flexiones

Necesidades del Usuario

Financiación

Inicio de sesión

Páginas