Generating sets of related sentences from input seed features

The Semantic Web (SW) can provide Natural Language Generation (NLG) with technologies capable to facilitate access to structured Web data. This type of data can be useful to this research area, which aims to automatically produce human utterances, in its different subtasks, such as in the content selection or its structure. NLG has been widely applied to several fields, for instance to the generation of recommendations (Lim-Cheng et al., 2014). However, generation systems are currently designed for very specific domains (Ramos-Soto et al., 2015) and pre-defined purposes (Ge et al., 2015). The use of SW’s technologies can facilitate the development of more flexible and domain independent systems, that could be adapted to the target audience or purposes, which would considerably advance the state of the art in NLG. The main objective of this paper is to propose a multidomain and multilingual statistical approach focused on the surface realisation stage using factored language models. Our proposed approach will be tested in the context of two different domains (fairy tales and movie reviews) and for the English and Spanish languages, in order to show its appropriateness to different non-related scenarios. The main novelty studied in this approach is the generation of related sentences (sentences with related topics) for different domains, with the aim to achieve cohesion between sentences and move forward towards the generation of coherent and cohesive texts. The approach can be flexible enough thanks to the use of an input seed feature that guides all the generation process. Within our scope, the seed feature can be seen as an abstract object that will determine how the sentence will be in terms of content. For example, this seed feature could be a phoneme, a property or a RDF triple from where the proposed approach could generate a sentence

Barros, Cristina
Lloret, Elena
Tipo de publicación: 
Acta de congreso
Nombre de la revista: 
Nombre del libro: 
2nd International Workshop on Natural Language Generation and the Semantic Web
Revisión por pares: 
Año de publicación: 
2 016