Combining Profiles and Local Information for Named Entity Classification: Adjustment of a Domain and Language Independent Approach

This paper presents a named entity classification system, which employs Random Forest machine learning algorithm. Our feature set includes local entity information and profiles, all of which is generated in an unsupervised manner. Performance on various languages (Spanish, Dutch and English) and domains (general and medical) demonstrate the flexibility, portability and adequateness of our approach regardless of the corpus. Moreover, these results support our hypothesis that named entity classification, without external knowledge or complex linguistic analysis, is language and domain independent; but performance is, sometimes, slightly lower than previous work.

Moreno, Isabel
Romá-Ferri, M.T.
Moreda, Paloma
Tipo de publicación: 
Acta de congreso
Nombre de la revista: 
Nombre del libro: 
8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics
LTC 2017
Revisión por pares: 
Año de publicación: 
2 017