PUBLICACIONES

Ver todos los resumenes/See all abstracts

Ver todas las publicaciones (sin resumenes)/See all publications (without abstracts)

Claves:


BLBlog
CICongreso internacional / International conference
CLCapítulo de libro / Book chapter
CNCongreso nacional / National conference
IIInforme interno / Internal report
LILibro / Book
RVRevista / Journal

URL     Documento / Document     Presentación / Slides


Año/Year 2001:

Clave: CI  Ref: ICEIS'2001
Sergio Luján-Mora, Enrique Medina. Reducing Inconsistency in Data Warehouses. Proceedings of the 3rd International Conference on Enterprise Information Systems (ICEIS 2001), p. 199-206: ICEIS Press, Setúbal (Portugal), July 7-10 2001.

A data warehouse is a repository of data formed of a collection of data extracted from different and possible heterogeneous sources (e.g., databases or files). One of the main problems in integrating databases into a common repository is the possible inconsistency of the values stored in them, i.e., the very same term may have different values, due to misspelling, a permuted word order, spelling variants and so on. In this paper, we present an automatic method for reducing inconsistency found in existing databases, and thus, improving data quality. All the values that refer to a same term are clustered by measuring their degree of similarity. The clustered values can be assigned to a common value that, in principle, could substitute the original values. Thus, the values are uniformed. The method we propose provides good results with a considerably low error rate.
  



Ver todos los resumenes/See all abstracts

Ver todas las publicaciones (sin resumenes)/See all publications (without abstracts)



Página mantenida por Sergio Luján Mora
Última actualización: 19-Dic-2001 
página principalenviar correo