DrugSemantis Gold Standard

DrugSemantics gold standard consists of 5 Summaries of Product Characteristics (SPC) written in Spanish. SPCs were retrieved from Medicines Online Information Center - CIMA - that belongs to the Spanish Agency for Medicines and Health Products - AEMPS.

This corpus is annotated with 10 Named Entities (NE) related to pharmacotherapeutic care, namely: Chemical Composition, Disease, Drug, Excipient, Food, Medicament, Pharmaceutical Form, Route, Therapeutic Action and Unit of Measurement. It contains 2241 ENs, 780 sentences and 226,729 tokens.

DrugSemantics was designed to be used for developing and testing of Spanish NE recogniton tools in the pharmacotherapeutic domain.


The dataset for download.


Please feel free to send us your technical questions, requests and bug reports by email.

Idiomas del recurso: 
This corpus is licensed under a Attribution-NonCommercial 3.0 Unported licence.
Referencia bibliográfica: 

If you use this data, please cite the following publications:

Isabel Moreno; Ester Boldrini; Paloma Moreda; M. Teresa Romá-Ferri (2017), "DrugSemantics Gold Standard”, Mendeley Data, v1
Isabel Moreno, Ester Boldrini, Paloma Moreda, M. Teresa Romá-Ferri, DrugSemantics: A corpus for Named Entity Recognition in Spanish Summaries 
of Product Characteristics, Journal of Biomedical Informatics, Volume 72, August 2017, Pages 8-22, ISSN 1532-0464

Añadir nuevo comentario

Plain text

  • No se permiten etiquetas HTML.
  • Syntax highlight code surrounded by the <pre class="brush: lang">...</pre> tags, where lang is one of the following language brushes: as3, applescript, bash, csharp, coldfusion, cpp, css, delphi, diff, erlang, groovy, jscript, java, javafx, perl, php, plain, powershell, python, ruby, sass, scala, sql, vb, xml.
  • Las direcciones de las páginas web y las de correo se convierten en enlaces automáticamente.
  • Saltos automáticos de líneas y de párrafos.
  • Mathematics inside the configured delimiters is rendered by MathJax. The default math delimiters are $$...$$ and \[...\] for displayed mathematics, and $...$ and \(...\) for in-line mathematics.
  • Replaces [VIDEO::http://www.youtube.com/watch?v=someVideoID::aVideoStyle] tags with embedded videos.

Display Suite code

  • You may post Display Suite code. You should include <?php ?> tags when using PHP. The $entity object is available.