Guidelines
Document collections
Registered participants can download the corpora from the CLEF website (registration form and enduser agreement must be first filled in). The collections will be the same than the main task.
Questions
The set of questions will be smaller than those provided for the main task, which will
be made up of 20 questions for the workshop and 100 for the web exercise. This year the
test set will be consist of two types of questions:
(1) factoid questions: fact-based questions, asking for the name of a person, a location,
the extent of something, the day on which something happened, etc.
(2) definition questions: questions like "What/Who is X?". This year definition questions
won't concern only people and organisations, but also objects, natural phenomena, legal procedures etc.
Some questions may even have no answer in the document collection, and in this case the correct
response is a blank string with docid "NIL".
Question format
Test sets will be formatted as XML files (UTF-8 encoded). For more information, you can see: the guidelines of the main QA task 2007