CREST: Corpus of Recommendation Strength
CREST is a collection of clinical guidelines annotated with instances of recommendations, each labeled with their strength of importance as specified by their authors. As data is drawn from many disparate authors, a unified scheme for labelling importance is defined, together with a mapping for each guideline.
For a detailed description of the corpus, please see the paper by Read, Velldal, Cavazza & Georg presented at LREC 2016, A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations:
http://www.lrec-conf.org/proceedings/lrec2016/summaries/521.html
The data is available for download from the following link:
https://www.velldal.net/erik/data/crest.tgz
The archive `crest.tgz’ contains the following:
- partitions.xml
The assignment of guidelines used in experiments described in the paper above, with guideline identifiers assigned to either heldout or development partitions (together with development’s folds for cross-validation). - primary/
HTML acquired from www.guidelines.gov (named according to the guideline identifier). - schemes.xml
The recommendation strength schemes used by individual guidelines, which also contain attributes mapping to the unified scheme described in the paper above. - xml/
XML encoding of the recommendations section in guidelines, with explicit labels of importance removed from the text and instead indicated with XML attributes (named according to the guideline identifier). - README
The same information as found on this page.
Please use the following citation when referencing the data:
@InProceedings{ReaVelCavGeo16,
author = {Jonathon Read and Erik Velldal and
Marc Cavazza and Gersende Georg},
title = {A Corpus of Clinical Practice Guidelines Annotated with the
Importance of Recommendations},
booktitle = {Proceedings of the Tenth International Conference on
Language Resources and Evaluation},
pages = {1724--1731},
year = {2016},
address = {Portorož, Slovenia}
}