The ACL RD-TEC 1.0: The Complete Resource For Download


The dataset is organized in several folders; each folder may further contain several sub-folders or archive files. Archive files contain tab-separated text files. In these files, the first line starts with character “#” and describes the content of the file.

Below is the list of files and folders available for download. An additional description is provided for each category and file.

Name Description
annotation/ Files that represent the set of manually annotated candidate terms.
annotation_guideline/ Annotation guidelines and relevant documentations.
annotator_agreement_test_files/ Additional annotations used for calculating the annotator agreement.
candidate_term/ The set of all the extracted candidate terms.
cleansed_text/ Raw text files in XML format, cleansed and segmented at paragraph level.
external_resource/ Resources from ACL ARC.
licenses/ Relevant license files.
misc/ Additional helpful materials.
sepid_corpus/ The pre-processed, segmented, tagged and indexed ACL ARC.

This page last edited on 21 April 2017.

