The ACL RD-TEC 2.0

Successive to the ACL RD-TEC 1.0, the ACL RD-TEC 2.0 embraces 300 unique abstracts from the ACL Anthology Corpus which are manually annotated for terms that they contain.
These terms are tagged with several categories of computational linguistics concepts: technologies, systems, language resources, language resources (specific product), models, measure and measurement related terms, as well as a class label for residuals (i.e., other)---see the guidelines here.
In total, 471 abstracts are annotated, of which 171 are annotated by two annotators (myself and Anne Schumann).
The manually annotated corpora resulting from the annotations by each of the participating annotators can be browsed in the NoSkE engine at these links for Annotator 1 and for Annotaotr 2. To see annotated terms in an abstract, click on provided links in the A1 and A2 columns of the table below.

More information about the dataset can be found in the following publication:
QasemiZadeh and Schumann, The ACL RD-TEC 2.0: A Language Resource for Evaluating Term Extraction and Entity Recognition Methods, LREC 2016.

Obtaining the annotated corpus

The ACL RD-TEC has a permanent home at LINDAT/CLARIN Repository of the Institute of Formal and Applied Linguistics, Charles University in Prague: http://hdl.handle.net/11372/LRT-1661.
The Git repository containing the data and some tools can be also browsed here.
You can also use the local NoSkE instances to collect data.
The corpus is also hosted by the Lindat KonText system at UFAL.

Summary:

Total number of abstracts annotated by at least one annotaotr: 300
Total number of files annotated by the first annotator: 282
Total number of files annotated by the second annotator: 189
Total number of files annotated by both annotator: 171
Average inter annotator agreement on annotated boundaries: 0.8
Average inter annotator agreement on annotated boundaries and their assigned class: 0.71

Lists of abstracts:

In the following table, for each file participating annotators are marked by A1 and A2 (and a link to the terms that they have annotated in the abstract). IAAS and IAAA show the inter annotator agreement for deciding term boundaries and their semantic classes, respectively. Bear in mind that IAAS determines the upper-bound limit for IAAA.
Computed inter agreements per concept category can be seen here.

Year	ACL ID	Title	A1	A2	IAAS	IAAA
1978	T78‑1001	Testing The Psychological Reality of a Representational Model	✓	✓	0.87	0.82
1978	T78‑1028	Fragments of a Theory of Human Plausible Reasoning	✓	✓	0.86	0.78
1978	T78‑1031	PATH‑BASED AND NODE‑BASED INFERENCE IN SEMANTIC NETWORKS	✓	✓	0.94	0.73
1980	C80‑1039	ON FROFF: A TEXT PROCESSING SYSTEM FOR ENGLISH TEXTS AND FIGURES	✓	✓	0.53	0.53
1980	C80‑1073	ATNS USED AS A PROCEDURAL DIALOG MODEL	✓	✓	0.75	0.5
1980	P80‑1004	Metaphor ‑ A Key to Extensible Semantic Analysis	✓	✓	0.69	0.21
1980	P80‑1019	Expanding the Horizons of Natural Language Interfaces	✓	✓	0.64	0.64
1980	P80‑1026	Flexiable Parsing	✓	✓	0.6	0.53
1981	P81‑1032	Dynamic Strategy Selection in Flexible Parsing		✓
1981	P81‑1033	A Construction‑Specific Approach to Focused Interaction in Flexible Parsing		✓
1982	C82‑1054	AN IMPROVED LEFT‑CORNER PARSING ALGORITHM	✓	✓	1	1
1982	J82‑3002	An Efficient Easily Adaptable System for Interpreting Natural Language Queries	✓	✓	0.88	0.85
1982	P82‑1035	Scruffy Text Understanding: Design and Implementation of 'Tolerant' Understanders	✓	✓	0.51	0.51
1983	E83‑1021	AN APPROACH TO NATURAL LANGUAGE IN THE SI‑NETS PARADIGM		✓
1983	E83‑1029	NATURAL LANGUAGE INPUT FOR SCENE GENERATION		✓
1983	P83‑1003	Crossed Serial Dependencies: A low‑power parseable extension to GPSG		✓
1983	P83‑1004	Formal Constraints on Metarules		✓
1983	P83‑1021	PARSING AS DEDUCTION		✓
1984	P84‑1020	LIMITED DOMAIN SYSTEMS FOR LANGUAGE TEACHING	✓	✓	1	1
1984	P84‑1034	A PROPER TREATMEMT OF SYNTAX AND SEMANTICS IN MACHINE TRANSLATION	✓	✓	0.93	0.93
1984	P84‑1047	Entity‑Oriented Parsing	✓	✓	0.8	0.55
1984	P84‑1064	A COMPUTATIONAL THEORY OF DISPOSITIONS	✓	✓	0.89	0.79
1984	P84‑1078	Controlling Lexical Substitution in Computer Text Generation	✓	✓	0.83	0.78
1985	E85‑1037	A PROBLEM SOLVING APPROACH TO GENERATING TEXT FROM SYSTEMIC GRAMMARS		✓
1985	E85‑1041	THE STRUCTURE OF COMMUNICATIVE CONTEXT OF DIALOGUE INTERACTION		✓
1985	P85‑1015	Parsing with Discontinuous Constituents		✓
1985	P85‑1019	Semantic Caseframe Parsing and Syntactic Generality		✓
1985	P85‑1024	A PRAGMATICS‑BASED APPROACH TO UNDERSTANDING INTERSENTENTIAL ELLIPSIS		✓
1986	C86‑1081	A LOGICAL FORMALISM FOR THE REPRESENTATION OF DETERMINERS	✓	✓	0.86	0.86
1986	C86‑1132	SYNTHESIZING WEATHER FORECASTS FROM FORMATFED DATA	✓	✓	0.88	0.88
1986	J86‑1002	THE CORRECTION OF ILL‑FORMED INPUT USING HISTORY‑BASED EXPECTATION WITH APPLICATIONS TO SPEECH UNDERSTANDING	✓	✓	0.74	0.64
1986	J86‑3001	Attention, Intentions, And The Structure Of Discourse	✓	✓	0.83	0.82
1986	J86‑4002	REFERENCE IDENTIFICATION AND REFERENCE IDENTIFICATION FAILURES	✓	✓	0.86	0.84
1986	P86‑1011	The Relationship Between Tree Adjoining Grammars And Head Grammars	✓	✓	0.92	0.73
1986	P86‑1038	A LOGICAL SEMANTICS FOR FEATURE STRUCTURES	✓	✓	0.84	0.66
1987	E87‑1037	A Comparison of Rule‑Invocation Strategies in Context‑Free Chart Parsing		✓
1987	E87‑1043	ITERATION, HABITUALITY AND VERB FORM SEMANTICS		✓
1987	J87‑1003	SIMULTANEOUS‑DISTRIBUTIVE COORDINATION AND CONTEXT‑FREENESS		✓
1987	J87‑3001	PROCESSING DICTIONARY DEFINITIONS WITH PHRASAL PATTERN HIERARCHIES		✓
1987	P87‑1022	A CENTERING APPROACH TO PRONOUNS		✓
1988	A88‑1001	The Multimedia Articulation of Answers in a Natural Language Database Query System	✓	✓	0.76	0.63
1988	A88‑1003	An Architecture for Anaphora Resolution	✓	✓	0.53	0.43
1988	C88‑1007	Machine Translation Using Isomorphic UCGs	✓	✓	0.59	0.56
1988	C88‑1044	On the Generation and Interpretation of Demonstrative Expressions	✓	✓	0.71	0.71
1988	C88‑1066	Parsing with Category Coocurrence Restrictions	✓	✓	0.87	0.69
1988	C88‑2086	Solving Some Persistent Presupposition Problems	✓	✓	0.57	0.57
1988	C88‑2130	Directing the Generation of Living Space Descriptions	✓	✓	1	0.67
1988	C88‑2132	Island Parsing and Bidirectional Charts	✓	✓	0.55	0.48
1988	C88‑2160	Interactive Translation : a new approach	✓	✓	0.43	0.43
1988	C88‑2162	NETL: A System for Representing and Using Real‑World Knowledge	✓	✓	0.83	0.45
1988	C88‑2166	COMPLEX: A Computational Lexicon for Natural Language Systems	✓	✓	0.7	0.64
1988	J88‑3002	MODELING THE USER IN NATURAL LANGUAGE SYSTEMS	✓	✓	0.9	0.86
1989	E89‑1006	TENSES AS ANAPHORA		✓
1989	E89‑1016	User studies and the design of Natural Language Systems		✓
1989	H89‑1027	THE MIT SUMMIT SPEECH RECOGNITION SYSTEM: A PROGRESS REPORT		✓
1989	H89‑1036	Lexicalized TAGs, Parsing and Lexicons		✓
1989	H89‑2019	A PROPOSAL FOR SLS EVALUATION		✓
1989	H89‑2028	A CSR‑NL INTERFACE SPECIFICATION: Version 1.5		✓
1989	J89‑4003	A FORMAL MODEL FOR CONTEXT‑FREE LANGUAGES AUGMENTED WITH REDUPLICATION		✓
1989	P89‑1008	CONVERSATIONALLY RELEVANT DESCRIPTIONS		✓
1990	C90‑1013	Generation for Dialogue Translation Using Typed Feature Structure Unification	✓	✓	0.91	0.67
1990	C90‑2032	Sentence disambiguation by document preference sets oriented	✓	✓	0.7	0.7
1990	C90‑3014	A phonological knowledge base system using unification‑based formalism: a case study of Korean phonology	✓	✓	0.67	0.33
1990	C90‑3045	Synchronous Tree‑Adjoining Grammars	✓	✓	0.71	0.67
1990	C90‑3046	Japanese Sentence Analysis as Argumentation	✓	✓	0.88	0.62
1990	C90‑3063	Automatic Processing of Large Corpora for the Resolution of Anaphora References	✓	✓	0.69	0.36
1990	C90‑3072	Spelling‑checking for Highly Inflective Languages	✓	✓	0.86	0.83
1990	H90‑1016	Toward a Real‑Time Spoken Language System Using Commercial Hardware	✓	✓	0.5	0.42
1990	H90‑1060	A New Paradigm for Speaker‑Independent Training and Speaker Adaptation	✓	✓	0.83	0.74
1990	J90‑3002	AN EDITOR FOR THE EXPLANATORY AND COMBINATORY DICTIONARY OF CONTEMPORARY FRENCH (DECFC)	✓	✓	0.88	0.83
1990	P90‑1014	Free Indexation: Combinatorial Analysis and A Compositional Algorithm	✓	✓	0.75	0.46
1991	E91‑1012	Non‑deterministic Recursive Ascent Parsing		✓
1991	E91‑1043	A BIDIRECTIONAL MODEL FOR NATURAL LANGUAGE PROCESSING		✓
1991	E91‑1050	A Language for the Statement of Binary Relations over Feature Structures		✓
1991	H91‑1010	New Results with the Lincoln Tied‑Mixture HMM CSR System		✓
1991	H91‑1067	Automatic Acquisition of Subcategorization Frames from Tagged Text		✓
1991	H91‑1077	A PROPOSAL FOR LEXICAL DISAMBIGUATION		✓
1991	P91‑1016	The Acquisition and Application of Context Sensitive Grammar for English		✓
1991	P91‑1025	Resolving Translation Mismatches With Information Flow		✓
1992	A92‑1026	Robust Processing of Real‑World Natural‑Language Texts	✓	✓	0.77	0.77
1992	A92‑1027	An Efficient Chart‑based Algorithm for Partial‑Parsing of Unrestricted Texts	✓	✓	0.84	0.8
1992	C92‑1052	Temporal Structure of Discourse	✓	✓	0.92	0.73
1992	C92‑1055	Syntactic Ambiguity Resolution Using A Discrimination and Robustness Oriented Adaptive Learning Algorithm	✓	✓	0.76	0.71
1992	C92‑2068	Quasi‑Destructive Graph Unification with Structure‑Sharing	✓	✓	0.5	0.5
1992	C92‑2115	A Similarity‑Driven Transfer System	✓	✓	1	0.91
1992	C92‑3165	Interactive Speech Understanding	✓	✓	0.67	0.67
1992	C92‑4199	Recognizing Unregistered Names for Mandarin Word Identification	✓	✓	0.76	0.7
1992	C92‑4207	Reconstructing Spatial Image from Natural Language Texts	✓	✓	0.58	0.48
1992	H92‑1003	Multi‑Site Data Collection for a Spoken Language Corpus: MADCOW	✓	✓	0.94	0.88
1992	H92‑1010	Spoken Language Processing in the Framework of Human‑Machine Communication at LIMSI	✓	✓	0.89	0
1992	H92‑1016	The MIT ATIS System: February 1992 Progress Report	✓	✓	0.8	0.75
1992	H92‑1017	Recent Improvements and Benchmark Results for Paramax ATIS System	✓	✓	0.67	0.59
1992	H92‑1026	Towards History‑based Grammars: Using Richer Models for Probabilistic Parsing	✓	✓	0.74	0.55
1992	H92‑1036	MAP Estimation of Continuous Density HMM: Theory and Applications	✓	✓	0.83	0.79
1992	H92‑1045	One Sense Per Discourse	✓	✓	0.83	0.77
1992	H92‑1060	A Relaxation Method for Understanding Spontaneous Speech Utterances	✓	✓	0.6	0.46
1992	H92‑1074	CSR Corpus Development	✓	✓	0.69	0.64
1992	H92‑1095	Language Understanding Research at Paramax	✓	✓	0.74	0.74
1992	M92‑1025	GE NLTOOLSET: DESCRIPTION OF THE SYSTEM AS USED FOR MUC‑4	✓
1993	E93‑1004	Talking About Trees		✓
1993	E93‑1013	LFG Semantics via Constraints		✓
1993	E93‑1020	A Computational Treatment of Sentence‑Final 'then'		✓
1993	E93‑1023	A Probabilistic Context‑free Grammar for Disambiguation in Morphological Parsing		✓
1993	E93‑1025	A Discourse Copying Algorithm for Ellipsis and Anaphora Resolution		✓
1993	E93‑1043	Coping With Derivation in a Morphological Component		✓
1993	H93‑1076	Speech and Text‑Image Processing in Documents		✓
1993	P93‑1014	A UNIFICATION‑BASED PARSER FOR RELATIONAL GRAMMAR		✓
1994	A94‑1007	Symmetric Pattern Matching Analysis for English Coordinate Structures	✓	✓	0.84	0.63
1994	A94‑1011	Exploiting Sophisticated Representations for Document Retrieval	✓	✓	0.77	0.66
1994	A94‑1017	Real‑Time Spoken Language Translation Using Associative Processors	✓	✓	0.79	0.68
1994	A94‑1026	Handling Japanese Homophone Errors in Revision Support System for Japanese Texts; REVISE	✓	✓	0.82	0.82
1994	C94‑1026	A Part‑of‑Speech‑Based Alignment Algorithm	✓	✓	0.86	0.74
1994	C94‑1030	AN EVALUATION TO DETECT AND CORRECT ERRONEOUS CHARACTERS WRONGLY SUBSTITUTED, DELETED AND INSERTED IN JAPANESE AND ENGLISH SEN~IENCES USING MARKOV MODELS	✓	✓	0.93	0.86
1994	C94‑1052	TGE: Tlinks Generation Environment.	✓	✓	0.84	0.71
1994	C94‑1061	CONCURRENT LEXICAIJZEID DEPENDENCY PARSING: THE ParseTalk MODEL	✓	✓	0.83	0.6
1994	C94‑1077	Emergent Parsing and Generation with Generalized	✓	✓	0.92	0.88
1994	C94‑1079	PRINCIPAR‑‑An Efficient, Broad‑coverage, Principle‑based Parser	✓	✓	0.74	0.5
1994	C94‑1080	CONCURRENT LEXICALIZED DEPENDENCY PARSING: A BEHAVIORAL VIEW ON ParseTalk EVENTS	✓	✓	0.84	0.71
1994	C94‑1082	LHIP: Extended DCGs for Configurable Robust Parsing	✓	✓	0.6	0.6
1994	C94‑2151	Non‑constituent coordination: Theory and practice	✓	✓	0.75	0.75
1994	H94‑1014	Language Modeling with Sentence‑Level Mixtures	✓	✓	0.86	0.81
1994	H94‑1034	Tagging Speech Repairs	✓	✓	0.75	0.7
1994	H94‑1084	Integrated Text and Image Understanding for Document Understanding	✓	✓	0.76	0.76
1995	E95‑1021	Tagging French ‑ comparing a statistical and a constraint‑based method		✓
1995	E95‑1033	ParseTalk about Sentence‑ and Text‑Level Anaphora		✓
1995	E95‑1036	Splitting the Reference Time: Temporal Anaphora and Quantification in DRT		✓
1995	P95‑1013	Compilation of HPSG to TAG		✓
1995	P95‑1025	Statistical Sense Disambiguation with Relatively Small Corpora Using Dictionary Definitions		✓
1995	P95‑1027	A Quantitative Evaluation of Linguistic Tests for the Automatic Prediction of Semantic Markedness		✓
1995	P95‑1034	Two‑Level, Many‑Paths Generation		✓
1995	P95‑1053	Conciseness through Aggregation in Text Generation		✓
1997	A97‑1015	The Domain Dependence of Parsing		✓
1997	A97‑1020	Reading more into Foreign Languages		✓
1997	A97‑1021	Large‑Scale Acquisition of LCS‑Based Lexicons for Foreign Language Tutoring		✓
1997	A97‑1027	Dutch Sublanguage Semantic Tagging combined with Mark‑Up Technology		✓
1997	A97‑1028	A Statistical Profile of the Named Entity Task		✓
1997	A97‑1034	Using SGML as a Basis for Data‑Intensive NLP		✓
1997	A97‑1042	Identifying Topics by Position		✓
1997	A97‑1050	Semi‑Automatic Acquisition of Domain‑Specific Translation Lexicons		✓
1997	A97‑1052	Corpus Data TP FP FN		✓
1997	P97‑1002	Fast Context‑Free Parsing Requires Fast Boolean Matrix Multiplication		✓
1997	P97‑1006	Document Classification Using a Finite Mixture Model		✓
1997	P97‑1015	Probing the lexicon in evaluating commercial MT systems		✓
1997	P97‑1017	Machine Transliteration		✓
1997	P97‑1026	Sentence Planning as Description Using Tree Adjoining Grammar		✓
1997	P97‑1040	Efficient Generation in Primitive Optimality Theory		✓
1997	P97‑1050	Efficient Construction of Underspecified Semantics under Massive Ambiguity		✓
1997	P97‑1052	On Interpreting F‑Structures as UDRSs		✓
1997	P97‑1058	Approximating Context‑Free Grammars with a Finite‑State Calculus		✓
1997	P97‑1071	Contrastive accent in a data‑to‑speech system		✓
1997	P97‑1072	Towards resolution of bridging descriptions		✓
1999	E99‑1005	Determinants of Adjective‑Noun Plausibility		✓
1999	E99‑1014	Full Text Parsing using Cascades of Rules: an Information Extraction Perspective		✓
1999	E99‑1015	An annotation scheme for discourse‑level argumentation in research articles		✓
1999	E99‑1023	Representing Text Chunks		✓
1999	E99‑1029	Parsing with an Extended Domain of Locality		✓
1999	E99‑1034	Finding content‑bearing terms using term similarities		✓
1999	E99‑1038	Focusing on focus: a formalization		✓
1999	P99‑1022	Dynamic Nonlocal Language Modeling via Hierarchical Topic‑Based Adaptation		✓
1999	P99‑1025	Construct Algebra: Analytical Dialog Management		✓
1999	P99‑1036	A Part of Speech Estimation Method for Japanese Unknown Words using a Statistical Model of Morphology and Context		✓
1999	P99‑1038	Two Accounts of Scope Availability and Semantic Underspecification		✓
1999	P99‑1058	A semantically‑derived subset of English for hardware verification		✓
1999	P99‑1062	Semantic Analysis of Japanese Noun Phrases : A New Approach to Dictionary‑Based Understanding		✓
1999	P99‑1068	Mining the Web for Bilingual Text		✓
1999	P99‑1080	A Pylonic Decision‑Tree Language Model with Optimal Question Selection		✓
2001	H01‑1001	Activity detection for information access to oral communication	✓	✓	0.74	0.43
2001	H01‑1017	Dialogue Interaction with the DARPA Communicator Infrastructure: The Development of Useful Software	✓	✓	0.62	0.62
2001	H01‑1040	Intelligent Access to Text: Integrating Information Extraction Technology into Text Browsers	✓	✓	0.83	0.67
2001	H01‑1041	Interlingua‑Based Broad‑Coverage Korean‑to‑English Translation in CCLING	✓	✓	0.84	0.82
2001	H01‑1042	Is That Your Final Answer?	✓	✓	0.95	0.95
2001	H01‑1049	Listen‑Communicate‑Show (LCS): Spoken Language Command of Agent‑based Remote Information Access	✓	✓	0.9	0.9
2001	H01‑1055	Natural Language Generation in Dialog Systems	✓	✓	1	1
2001	H01‑1058	On Combining Language Models : Oracle Approach	✓	✓	0.98	0.88
2001	H01‑1068	A Three‑Tiered Evaluation Approach for Interactive Spoken Dialogue Systems	✓	✓	0.67	0.67
2001	H01‑1070	Towards an Intelligent Multilingual Keyboard System	✓	✓	1	1
2001	N01‑1003	SPoT: A Trainable Sentence Planner	✓	✓	0.84	0.82
2001	P01‑1004	Low‑cost, High‑performance Translation Retrieval: Dumber is Better	✓	✓	0.91	0.91
2001	P01‑1007	Guided Parsing of Range Concatenation Languages	✓	✓	0.78	0.68
2001	P01‑1008	Extracting Paraphrases from a Parallel Corpus	✓	✓	0.95	0.95
2001	P01‑1009	Alternative Phrases and Natural Language Information Retrieval	✓	✓	0.69	0.64
2001	P01‑1047	Extending Lambek grammars: a logical account of minimalist grammars	✓	✓	0.93	0.93
2001	P01‑1056	Evaluating a Trainable Sentence Planner for a Spoken Dialogue System	✓	✓	0.93	0.93
2001	P01‑1070	Using Machine Learning Techniques to Interpret WH‑questions	✓	✓	0.8	0.67
2003	N03‑1001	Effective Utterance Classification with Unsupervised Phonotactic Models	✓	✓	0.94	0.87
2003	N03‑1004	In Question Answering, Two Heads Are Better Than One	✓	✓	0.76	0.76
2003	N03‑1012	Semantic Coherence Scoring Using an Ontology	✓	✓	0.96	0.92
2003	N03‑1017	Statistical Phrase‑Based Translation	✓	✓	1	1
2003	N03‑1018	A Generative Probabilistic OCR Model for NLP Applications	✓	✓	0.78	0.71
2003	N03‑1026	Statistical Sentence Condensation using Ambiguity Packing and Stochastic Disambiguation Methods for Lexical‑Functional Grammar	✓	✓	0.76	0.63
2003	N03‑1033	Feature‑Rich Part‑of‑Speech Tagging with a Cyclic Dependency Network	✓	✓	0.88	0.73
2003	N03‑2003	Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class‑Dependent Mixtures	✓	✓	0.92	0.92
2003	N03‑2006	Adaptation Using Out‑of‑Domain Corpus within EBMT	✓	✓	0.67	0.61
2003	N03‑2015	Unsupervised Learning of Morphology for English and Inuktitut	✓	✓	0.97	0.97
2003	N03‑2017	Word Alignment with Cohesion Constraint	✓	✓	1	0.73
2003	N03‑2025	Bootstrapping for Named Entity Tagging Using Concept‑based Seeds	✓	✓	0.84	0.78
2003	N03‑2036	A Phrase‑Based Unigram Model for Statistical Machine Translation	✓	✓	0.88	0.85
2003	N03‑3010	Cooperative Model Based Language Understanding in Dialogue	✓	✓	0.95	0.95
2003	N03‑4004	TAP‑XL: An Automated Analyst's Assistant	✓	✓	0.67	0.67
2003	N03‑4010	JAVELIN: A Flexible, Planner‑Based Architecture for Question Answering	✓	✓	1	1
2003	P03‑1002	Using Predicate‑Argument Structures for Information Extraction	✓	✓	0.88	0.71
2003	P03‑1005	Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data	✓	✓	1	0.89
2003	P03‑1009	Clustering Polysemic Subcategorization Frame Distributions Semantically	✓	✓	0.64	0.64
2003	P03‑1022	A Machine Learning Approach to Pronoun Resolution in Spoken Dialogue	✓	✓	0.95	0.95
2003	P03‑1030	Optimizing Story Link Detection is not Equivalent to Optimizing New Event Detection	✓	✓	1	0.95
2003	P03‑1031	Corpus‑based Discourse Understanding in Spoken Dialogue Systems	✓	✓	0.83	0.76
2003	P03‑1033	Flexible Guidance Generation using User Model in Spoken Dialogue Systems	✓	✓	0.86	0.86
2003	P03‑1050	Unsupervised Learning of Arabic Stemming using a Parallel Corpus	✓	✓	0.96	0.96
2003	P03‑1051	Language Model Based Arabic Word Segmentation	✓	✓	0.74	0.72
2003	P03‑1058	Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study	✓	✓	0.81	0.81
2003	P03‑1068	Towards a Resource for Lexical Semantics: A Large German Corpus with Extensive Semantic Annotation	✓	✓	0.8	0.8
2003	P03‑1070	Towards a Model of Face‑to‑Face Grounding	✓	✓	0.97	0.97
2003	P03‑2036	Comparison between CFG filtering techniques for LTAG and HPSG	✓	✓	1	0.91
2004	C04‑1035	Classifying Ellipsis in Dialogue: A Machine Learning Approach	✓
2004	C04‑1036	Feature Vector Quality and Distributional Similarity	✓
2004	C04‑1068	Filtering Speaker‑Specific Words from Electronic Discussions	✓
2004	C04‑1080	Part of Speech Tagging in Context	✓
2004	C04‑1096	Generation of Relative Referring Expressions based on Perceptual Grouping	✓
2004	C04‑1103	Direct Orthographical Mapping for Machine Transliteration	✓
2004	C04‑1106	Lower and higher estimates of the number of "true analogies" between sentences contained in a large multilingual corpus	✓
2004	C04‑1112	A Lemma‑Based Approach to a Maximum Entropy Word Sense Disambiguation System for Dutch	✓
2004	C04‑1116	Term Aggregation: Mining Synonymous Expressions using Personal Stylistic Variations	✓
2004	C04‑1128	Detection of Question‑Answer Pairs in Email Conversations	✓
2004	C04‑1147	Fast Computation of Lexical Affinity Models	✓
2004	C04‑1192	Fine‑Grained Word Sense Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and Aligned Wordnets	✓
2004	N04‑1022	Minimum Bayes‑Risk Decoding for Statistical Machine Translation	✓
2004	N04‑1024	Evaluating Multiple Aspects of Coherence in Student Essays	✓
2004	N04‑4028	Confidence Estimation for Information Extraction	✓
2004	P04‑2005	Automatic Acquisition of English Topic Signatures Based on a Second Language	✓
2004	P04‑2010	A Machine Learning Approach to German Pronoun Resolution	✓
2005	H05‑1005	Improving Multilingual Summarization: Using Redundancy in the Input to Correct MT errors	✓	✓	0.62	0.57
2005	H05‑1012	A Maximum Entropy Word Aligner for Arabic‑English Machine Translation	✓	✓	0.83	0.83
2005	H05‑1032	Bayesian Learning in Text Summarization		✓
2005	H05‑1064	Hidden‑Variable Models for Discriminative Reranking		✓
2005	H05‑1095	Translating with non‑contiguous phrases	✓	✓	0.96	0.92
2005	H05‑1101	Some Computational Complexity Results for Synchronous Context‑Free Grammars	✓	✓	0.75	0.57
2005	H05‑1115	Using Random Walks for Question‑focused Sentence Retrieval		✓
2005	H05‑1117	Automatically Evaluating Answers to Definition Questions	✓	✓	0.67	0.67
2005	H05‑2007	Pattern Visualization for Machine Translation Output	✓	✓	0.67	0.57
2005	I05‑2013	Automatic recognition of French expletive pronoun occurrences		✓
2005	I05‑2014	BLEU in characters: towards automatic MT evaluation in languages without word delimiters	✓	✓	0.73	0.73
2005	I05‑2021	Evaluating the Word Sense Disambiguation Performance of Statistical Machine Translation	✓	✓	0.77	0.43
2005	I05‑2043	Trend Survey on Japanese Natural Language Processing Studies over the Last Decade		✓
2005	I05‑2044	Two‑Phase Shift‑Reduce Deterministic Dependency Parser of Chinese		✓
2005	I05‑2048	Statistical Machine Translation Part I: Hands‑On Introduction	✓	✓	0.95	0.79
2005	I05‑3022	Chinese Word Segmentation in FTRD Beijing		✓
2005	I05‑4007	Cross‑lingual Conversion of Lexical Semantic Relations: Building Parallel Wordnets		✓
2005	I05‑4008	Taiwan Child Language Corpus: Data Collection and Annotation		✓
2005	I05‑4010	Harvesting the Bitexts of the Laws of Hong Kong From the Web	✓	✓	0.53	0.53
2005	I05‑5003	Using Machine Translation Evaluation Techniques to Determine Sentence‑level Semantic Equivalence	✓	✓	0.68	0.57
2005	I05‑5004	A Class‑oriented Approach to Building a Paraphrase Corpus		✓
2005	I05‑5008	Automatic generation of paraphrases to be used as translation references in objective evaluation measures of machine translation	✓	✓	0.7	0.59
2005	I05‑5009	Evaluating Contextual Dependency of Paraphrases using a Latent Variable Model		✓
2005	I05‑6010	Some remarks on the Annotation of Quantifying Noun Groups in Treebanks		✓
2005	I05‑6011	Annotating Honorifics Denoting Social Ranking of Referents	✓	✓	0.87	0.87
2005	J05‑1003	Discriminative Reranking for Natural Language Parsing	✓	✓	0.78	0.66
2005	J05‑4003	Improving Machine Translation Performance by Exploiting Non‑Parallel Corpora	✓	✓	0.67	0.62
2005	P05‑1010	Probabilistic CFG with latent annotations		✓
2005	P05‑1018	Modeling Local Coherence: An Entity‑based Approach		✓
2005	P05‑1028	Exploring and Exploiting the Limited Utility of Captions in Recognizing Intention in Information Graphics		✓
2005	P05‑1032	Scaling Phrase‑Based Statistical Machine Translation to Larger Corpora and Longer Phrases	✓	✓	0.85	0.64
2005	P05‑1034	Dependency Treelet Translation: Syntactically Informed Phrasal SMT	✓	✓	0.95	0.89
2005	P05‑1039	What to do when lexicalization fails: parsing German with suffix analysis and smoothing		✓
2005	P05‑1046	Unsupervised Learning of Field Segmentation Models for Information Extraction		✓
2005	P05‑1048	Word Sense Disambiguation vs. Statistical Machine Translation	✓	✓	0.86	0.67
2005	P05‑1053	Exploring Various Knowledge in Relation Extraction		✓
2005	P05‑1056	Using Conditional Random Fields For Sentence Boundary Detection In Speech		✓
2005	P05‑1057	Log‑linear Models for Word Alignment		✓
2005	P05‑1058	Alignment Model Adaptation for Domain‑Specific Word Alignment		✓
2005	P05‑1067	Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars	✓	✓	0.93	0.58
2005	P05‑1069	A Localized Prediction Model for Statistical Machine Translation	✓	✓	0.85	0.64
2005	P05‑1073	Joint Learning Improves Semantic Role Labeling		✓
2005	P05‑1074	Paraphrasing with Bilingual Parallel Corpora	✓	✓	0.91	0.77
2005	P05‑1076	Automatic Acquisition of Adjectival Subcategorization from Corpora		✓
2005	P05‑2008	Using Emoticons to reduce Dependency in Machine Learning Techniques for Sentiment Classification		✓
2005	P05‑2013	Automatic Induction of a CCG Grammar for Turkish		✓
2005	P05‑2016	Dependency‑Based Statistical Machine Translation	✓	✓	0.82	0.67
2005	P05‑3001	An Information‑State Approach to Collaborative Reference		✓
2005	P05‑3025	Interactively Exploring a Machine Translation Model	✓	✓	0.42	0.42
2005	P05‑3030	Organizing English Reading Materials for Vocabulary Learning		✓
2006	E06‑1004	Computational Complexity of Statistical Machine Translation	✓	✓	0.89	0.85
2006	E06‑1018	Word Sense Induction: Triplet‑Based Clustering and Automatic Evaluation	✓	✓	0.69	0.69
2006	E06‑1022	Addressee Identification in Face‑to‑Face Meetings	✓	✓	0.62	0.62
2006	E06‑1031	CDER: Efficient MT Evaluation Using Block Movements	✓	✓	0.87	0.71
2006	E06‑1035	Automatic Segmentation of Multiparty Dialogue	✓	✓	0.73	0.73
2006	E06‑1041	Structuring Knowledge for Reference Generation: A Clustering Algorithm	✓	✓	0.75	0.67
2006	N06‑2009	Answering the Question You Wish They Had Asked: The Impact of Paraphrasing for Question Answering	✓	✓	0.67	0.67
2006	N06‑2038	A Comparison of Tagging Strategies for Statistical Information Extraction	✓	✓	0.88	0.88
2006	N06‑4001	InfoMagnets: Making Sense of Corpus Data	✓	✓	0.42	0.42
2006	P06‑1013	Ensemble Methods for Unsupervised WSD	✓	✓	0.67	0.67
2006	P06‑1018	Polarized Unification Grammars	✓	✓	0.87	0.87
2006	P06‑1052	An Improved Redundancy Elimination Algorithm for Underspecified Representations	✓	✓	0.87	0.56
2006	P06‑2001	Using Machine Learning Techniques to Build a Comma Checker for Basque	✓	✓	0.93	0.93
2006	P06‑2012	Unsupervised Relation Disambiguation Using Spectral Clustering	✓	✓	0.86	0.86
2006	P06‑2059	Automatic Construction of Polarity‑tagged Corpus from HTML Documents	✓	✓	0.71	0.62
2006	P06‑2110	Word Vectors and Two Kinds of Similarity	✓	✓	0.92	0.67
2006	P06‑3007	Investigations on Event‑Based Summarization	✓	✓	0.56	0.47
2006	P06‑4007	FERRET: Interactive Question‑Answering for Real‑World Environments	✓	✓	0.86	0.86
2006	P06‑4011	Computational Analysis of Move Structures in Academic Abstracts	✓	✓	0.77	0.73
2006	P06‑4014	Re‑Usable Tools for Precision Machine Translation	✓	✓	0.77	0.55