Vertical ACL ARC 2.0
The ACL ARC 2.0 can be browsed on-line using the NoSketch Engine from this link. This version of the corpus is derived from processing (part-of-speech tagging and restructuring) of the ParsCit's output included in the official release of the ACL ARC 2.0. The vertical corpus and its accompanied registry file are available from this link (tar.gz format).
OpenNLP is used for segmenting sentences. Stanford CoreNLP is used for tokenisation and part-of-speech tagging.
- Concordances of the word "embedding" in the abstract sections using the CQL query
[word="embedding"] within <section type="abstract"/>.