Language Resources: See the left menu.
Some of these corpora can be also browsed on-line using the noSke corpus management system:
- Orwell's 1984 corpus, Farsi (Persian) section of MULTEXT-East project.
- The ACL Anthology Reference Corpus 2.0, segmented, part-of-speech tagged and cleaned.
- The ACL-RD TEC 2.0: Annotations of terms by Anne-Kathrin Schumann.
- The ACL-RD TEC 2.0: Annotations of terms by me.
- M-Grams: A corpus of music lyrics (special thanks to Carla Kennedy).