The Ontologies of Linguistic Annotation (OLiA) are a repository of linguistic data categories used for
The OLiA ontologies are being developed at the Applied Computational Linguistics (ACoLi) Lab at the Goethe University Frankfurt, Germany. Earlier development took place in the context of Collaborative Research Center "Linguistic Data Structures", (SFB 441/C2) in a collaborative effort of the universities of Tübingen, Hamburg, Potsdam, HU Berlin (2005-2008), and subsequently, at the Collaborative Research Center "Information Structure" (SFB 632/D1) with participation of the University Potsdam and the Humboldt-University Berlin (since 2007). The original goal was to document and to formalize linguistic categories for all language resources of the linguistic collaborative research centers existing at the time. Later on, different applications in corpus linguistics, natural language processing and the Semantic Web have been developed.
Via its Sourceforge repository, OLiA provides Annotation Models for more than 75 different languages or language stages covering morphology, morphosyntax, phrase structure syntax, dependency syntax, aspects of semantics, as well as recent extensions to discourse, information structure and anaphora annotation. Additional OLiA annotation models externally hosted and/or provided include
OLiA is used in a number of projects and resources, including
This page enumerates the ontologies that are currently available. The ontologies are released under a Creative Commons Attribution licence CC-BY with reference to
Christian Chiarcos, and Maria Sukhareva (2015). OLiA - Ontologies of Linguistic Annotation, SWJ (Semantic Web Journal) 6(4): 379-386.
As further reference, see our ontology-relevant publications, and some remarks on the background of the OLiA ontologies. Besides the ontologies listed below, there are a number of experimental ontologies, including the OLiA Discourse Extensions, further annotation schemes, the linking with GOLD and the ISO TC37/SC4 Data Category Registry. For enquiries with respect to these lease contact Christian Chiarcos.The OLiA architecture is a set of modular OWL/DL ontologies with ontological models of annotation schemes (Annotation Models) on the one hand, an ontology of reference terms (Reference Model) on the other hand, and ontologies (Linking Models) that implement subClassOf relationships between them.
For convenient viewing the ontologies, we provide a partial static HTML export of the OLiA Reference Model, and the OLiA Discourse Extensions. Note that these do not include Annotation and Linking Models.
For interactive browsing the OLiA ontologies, we recommend Protégé, an ontology browser and editor (available both as web and java edition, the latter requires local installation). For browsing the ontologies copy and paste the URLs given below.
Over our Sourceforge site, we provide a static data dump as well as access to our current developers' version in the in the SVN repository. The developers' version which is also available via the official Purl URL differs from the static dump mostly in the number of annotation schemes covered. Until the next version number (we are still at 0.x), OLiA development is strictly downward compatible, i.e., new concepts may be added, but existing concepts are never deleted, but only marked as deprecated.
Module |
phenomenon |
OWL/DL models |
OLiA Reference Model for morphosyntax, morphology and syntax |
morphosyntax, morphology and syntax |
|
OLiA Reference Model for discourse structure |
discourse structure, discourse relations |
|
OLiA Reference Model for information structure |
information structure, information status, coreference |
|
OLiA System Ontology |
basic annotation data structures |
|
OLiA Top-Level Ontology |
top-level concepts of the OLiA Reference Model for morphosyntax, morphology and syntax |
tagset / NLP tool |
phenomenon |
languages |
OWL/DL models |
SFB632 annotation standard (Dipper et al. 2008) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
> 30 typologically different languages, including many African languages |
|
EAGLES recommendations |
morphosyntax |
11 EU languages, incl. Romance, Germanic, Greek and Irish |
|
Connexor dependency parser |
morphosyntax, morphology, dependency syntax |
10 European languages, incl. Romance, Germanic and Uralic languages |
|
MULTEXT-East |
morphosyntax, morphology |
15 mostly Eastern European languages, incl. Slavic, Romance, Uralic languages and Persian |
Annotation Model (common specifications)(*), Linking Model(*); Annotation Model (all languages)(*), see project page and below for individual languages |
IL-POSTS tagset |
morphosyntax |
languages of the Indian subcontinent |
|
AnnCorra |
morphosyntax, chunks |
languages of the Indian subcontinent |
|
IIIT tagset |
morphosyntax |
languages of the Indian subcontinent |
|
PROIEL |
morphosyntax, dependency syntax |
Older Indo-European languages (Greek, Latin, Gothic, Classical Armenian, Old Church Slavonic, others |
|
Universal Dependencies (POS) |
parts of speech |
various languages |
(for language-specific Annotation Model ABoxes see below) Annotation Model TBox(*), Linking Model |
Universal Dependencies (features) |
morphosyntax |
various languages |
(for language-specific Annotation Model ABoxes see below) Annotation Model TBox(*) |
Universal Dependencies (relations) |
dependency syntax |
various languages |
(for language-specific Annotation Model ABoxes see below) Annotation Model TBox(*), Linking Model |
tagset / NLP tool |
phenomenon |
OWL/DL models |
Brown corpus tagset |
morphosyntax |
|
Connexor dependency parser |
morphosyntax, morphology, dependency syntax |
|
EAGLES recommendations (English) |
morphosyntax |
|
GENIA corpus |
morphosyntax |
|
MULTEXT-East (English) |
morphosyntax |
Annotation Model(*), Linking Model(*) |
Penn Treebank |
morphosyntax |
|
|
syntax |
|
QTag |
morphosyntax |
|
Stanford dependency parser |
dependency syntac |
|
Susanne corpus |
morphosyntax |
|
English UD POS |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
English UD features |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
English UD dependencies |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset / NLP tool |
phenomenon |
OWL/DL models |
Connexor dependency parser |
morphosyntax, morphology, dependency syntax |
|
EAGLES recommendations (German) |
morphosyntax |
|
Morphisto |
morphology |
|
STTS |
morphosyntax |
|
TIGER/NEGRA |
morphology |
|
|
constituent syntax |
|
TreeTagger Chunker |
chunk labels |
|
German UD POS |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
German UD features |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
German UD dependencies |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
RFTagger |
morphosyntax, morphology |
t.b.a |
tagset/NLP tool |
language |
phenomenon |
OWL/DL models |
EAGLES recommendations |
Danish, Dutch, Swedish (and several non-Germanic languages) |
morphosyntax; inflectional morphology |
|
Danish UD POS |
Danish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Danish UD features |
Danish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Danish UD dependencies |
Danish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Alpino |
Dutch |
morphosyntax (POS) |
|
Lassy |
Dutch |
morphosyntax (POS) |
|
Dutch UD POS |
Dutch |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Dutch UD features |
Dutch |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Dutch UD dependencies |
Dutch |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Norwegian UD POS |
Norwegian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Norwegian UD features |
Norwegian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Norwegian UD dependencies |
Norwegian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Mamba lexical categories |
Swedish |
morphosyntax (POS) |
|
Mamba dependencies |
Swedish |
dependency syntax |
|
Stockholm—Umeå Corpus (SUC 2.0) |
Swedish |
morphosyntax |
|
Swedish UD POS |
Swedish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Swedish UD features |
Swedish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Swedish UD dependencies |
Swedish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Connexor |
Dutch, Swedish, Danish, Norwegian |
morphosyntax, morphology, dependency syntax |
|
SFB632 annotation standard |
Dutch (among other languages) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
PPCME2 POS tags |
Middle English |
morphosyntax |
|
YCOE POS tags |
Old English |
morphosyntax |
|
MENOTA (incomplete) |
Old Norse |
morphosyntax |
|
T-CODEX |
Old High German |
morphosyntax, syntax, information structure |
|
PROIEL |
Gothic (and others) |
morphosyntax, dependency syntax |
|
Gothic UD POS |
Gothic |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Gothic UD features |
Gothic |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Gothic UD dependencies |
Gothic |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset / NLP tool |
phenomenon |
OWL/DL models |
Uppsala corpus tagset |
morphosyntax, morphology |
|
Russian TreeTagger |
morphosyntax |
|
MULTEXT-East for Russian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Russian UD POS |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Russian UD features |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Russian UD dependencies |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset / NLP tool |
language |
phenomenon |
OWL/DL models |
MULTEXT-East |
Bulgarian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Bulgarian UD POS |
Bulgarian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Bulgarian UD features |
Bulgarian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Bulgarian UD dependencies |
Bulgarian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Croatian UD POS |
Croatian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Croatian UD features |
Croatian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Croatian UD dependencies |
Croatian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
MULTEXT-East |
Czech |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Czech UD POS |
Czech |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Czech UD features |
Czech |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Czech UD dependencies |
Czech |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Latvian UD POS |
Latvian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Latvian UD features |
Latvian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Latvian UD dependencies |
Latvian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
MULTEXT-East |
Macedonian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
MULTEXT-East |
Polish |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Polish UD POS |
Polish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Polish UD features |
Polish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Polish UD dependencies |
Polish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
MULTEXT-East |
Serbian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
MULTEXT-East |
Slovak |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
MULTEXT-East |
Slovene |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
MULTEXT-East |
Resian (Slovene spoken in Italy) |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Slovenian UD POS |
Slovene |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Slovenian UD features |
Slovene |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Slovenian UD dependencies |
Slovene |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
MULTEXT-East |
Ukrainian |
morphosyntax, morphology |
Annotation Model,(*) Linking Model(*) |
Ukrainian UD POS |
Ukrainian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Ukrainian UD features |
Ukrainian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Ukrainian UD dependencies |
Ukrainian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
PROIEL |
Old Church Slavonic (and others) |
morphosyntax, dependency syntax |
|
Old Church Slavonic UD POS |
Old Church Slavonic |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Old Church Slavonic UD features |
Old Church Slavonic |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Old Church Slavonic UD dependencies |
Old Church Slavonic |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset / NLP tool |
phenomenon |
OWL/DL models |
EAGLES recommendations |
morphosyntax |
|
French TreeTagger |
morphosyntax |
|
Le Monde corpus |
morphosyntax |
|
Connexor |
morphosyntax, morphology, dependency syntax |
|
SFB632 annotation standard |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) for Canadian French (among other languages, SFB 632, project D2) |
|
French UD POS |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
French UD features |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
French UD dependencies |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset |
language |
phenomenon |
OWL/DL models |
PROIEL |
Latin (and others) |
morphosyntax, dependency syntax |
|
Latin UD POS |
Latin |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Latin UD features |
Latin |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Latin UD dependencies |
Latin |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
EAGLES recommendations |
Catalan, Portuguese, Spanish |
morphosyntax |
|
Connexor |
Spanish, Italian |
morphosyntax, morphology, dependency syntax |
|
PAROLE Spanish/Catalan |
Spanish, Catalan |
morphosyntax, inflectional morphology |
|
Catalan UD POS |
Catalan |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Catalan UD features |
Catalan |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Catalan UD dependencies |
Catalan |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Galician UD POS |
Galician |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Galician UD features |
Galician |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Galician UD dependencies |
Galician |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Italian UD POS |
Italian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Italian UD features |
Italian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Italian UD dependencies |
Italian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Portuguese UD POS |
Portuguese, Brazilian Portuguese |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Portuguese UD features |
Portuguese, Brazilian Portuguese |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Portuguese UD dependencies |
Portuguese, Brazilian Portuguese |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Spanish UD POS |
Spanish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Spanish UD features |
Spanish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Spanish UD dependencies |
Spanish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
MULTEXT-East |
Romanian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Romanian UD POS |
Romanian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Romanian UD features |
Romanian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Romanian UD dependencies |
Romanian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset |
language |
phenomenon |
OWL/DL models |
MULTEXT-East |
Estonian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
Estonian UD POS |
Estonian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Estonian UD features |
Estonian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Estonian UD dependencies |
Estonian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Connexor |
Finnish |
morphosyntax, morphology, dependency syntax |
|
Finnish UD POS |
Finnish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Finnish UD features |
Finnish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Finnish UD dependencies |
Finnish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
MULTEXT-East |
Hungarian |
morphosyntax, morphology |
Annotation Model(*), Linking Model(*) |
SFB632 annotation standard |
Hungarian (among other languages) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
Hungarian UD POS |
Hungarian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Hungarian UD features |
Hungarian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Hungarian UD dependencies |
Hungarian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Kazakh UD POS |
Kazakh |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Kazakh UD features |
Kazakh |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Kazakh UD dependencies |
Kazakh |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Turkish POS tagset |
Turkish |
morphosyntax |
|
Turkish UD POS |
Turkish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Turkish UD features |
Turkish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Turkish UD dependencies |
Turkish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset |
language |
phenomenon |
OWL/DL models |
EAGLES recommendations |
Modern Greek, Irish (among other EU languages) |
morphosyntax |
|
SFB632 annotation standard |
Georgian, Modern Greek (among other languages) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
PROIEL |
Ancient Greek, Classical Armenian (and others) |
morphosyntax, dependency syntax |
|
Ancient Greek UD POS |
Ancient Greek |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Ancient Greek UD features |
Ancient Greek |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Ancient Greek UD dependencies |
Ancient Greek |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
EUSTagger |
Basque |
morphosyntax |
|
Basque UD POS |
Basque |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Basque UD features |
Basque |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Basque UD dependencies |
Basque |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Greek UD POS |
Greek |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Greek UD features |
Greek |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Greek UD dependencies |
Greek |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Irish UD POS |
Irish |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Irish UD features |
Irish |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Irish UD dependencies |
Irish |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset |
language |
phenomenon |
OWL/DL models |
Urdu EMILLE tagset |
Urdu |
morphosyntax, inflectional morphology |
|
Urdu tagset |
Urdu |
morphosyntax |
|
IL-POSTS tagset |
Bangla, Hindi, Marathi, Sanskrit |
morphosyntax, inflectional morphology |
|
AnnCorra |
Bangla, Hindi |
morphosyntax, chunks |
|
IIIT tagset |
Hindi, Marathi |
morphosyntax |
|
Hindi UD POS |
Hindi |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Hindi UD features |
Hindi |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Hindi UD dependencies |
Hindi |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
SFB632 annotation standard |
Konkani (among other, unrelated languages) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
MULTEXT-East |
Farsi (Persian) |
morphosyntax |
Annotation Model(*), Linking Model(*) |
Persian UD POS |
Farsi (Persian) |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Persian UD features |
Farsi (Persian) |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Persian UD dependencies |
Farsi (Persian) |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset |
language |
phenomenon |
OWL/DL models |
IL-POSTS tagset |
Kannada, Malayalam, Tamil, Telugu |
morphosyntax |
|
AnnCorra |
Telugu, Tamil |
morphosyntax, chunks |
|
IIIT tagset |
Telugu |
morphosyntax |
|
Tamil UD POS |
Tamil |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Tamil UD features |
Tamil |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Tamil UD dependencies |
Tamil |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
tagset |
language |
phenomenon |
OWL/DL models |
Dzongkha tagset |
Dzongkha |
morphosyntax |
|
SFB632 annotation standard |
Prinmi (among other, unrelated languages) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
Tübingen Tibetan Corpora |
Tibetan (Old Tibetan, Classical Tibetan, Balti, Ladakh) |
morphosyntax, morphology, syntax |
annotation scheme / corpus |
language |
phenomenon |
Annotation Model |
Penn Chinese Treebank |
Chinese |
morphosyntax |
|
Chinese UD POS |
Chinese |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Chinese UD features |
Chinese |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Chinese UD dependencies |
Chinese |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
SFB632 annotation standard |
Japanese (among other, unrelated languages) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
Japanese UD POS |
Japanese |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Japanese UD features |
Japanese |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Japanese UD dependencies |
Japanese |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Sejong Treebank Annotation Model |
Korean |
morphosyntax (POS) |
Annotation Model(*), Linking Model(*) |
Korean UD POS |
Korean |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Korean UD features |
Korean |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Korean UD dependencies |
Korean |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Vietnamese UD POS |
Vietnamese |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Vietnamese UD features |
Vietnamese |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Vietnamese UD dependencies |
Vietnamese |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
annotation scheme / corpus |
language |
phenomenon |
Annotation Model |
Amharic UD POS |
Amharic |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Amharic UD features |
Amharic |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Amharic UD dependencies |
Amharic |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Arabic tagset |
Arabic |
morphosyntax |
|
Arabic UD POS |
Arabic |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Arabic UD features |
Arabic |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Arabic UD dependencies |
Arabic |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
SFB632 annotation standard |
Chadic languages (including Guruntum, Tangale, Hausa) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
Coptic UD POS |
Coptic |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Coptic UD features |
Coptic |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Coptic UD dependencies |
Coptic |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Hausa Internet Corpus |
Hausa |
morphosyntax |
t.b.a |
Hebrew UD POS |
Hebrew |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Hebrew UD features |
Hebrew |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Hebrew UD dependencies |
Hebrew |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
Electronic Text Corpus of Sumerian Royal Inscriptions (ETSCRI) |
Sumerian |
morphology |
annotation scheme / corpus |
language |
phenomenon |
Annotation Model |
SFB632 annotation standard |
Gur and Kwa languages (including Aja, Dagbani, Buli, Byali, Ditammari, Fon, Foodo, Konni, Nateni, Waamma, Yom) |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
Chadic languages (including Guruntum, Tangale, Hausa) |
|||
Hausa Internet Corpus |
Hausa |
morphosyntax |
t.b.a |
annotation scheme / corpus |
language |
phenomenon |
Annotation Model |
SFB632 annotation standard |
Teribe, Yucatec Maya, Mawng, Niue |
parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) |
|
Indonesian UD POS |
Indonesian |
parts of speech |
language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model |
Indonesian UD features |
Indonesian |
morphosyntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*) |
Indonesian UD dependencies |
Indonesian |
dependency syntax |
language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model |
terminological repository |
original url |
local url |
Linking Model |
ISO TC37/SC4 Data Category Registry |
t.b.a |
t.b.a |
|
GOLD |
t.b.a |
t.b.a |
OLiA also serves as a conceptual backbone for the ontological reconstruction, resp. LLOD edition, of legacy thesauri of linguistic terminology. This includes the Bibliography of Lingistic Literature (BLL) Thesaurus. The Bibliography of Linguistic Literature (BLL) is one of the most important sources of bibliographical information for general linguistics as well as English, German and Romance linguistics, and the thesaurus organizes the keywords used for indexing linguistic literature since the 1970s.
terminological repository |
original url |
linking model |
BLL Thesaurus (SKOS) |
BLL Thesaurus (different formats available via content negotiation) |
none |
BLL Ontology (OWL) |
BLL Ontology (different formats available via content negotiation) |