OLiA ontologies

 

The Ontologies of Linguistic Annotation (OLiA) are a repository of linguistic data categories used for

They formalize application-specific terms (e.g., an annotation scheme) as OWL2/DL ontologies, and provide a declarative linking with an application-independent Reference Model that then serves as a mediator to different community-maintained terminology repositories such as GOLD and ISOcat. In this function, they will serve as a central hub for linguistic data categories within the emerging Linguistic Linked Open Data cloud. OLiA provides Annotation Models for linguistic annotations and NLP tools for more than 85 languages together with their linking to a common Reference Model that provides their respective terms and concepts.

The OLiA ontologies are being developed at the Applied Computational Linguistics (ACoLi) Lab at the Goethe University Frankfurt, Germany. Earlier development took place in the context of Collaborative Research Center "Linguistic Data Structures", (SFB 441/C2) in a collaborative effort of the universities of Tübingen, Hamburg, Potsdam, HU Berlin (2005-2008), and subsequently, at the Collaborative Research Center "Information Structure" (SFB 632/D1) with participation of the University Potsdam and the Humboldt-University Berlin (since 2007). The original goal was to document and to formalize linguistic categories for all language resources of the linguistic collaborative research centers existing at the time. Later on, different applications in corpus linguistics, natural language processing and the Semantic Web have been developed.

Via its Sourceforge repository, OLiA provides 42 Annotation Models for more than 75 different languages or language stages covering morphology, morphosyntax, phrase structure syntax, dependency syntax, aspects of semantics, as well as recent extensions to discourse, information structure and anaphora annotation. Additional OLiA annotation models externally hosted and/or provided include

Below, links to external resources are marked with (*).

OLiA is used in a number of projects and resources, including

This page enumerates the ontologies that are currently available. The ontologies are released under a Creative Commons Attribution licence CC-BY (Reference Model: CC-BY-SA) with reference to

    Christian Chiarcos, and Maria Sukhareva (2015). OLiA - Ontologies of Linguistic Annotation, SWJ (Semantic Web Journal) 6(4): 379-386.

As further reference, see our ontology-relevant publications, and some remarks on the background of the OLiA ontologies. Besides the ontologies listed below, there are a number of experimental ontologies, including the OLiA Discourse Extensions, further annotation schemes, the linking with GOLD and the ISO TC37/SC4 Data Category Registry. For enquiries with respect to these lease contact Christian Chiarcos.

The OLiA architecture is a set of modular OWL/DL ontologies with ontological models of annotation schemes (Annotation Models) on the one hand, an ontology of reference terms (Reference Model) on the other hand, and ontologies (Linking Models) that implement subClassOf relationships between them.

For convenient viewing the ontologies, we provide a partial static HTML export of the OLiA Reference Model, and the OLiA Discourse Extensions. Note that these do not include Annotation and Linking Models.

For interactive browsing the OLiA ontologies, we recommend Protégé, an ontology browser and editor (available both as web and java edition, the latter requires local installation). For browsing the ontologies copy and paste the URLs given below.

Over our Sourceforge site, we provide a static data dump as well as access to our current developers' version in the in the SVN repository. The developers' version which is also available via the official Purl URL differs from the static dump mostly in the number of annotation schemes covered. Until the next version number (we are still at 0.x), OLiA development is strictly downward compatible, i.e., new concepts may be added, but existing concepts are never deleted, but only marked as deprecated.

 

Overview

 

OLiA Reference Model and system ontologies

Module

phenomenon

OWL/DL models

OLiA Reference Model for morphosyntax, morphology and syntax

morphosyntax, morphology and syntax

http://purl.org/olia/olia.owl

OLiA Reference Model for discourse structure

discourse structure, discourse relations

see discourse extensions

OLiA Reference Model for information structure

information structure, information status, coreference

see discourse extensions

OLiA System Ontology

basic annotation data structures

http://purl.org/olia/system.owl

OLiA Top-Level Ontology

top-level concepts of the OLiA Reference Model for morphosyntax, morphology and syntax

http://purl.org/olia/olia-top.owl

 

Multilingual Annotation Models for morphological, morphosyntactic and syntactic annotation

tagset / NLP tool

phenomenon

languages

OWL/DL models

SFB632 annotation standard (Dipper et al. 2008)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

> 30 typologically different languages, including many African languages

Annotation Model, Linking Model

EAGLES recommendations
(Leech and Wilson 1996)

morphosyntax

11 EU languages, incl. Romance, Germanic, Greek and Irish

Annotation Model, Linking Model

Connexor dependency parser

morphosyntax, morphology, dependency syntax

10 European languages, incl. Romance, Germanic and Uralic languages

Annotation Model, Linking Model

MULTEXT-East

morphosyntax, morphology

15 mostly Eastern European languages, incl. Slavic, Romance, Uralic languages and Persian

Annotation Model (common specifications)(*), Linking Model(*); Annotation Model (all languages)(*), see project page and below for individual languages

IL-POSTS tagset
Baskaran et al. (2008)

morphosyntax

languages of the Indian subcontinent

Annotation Model, Linking Model

AnnCorra
Bharati et al. (2006)

morphosyntax, chunks

languages of the Indian subcontinent

Annotation Model, Linking Model

IIIT tagset
IIT (2007)

morphosyntax

languages of the Indian subcontinent

Annotation Model, Linking Model

PROIEL

morphosyntax, dependency syntax

Older Indo-European languages (Greek, Latin, Gothic, Classical Armenian, Old Church Slavonic, others

Annotation Model, Linking Model

Universal Dependencies (POS)

parts of speech

various languages

(for language-specific Annotation Model ABoxes see below) Annotation Model TBox(*), Linking Model

Universal Dependencies (features)

morphosyntax

various languages

(for language-specific Annotation Model ABoxes see below) Annotation Model TBox(*)

Universal Dependencies (relations)

dependency syntax

various languages

(for language-specific Annotation Model ABoxes see below) Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of English

tagset / NLP tool

phenomenon

OWL/DL models

Brown corpus tagset

morphosyntax

Annotation Model, Linking Model

Connexor dependency parser

morphosyntax, morphology, dependency syntax

Annotation Model, Linking Model

EAGLES recommendations (English)
(Leech and Wilson 1996)

morphosyntax

Annotation Model, Linking Model

GENIA corpus

morphosyntax

Annotation Model, Linking Model

MULTEXT-East (English)

morphosyntax

Annotation Model(*), Linking Model(*)

Penn Treebank

morphosyntax

Annotation Model, Linking Model

 

syntax

Annotation Model, Linking Model

QTag

morphosyntax

Annotation Model, Linking Model

Stanford dependency parser

dependency syntac

Annotation Model, Linking Model

Susanne corpus

morphosyntax

Annotation Model, Linking Model

English UD POS

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

English UD features

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

English UD dependencies

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of German

tagset / NLP tool

phenomenon

OWL/DL models

Connexor dependency parser

morphosyntax, morphology, dependency syntax

Annotation Model, Linking Model

EAGLES recommendations (German)
(Leech and Wilson 1996)

morphosyntax

Annotation Model, Linking Model

Morphisto

morphology

Annotation Model, Linking Model

STTS

morphosyntax

Annotation Model, Linking Model

TIGER/NEGRA

morphology

Annotation Model, Linking Model

 

constituent syntax

Annotation Model, Linking Model

TreeTagger Chunker

chunk labels

Annotation Model, Linking Model

German UD POS

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

German UD features

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

German UD dependencies

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

RFTagger

morphosyntax, morphology

t.b.a

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of other Germanic languages

tagset/NLP tool

language

phenomenon

OWL/DL models

EAGLES recommendations
(Leech and Wilson 1996)

Danish, Dutch, Swedish (and several non-Germanic languages)

morphosyntax; inflectional morphology

Annotation Model, Linking Model

Danish UD POS

Danish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Danish UD features

Danish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Danish UD dependencies

Danish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Alpino

Dutch

morphosyntax (POS)

Annotation Model, Linking Model

Lassy

Dutch

morphosyntax (POS)

Annotation Model, Linking Model

Dutch UD POS

Dutch

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Dutch UD features

Dutch

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Dutch UD dependencies

Dutch

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Norwegian UD POS

Norwegian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Norwegian UD features

Norwegian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Norwegian UD dependencies

Norwegian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Mamba lexical categories

Swedish

morphosyntax (POS)

Annotation Model, Linking Model

Mamba dependencies

Swedish

dependency syntax

Annotation Model, Linking Model

Stockholm—Umeĺ Corpus (SUC 2.0)

Swedish

morphosyntax

Annotation Model, Linking Model

Swedish UD POS

Swedish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Swedish UD features

Swedish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Swedish UD dependencies

Swedish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Connexor

Dutch, Swedish, Danish, Norwegian

morphosyntax, morphology, dependency syntax

Annotation Model, Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

Dutch (among other languages)
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

PPCME2 POS tags

Middle English

morphosyntax

Annotation Model, Linking Model

YCOE POS tags

Old English

morphosyntax

Annotation Model, Linking Model

MENOTA (incomplete)

Old Norse

morphosyntax

Annotation Model, Linking Model

T-CODEX

Old High German

morphosyntax, syntax, information structure

Annotation Model, Linking Model

PROIEL

Gothic (and others)

morphosyntax, dependency syntax

Annotation Model, Linking Model

Gothic UD POS

Gothic

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Gothic UD features

Gothic

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Gothic UD dependencies

Gothic

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of Russian

tagset / NLP tool

phenomenon

OWL/DL models

Uppsala corpus tagset

morphosyntax, morphology

Annotation Model, Linking Model

Russian TreeTagger
(Serge Sharoff)

morphosyntax

Annotation Model, Linking Model

MULTEXT-East for Russian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Russian UD POS

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Russian UD features

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Russian UD dependencies

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphosyntactic and syntactic annotation of other Slavic and Baltic languages

tagset / NLP tool

language

phenomenon

OWL/DL models

MULTEXT-East

Bulgarian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Bulgarian UD POS

Bulgarian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Bulgarian UD features

Bulgarian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Bulgarian UD dependencies

Bulgarian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Croatian UD POS

Croatian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Croatian UD features

Croatian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Croatian UD dependencies

Croatian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

MULTEXT-East

Czech

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Czech UD POS

Czech

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Czech UD features

Czech

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Czech UD dependencies

Czech

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Latvian UD POS

Latvian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Latvian UD features

Latvian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Latvian UD dependencies

Latvian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

MULTEXT-East

Macedonian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

MULTEXT-East

Polish

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Polish UD POS

Polish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Polish UD features

Polish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Polish UD dependencies

Polish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

MULTEXT-East

Serbian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

MULTEXT-East

Slovak

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

MULTEXT-East

Slovene

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

MULTEXT-East

Resian (Slovene spoken in Italy)

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Slovenian UD POS

Slovene

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Slovenian UD features

Slovene

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Slovenian UD dependencies

Slovene

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

MULTEXT-East

Ukrainian

morphosyntax, morphology

Annotation Model,(*) Linking Model(*)

Ukrainian UD POS

Ukrainian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Ukrainian UD features

Ukrainian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Ukrainian UD dependencies

Ukrainian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

PROIEL

Old Church Slavonic (and others)

morphosyntax, dependency syntax

Annotation Model, Linking Model

Old Church Slavonic UD POS

Old Church Slavonic

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Old Church Slavonic UD features

Old Church Slavonic

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Old Church Slavonic UD dependencies

Old Church Slavonic

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of French

tagset / NLP tool

phenomenon

OWL/DL models

EAGLES recommendations
(Leech and Wilson 1996)

morphosyntax

Annotation Model, Linking Model

French TreeTagger
(Achim Stein)

morphosyntax

Annotation Model

Le Monde corpus
(Abeillé et al. 2000)

morphosyntax

Annotation Model

Connexor

morphosyntax, morphology, dependency syntax

Annotation Model, Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure) for Canadian French (among other languages, SFB 632, project D2)

Annotation Model, Linking Model

French UD POS

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

French UD features

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

French UD dependencies

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of other Romance and Italic languages

tagset

language

phenomenon

OWL/DL models

PROIEL

Latin (and others)

morphosyntax, dependency syntax

Annotation Model, Linking Model

Latin UD POS

Latin

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Latin UD features

Latin

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Latin UD dependencies

Latin

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

EAGLES recommendations
(Leech and Wilson 1996)

Catalan, Portuguese, Spanish

morphosyntax

Annotation Model, Linking Model

Connexor

Spanish, Italian

morphosyntax, morphology, dependency syntax

Annotation Model, Linking Model

PAROLE Spanish/Catalan
(http://nlp.lsi.upc.edu/freeling)

Spanish, Catalan

morphosyntax, inflectional morphology

Annotation Model

Catalan UD POS

Catalan

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Catalan UD features

Catalan

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Catalan UD dependencies

Catalan

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Galician UD POS

Galician

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Galician UD features

Galician

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Galician UD dependencies

Galician

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Italian UD POS

Italian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Italian UD features

Italian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Italian UD dependencies

Italian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Portuguese UD POS

Portuguese, Brazilian Portuguese

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Portuguese UD features

Portuguese, Brazilian Portuguese

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Portuguese UD dependencies

Portuguese, Brazilian Portuguese

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Spanish UD POS

Spanish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Spanish UD features

Spanish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Spanish UD dependencies

Spanish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

MULTEXT-East

Romanian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Romanian UD POS

Romanian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Romanian UD features

Romanian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Romanian UD dependencies

Romanian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphological, morphosyntactic and syntactic annotation of Uralic and Altaic languages

tagset

language

phenomenon

OWL/DL models

MULTEXT-East

Estonian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

Estonian UD POS

Estonian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Estonian UD features

Estonian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Estonian UD dependencies

Estonian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Connexor

Finnish

morphosyntax, morphology, dependency syntax

Annotation Model, Linking Model

Finnish UD POS

Finnish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Finnish UD features

Finnish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Finnish UD dependencies

Finnish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

MULTEXT-East

Hungarian

morphosyntax, morphology

Annotation Model(*), Linking Model(*)

SFB632 annotation standard
(Dipper et al. 2008)

Hungarian (among other languages)
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

Hungarian UD POS

Hungarian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Hungarian UD features

Hungarian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Hungarian UD dependencies

Hungarian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Kazakh UD POS

Kazakh

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Kazakh UD features

Kazakh

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Kazakh UD dependencies

Kazakh

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Turkish POS tagset
(Oflazer et al. 2003)

Turkish

morphosyntax

Annotation Model

Turkish UD POS

Turkish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Turkish UD features

Turkish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Turkish UD dependencies

Turkish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphosyntactic annotation of other European languages

tagset

language

phenomenon

OWL/DL models

EAGLES recommendations
(Leech and Wilson 1996)

Modern Greek, Irish (among other EU languages)

morphosyntax

Annotation Model, Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

Georgian, Modern Greek (among other languages)
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

PROIEL

Ancient Greek, Classical Armenian (and others)

morphosyntax, dependency syntax

Annotation Model, Linking Model

Ancient Greek UD POS

Ancient Greek

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Ancient Greek UD features

Ancient Greek

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Ancient Greek UD dependencies

Ancient Greek

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

EUSTagger
Ezeiza et al. (1998)

Basque

morphosyntax

Annotation Model

Basque UD POS

Basque

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Basque UD features

Basque

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Basque UD dependencies

Basque

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Greek UD POS

Greek

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Greek UD features

Greek

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Greek UD dependencies

Greek

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Irish UD POS

Irish

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Irish UD features

Irish

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Irish UD dependencies

Irish

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphosyntactic annotation of Indoiranian languages

tagset

language

phenomenon

OWL/DL models

Urdu EMILLE tagset
Hardie (2003, 2004)

Urdu

morphosyntax, inflectional morphology

Annotation Model, Linking Model

Urdu tagset
Sajjad (2007)

Urdu

morphosyntax

Annotation Model, Linking Model

IL-POSTS tagset
Baskaran et al. (2008)

Bangla, Hindi, Marathi, Sanskrit

morphosyntax, inflectional morphology

Annotation Model, Linking Model

AnnCorra
Bharati et al. (2006)

Bangla, Hindi

morphosyntax, chunks

Annotation Model, Linking Model

IIIT tagset
IIIT (2007)

Hindi, Marathi

morphosyntax

Annotation Model, Linking Model

Hindi UD POS

Hindi

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Hindi UD features

Hindi

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Hindi UD dependencies

Hindi

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

Konkani (among other, unrelated languages)
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

MULTEXT-East

Farsi (Persian)

morphosyntax

Annotation Model(*), Linking Model(*)

Persian UD POS

Farsi (Persian)

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Persian UD features

Farsi (Persian)

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Persian UD dependencies

Farsi (Persian)

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the morphosyntactic annotation of Dravidian languages

tagset

language

phenomenon

OWL/DL models

IL-POSTS tagset
Baskaran et al. (2008)

Kannada, Malayalam, Tamil, Telugu

morphosyntax

Annotation Model, Linking Model

AnnCorra
Bharati et al. (2006)

Telugu, Tamil

morphosyntax, chunks

Annotation Model, Linking Model

IIIT tagset
IIIT (2007)

Telugu

morphosyntax

Annotation Model, Linking Model

Tamil UD POS

Tamil

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Tamil UD features

Tamil

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Tamil UD dependencies

Tamil

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Annotation Models for the morphological, morphosyntactic and syntactic annotation of Tibeto-Burman languages

tagset

language

phenomenon

OWL/DL models

Dzongkha tagset
(Chungku et al. 2010)

Dzongkha

morphosyntax

Annotation Model, Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

Prinmi (among other, unrelated languages)
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

Tübingen Tibetan Corpora
(Wagner & Zeisler 2004)

Tibetan (Old Tibetan, Classical Tibetan, Balti, Ladakh)

morphosyntax, morphology, syntax

Annotation Model

 

Annotation Models for East Asian languages

annotation scheme / corpus

language

phenomenon

Annotation Model

Penn Chinese Treebank
(Xia 2000)

Chinese

morphosyntax

Annotation Model

Chinese UD POS

Chinese

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Chinese UD features

Chinese

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Chinese UD dependencies

Chinese

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

Japanese (among other, unrelated languages)
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

Japanese UD POS

Japanese

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Japanese UD features

Japanese

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Japanese UD dependencies

Japanese

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Sejong Treebank Annotation Model

Korean

morphosyntax (POS)

Annotation Model(*), Linking Model(*)

Korean UD POS

Korean

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Korean UD features

Korean

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Korean UD dependencies

Korean

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Vietnamese UD POS

Vietnamese

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Vietnamese UD features

Vietnamese

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Vietnamese UD dependencies

Vietnamese

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for Afroasiatic languages

annotation scheme / corpus

language

phenomenon

Annotation Model

Amharic UD POS

Amharic

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Amharic UD features

Amharic

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Amharic UD dependencies

Amharic

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Arabic tagset
(Khoja 2001)

Arabic

morphosyntax

Annotation Model

Arabic UD POS

Arabic

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Arabic UD features

Arabic

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Arabic UD dependencies

Arabic

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

SFB632 annotation standard
(Dipper et al. 2008)

Chadic languages (including Guruntum, Tangale, Hausa)
(SFB 632, project B2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

Coptic UD POS

Coptic

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Coptic UD features

Coptic

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Coptic UD dependencies

Coptic

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

Hausa Internet Corpus
Chiarcos et al. (2011)

Hausa

morphosyntax

t.b.a

Hebrew UD POS

Hebrew

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Hebrew UD features

Hebrew

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Hebrew UD dependencies

Hebrew

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

Annotation Models for the languages of Subsaharic Africa

annotation scheme / corpus

language

phenomenon

Annotation Model

SFB632 annotation standard
(Dipper et al. 2008)

Gur and Kwa languages (including Aja, Dagbani, Buli, Byali, Ditammari, Fon, Foodo, Konni, Nateni, Waamma, Yom)
(SFB 632, project B1)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

Chadic languages (including Guruntum, Tangale, Hausa)
(SFB 632, project B2)

Hausa Internet Corpus
Chiarcos et al. (2011)

Hausa

morphosyntax

t.b.a

 

Annotation Models for indigenous languages of the Americas, Australia and the Pacific

annotation scheme / corpus

language

phenomenon

Annotation Model

SFB632 annotation standard
(Dipper et al. 2008)

Teribe, Yucatec Maya, Mawng, Niue
(SFB 632, project D2)

parts of speech, glosses, chunk labels, grammatical functions (phonology, information structure)

Annotation Model, Linking Model

Indonesian UD POS

Indonesian

parts of speech

language-specific Annotation Model ABox(*), Annotation Model TBox*, Linking Model

Indonesian UD features

Indonesian

morphosyntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*)

Indonesian UD dependencies

Indonesian

dependency syntax

language-specific Annotation Model ABox(*), Annotation Model TBox(*), Linking Model

 

External Reference Models

terminological repository

original url

local url

Linking Model

ISO TC37/SC4 Data Category Registry

http://www.isocat.org

t.b.a

t.b.a

GOLD

http://linguistics-ontology.org

t.b.a

t.b.a

 

Other applications

BLL - Bibliography of Linguistic Literature Thesaurus

OLiA also serves as a conceptual backbone for the ontological reconstruction, resp. LLOD edition, of legacy thesauri of linguistic terminology. This includes the Bibliography of Lingistic Literature (BLL) Thesaurus. The Bibliography of Linguistic Literature (BLL) is one of the most important bibliographical sources of bibliographical information for general linguistics as well as English, German and Romance linguistics, and the thesaurus organizes the keywords used for indexing linguistic literature since the 1950s.

terminological repository

original url

linking model

BLL Thesaurus (SKOS)

BLL Thesaurus (different formats available via content negotiation)

none

BLL Ontology (OWL)

BLL Ontology (different formats available via content negotiation)

bll-link.rdf