CLARIN Tool Portal

The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Croatian 2.1

3 resources

This model for morphosyntactic annotation of non-standard Croatian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the hr500k training corpus (http://hdl.handle.net/11356/1792) and the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1793), using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1790). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~92.49. The difference to the previous version of the model is that this version uses the new version of Croatian word embeddings and is trained on a combination of two datasets (hr500k, ReLDI-NormTagNER-hr).

Use "The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Croatian 2.1"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Slovenian 1.0

3 resources

This model for morphosyntactic annotation of non-standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the ssj500k training corpus (http://hdl.handle.net/11356/1210) and the Janes-Tag corpus (http://hdl.handle.net/11356/1238), using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~96.14.

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Slovenian 1.0"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Croatian 1.0

3 resources

This model for morphosyntactic annotation of non-standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1210), the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1241), the RAPUT corpus (https://www.aclweb.org/anthology/L16-1513/) and the ReLDI-NormTagNER-sr corpus (http://hdl.handle.net/11356/1240), using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1205). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~95.11.

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Croatian 1.0"

The CLASSLA-Stanza model for morphosyntactic annotation of spoken Slovenian 2.2

3 resources

This model for morphosyntactic annotation of spoken Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SST treebank of spoken Slovenian (https://github.com/UniversalDependencies/UD_Slovenian-SST) combined with the SUK training corpus (http://hdl.handle.net/11356/1959) and using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1791) that were expanded with the MaCoCu-sl Slovene web corpus (http://hdl.handle.net/11356/1517). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~96.76.

Use "The CLASSLA-Stanza model for morphosyntactic annotation of spoken Slovenian 2.2"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian 1.3

3 resources

This model for morphosyntactic annotation of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the ssj500k training corpus (http://hdl.handle.net/11356/1210) and using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~97.06. The difference to the previous version of the model is that the model now also includes the Sloleks inflectional lexicon.

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian 1.3"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Bulgarian 1.1

3 resources

This model for morphosyntactic annotation of standard Bulgarian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the BulTreeBank training corpus (http://hdl.handle.net/11495/D93F-C6E9-65D9-2) and using the CoNLL2017 word embeddings (http://hdl.handle.net/11234/1-1989). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~96.8. The difference to the previous version of the model is that the pre-trained embeddings are limited to 250 thousand entries and adapted to the new code base.

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Bulgarian 1.1"

The CLASSLA-Stanza model for morphosyntactic annotation of standard Croatian 2.1

3 resources

The model for morphosyntactic annotation of standard Croatian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the hr500k training corpus (http://hdl.handle.net/11356/1792) and using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1790). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~94.87. The difference to the previous version of the model is that this version was trained using the new version of the hr500k corpus and the new version of the Croatian word embeddings.

Use "The CLASSLA-Stanza model for morphosyntactic annotation of standard Croatian 2.1"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Serbian 1.2

3 resources

The model for morphosyntactic annotation of standard Serbian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the SETimes.SR training corpus (http://hdl.handle.net/11356/1200) and using the CLARIN.SI-embed.sr word embeddings (http://hdl.handle.net/11356/1206). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~95.2. The difference to the previous version of the model is that the pre-trained embeddings are limited to 250 thousand entries and adapted to the new code base.

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Serbian 1.2"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian

3 resources

The model for morphosyntactic annotation of standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1183) and using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1205). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~94.1.

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian 1.1

3 resources

The model for morphosyntactic annotation of standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1183) and using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1205). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~94.1. The difference to the previous version of the model is that now the whole XPOS tag is predicted and not specific characters, as was the case in stanfordnlp, which resulted in illegal XPOS tags (and slightly decreased performance).

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian 1.1"

Result filters

Metadata provider

Language

Resource type

Tool task

Field of study

Availability

Organisation

Project

Keywords

Active filters:

Search results

The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Croatian 2.1

The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Slovenian 1.0

The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Croatian 1.0

The CLASSLA-Stanza model for morphosyntactic annotation of spoken Slovenian 2.2

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian 1.3

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Bulgarian 1.1

The CLASSLA-Stanza model for morphosyntactic annotation of standard Croatian 2.1

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Serbian 1.2

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian 1.1

Result filters

Metadata provider

Language

Resource type

Tool task

Field of study

Availability

Organisation

Project

Keywords

Active filters:

Search results

Session recording