CLARIN Tool Portal

Active filters:

Tool task: Lemmatisation

66 record(s) found

Search results

EvaLatin 2020 models for UDPipe 2 (2020-08-31)

2 resources

POS Tagger and Lemmatizer models for EvaLatin2020 data (https://github.com/CIRCSE/LT4HALA). The model documentation including performance can be found at https://ufal.mff.cuni.cz/udpipe/2/models#evalatin20_models . To use these models, you need UDPipe version at least 2.0, which you can download from https://ufal.mff.cuni.cz/udpipe/2 .

Use "EvaLatin 2020 models for UDPipe 2 (2020-08-31)"
The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0

2 resources

The model for lemmatisation of non-standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1210), the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1241), the RAPUT corpus (https://www.aclweb.org/anthology/L16-1513/) and the ReLDI-NormTagNER-sr corpus (http://hdl.handle.net/11356/1240), using the hrLex inflectional lexicon (http://hdl.handle.net/11356/1232). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The estimated F1 of the lemma annotations is ~97.54.

Use "The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0"
The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0

2 resources

The model for lemmatisation of non-standard Serbian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the SETimes.SR training corpus (http://hdl.handle.net/11356/1200), the ReLDI-NormTagNER-sr corpus (http://hdl.handle.net/11356/1240), the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1241), the hr500k training corpus (http://hdl.handle.net/11356/1210) and the RAPUT corpus (https://www.aclweb.org/anthology/L16-1513/), using the srLex inflectional lexicon (http://hdl.handle.net/11356/1233). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The estimated F1 of the lemma annotations is ~97.62.

Use "The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0"
WebLicht Lemmas DE

1 resources

WebLicht Easy Chain for Lemmatization (German). The pipeline makes use of WebLicht's TCF converter, the IMS tokenizer, and the IMS TreeTagger. WebLicht's Tundra can be used to visualize the result.
DARIAH DKPro-Wrapper: POS-Tagging und Lemmatization EN

1 resources

The DARIAH DKPro Wrapper is a wrapper for DKPro Core, a tool for linguistic annotation.
WebLicht Lemmas EN

1 resources

WebLicht Easy Chain for Lemmatization (English). The pipeline makes use of WebLicht's TCF converter, the Stanford tokenizer, the Jitar POS Tagger, and the lemmatizer service from MorphAdorner. WebLicht's Tundra can be used to visualize the result.
int-pie

2 resources

The PIE tagger with custom modifications by the Dutch Language Institute (INT).
Glem

1 resources

GLEM is a lemmatizer for Ancient Greek.

Use "Glem"
Frog

1 resources

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. It performs automatic linguistic enrichment such as part of speech tagging, lemmatisation, named entity recognition, shallow parsing, dependency parsing and morphological analysis. All NLP modules are based on TiMBL.

Use "Frog"
GaLAHaD

2 resources

GaLAHaD (Generating Linguistic Annotations for Historical Dutch) allows linguists to compare taggers, tag their own corpora, evaluate the results and export their tagged documents.

Use "GaLAHaD"

Result filters

Metadata provider

Language

Resource type

Type of tool

Tool task

Field of study

Availability

Organisation

Project

Keywords

Active filters:

Search results

EvaLatin 2020 models for UDPipe 2 (2020-08-31)

The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0

The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0

WebLicht Lemmas DE

DARIAH DKPro-Wrapper: POS-Tagging und Lemmatization EN

WebLicht Lemmas EN

int-pie

Glem

Frog

GaLAHaD

Result filters

Metadata provider

Language

Resource type

Type of tool

Tool task

Field of study

Availability

Organisation

Project

Keywords

Active filters:

Search results

Session recording