CLARIN Tool Portal

WebStylo

2 resources

Web based, open stylometry system based on Multilevel Text Analysis. Runs cluto and stylo (R system) clusterisation methods. Based on Natural Language Processing Workflow engine (included in the distribution).

Use "WebStylo"

Toposław 2 (2016-05-31)

3 resources

Toposław 2 is an editor of multi-world unit inflection lexicons.

Use "Toposław 2 (2016-05-31)"

ENIAM

4 resources

ENIAM: Categorial Syntactic-Semantic Parser for Polish

Use "ENIAM"

Liner2.6 model NER NKJP

3 resources

Liner2.6 NER NKJP model The package contains a pre-trained Liner2 (https://github.com/CLARIN-PL/Liner2) model for recognition named entities according to NKJP guidelines. The model was trained on the NKJP corpus (http://nkjp.pl/) and evaluated in the PolEval 2018 Task 2 (http://poleval.pl/tasks/). The model won third place with the following results: Exact — 0.778, Overlap — 0.818, Final — 0.810. References: * NKJP corpus in TEI format — http://clip.ipipan.waw.pl/NationalCorpusOfPolish?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.2.tar.gz * PolEval 2018 Task 2 evaluation corpus — http://mozart.ipipan.waw.pl/~axw/poleval2018/

Use "Liner2.6 model NER NKJP"

Universal Dependencies 2.10 models for UDPipe 2 (2022-07-11)

2 resources

Tokenizer, POS Tagger, Lemmatizer and Parser models for 123 treebanks of 69 languages of Universal Depenencies 2.10 Treebanks, created solely using UD 2.10 data (https://hdl.handle.net/11234/1-4758). The model documentation including performance can be found at https://ufal.mff.cuni.cz/udpipe/2/models#universal_dependencies_210_models . To use these models, you need UDPipe version 2.0, which you can download from https://ufal.mff.cuni.cz/udpipe/2 .

Use "Universal Dependencies 2.10 models for UDPipe 2 (2022-07-11)"

Tests for Word Embeddings

4 resources

Evaluation tools (WBST, HWBST, EWBST) for word embedding models used to assess and compare the usefulness of different word embeddings

Use "Tests for Word Embeddings"

CUBBITT Translation Models (en-pl) (v1.0)

3 resources

CUBBITT En-Pl translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on newstest2020 (BLEU): en->pl: 12.3 pl->en: 20.0 (Evaluated using multeval: https://github.com/jhclark/multeval)

Use "CUBBITT Translation Models (en-pl) (v1.0)"

CorPipe 23 multilingual CorefUD 1.1 model (corpipe23-corefud1.1-231206)

2 resources

The `corpipe23-corefud1.1-231206` is a `mT5-large`-based multilingual model for coreference resolution usable in CorPipe 23 (https://github.com/ufal/crac2023-corpipe). It is released under the CC BY-NC-SA 4.0 license. The model is language agnostic (no _corpus id_ on input), so it can be used to predict coreference in any `mT5` language (for zero-shot evaluation, see the paper). However, note that the empty nodes must be present already on input, they are not predicted (the same settings as in the CRAC23 shared task).

Use "CorPipe 23 multilingual CorefUD 1.1 model (corpipe23-corefud1.1-231206)"

Keyword Extractor

1 resources

Tool for extracting key phrases for text, using TextRank algorithm.

Use "Keyword Extractor"

Cinderella - tool for Clustering and Classifications of Texts in Polish

2 resources

System for clustering and classifications of Texts in Polish. Source code.

Use "Cinderella - tool for Clustering and Classifications of Texts in Polish"

Result filters

Metadata provider

Language

Resource type

Tool task

Availability

Project

Keywords

Active filters:

Search results

WebStylo

Toposław 2 (2016-05-31)

ENIAM

Liner2.6 model NER NKJP

Universal Dependencies 2.10 models for UDPipe 2 (2022-07-11)

Tests for Word Embeddings

CUBBITT Translation Models (en-pl) (v1.0)

CorPipe 23 multilingual CorefUD 1.1 model (corpipe23-corefud1.1-231206)

Keyword Extractor

Cinderella - tool for Clustering and Classifications of Texts in Polish

Result filters

Metadata provider

Language

Resource type

Tool task

Availability

Project

Keywords

Active filters:

Search results

Session recording