CLARIN Tool Portal

698 record(s) found

Search results

Dependency tree extraction tool STARK 3.0

2 resources

STARK is a highly customizable tool designed for extracting different types of syntactic structures (trees) from parsed corpora (treebanks), aimed at corpus-driven linguistic investigations of syntactic and lexical phenomena of various kinds. It takes a treebank in the CONLL-U format as input and returns a list of all relevant dependency trees with frequency information and other useful statistics, such as the strength of association between the nodes of a tree, or its significance in comparison to another treebank. For installation, execution and the description of various user-defined parameter settings, see the official project page at: https://github.com/clarinsi/STARK. An online demo version of the tool is available at: https://orodja.cjvt.si/stark/. In comparison to v2, this version introduces several new features and improvements, such as the ability to extract very long trees, ignore irrelevant relations, process multi-root treebanks, or handle special operators when querying.

Use "Dependency tree extraction tool STARK 3.0"
Spellchecking app for Android (22.10)

2 resources

ENGLISH: This is an Android application which provides spell and grammar checking for Icelandic. The app is available on Google Play Store under the name "Réttritun". The source code is written in Kotlin and could be used as a base for Android app projects that need an Icelandic spell checking service. The app uses the spell checker service as impelmented by Miðeind ehf. in the Language Technology Program. See also: http://hdl.handle.net/20.500.12537/266 and http://hdl.handle.net/20.500.12537/270 ÍSLENSKA: Réttritun er Android app sem býður upp á málrýni fyrir íslensku. Appið er hægt að nálgast á Google Play Store. Kóðinn er skrifaður í Kotlin og gæti verið notaður sem grunnur fyrir önnur Android app verkefni sem vilja nýta málrýni fyrir íslensku. Appið notar málrýniþjónustu eins og þá sem Miðeind ehf. þróaði innan Máltækniáætlunarinnar. Sjá: http://hdl.handle.net/20.500.12537/266 and http://hdl.handle.net/20.500.12537/270

Use "Spellchecking app for Android (22.10)"
GreynirPackage v3.5.1

3 resources

GreynirPackage is a Python 3 package for working with Icelandic natural language text. Greynir can parse text into sentence trees, find lemmas, inflect noun phrases, assign part-of-speech tags and much more. Greynir's sentence trees can inter alia be used to extract information from text, for instance about people, titles, entities, facts, actions and opinions. Greynir uses the Tokenizer package, by the same authors, to tokenize text. More information at https://github.com/mideind/GreynirPackage and detailed documentation at https://greynir.is/doc/. GreynirPackage er Python 3 pakki sem vinnur með íslenskan texta. Greynir þáttar texta í setningar, lemmar og markar texta, beygir nafnliði og margt fleira. Hægt er að nýta þáttunartrén sem tólið býr til í þeim tilgangi að draga upplýsingar út úr texta, til dæmis um manneskjur, starfstitla, sérnafnaeiningar, staðreyndir, atburði og skoðanir. Greynir notar Tokenizer-pakkann, eftir sömu höfunda, til að tilreiða texta. Frekari upplýsingar má finna á https://github.com/mideind/GreynirPackage og ítarlega skjölun (á ensku) á https://greynir.is/doc/.

Use "GreynirPackage v3.5.1"
Tiro TTS web service (22.06)

2 resources

Tiro TTS is a text-to-speech (TTS) API web service that works with various TTS backends. By default, it expects a FastSpeech2+Melgan+Sequitur backend. See the https://github.com/cadia-lvl/fastspeech2 repository for more information on the backend. The service can accept either unnormalized text or an SSML document and respond with audio (MP3, Ogg Vorbis or raw 16 bit PCM) or speech marks, indicating the byte and time offset of each synthesized word in the request. The full API documentation in OpenAPI 2 format is available online at tts.tiro.is. The code for the service along with further information is on https://github.com/tiro-is/tiro-tts/releases/tag/M8. You should also check if a newer version is out (see README.md)

Use "Tiro TTS web service (22.06)"
RÚV-DI Speaker Diarization (20.09)

2 resources

These are a set of speaker diarization recipes which depend on the speech toolkit Kaldi. There are two types of recipes here. First are recipes used for decoding unseen audio. The second type of recipes are for training diarization models on the Rúv-di data. This tool also lists the DER for the Rúv-di dataset on most of the recipes. All DERs within this tool have no unscored collars and include overlapping speech

Use "RÚV-DI Speaker Diarization (20.09)"
GreynirT2T - En--Is NMT with Tensor2Tensor (1.0)

2 resources

A program library for training English-Icelandic neural machine translation systems, built on top of Tensor2Tensor and Tensorflow. Supports training with or without back-translated data. Forritasafn til að þjálfa þýðingarlíkön sem þýða milli íslensku og ensku. Uppsetningin er byggð á Tensor2Tensor og Tensorflow. Safnið styður þjálfun með og án bakþýðingargagna.

Use "GreynirT2T - En--Is NMT with Tensor2Tensor (1.0)"
NameTag 2

2 resources

NameTag 2 is a named entity recognition tool. It recognizes named entities (e.g., names, locations, etc.) and can recognize both flat and embedded (nested) entities. NameTag 2 can be used either as a commandline tool or by requesting the NameTag webservice. NameTag webservice can be found at: https://lindat.mff.cuni.cz/services/nametag/ NameTag commandline tool can be downloaded from NameTag GitHub repository, branch nametag2: git clone https://github.com/ufal/nametag -b nametag2 Latest models and documentation can be found at: https://ufal.mff.cuni.cz/nametag/2 This software subject to the terms of the Mozilla Public License, v. 2.0 (http://mozilla.org/MPL/2.0/). The associated models are distributed under CC BY-NC-SA license. Please cite as: Jana Straková, Milan Straka, Jan Hajič (2019): Neural Architectures for Nested NER through Linearization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5326-5331, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-48-2 (https://aclweb.org/anthology/papers/P/P19/P19-1527/)

Use "NameTag 2"
Corpus extraction tool LIST 1.0

2 resources

The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that can be imported into Microsoft Excel or similar statistical processing software.

Use "Corpus extraction tool LIST 1.0"
Webrice extension (22.09)

2 resources

The Webrice plugin is a software add-on that gives access to people to listen to web pages instead of reading them. This chrome browser extension changes Icelandic text to speech. Webrice-viðbótin er hugbúnaðarforrit sem hjálpar notendum að velja texta og hlusta á hann í staðinn fyrir að lesa. Þessi Chrome-viðbót breytir íslenskum textan í tal.

Use "Webrice extension (22.09)"
Slovene Text Normalizator RSDO-DS2-NORM 1.0

2 resources

This Text Normalisator converts Slovene text from written-form into its spoken-form. Traditionally it is an essential preprocessing step before text-to-speech (TTS). As input it accepts text as a string, and returns a dictionary with fields "input_text", "normalised_text", "status" and "logs". Example: normalize_text("Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta 1954, je, da je temperatura trojne točke vode enaka 0,01 °C.") {'input_text': 'Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta 1954, je, da je temperatura trojne točke vode enaka 0,01 °C.', 'normalized_text': 'Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta tisoč devetsto štiriinpetdeset, je, da je temperatura trojne točke vode enaka nič celih nič ena stopinje Celzija.', 'status': 1, 'logs': [('1954', 'tisoč devetsto štiriinpetdeset'), ('0,01', 'nič celih nič ena'), ('°C', 'stopinje Celzija')]} For further details see README.md.

Use "Slovene Text Normalizator RSDO-DS2-NORM 1.0"

Result filters

Metadata provider

Language

Resource type

Type of tool

Tool task

Field of study

Availability

Organisation

Project

Keywords

Search results

Dependency tree extraction tool STARK 3.0

Spellchecking app for Android (22.10)

GreynirPackage v3.5.1

Tiro TTS web service (22.06)

RÚV-DI Speaker Diarization (20.09)

GreynirT2T - En--Is NMT with Tensor2Tensor (1.0)

NameTag 2

Corpus extraction tool LIST 1.0

Webrice extension (22.09)

Slovene Text Normalizator RSDO-DS2-NORM 1.0

Result filters

Metadata provider

Language

Resource type

Type of tool

Tool task

Field of study

Availability

Organisation

Project

Keywords

Search results

Session recording