Javanese Automatic Speech Recognition
Javanese Speech Recognitionand its Dataset
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text.
By Wilson Wongso, Steven Limcorn and AI-Research.id team
June 1, 2021
Models
Name | Description | Author | Link |
---|---|---|---|
Wav2Vec2-Large-XLSR-Javanese | Fine-tuned facebook/wav2vec2-large-xlsr-53 on the OpenSLR High quality TTS data for Javanese. When using this model, make sure that your speech input is sampled at 16kHz. | Cahya Wirawan | HuggingFace |
Datasets
Name | Description | Author | Link |
---|---|---|---|
OpenSLR | This data set contains transcribed audio data for Javanese (~185K utterances). The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, UserID and the transcription of audio in the file. | Oddur Kjartansson and Supheakmungkol Sarin and Knot Pipatsrisawat and Martin Jansche and Linne Ha | HuggingFace |
VolLingua107 | VoxLingua107 is a speech dataset for training spoken language identification models. The dataset consists of speech segments extracted from YouTube videos & post-processed. The Javanese dataset has 53 hours (5.0G). | Jörgen Valk, Tanel Alumäe | bark.phon.ioc.ee |