Javanese Machine Translation

Models and its Dataset for Javanese Machine Translation

Machine Translation is the task of automatically converting one natural language into another, preserving the meaning of the input text, and producing fluent text in the output language.

By Wilson Wongso, Steven Limcorn and AI-Research.id team

June 1, 2021

Models

Name Description Author Link
OPUS-MT-MUL-EN Machine translation from multiple languages to English. Language Technology Research Group at the University of Helsinki HuggingFace
OPUS-MT-EN-MUL Machine translation from English to multiple languages. Language Technology Research Group at the University of Helsinki HuggingFace

Datasets

Name Description Author Link
Ubuntu A parallel corpus of Ubuntu localization files. J. Tiedemann HuggingFace
Tatoeba This is a collection of translated sentences from Tatoeba. J. Tiedemann HuggingFace
QED The QCRI Educational Domain Corpus (formerly QCRI AMARA Corpus) is an open multilingual collection of subtitles for educational videos and lectures collaboratively transcribed and translated over the AMARA web-based platform. Qatar Computing Research Institute, Arabic Language Technologies Group HuggingFace
The Universal Declaration of Human Rights (UDHR) The Universal Declaration of Human Rights (UDHR) is a milestone document in the history of human rights. Drafted by representatives with different legal and cultural backgrounds from all regions of the world, it set out, for the first time, fundamental human rights to be universally protected. The Declaration was adopted by the UN General Assembly in Paris on 10 December 1948 during its 183rd plenary meeting. UDHR & Joe Davison HuggingFace