Sundanese Token Classification

Models and its Dataset for Sundanese Token Classification

The Token classification Task is similar to text classification, except each token within the text receives a prediction. A common use of this task is Named Entity Recognition (NER).

By Wilson Wongso, Steven Limcorn and AI-Research.id team

June 1, 2021

Datasets

Name Description Author Link
WikiANN WikiANN (sometimes called PAN-X) is a multilingual named entity recognition dataset consisting of Wikipedia articles annotated with LOC (location), PER (person), and ORG (organisation) tags in the IOB2 format. Pan, Xiaoman and Zhang, Boliang and May, Jonathan and Nothman, Joel and Knight, Kevin and Ji, Heng & Rahimi, Afshin and Li, Yuan and Cohn, Trevor HuggingFace