logo

Dataset

  • chatbot
  • corpus
  • crawl
  • dictionary
  • document-ranking
  • dumping
  • embedding
  • generative
  • keyphrase
  • knowledge-graph
  • lexicon
  • llm-benchmark
  • llm-instruction
  • news
  • nlq
  • normalization
  • ocr
  • paraphrase
  • parsing
  • phoneme
  • question-answer
  • segmentation
  • sentiment
  • speech
  • speech-to-text
  • speech-to-text-semisupervised
  • spelling-correction
  • summarization
  • tagging
  • tatabahasa
  • text-similarity
  • text-to-speech
  • tokenization
  • translation
  • true-case
Theme by the Executable Book Project
  • .rst
Contents
  • True Case
    • download
    • Citation

true-case

Contents

  • True Case
    • download
    • Citation

true-case#

True Case#

Build custom true case augmentation.

download#

  1. https://f000.backblazeb2.com/file/malay-dataset/true-case/true-case-multisentences-news.tsv

  2. https://f000.backblazeb2.com/file/malay-dataset/true-case/true-case-multisentences-wiki.tsv

  3. https://f000.backblazeb2.com/file/malay-dataset/true-case/true-case-news.tsv

  4. https://f000.backblazeb2.com/file/malay-dataset/true-case/true-case-short-news.tsv

  5. https://f000.backblazeb2.com/file/malay-dataset/true-case/true-case-short-wiki.tsv

  6. https://f000.backblazeb2.com/file/malay-dataset/true-case/true-case-wiki.tsv

  7. https://f000.backblazeb2.com/file/malay-dataset/true-case/test-set-true-case.json

Citation#

@misc{Malay-Dataset, We gather Bahasa Malaysia corpus!, True Case Augmentation,
author = {Husein, Zolkepli},
title = {Malay-Dataset},
year = {2018},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/huseinzol05/malay-dataset/tree/master/truecase}}
}

previous

translation

By mesolitica
© Copyright 2020, mesolitica.