logo

Dataset

  • chatbot
  • corpus
  • crawl
  • dictionary
  • document-ranking
  • dumping
  • embedding
  • generative
  • keyphrase
  • knowledge-graph
  • lexicon
  • llm-benchmark
  • llm-instruction
  • news
  • nlq
  • normalization
  • ocr
  • paraphrase
  • parsing
  • phoneme
  • question-answer
  • segmentation
  • sentiment
  • speech
  • speech-to-text
  • speech-to-text-semisupervised
  • spelling-correction
  • summarization
  • tagging
  • tatabahasa
  • text-similarity
  • text-to-speech
  • tokenization
  • translation
  • true-case
Theme by the Executable Book Project
  • .rst
Contents
  • Neuspell
    • download

spelling-correction

Contents

  • Neuspell
    • download

spelling-correction#

Neuspell#

Build custom spelling correction augmentation, follow https://github.com/neuspell/neuspell

download#

  1. spelling-correction-news.tsv, https://f000.backblazeb2.com/file/malay-dataset/spelling/spelling-correction-news.tsv

  2. spelling-correction-wiki.tsv, https://f000.backblazeb2.com/file/malay-dataset/spelling/spelling-correction-wiki.tsv

  3. test set, https://f000.backblazeb2.com/file/malay-dataset/spelling/testset-spelling-augmentation.json

previous

speech-to-text-semisupervised

next

summarization

By mesolitica
© Copyright 2020, mesolitica.