Explore Projects

Discover 5 open source projects

Active filters (1):

Search: corpora×

Showing 1-5 of 5 projects

NLTK Data is a collection of datasets, models, and other resources for natural language processing in Python.

1.8K

Active

Python

Natural Language Processing

Python

#nlp#linguistics#corpora

A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.

1.4K

Archived

AI Voice & Speech

Databases

#speech-recognition#speech-synthesis#speech-processing

A Python library for easy data augmentation of Chinese text corpora using the EDA (Easy Data Augmentation) technique.

1.4K

Archived

Python

Text Augmentation

#chinese#data-augmentation#text-classification

An open-source library for automatic high-quality phrase mining from large text corpora.

1.2K

Archived

C++

Text Mining

#text-mining#phrase-extraction#lexicon-generation

A data repository for pre-trained NLP models and corpora to use in language processing projects.

1.0K

Archived

Python

LLM Frameworks

Databases

Python

#nlp#corpora#pretrained-models

Get weekly updates on trending AI coding tools and projects.