Explore Projects

Discover 5 open source projects

Active filters (1):
Search: corporaร—
Clear all

Showing 1-5 of 5 projects

nltk/nltk_data

NLTK Data is a collection of datasets, models, and other resources for natural language processing in Python.

1.8K
Active
Python
Natural Language Processing
Python
#nlp#linguistics#corpora

coqui-ai/open-speech-corpora

A collection of open-source speech corpora for building speech recognition, synthesis, and other audio applications.

1.4K
Archived
AI Voice & Speech
Databases
#speech-recognition#speech-synthesis#speech-processing

zhanlaoban/EDA_NLP_for_Chinese

A Python library for easy data augmentation of Chinese text corpora using the EDA (Easy Data Augmentation) technique.

1.4K
Archived
Python
Text Augmentation
#chinese#data-augmentation#text-classification

shangjingbo1226/AutoPhrase

An open-source library for automatic high-quality phrase mining from large text corpora.

1.2K
Archived
C++
Text Mining
#text-mining#phrase-extraction#lexicon-generation

piskvorky/gensim-data

A data repository for pre-trained NLP models and corpora to use in language processing projects.

1.0K
Archived
Python
LLM Frameworks
Databases
Python
#nlp#corpora#pretrained-models

Stay in the loop

Get weekly updates on trending AI coding tools and projects.