Inventing the future of AI

Training the second generation of AI to speak naturally.

Participate

Data Collection

Collecting data to train Automatic Speach Recognition (ASR) and Natural Language Processing (NLP) systems in Large Language Models (LLM's) requires dedication and teamwork. Our team of professional's work to collect RAW unfiltered data for the use in AI voice projects. The collection is currently being done with 6 different ethnic groups.


More Details

Participants 

Vetted for accents and diversity within the geographical location.

Corpus

Conversational in content and style.

Photo by <a href="https://unsplash.com/@salahdarwish?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">‪Salah Darwish</a> on <a href="https://unsplash.com/photos/a-pair-of-black-shoes-TE40O_a3pWM?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>

Recording

48kHz, 24bit, mono channel line-in direct recording.

Data

Standard minimum specification of 24bit 48kHz, RIFF WAV

Where you can find us 


Social Media

Click here to setup your social networks