We're building a portfolio of technologies to power voice-enabled devices.
We're curating the best voice datasets in the world.
Over 150,000 unique voicemails and speakers leaving messages around anniversaries and birthdays.
3,000+ labels of each emotion type: happy, sad, disgust, angry, fear, surprise, and neutral.
1,000+ samples in 16 types of English accents all repeating the same sentence.
10,000+ gender labels (e.g. male or female) across many accents.
>1000 voice samples in each class: iPhone and/or iMac computer.
1,000+ samples of users speaking naturally or abnormally (e.g. singing).
5,000+ samples labeled as voice events, music events, typing events, and silence events.
1,000+ samples labeled with fatigue level (e.g. awake or tired).
500+ voice data samples with self-reported PHQ-9 scores, a measure for depression.
300+ voice data samples with self-reported BPRS scores, a measure for psychosis.
2,000+ patients with Parkinson's disease, data collected by Sage Bionetworks (mPower study).
200+ Alzheimer's patients with MRI images, data collected by the Framingham Heart Study.
We can work alongside your team to unleash the power of voice computing.
Remove noise or augment voice datasets to be indicative of real-world use.
We can extract state-of-the art feature embeddings for audio, text, and image data.
Work with our team to build deep learning models from your voice data.
Work with our team to source-separate multiple speakers.
Use our team to manually or automatically label datasets.
Work with our team to find new ways to monetize your voice data.
Use our API to clean, featurize, and model voice data on servers.
Schedule a call to discuss how we can add value to your organization through a subscription, data license, or a service contract.
We then write up a short strategic plan with a mission, goals, and success metrics for a risk-free 14-day trial trial.
We then move forward with this free 14-day trial to test the subscription, data license, or service contract on the defined success metrics.
We can then move forward with an engagement. Typically, engagements occur over 3, 6, 12, or 18 months.
We recently wrote a textbook to help train more developers in this field.
Check it out at this link.
Yes! We can customize service contracts to meet your needs. Here are some services that we can provide beyond our core product offerings:
1. Manual data labeling - $1/minute of audio per label
2. Automated data labeling - $0.12/minute of audio
3. Data cleaning - $100/hour
4. Data modeling - $150/hour
5. Research assistants (collect data) - $15/hour
6. Speaker diarization (up to 3 speakers) - $0.024/minute of audio processed
7. Hardware design - Varies based on scope
8. Manual transcription (100% accuracy) - $0.80/minute
9. Amazon Alexa skills / Google Assistant apps - $1,000/application
10. Custom software development - Varies based on scope
If you want something done that does not fit the above list, reach out to us. We most likley are able to help you with the >30 voice engineers in our lab.
We can start projects immediately and scale-up very quickly.
Just book a meeting on our calendar here and we'll handle the rest.
We commonly get asked how many and what types of features we extract to build machine learning models. Simply, there are 4 main types of features that we can extract:
1. Audio features - acoustic features extracted from an audio file (e.g. fundamental frequency)
2. Text features - language features extracted from a transcript (e.g. noun frequency)
3. Mixed features - Mixing up audio features and text features - often as a ratio (e.g. speaking rate).
4. Meta features - features derived from machine learning models on any of the embeddings described above (e.g. age - 20s).
You can read more about each of these types of features in this SurveyLex FAQ document.
We have an internal product, ModLex, which makes it quite easy for us to featurize and optimize machine learning models automatically. This makes it much more affordable for you to build machine learning models at a low cost.
Right now we’re training mostly simple classification models because many of the datasets that we have curated internally are relatively low sample sizes (e.g. 100-1000 labels per class). However, for larger datasets we build and optimize models with deep learning techniques (e.g. attention-based neural networks). Here is a list of all the model families that we test currently with the ModLex software:
- Naive Bayes (NB)
- Decision tree
- Support vector machines (SVM)
- Maximum entropy
- Gradient boost
- Logistic regression
- Hard voting
- K nearest neighbors (knn)
- Random forest
After running the ModLex software with one of the training scripts, we can then assess it’s accuracy in the form of a .TXT file output. This process allows us to rapidly prototype many different feature embeddings and model types at once.