Home

Data collection

Use our SurveyLex product to collect massive datasets through web browsers.

Data cleaning

Remove noise or augment voice datasets to be indicative of real-world use.

Featurization

We can extract state-of-the art feature embeddings for audio, text, and image data.

Machine learning

Work with our team to build deep learning models from your voice data.

Speaker diarization

Work with our team to source-separate multiple speakers.

Data labeling

Use our team to manually or automatically label datasets.

Monetization

Work with our team to find new ways to monetize your voice data.

Server deployments

Use our API to clean, featurize, and model voice data on servers.

Voicemails

Over 150,000 unique voicemails and speakers leaving messages around anniversaries and birthdays.

Emotions

3,000+ labels of each emotion type: happy, sad, disgust, angry, fear, surprise, and neutral.

Accents

1,000+ samples in 16 types of English accents all repeating the same sentence.

Genders

10,000+ gender labels (e.g. male or female) across many accents.

Microphone type

>1000 voice samples in each class: iPhone and/or iMac computer.

Speaking type

1,000+ samples of users speaking naturally or abnormally (e.g. singing).

General computer use

5,000+ samples labeled as voice events, music events, typing events, and silence events.

Fatigue

1,000+ samples labeled with fatigue level (e.g. awake or tired).

Depression

500+ voice data samples with self-reported PHQ-9 scores, a measure for depression.

Schizophrenia

300+ voice data samples with self-reported BPRS scores, a measure for psychosis.

Parkinson's disease

2,000+ patients labeled with Parkinson's disease with a short voice utterance ('ahhh').

Alzheimer's disease

200+ patients with MRI images labeled with mild-cognitive impairment vs. controls.

Step 1:
Discovery Call ➡

Schedule a call to discuss how we can add value to your organization through a subscription, data license, or a service contract.

Step 2:
Strategic plan ➡

We then write up a short strategic plan with a mission, goals, and success metrics for a risk-free 14-day trial trial.

Step 3:
Trial period ➡

We then move forward with this free 14-day trial to test the subscription, data license, or service contract on the defined success metrics.

Step 4:
Contract signing

We can then move forward with an engagement. Typically, engagements occur over 3, 6, 12, or 18 months.

How can I learn more about voice computing?

We recently wrote a textbook to help train more developers in this field.

Check it out at this link.

Can you provide custom services?

Yes! We can customize service contracts to meet your needs. Here are some services that we can provide beyond our core product offerings:

1. Manual data annotation - $1/minute of audio per label
2. Automated data anotation - $0.12/minute of audio
3. Data cleaning - $100/hour
4. Data modeling - $150/hour
5. Research assistants (collect data) - $15/hour
6. Speaker diarization (up to 3 speakers) - $0.024/minute of audio processed
7. Hardware design - Varies based on scope
8. Manual transcription (100% accuracy) - $1-2/minute
10. Custom software development - Varies based on scope

If you want something done that does not fit the above list, reach out to us. We most likely are able to help you with the >30 voice engineers in our lab.

When can you start a project?

We can start projects immediately and scale-up very quickly.

Just book a meeting on our calendar here and we'll handle the rest.

What features can you extract?

We commonly get asked how many and what types of features we extract to build machine learning models. Simply, there are 4 main types of features that we can extract:

1. Audio features - acoustic features extracted from an audio file (e.g. fundamental frequency)

2. Text features - language features extracted from a transcript (e.g. noun frequency)

3. Mixed features - Mixing up audio features and text features - often as a ratio (e.g. speaking rate).

4. Meta features - features derived from machine learning models on any of the embeddings described above (e.g. age - 20s).

You can read more about each of these types of features @ this GitHub link.

Can I join your research lab?

Absolutely!

If you are interested to join our lab, please fill out this application to join our innovation fellows program.

After applying, we will be in touch within 1-3 months.

If you'd like to speak sooner, schedule a demo meeting. We are always willing to speak about how you can contribute to our company.

We've been acquired by Sonde Health!

Our Products

SurveyLex

Sonde One

Our Services