S02 E09

How Deepgram is making AI interaction colloquial

Podcast available on
Lauren Sypniewski

Lauren Sypniewski

Lauren Sypniewski

Lauren is the Head of Data Operations at Deepgram, where she’s been making a big impact for the past four years. Her journey has taken her from teaching writing to leading data teams, and along the way, she’s worn many hats—from managing projects to building communication strategies and even transforming hiring processes. Lauren brings a unique mix of creativity and structure to everything she does, and we can’t wait to hear her story and insights!.

Episode Summary

The role of data at Deepgram

At Deepgram, data powers everything from speech-to-text and text-to-speech models to advanced audio intelligence features. By owning every aspect of their data—from collection and labeling to model training—they’re building AI systems that listen, understand, and respond naturally, making human-AI interactions seamless and intuitive.

Company culture and Sypniewski's top tips to lead a data team

Sypniewski describes Deepgram’s culture as fast-paced, supportive, and puzzle-driven. The team thrives on curiosity, collaboration, and solving complex challenges. She advises to stay curious, know when to automate, and focus resources on solving the hardest problems.

The data footprint at Deepgram

Deepgram’s data operations span everything from custom-built labeling tools to pipelines for continuous model improvement. The team has built scalable systems to create training data for diverse use cases, including multilingual models and voice customization. Distributed globally, they handle projects with precision and speed, enabling fast iteration.

The biggest data wins at Deepgram

One of Lauren’s proudest achievements is scaling the launching of new language models. What once took months now takes weeks, allowing Deepgram to expand quickly and solve more complex problems. Their adaptability in tackling nuanced challenges, like tone and context in conversations, sets them apart.

What’s next for Sypniewski and team?

Sypniewski’s team is diving deeper into perfecting real-time AI interactions. From improving conversational flow to addressing cultural nuances in different languages, they aim to make AI exchanges indistinguishable from human ones. The goal is to deliver voice AI experiences that are natural, engaging, and transformative.

Transcript

Tarush Aggarwal (00:00)

Welcome to another episode of People of Data, where we get to highlight the wonderful people of data. These are our data leaders from various industries, geographies, and disciplines, discover how companies are using data, their achievements, and the challenges they face. Today on the show, we're very excited to have Lauren Sipniewski, the head of data operations at Deepcrab, where she's been making a big impact for the last four years.

Her journey has taken her from teaching, writing, to leading data teams. And along the way, she's won many hats for managing projects, building community strategies, and even transforming hiring processes. Lauren brings a unique mix of creativity and structure to everything she does. We're super excited to hear her story and insights. Welcome to the show, Lauren.

Lauren Sypniewski (00:46)

Thank you so much. Happy to be here. Yeah, let's do it.

Tarush Aggarwal (00:49)

Are you ready to get into it?

Awesome. So for those of us who don't know, what does Deepgram do? And how does data play a role in helping Deepgram do what it wants to do?

Lauren Sypniewski (01:02)

Yeah, yeah, so Deepgram.

Deepgram is a voice AI company. So we build speech recognition and generative AI models. And in doing so, we're helping developers, we're helping data scientists create that next generation of voice applications. So at the core of everything is voice. So Deepgram is driven by the mission that we are going to increase the world's productivity with the intelligent technology that we build. So we're trying to make interactions with machines as natural as talking to another person.

So if you think about voice, like voice is faster than typing, it's more engaging than reading, and it's how humans naturally communicate, and we believe that that voice is going to be the future of the AI experiences and the AI interface. So we're focused on leading the pack and leading the way in conversational audio, ensuring that the systems we build can listen, can understand, and can respond just like a human.

So our core products are going to be speech to text, text to speech, audio intelligence features, so things like sentiment analysis and topic detection. And then we have also the full AI voice agent experience. So that's building AI voice assistants in a way that can carry conversations and engage with users that make it feel human and intuitive.

So that's kind of Deepgram in a nutshell.

Tarush Aggarwal (02:30)

That's awesome. do you, you know, there are a bunch of the, you know, sort of hyperscalers, Google, Amazon that had, you know, APIs for text to speech and speech to text. How do you differentiate yourself from, you know, some of the big clouds?

Lauren Sypniewski (02:46)

Yeah, great question. We want to build the entire experience. So when you're thinking about communicating with a human, it's what we're doing right now, right? So you are listening and you're perceiving what I'm saying, you're ingesting it. So that's our speech to text model, ingesting it, and you're understanding it. You're synthesizing, you're making connections, your brain's on fire, right?

And then you're going to respond, you're going to output something, and that's our text to speech. And so you have this very normal conversational cycle that happens with humans that we want to replicate, and we want to be able to do all of that and not necessarily have to have businesses use one part of Google, one part of Amazon, one part of Microsoft, one part of Deepgram, and have to stitch all of these different components together. So that's one part of it.

Tarush Aggarwal (03:36)

and

Lauren Sypniewski (03:39)

The other part of it is that we are a foundational AI company. So we're building all of our own models. We host all of our own infrastructure in our own data centers. We label our own audio. We're very, very vertically integrated. So we do it all, right? And so we're not necessarily just shelling out money to different cloud providers or this tooling or that tooling. So we're able to invest that back into the research that we're doing.

So that's a big differentiator too. And then the other third component is I would say the ability for us to adapt our systems to our customers specific pain points. And I can talk more about that, but that's our ability to adapt, customize our speech to text models to specific use cases, vocabulary, recording conditions of our customers. So for example, we've created models for NASA.

for the ground to air communication, ground to space communication that's really, really messy and hard to hear. It has a lot of like jargon and terminology. And so creating a model that's far more accurate than anything off the shelf from some of the legacy old providers.

Tarush Aggarwal (04:51)

That's awesome. I love the NASA use case. Thank you for sharing that. What does the culture look like at DeepCram?

Lauren Sypniewski (04:56)

Yeah, so I'm probably going to get a little cheesy here because I love Deepgram's culture. Deepgram is a place where I personally get to solve a puzzle that no one else is solving like every single day. And if you can give me a puzzle every day, I'm happy as a clam. So not only do I get these really interesting puzzles, but

Tarush Aggarwal (05:01)

Yeah.

Lauren Sypniewski (05:19)

As cliche as it sounds, I get to work with the most amazing group of people, right? So people that are not only insanely smart, but actually very kind and very generous with their time. And so we have such a fast paced company in a fast paced world working with fast paced AI tech that you'd think everyone's just so hurry scurry, but they're no like

Tarush Aggarwal (05:25)

Yeah.

Yeah.

Lauren Sypniewski (05:46)

you can truly see that regardless of official roles, no one says that's not my job. Everybody is there to help each other out. And I think that it really shows how much our team and our culture and our company is not just the sum of its parts, we're more than that because we truly do lift each other up.

Tarush Aggarwal (06:02)

Yeah.

Yeah, I love that. love speaking with people who have so much passion for what they do. So how did you find...

Lauren Sypniewski (06:13)

Yeah.

Tarush Aggarwal (06:17)

How do you find Deep Crab and how do you evaluate a company really, know, before this you're probably looking at a few different companies. How do you really hone in on like, actually this one has got a culture worth joining. We'd sort of love to hear your thoughts on that.

Lauren Sypniewski (06:32)

Yeah, I think that like my story of joining Deepgram would be very different than the story of joining Deepgram today. know, four years at a startup is a long time for things to become more formal or for different ways to influence the interview process and things like that. I think it's hard for people today to

be interviewing at a company and truly understand what is the culture. I think that when I sit in interviews and I watch my colleagues sit in interviews, I feel like they bring themselves to the interview. So there's the same sort of camaraderie during the interview process that they have in the day to day.

My story was very abnormal. My background, I went to school for writing and I worked in academia. And then after that, was like, academia is kind of like, unless I go for a 10 year position, it kind of felt like a dead end. And so what do I do next? And so I started working for a federal contractor.

sort of more in the recruiting and marketing and communications, but I became more familiar. They did a lot of AI applications for DARPA. And so I was becoming more familiar with that space. And then I knew some people that worked at Deepgram because they were in the area because some of the founders, well, both of the founders came from the University of Michigan, where, you know, the space where I'm at. And so at that time, they were just looking for

for smart people. And I was like, I don't know if I have the right background. My background's so odd. And it's so interesting how my background's more in arts and communication and how do people communicate. But that's really the kind of problems that we're working with at Deepgram because we're dealing with speech data, which is messy and complex and nuanced. And so truly,

As you said, like I head up the data operations team. What is the data operations team? I head up the team that builds all of our training data. So if you're building a speech to text model, your training data is audio and transcripts. Okay. And if you're building audio intelligence, your training data looks like X and looks like Y. And so really my job is melding the worlds of art and communication with the science of how do you make these labels consumable by machines?

Tarush Aggarwal (08:28)

Yeah.

Yeah.

Yeah.

Lauren Sypniewski (08:50)

And then how do you teach people, so my background in teaching, how do you teach a global team of many different annotators working on many different projects to all do the same thing, so we have that consistency for the R &D teams to use that data.

Tarush Aggarwal (09:06)

That sounds so interesting. Talking about the, you know, being running the data operations team and being really the middle layer between these assets and the ultimate classified products. What does this sort of tech stack which your team uses and the current data footprint of the team from team size and how they function on a day day basis?

Lauren Sypniewski (09:32)

Yeah, and data is such a loaded term. And so when I'm working with data, I'm working again with this training data, the data that we meld like audio and text and we send it to the research team, they use it for training. And that's really the backbone of everything that DeepRAM.

Tarush Aggarwal (09:46)

Yeah.

Lauren Sypniewski (09:52)

does and Deepgram builds. And then on top of that, we own the entire process from the data centers to the foundational models, the infrastructure. The team at Deepgram, we had an office pre-COVID in San Francisco and then COVID hit and that got shut down. And then we had so much hiring that happened without a centralized office. So the team became very distributed.

Tarush Aggarwal (09:54)

Yeah.

Lauren Sypniewski (10:19)

So I'm gonna probably mess this up, but I think we're in something like 23 different states as well as Canada and Europe and then we own a subsidiary that's based in the Philippines. So it's truly truly distributed. On my team, we also work with a global team of contractors and they are working in

Tarush Aggarwal (10:31)

Yeah. Yeah.

Lauren Sypniewski (10:41)

various different languages for different language products, different product lines, so whether that's speech to text or different features within our speech to text or text to speech, whatever it may be. And then on the R &D side, we have anywhere from data scientists, research scientists, data engineers. We have people that come with a background in linguistics, because that's, again, we're dealing with speech data, so that's very much a part of what we do. And then,

We have some folks that have come with backgrounds in working with audio and then some that have no experience working with audio before. So I think that's good in terms of creating like a really well-rounded team. As far as like the tech stack goes, we build a lot in-house. So we have custom in-house labeling tools that we've built and those labeling tools

Tarush Aggarwal (11:20)

Yeah.

Lauren Sypniewski (11:34)

will be things like labeling transcription for audio projects, as well as we have tools for audio generation. So working with, when we're building text to speech products, we have to work with voice actors to curate data in a very specific way to build that voice into the product. And so we have those data or audio generation platforms as well. We also have...

Tarush Aggarwal (11:58)

Yeah.

Lauren Sypniewski (12:01)

learning pipelines that we've set up so we're able to continuously improve our models and target specific audio or specific unlabeled data to curate and pipe back into our system. And then there's the other side of data like the business side of data and the customer usage and customer insights and that is a

whole operation that sits with like our, we have someone working in revenue operations and we have someone working in like just being data centric around like customer data. So that side of things I don't touch so much except to view the beautiful things that they have created, but I can't speak to it so much personally.

Tarush Aggarwal (12:44)

Yeah.

Yeah, no, that makes a lot of sense. In your four years over here, and it seems like, know, if you can, you're one of the OG team, what's been one of your biggest achievements which you're really proud of?

Lauren Sypniewski (12:58)

man, sometimes I think about this and I'm like, I don't even remember yesterday, let alone like four years ago. I think it is how well, my team has understood what needs a fine touch and what can be automated. And so our ability to scale

Tarush Aggarwal (13:02)

Thank

Yeah. Yeah.

Lauren Sypniewski (13:22)

Okay, so when I joined Deepgram, one of the first things I did was I established our Hindi team. So we had not built a Hindi speech to text model. We needed to develop a style guide for our transcriptionists. We needed to understand what the market looked like to how much to pay people, how we would get them onboarded, how to train them. We needed folks that were multilingual in

Tarush Aggarwal (13:30)

Yeah.

Lauren Sypniewski (13:49)

conversational English and understood the products we were building so they could pass that on to the transcriptionist. Like all of these different things and it took us.

months to be able to do anything like what data are we working on, what data is going to best influence the model. We hadn't had a product built and so we couldn't dog food it either. So the transcriptionists were working from nothing, which they don't like to do, right? They would much prefer to have something pre-labeled by some sort of machine learning model, right? And so that took months.

Tarush Aggarwal (13:58)

Yeah.

Lauren Sypniewski (14:23)

and there were a lot of like headaches around, wait that doesn't make sense, can you explain it? Like I only speak English and that is not a good thing at Deepgram, I need to learn more languages. But I don't understand like reduplication for example. Should we transcribe reduplication differently than repeated words? Differently than a stutter?

Tarush Aggarwal (14:33)

Yeah.

Lauren Sypniewski (14:47)

Well, yeah, they mean something different. So we have to tag it in a way that it looks and feels different because it is a different situation. Okay. Okay. And trying to understand all of these different problems. Today, we stand up new languages within like a week or two, not months. And so we've been able to figure out what sort of stuff is golden rules. They never change their language agnostic. And then

the network that we've created over the years to be able to just pull people from the crowdsourcing that we've done, from these relationships that we've created. We've worked with contractors for over five years. So even though we say contractors and freelancers, they've been with us since the very beginning. And we've created relationships with them so that we're able to have this very wide network. So I think that's probably like one thing where I'm just like, wow.

Tarush Aggarwal (15:27)

Yeah.

Yeah.

That's all.

Lauren Sypniewski (15:42)

This is something we would never have been able to do four years ago that is just easy peasy piece of cake today in really exciting ways. Because then we can focus on the really hard problems that in four years will be easy, right?

Tarush Aggarwal (15:49)

Yeah.

That's such an example. It reminds me of the quote that people overestimate what they can do in one year and vastly underestimate what they can do in longer periods of time. And this is beautiful example of that. On the flip side, what's one of the challenges which you've had, which you might currently be navigating or something which you have navigated in the past?

Lauren Sypniewski (16:18)

Yeah, I would say thinking about like what we're tackling right now, I think one of our challenges is enhancing our real time AI interactions. So I'd mentioned the voice agents that we build, and they already lead the industry in terms of managing interruptions and back channeling and these dynamic conversations, barge and things like that. They're great.

they could be better though right so there's always room for that improvement i don't know if you've worked with or tried to experiment with voice agents before so it often happens when you like interrupt it it'll like cut off in a very awkward way and then and then there's silence and it's just like wait is the human gonna talk

Tarush Aggarwal (17:01)

Yeah.

Yeah.

Lauren Sypniewski (17:06)

Is the AI gonna talk? And it's just like, who's it? Is it my turn? Is it your turn? And then sometimes the human starts talking and the AI starts talking and then the AI shuts up because it hears the human and it's just really messy. And so we've certainly made progress and I think the experience is phenomenal, but there's still a lot of work to do to truly make that exchange indistinguishable from like a human on human interaction.

Tarush Aggarwal (17:16)

Yeah. Yeah.

Yeah.

Lauren Sypniewski (17:36)

So I think that involves tackling very nuanced challenges. So things like tone and phrasing and context, because when I say, or you just said, yeah, I know that you're not trying to interrupt me. I know that I can keep talking. You're actually encouraging me to talk, but that's not built into the systems of today and that needs to. Well.

Tarush Aggarwal (17:57)

Wow, I didn't even

think about that.

Lauren Sypniewski (17:59)

Yeah, that would be considered an interruption. The AI would shut up. So what sort of things do you allow to flow through and not be considered a barge in versus not? And it becomes very nuanced. And then you also want to make sure that your latency remains low and the performance is still seamless while also incorporating these complexities.

On top of that, which I think is really interesting, is then scaling this capacity across different languages because conversational norms do vary widely culturally. Like there are some languages where even saying mm-hmm or okay is considered rude. Like you should be completely silent while the other person talks. And there are some it's rude if you don't say mm-hmm because then it sounds like you're not listening. And so what's natural for English

Tarush Aggarwal (18:46)

Yeah. Yeah.

Lauren Sypniewski (18:51)

might feel completely off in something like Japanese or Korean or Spanish. And so I think the aim is not to be language agnostic. I think that is completely wrong. I think it's to be language fluent. So adapting these different language and cultural norms into the model so it just understands that context. I think this is very challenging, something that we're working on. Can't tell you what all the problems are yet because we're still kind of like working through it.

Tarush Aggarwal (19:05)

Yeah.

Yeah.

Lauren Sypniewski (19:20)

But it's very exciting.

Tarush Aggarwal (19:21)

feel like when I speak to you, probably I was thinking, what's an example of this which I interact with daily? And I tried to not use that much technology. But one of the things I have is the home board at home. And speaking to Siri, I fully recognize the awkward, silent speaking thing. So that helped me really conceptualize what you're talking about.

Lauren Sypniewski (19:42)

Mm-hmm.

Tarush Aggarwal (19:48)

Would you have, you this sounds, it sounds like you're working on some incredible problems and, you know, we're very excited to see where you end up with this. What advice would you give to someone who wants to come work at Deepgram? I know it's not four years ago, but if someone's looking at Deepgram today, what's, one piece of advice you would give them?

Lauren Sypniewski (20:06)

That is a great question.

I don't know if this is like...

What's the word?

I don't know if this is like untraditional advice, but what I would do is I would play with all of Deepgram's products, figure out what works, what doesn't, and come to the table with a proposal. Hey, I did all this testing. I found this really interesting. And this is, this is something that I see. This is the value I could bring. This is how I've adapted this in the past. And so I think,

Tarush Aggarwal (20:17)

Mmm.

Lauren Sypniewski (20:37)

One thing that's incredibly consistent across all deep grammars is their like intense curiosity. And so if you're able to demonstrate, I was really curious about deep gram, but so curious that I ended up, you know, rabbit holeing all the way down into these different discourse, discord conversations and testing over here. And I found this really interesting. That brings

Tarush Aggarwal (20:45)

Yeah.

Lauren Sypniewski (21:04)

a lot to the table on top of the person's background and skills and formal education.

Tarush Aggarwal (21:10)

I love that. Lauren, thank you so much for sharing your wisdom and thank you for being a people of data.

Lauren Sypniewski (21:15)

Yeah, thank you for having me.

Get notified when a new season is released

Please enter your work email.
Thank you for subscribing! Stay tuned for the next season!
Oops! Something went wrong while submitting the form.

Stay tuned for updates

Please enter your work email.
Stay tuned!
Oops! Something went wrong while submitting the form.

Don’t miss out on our top picks

Listen more