THE FUTURE OF TECH IS YOURS TO BUILD

Learn more about opportunities in Alkeon’s VC Portfolio

companies

Jobs

My job alerts

Audio Engineer

Deepgram

Software Engineering

United States · Remote

Posted on Feb 10, 2026

Apply now

Location

USA | Remote

Employment Type

Full time

Location Type

Remote

Department

Data Operations

Compensation

Base Salary Range $120K – $175K • Offers Equity • Offers Bonus • 10% Annual Bonus

This range is determined by work location and additional factors, including job-related skills and experience. There may be instances where a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

Please note that the compensation details listed on US role postings reflect the base salary only and does not include bonus, equity or benefits.

Company Overview

Deepgram is the leading platform underpinning the emerging trillion-dollar Voice AI economy, providing real-time APIs for speech-to-text (STT), text-to-speech (TTS), and building production-grade voice agents at scale. More than 200,000 developers and 1,300+ organizations build voice offerings that are ‘Powered by Deepgram’, including Twilio, Cloudflare, Sierra, Decagon, Vapi, Daily, Cresta, Granola, and Jack in the Box. Deepgram’s voice-native foundation models are accessed through cloud APIs or as self-hosted and on-premises software, with unmatched accuracy, low latency, and cost efficiency. Backed by a recent Series C led by leading global investors and strategic partners, Deepgram has processed over 50,000 years of audio and transcribed more than 1 trillion words. There is no organization in the world that understands voice better than Deepgram.

Company Operating Rhythm

At Deepgram, we expect an AI-first mindset—AI use and comfort aren’t optional, they’re core to how we operate, innovate, and measure performance.

Every team member who works at Deepgram is expected to actively use and experiment with advanced AI tools, and even build your own into your everyday work. We measure how effectively AI is applied to deliver results, and consistent, creative use of the latest AI capabilities is key to success here. Candidates should be comfortable adopting new models and modes quickly, integrating AI into their workflows, and continuously pushing the boundaries of what these technologies can do.

Additionally, we move at the pace of AI. Change is rapid, and you can expect your day-to-day work to evolve just as quickly. This may not be the right role if you’re not excited to experiment, adapt, think on your feet, and learn constantly, or if you’re seeking something highly prescriptive with a traditional 9-to-5.

Opportunity:

Deepgram is looking for an Audio Engineer to own and scale audio quality across our voice AI products. This role sits at the intersection of professional audio engineering and machine-learning infrastructure. You will be responsible for ensuring our voices don’t just sound “correct,” but sound genuinely great to human listeners, across thousands of voices, recording conditions, and use cases.

This is a foundational role. You’ll help define how audio engineering fits into our end-to-end pipeline: from on-site voice actor recording, to speaker-specific cleanup for fine-tuning, to synthetic data generation, and large-scale TTS training. You’ll take traditionally manual, GUI-driven audio workflows and turn them into scalable, programmatic systems that can operate at Deepgram’s scale.

What You’ll Do

Identify and correct audio artifacts, loudness inconsistencies, frequency imbalances, and sibilance issues across large-scale voice datasets.
Design and implement scalable audio processing pipelines for voice data
Define and implement scalable audio processing pipelines (EQ, compression, de-essing, dynamic range optimization) and normalization strategies across inter- and intra- voice recordings.
Optimize audio quality across real and synthetic voices to ensure a consistent product experience across multiple use cases.
Lead audio quality decisions during on-site voice actor recording sessions, including microphone selection, placement, gain staging, and environment setup.
Define, document, and enforce audio quality standards for external vendors, including recording setup requirements, signal characteristics, and post-processing expectations, ensuring vendor-produced audio meets Deepgram’s training and product needs even when recordings are not done on-site.
Convert expert-driven, manual audio workflows into automated, repeatable, code-based systems.
Collaborate closely with research to improve training data quality, especially TTS speaker-specific fine-tuning.
Contribute to synthetic data pipelines by defining and validating acoustic characteristics, guiding how different “sound profiles” should be produced and evaluated.

You’ll Love This Role If You

Instinctively hear volume, EQ, and dynamic differences that others miss.
Obsess over why one voice sounds more pleasing than another—even when both are “natural.”
Are equally comfortable tweaking a signal chain in Logic Pro and implementing it in FFmpeg or Python.
Enjoy scaling handcrafted quality decisions to thousands of recordings.
Lose sleep over missed opportunities to improve audio quality or training data diversity.
Think ahead about how today’s recording choices enable tomorrow’s models.

It’s Important To Us That You Have

Professional audio engineering experience (studio, podcast, radio, live sound, or equivalent).
Deep understanding of EQ, compression, limiting, de-essing, and mastering techniques.
Strong familiarity with professional audio tools (Adobe Audition, Logic Pro, Pro Tools, or similar).
Hands-on experience with FFmpeg and command-line audio processing tools.
Solid understanding of microphone characteristics, placement, and acoustic principles.
A highly trained ear for subtle audio quality differences across voices and environments.

It Would Be Great if You Had

Programming ability (Python preferred) to automate and scale audio workflows.
Experience building custom audio plugins or DSP tools.
Open-source contributions to audio or signal-processing projects.
Background in batch or programmatic audio processing at scale.
Familiarity with ML audio preprocessing for ASR or TTS.
Experience managing large-scale audio datasets.
Comfort working in creative/audio communities and technical open-source ecosystems.

Benefits & Perks

Holistic health

Medical, dental, vision benefits
Annual wellness stipend
Mental health support
Life, STD, LTD Income Insurance Plans

Work/life blend

Unlimited PTO
Generous paid parental leave
Flexible schedule
12 Paid US company holidays
Quarterly personal productivity stipend
One-time stipend for home office upgrades
401(k) plan with company match
Tax Savings Programs

Continuous learning

Learning / Education stipend
Participation in talks and conferences
Employee Resource Groups
AI enablement workshops / sessions

Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $215M in total funding. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you!

Deepgram is an equal opportunity employer. We want all voices and perspectives represented in our workforce. We are a curious bunch focused on collaboration and doing the right thing. We put our customers first, grow together and move quickly. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, veteran status, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate.

We are happy to provide accommodations for applicants who need them.

Compensation Range: $120K - $175K

Apply now

See more open positions at Deepgram

Privacy policy Cookie policy