USD per year
Audio QA Lead
Support the development of high-quality training datasets for next-generation voice AI models. Part-time contract (can turn into full-time) Remote Speech Data, Quality & Annotation
About the role
We are hiring an Audio QA Lead to support the development of high-quality training datasets for next-generation voice AI models. In this role, you will work hands-on to improve the quality, consistency, and usability of speech datasets across applications such as text-to-speech, transcription, speech-to-speech, ASR, and conversational voice systems. Your work will directly influence how data is collected, reviewed, and delivered for real-world model training. You will work across three core areas: defining and applying audio quality standards, recording high-quality speech on demand, and performing annotation and QA across speech datasets. This is not a generic audio production role. The work focuses on making audio usable for model training and requires a strong understanding of how data quality impacts model performance.
What you'll do
- Develop, refine, and apply audio quality guidelines for speech and voice datasets.
- Review audio files against technical, linguistic, and task-specific standards, making clear approval, rejection, or revision decisions.
- Identify audio and annotation issues such as background noise, clipping, distortion, plosives, echo, low signal, segmentation errors, transcript mismatches, and speaker-label inconsistencies.
- Perform annotation and QA tasks, including transcription, timestamp validation, VAD/segmentation, diarization, pronunciation checks, and metadata review.
- Record speech based on provided scripts and performance guidelines, delivering natural, high-quality, specification-compliant audio.
- Document edge cases, update review rubrics,and improve internal SOPs and quality standards.
- Collaborate with research , ML ,and operations teams to translate model requirements into data specifications and evaluation criteria .
- Ensure consistency and integrity across audio files , transcripts , annotations ,and associated metadata .
Who we're looking for
The ideal candidate has direct experience working with audio AI datasets and understands what makes speech data effective for model training . You have a strong ear for audio quality , are comfortable applying annotation standards ,and can consistently produce and evaluate high-quality recordings .
- Direct experience working with audio AI training datasets or evaluation workflows .
- Hands-on experience with TTS , ASR , transcription , speech-to-speech , or related voice AI systems .
- Experience developing or applying audio quality standards in production environments .
- Experience with speech annotation tasks such as transcription , timestamp QA , VAD/segmentation ,and diarization .
- Strong auditory judgment with the ability to consistently identify subtle audio quality issues .
- Ability to produce high-quality recordings in a controlled , quiet environment using professional or near-professional equipment .
- Strong written communication skills with the ability to provide clear , actionable feedback .
- High attention to detail and sound judgment when evaluating edge cases .
- Comfort working with structured data formats such as spreadsheets , CSV , or JSON .
Bonus qualifications
- Experience with audio tools such as Audacity , Praat , or similar .
- Basic scripting skills in Python , Bash , or SQL for QA or dataset analysis .
- Background in linguistics , phonetics , speech research , or voiceover work .
- Experience evaluating both real and synthetic audio .
- Multilingual experience or familiarity with accents and dialect variation .
- Familiarity with compliant handling of consented and licensed voice data.
Licensed Audio Data for Voice AI Models. Collect, license, annotate, and evaluate high-quality conversational audio datasets.
View Company Profile