📊How to create language evaluation for ASR?
Set up a simple workflow to compare ASR models and evaluate the right choice for your product
Last updated
Set up a simple workflow to compare ASR models and evaluate the right choice for your product
Last updated
@Dara.network / Gooey.AI / support@gooey.ai
When testing ASR models, your main goal is to scope out which model is best suited for your project needs.
There are several components to test:
understanding model proficiency in global languages and local accents
assessing the accuracy of translations
checking for low Word Error Rate
latency
price
Run several models in one click
Choose any of the API Response Outputs to populate your test
Built-in evaluation tool for quick analysis
Use csv or Google Sheets as input
Get output in CSV for further data analysis
Collect all your audio voice samples into a Google Drive folder (make sure they are in .wav or .mp3 format)
In a new Google spreadsheet or CSV file copy the links of all your audio files
Add the human-created transcription for each sample
Add the English translation for each sample
Head to our bulk and eval workflow.
LINK TO HINDI ASR EVALUATION EXAMPLE
In the example, we have already pre-filled the various models that can be tested. You can choose the ones you want to run by selecting it in the dropdown.
Upload your CSV/Google Sheet from Step 0. In this example, we have used a Google Sheet of 10 Audio Samples with transcripts and translations. A preview of your sheet will appear once it is correctly uploaded.
Select the column in the input from the dropdown box. The outputs will appear as various columns. In this example, it will be the "audios" column
In the "Evaluation Workflows" section select the "Speech Recognition Model Evaluator".
Once you hit submit, the selected ASR model workflows (see Step 1) will run for each audio file in the sheet (see Step 2). An output CSV will be generated on the right-hand side of the page.
After the runs are complete, the selected Evaluator (see Step 4), will compare the ASR model outputs to the human-generated translations. It will assess and rate how accurately each model has translated the audio sample.
A bar graph with the performance will appear once the entire evaluation is complete.
A: Arrange the audio sample link in the first column, for each audio link add transcriptions and translations in the respective row.
A: Any recording less than 40 minutes will work successfully, if you are a copilot creator we would recommend limiting the audio sample to 2-3 minutes.
A: Gooey's ASR workflow will accept .wav and .mp3 audio formats.
A: Our partners have tested up to 1000 audio files in one run!
A: Make sure your transcriptions and translations are as accurate as possible.