📊How to create language evaluation for ASR?

Set up a simple workflow to compare ASR models and evaluate the right choice for your product

Why do you need a Bulk and Evaluation Workflow?

When testing ASR models, your main goal is to scope out which model is best suited for your project needs.

There are several components to test:

understanding model proficiency in global languages and local accents
assessing the accuracy of translations
checking for low Word Error Rate
latency
price

Features of Bulk Runner and Evaluation Workflow

Run several models in one click
Choose any of the API Response Outputs to populate your test
Built-in evaluation tool for quick analysis
Use csv or Google Sheets as input
Get output in CSV for further data analysis

Also see:

🏎️ Global Language Understanding for AIs

🗣️ Check out our Hindi ASR Evaluation

Getting Started

Step 0 - Prepare your data

Collect all your audio voice samples into a Google Drive folder (make sure they are in .wav or .mp3 format)
In a new Google spreadsheet or CSV file copy the links of all your audio files
Add the human-created transcription for each sample
Add the English translation for each sample

Step 1 - Select the ASR Models

Head to our bulk and eval workflow.

LINK TO HINDI ASR EVALUATION EXAMPLE

In the example, we have already pre-filled the various models that can be tested. You can choose the ones you want to run by selecting it in the dropdown.

Step 2 - Add your CSV/Google Sheets

Upload your CSV/Google Sheet from Step 0. In this example, we have used a Google Sheet of 10 Audio Samples with transcripts and translations. A preview of your sheet will appear once it is correctly uploaded.

Step 3 - Select the input column

Select the column in the input from the dropdown box. The outputs will appear as various columns. In this example, it will be the "audios" column

Step 4 - Select the pre-built evaluator

In the "Evaluation Workflows" section select the "Speech Recognition Model Evaluator".

Step 5 - Hit Submit

Once you hit submit, the selected ASR model workflows (see Step 1) will run for each audio file in the sheet (see Step 2). An output CSV will be generated on the right-hand side of the page.

After the runs are complete, the selected Evaluator (see Step 4), will compare the ASR model outputs to the human-generated translations. It will assess and rate how accurately each model has translated the audio sample.

A bar graph with the performance will appear once the entire evaluation is complete.

FAQs

Q: How should I prepare the transcription and translation data?

A: Arrange the audio sample link in the first column, for each audio link add transcriptions and translations in the respective row.

Q: What is the ideal length of the recording?

A: Any recording less than 40 minutes will work successfully, if you are a copilot creator we would recommend limiting the audio sample to 2-3 minutes.

Q: What audio format should we use?

A: Gooey's ASR workflow will accept .wav and .mp3 audio formats.

Q: How many files can I test?

A: Our partners have tested up to 1000 audio files in one run!

Q: How can we increase the quality of the test?

A: Make sure your transcriptions and translations are as accurate as possible.

Last updated 1 year ago

Was this helpful?