# Create language evaluation for Speech Recognition

### Why do you need a Bulk and Evaluation Workflow?

When testing ASR models, your main goal is to scope out which model is best suited for your project needs.&#x20;

There are several components to test:&#x20;

* understanding model proficiency in global languages and local accents
* assessing the accuracy of translations
* checking for low Word Error Rate
* latency
* price

### Features of Bulk Runner and Evaluation Workflow

1. Run several models in one click
2. Choose any of the API Response Outputs to populate your test
3. Built-in evaluation tool for quick analysis
4. Use csv or Google Sheets as input
5. Get output in CSV for further data analysis

#### Also see:

<table data-view="cards" data-full-width="true"><thead><tr><th></th><th data-hidden data-card-target data-type="content-ref"></th><th data-hidden data-card-cover data-type="files"></th></tr></thead><tbody><tr><td>🏎️ Global Language Understanding for AIs</td><td><a href="https://app.gitbook.com/s/leYcqBx5FRZcVr3wI4f4/global-language-understanding-for-ais">Global Language Understanding for AIs</a></td><td><a href="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2FEsQiSroaBOslOnzBxvb0%2Fgooey.ai%20-%20cute%20robots%20racing%20vintage%20painted%20magazine%20advertisement%20muted%20colorful%20illustration.png?alt=media&#x26;token=798b495e-f992-4c28-8bd6-a216e84d38f3">gooey.ai - cute robots racing vintage painted magazine advertisement muted colorful illustration.png</a></td></tr><tr><td>🗣️ Check out our Hindi ASR Evaluation</td><td><a href="https://gooey.ai/bulk/compare-hindi-speech-recognition-hkgs8120p11t/">https://gooey.ai/bulk/compare-hindi-speech-recognition-hkgs8120p11t/</a></td><td><a href="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2F4EXnTSflMtbydX8zsc5D%2Fgooey.ai%20-%20cute%20vintage%20poster%20style%20illustration%20of%20a%20small%20group%20of%20indian%20kids%20learning%20hindi%20(1).png?alt=media&#x26;token=030fe6e9-1bf9-4c88-af80-1fd77f79462e">gooey.ai - cute vintage poster style illustration of a small group of indian kids learning hindi (1).png</a></td></tr></tbody></table>

## Getting Started

### Step 0 - Prepare your data

1. Collect all your audio voice samples into a Google Drive folder (make sure they are in .wav or .mp3 format)
2. In a new Google spreadsheet or CSV file copy the links of all your audio files&#x20;
3. Add the human-created transcription for each sample
4. Add the English translation for each sample

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2Fwkygeel1sUYQxl3F1llR%2FScreenshot%202024-05-07%20at%2012.27.38%20AM.png?alt=media&#x26;token=67fc2a7c-2055-4551-8d24-afe8fd4974bc" alt=""><figcaption></figcaption></figure>

### Step 1 - Select the ASR Models

Head to our bulk and eval workflow.&#x20;

[LINK TO HINDI ASR EVALUATION EXAMPLE](https://gooey.ai/bulk/compare-hindi-speech-recognition-hkgs8120p11t/)

In the example, we have already pre-filled the various models that can be tested. You can choose the ones you want to run by selecting it in the dropdown.

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2Fs0kGs24lBSrF5ZDdWT5A%2FScreenshot%202024-05-07%20at%2012.31.01%20AM.png?alt=media&#x26;token=ee780ca0-a181-494e-a355-f44ff38a64b6" alt=""><figcaption><p>Select the models you want to evaluate your audio samples on</p></figcaption></figure>

### Step 2 - Add your CSV/Google Sheets

Upload your CSV/Google Sheet from [Step 0](#step-0-prepare-your-data). In this example, we have used a Google Sheet of 10 Audio Samples with transcripts and translations. A preview of your sheet will appear once it is correctly uploaded.&#x20;

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2F33VS6GGJhWxIZBzpKF4d%2FScreenshot%202024-05-07%20at%2012.34.15%20AM.png?alt=media&#x26;token=d2284655-7fbf-4954-a890-6dedabd6f4a3" alt=""><figcaption></figcaption></figure>

### Step 3 - Select the input column

Select the column in the input from the dropdown box. The outputs will appear as various columns. In this example, it will be the "audios" column

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2FdyuGqCl7lK3nY0JfUb2C%2FScreenshot%202024-05-07%20at%2012.36.40%20AM.png?alt=media&#x26;token=3cd6eebd-a4e3-43d3-85ec-27f850e838de" alt=""><figcaption></figcaption></figure>

### Step 4 - Select the pre-built evaluator

In the "Evaluation Workflows" section select the "Speech Recognition Model Evaluator".

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2FhaGKh94feqnWvy2qxuxq%2FScreenshot%202024-05-07%20at%2012.38.13%20AM.png?alt=media&#x26;token=5b84b3bf-d561-4ab2-beca-da19d401efd3" alt=""><figcaption></figcaption></figure>

### Step 5 - Hit Submit

Once you hit submit, the selected ASR model workflows (see [Step 1](#step-1-select-the-asr-models)) will run for each audio file in the sheet (see [Step 2](#step-2-add-your-csv-google-sheets)). An output CSV will be generated on the right-hand side of the page.&#x20;

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2FsaNCCNOYJ9kukpZ3PYT3%2FScreenshot%202024-05-07%20at%2012.47.58%20AM.png?alt=media&#x26;token=c9b24086-d90e-4eb1-b7d9-367bafec741e" alt=""><figcaption><p>The outputs will appear in a table format on the right of the page, the output will appear in new columns after your originally populated columns</p></figcaption></figure>

After the runs are complete, the selected Evaluator (see [Step 4](#step-4-select-the-pre-built-evaluator)), will compare the ASR model outputs to the human-generated translations. It will assess and rate how accurately each model has translated the audio sample.&#x20;

A bar graph with the performance will appear once the entire evaluation is complete.

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2Fr2OwaUzlty2QzT7OnYD3%2FScreenshot%202024-05-07%20at%2012.50.05%20AM.png?alt=media&#x26;token=bf6f1a8c-afe3-4043-bd6a-eef742e24c20" alt=""><figcaption><p>After the evaluation is complete, table and a bar graph will show the evalution scores</p></figcaption></figure>

## FAQs

#### Q: How should I prepare the transcription and translation data?

A: Arrange the audio sample link in the first column, for each audio link add transcriptions and translations in the respective row.&#x20;

<figure><img src="https://662560811-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F5BFP5RUm6rTLXk8wUSTf%2Fuploads%2FqUx4WaG1UQCjzlUIUThV%2FScreenshot%202024-05-08%20at%2011.08.26%20PM.png?alt=media&#x26;token=c8d2510d-0b21-4d6d-af79-7386233388fc" alt=""><figcaption><p>Screenshot of audio sample links with trasncriptions and english translation</p></figcaption></figure>

#### Q: What is the ideal length of the recording?

A: Any recording less than 40 minutes will work successfully, if you are a copilot creator we would recommend limiting the audio sample to 2-3 minutes.&#x20;

#### Q: What audio format should we use?

A: Gooey's ASR workflow will accept .wav and .mp3 audio formats.

#### Q: How many files can I test?&#x20;

A: Our partners have tested up to 1000 audio files in one run!&#x20;

#### Q: How can we increase the quality of the test?

A: Make sure your transcriptions and translations are as accurate as possible.&#x20;
