LogoLogo
HomeExploreDocsAPIBlogContact
  • 🗃️Gooey.AI Docs
  • Changelog
  • 📖Guides
    • 🤖How to build an AI Copilot?
      • AI Prompting: Best practices
      • Curate your Knowledge Base Documents
      • Advanced Settings
      • Prepare Synthetic Data
      • Conversation Analysis
        • Glossary
      • Building a Multi-Modal Copilot
      • Frequently Asked Questions about AI Copilot
      • How to Automate Data Export?
    • 🚀How to deploy an AI Copilot?
      • Deploy to Web
      • Deploy to WhatsApp
      • Deploy to Slack
      • Deploy to Facebook
      • Broadcast Messages (via web or API)
      • Add buttons to your Copilot
    • ⚖️Understanding Bulk Runner and Evaluation
      • 💪How to set up Bulk Runner?
      • 🕵️‍♀️How to set up Evaluations?
      • How to use Bulk Run via API
    • 👄How to use AI Lip Sync Generator?
      • Lip Sync Animation Generator (WITH AUDIO FILES)
      • LipSync videos with Custom Voices
      • Set up your API for Lipsync with Local Folders
      • Tips to create great HD lipsync output
      • Frequently Asked Questions about Lipsync
    • 🗣️How to use ASR?
      • 📊How to create language evaluation for ASR?
    • How to use Compare AI Translations?
      • Google Translate Glossary
    • How does RAG-based document search work?
    • 🧩How to use Gooey Functions?
      • ✨LLM-enabled Functions
      • How to use SECRETS in Functions?
      • 🔥How to connect FirebaseDB to Copilot
    • 🎞️How to create AI Animations?
    • 🤳How to make amazing AI Art QR Codes?
      • API tips on AI Art QR Codes
    • 🖼️Create an AI Image with text
      • AI Image Prompting
      • API Tips for AI Image Generator
    • 📸AI Photo Editor
      • Build your avatar with AI
    • 🧑‍🏫How to use Gooey.AI’s Image Model Trainer?
    • 🔍Generate “People Also Ask” SEO Content
    • 🌐How to create SEO-Optimized content with AI?
    • How to use Workspaces?
      • How to use Version History?
      • How to add SECRETS in your Workspace?
    • 🍟How can I get free credits?
  • 😇CONTRIBUTING
    • Contributing
    • Documentation Style Guide
  • 🤓API REFERENCE
    • Getting started
    • API Generator
    • Rate Limits
    • Error Codes
  • 🍭ENDPOINTS
    • Copilot
    • Lipsync
    • Lipsync TTS
    • AI Art QR Generator
    • AI Animation Generator
    • Compare AI Image Generator
    • Gooey.AI on GitHub
Powered by GitBook
LogoLogo

Home

  • Gooey.AI
  • Explore Workflows
  • Sign In
  • Pricing

Learn

  • Docs
  • Blog
  • FAQs
  • Videos

Developers

  • How-to Guides
  • Get your Gooey.AI Key
  • Github
  • API Endpoints

Connect

  • Book a Demo
  • Discord
  • Team
  • Jobs

@Dara.network / Gooey.AI / support@gooey.ai

On this page
  • Step 1: Add your audio samples
  • Step 2: Select the Speech to Text provider
  • Step 3: Select your translation model
  • Step 4: Click "Run"
  • Frequently Asked Questions

Was this helpful?

Edit on GitHub
  1. Guides

How to use ASR?

A simple guide on how to use Automatic Speech Recognition

Last updated 6 months ago

Was this helpful?

We find speech recognition and translations important when we create AI workflows for frontline workers, traders, and impact organizations.

In AI copilot scenarios, we found that users prefer to send queries by voice rather than text. This could mean:

  • voice notes to Whatsapp and web copilots

  • Voice-based IVR in low internet coverage areas

So as part of our , we now host over 15 ASR models, that can be chained into the AI Copilot workflows.

Here is a simple guide to use the ASR Workflow

Step 1: Add your audio samples

  • Head to the Gooey.AI Speech Recognition and Translation Workflow:

  • Click on the "Clear all" Button and upload your audio file via a local folder or a link

If you are using a link you can use:

  • a hosted media link

  • a google drive link of the audio file

  • youtube video link

Step 2: Select the Speech to Text provider

  • Select the language in "Speech-to-Text Provider" from the dropdown provided.

Use the "Filter by Language" dropdown, if you are unsure which models will work with your source language

Step 3: Select your translation model

  • Click on the "Translate" checkbox

  • Select the translation model of your choice

Step 4: Click "Run"

  • Click on the "Run" button

Frequently Asked Questions

Q: How do I test the ASR models that can transcribe Swahili?

A: Use the "Filter by Language" dropdown, if you are unsure which models will work with your source language.

Q: In the translate section I can see "Google Translate" and "GhanaNLP", which model should I use?

A: If you are translating an African Language you can test if GhanaNLP is a better choice. GhanaNLP Machine Translation supports: Twi, Ewe, Ga, Fanti, Yoruba, Dagbani, Kikuyu, Fra fra, Luo (Kenya, Tanzania), Meru, Kusaal

Q: I have tested a few models, but I want to evaluate a larger dataset without using the API, is that possible?

A: Yes! It is very easy to set up large-scale ASR evaluations in Gooey! Here is the guide for:

Q: What is the "Prompt" section when I choose GPT4o-Audio?

A: GPT4o-Audio is an LLM-based transcription model, the prompt section will allow you to output the transcribed audio in more specific ways. For example, if you input a Hindi audio sample, you can say "Translate the Hindi recording as accurately as possible". This will use the LLM directly to translate the audio. You could also use it in other innovative ways like "Summarize the Hindi recording to English in bullet points" which could give you the salient points of the recording directly. Like the example here:

📖
🗣️

How to create language evaluation for ASR?

AI Workflow Standards
https://gooey.ai/speech/
LogoSummarize the Hindi recording to English in bullet points • Speech by Computational Mama aka Ambika • Gooey.AI