Transcribe audio into text

Supported in: Batch

Transcribe audio files into text.

Expression categories: Media

Declared arguments

  • Media reference - The column containing media references to audio files in the media sets.
    Expression<Media reference>
  • optional Language - The language to detect in the input file. If no language is provided, it will be inferred from the first 30 seconds of audio.
    Enum<Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, and more ...>
  • optional Output mode - Choose to output as a simple output where the output is the type of the output type parameter and errors are returned as null, or output a struct with the output the output type and error as fields.
    Enum<Simple, With errors>
  • optional Performance mode - The performance mode to use when running transcription. If no mode is provided, we will default to the more economical option.
    Enum<More economical, More performant>

Output type: String | Struct<ok, error>

Examples

Example 1: Base case

Description: Transcribe the audio file Argument values:

  • Media reference: mediaReference
  • Language: null
  • Output mode: null
  • Performance mode: null
mediaReferenceOutput
{"mimeType":"audio/mpeg","reference":{"type":"mediaSetItem","mediaSetItem":{"mediaSetRid":"ri.mio.main.media-set.a", "mediaItemRid":"ri.mio.main.media-item.a"}}}This is an example transcription from Whisper

Example 2: Null case

Argument values:

  • Media reference: Media Reference
  • Language: null
  • Output mode: null
  • Performance mode: null
mediaReferenceOutput
nullnull