Transcribe an audio media set

This guide will walk through how to perform audio transcriptions in Foundry using media sets.

Part 1: Import audio files in Foundry as a media set

First, you should import your audio files as media sets. There are two ways to do this:

Once imported, you will be able to view your audio media set.

Audio media set

Part 2: Transcribe audio media set via Pipeline Builder

  1. Create a new pipeline in Pipeline Builder. Detailed steps can be found in the initial set up section of the Pipeline Builder documentation.

  2. Add your audio media set to the pipeline.

    Add audio media set to Pipeline Builder.

    Your imported audio media set should look like this:

    Imported audio media set.

  3. Next, select the Transcribe audio into text transformation using Transforms.

    Transcribe audio into text transform.

  4. Specify the inputs for the Transcribe audio into text transformation and select Apply.

    Example inputs for transformation. Use the media_reference column from the media set input, and select the desired language. If no language is provided, it will be inferred from the first 30 seconds of audio. Choose to output the transcription as a plain text string, or to include segment field details with timestamps and confidence scores.

  5. You can preview the outputs from the transcription in the table.

    Preview audio transcription output.

  6. You can continue to transform your audio transcription string output with available string transformations if needed.

Part 3: Save pipeline output

Choose the desired pipeline output. You may output as Dataset or choose to ontologize the output by selecting an Object Type output. Creating an object type will allow you to use your pipeline outputs in Workshop.

Learn more about how to save your pipeline output.