Transcribe an audio media set

This guide will walk through how to perform audio transcriptions in Foundry using media sets.

Part 1: Import audio files in Foundry as a media set

First, you should import your audio files as media sets. There are two ways to do this:

Once imported, you will be able to view your audio media set.

Audio media set

Part 2: Transcribe audio media set via Pipeline Builder

  1. Create a new pipeline in Pipeline Builder. Detailed steps can be found in the initial set up section of the Pipeline Builder documentation.

  2. Add your audio media set to the pipeline.

    Add audio media set to Pipeline Builder.

    Your imported audio media set should look like this:

    Imported audio media set.

  3. Convert the media set into table rows using Transforms.

    Convert audio media set to table rows.

    This generates media references for the items in your media set. Media references enable you to use a media item in Foundry without having to make copies of the media item itself. Learn more about media reference.

  4. Next, select the Transcribe audio into text transformation.

    Transcribe audio into text transform.

  5. Specify the inputs for the Transcribe audio into text transformation and select Apply.

    Example inputs for transformation. Use the mediaReference generated from step 3, and select the desired language. If no language is provided, it will be inferred from the first 30 seconds of audio.

  6. You can preview the outputs from the transcription in the table.

    Preview audio transcription output.

  7. You can continue to transform your audio transcription string output with available string transformations if needed.

Part 3: Save pipeline output

Choose the desired pipeline output. You may output as Dataset or choose to ontologize the output by selecting an Object Type output. Creating an object type will allow you to use your pipeline outputs in Workshop.

Learn more about how to save your pipeline output.