Speech and Singing Alignment

This module transcribes speech and singing from an input audio file, converting spoken and sung content into textual form for improved accessibility, translation, or further analysis.

Documentation

Settings

Name
use_segments
Type
boolean
Description
Name
language
Type
string
Description
The language of the input audio file. Must be one of the supported language codes

Input

Name
subtitleInputFileUrl
Type
string
Description
Subtitle file you want to align with the audio
Name
audioInputFileUrl
Type
string
Description
Audio file containing speech or singing that you want to align with the subtitles

Output

Name
alignedByWord
Type
string
Description
JSON file containing individual words and timestamps aligned with the input audio
Name
alignedByLine
Type
string
Description
JSON file containing lines from the subtitle file and timestamps aligned with the input audio

Related Modules

Audio Pad

Introduce customizable periods of silence at the audio file's beginning and end, crafting room for content to breathe.

Details

Segment

Precisely extract a defined segment from the input audio file, preserving the chosen duration and starting at the specified timestamp.

Details

Audio Activity Detection

Automate the detection and analysis of audio activity, streamlining the editing workflow.

Details

Ready to take your project to the next level?

Start now — or reach out for assistance.

Get started Contact sales