audio package

translate module

multimodaltranslation.audio.translate.audio_to_text(audio_bytes: bytes, model: str) → str

Converts the audio files into text.

Parameters:

Returns:

The transcription of the audio.

Return type:

str

Raises:

RuntimeError – If the conversion of the audio file to wav type failed.

multimodaltranslation.audio.translate.convert_to_wav_bytes(audio_bytes: bytes) → BytesIO

Converts the different audio types into wav (using ffmpeg) which is needed by our model.

multimodaltranslation.audio.translate.get_model(lang: str) → str

Returns the path to the Vosk model for the given language. Downloads it if not already installed.

multimodaltranslation.audio.translate.translate_audio(audio_bytes: bytes, lang: str, targets: list) → list

Calls the audio_to_text to convert the audio into a trancsiped text.

Then translates it into desired langs using the translate_text() method.

Parameters:

Returns:

List of translated texts with the target language.

Return type:

list