audio package
translate module
- multimodaltranslation.audio.translate.audio_to_text(audio_bytes: bytes, model: str) str
Converts the audio files into text.
- Parameters:
audio_bytes (-) – The bytes of the audio file.
model (-) – The path to the correct model as a string.
- Returns:
The transcription of the audio.
- Return type:
str
- Raises:
RuntimeError – If the conversion of the audio file to wav type failed.
- multimodaltranslation.audio.translate.convert_to_wav_bytes(audio_bytes: bytes) BytesIO
Converts the different audio types into wav (using ffmpeg) which is needed by our model.
- Parameters:
audio_bytes (-) – The audio file in bytes.
- Returns:
The converted audio file.
- Return type:
io.BytesIO
- Raises:
RuntimeError – If the conversion process fails.
- multimodaltranslation.audio.translate.get_model(lang: str) str
Returns the path to the Vosk model for the given language. Downloads it if not already installed.
- Parameters:
lang (str) – The language of the model.
- Returns:
Path to the model folder as a string.
- Return type:
str
- Raises:
Exception – Language model not available.
- multimodaltranslation.audio.translate.translate_audio(audio_bytes: bytes, lang: str, targets: list) list
Calls the audio_to_text to convert the audio into a trancsiped text.
Then translates it into desired langs using the translate_text() method.
- Parameters:
audio_bytes (-) – The bytes of the audio file.
lang (-) – The original language of the audio.
targets (-) – A list of lanuages desired for translation.
- Returns:
List of translated texts with the target language.
- Return type:
list