About
What is multimodal translation?
Simply put, it’s translating content across various types of media.
Why is multimodality important?
Types of multimodal translation:
Text-to-text: This is the simplest form where you can translate text from one language to another language.
Audio-to-text: Here the audio is transcribed and then translated also into several languages.
Audio-to-audio: May be implemented in the future. It’s the same concept as audio to text but the output remains in audio format.
Technology used:
Speech recognition: Important to recognize spoken language for interpretation and translation. Output can then be in text or audio format.
Limitations:
language support: Hard to support all languages, since every language has its own modal that has to be trained and installed into the application.
Maintaining context: The context may change across different media. So it’s a must to ensure the context remains correct.
Improvements:
As mentioned above, audio to audio will be implemented in the future. Other media types can also be implemented like videos and images.