Technical Debt

The list below represent our technical debt which we will be addressed in the coming future

We will check the debt which is done by a

New Features:

1- Audio to many Audios translations 2- Video to many text translations (pure text) 3- Video to many Video translations (live) 4- Video to text translated on the videos

All the above need to be implemented from one to many languages. For example an english video should be translated into french, italian, etc… as a text. Or kept as a video (different language) or translated live.

Audio Search for a given language words(optional):

We need to be able to search audio file for a given word in a given language for example, if I have an audio in Swedish and I want to search for a words such as “eat”, “bread” in English; so I convert to the audio to English and do the search for the given English words. This is optional now and it is part of pocketsphinx see https://cmusphinx.github.io/wiki/tutorialpocketsphinx/#advanced-usage It also allow us to build a language model which we will not do now but who knows

Format limitation Currently, SpeechRecognition supports the following file formats:

WAV: must be in PCM/LPCM format ( we support this only) AIFF AIFF-C FLAC: must be native FLAC format; OGG-FLAC is not supported

Though a WAV file can contain compressed audio, the most common WAV audio format is uncompressed audio in the linear pulse-code modulation (LPCM) format. LPCM is also the standard audio coding format for audio CDs, which store two-channel LPCM audio sampled at 44.1 kHz with 16 bits per sample.

online play https://www.luxa.org/audio

No Noise filter implemtation. No suppport for ambient noise . the file needs to be clean of noise. This can be implemented in the future as TODO or techincal debt as per scipy digital signal processing see https://realpython.com/python-scipy-cluster-optimize/ see chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://greenteapress.com/thinkdsp/thinkdsp.pdf