Updated: Jul 19
Below is a list of features on our Roadmap for Speech.Works transcription.
You can have influence on the order in which we decide to make them available by voting in the Poll on the Home page.
Uploading more files at once
If you have a lot of files to transcribe, e.g., your podcasts from the past few years, it can be tedious to submit the files for transcription one at a time. We will soon modify the file upload dialog to support submitting multiple files at a time.
Support for additional audio formats
Here we are depending on the feedback of our users to decide which new format is most needed. Some formats that we are planning to support in the near future are:
AAC - stand-alone or embedded inside MP4 video files
Transcribe audio from a YouTube URL
This feature would allow you to provide a URL of a YouTube video as input to the transcribe utility. Our system would then extract audio from that URL and transcribe it.
Process stereo channels separately
Currently stereo file that gets uploaded mixed to mono before we submit it to our Speech-to-Text engine. Our plan is to provide an option to transcribe each of the stereo channels individually and then identify output separate transcript for each of the channels.
Of course' the timing information for both transcripts would be in sync, and in the non JSON transcript output would distinguish text from different channels.
Ability to edit transcript before download
We already provide a transcript review functionality where it is possible to check if the transcript matches the audio.
Our plan is to enhance this functionality with ability to edit the transcript in place if one notices discrepancy between text and audio. It would also be possible to fix wrong capitalization and punctuation, and then save or download the modified transcript/
Identify different speakers in the audio (diarisation)
Currently, the output transcript does not identify the speakers in the audio. Thus, if the audio contains a dialogue, the transcript does not have any information identifying which speaker says which text.
We are working on functionality to label transcribed text with originating speaker, if more than one speaker is detected in the audio.
Spanish language support
Currently, only (American) English is supported in Speech.Works. If we receive sufficient feedback regarding demand for Spanish language support we will make it our priority to release a Spanish version of our Speech-to-Text engine