![]() You can download them or clone them from GitHub here or follow along and add them as we go.įirst, open or create a file called console.txt. Otherwise, I recommend creating a dev folder under your user and under that add utils and speech-to-text subfolders. ![]() If you’ve already got a dev environment set up, then you know where you want to put this. We’re going to save the input audio here and have a text file to receive the output text along with the command line commands and Python scripts that do all of the work. With Python, hopefully, up and running and your IBM account you’re ready to set things up.įirst, create a folder to host your scripts. We will need both of these things when setting up your scripts. Your API Key (keep this secret) and the Url where IBM is listening for your requests. You should be on a page listing your speech-to-text settings. The defaults should be fine for a hacked together tool like this. ![]() Next, go to your IBM Bluemix Console (log in if you need) and then search for and select “speech to text”. Sign Up for Watsonįirst, create an IBM Cloud account and confirm your email, agree to give up your next child in the TOS, and finish all of the usual other setup steps. It took me a lot of Googling but I’m sure you’ll get there if you have enough time.Īlternately, ask a Mac-using friend to borrow their computer. Setting this up is way beyond the scope of this article. I installed Bash, integrated it with VS Code and then followed the Linux instructions to get Python running. In my case, I’ve got WSL running Ubuntu Linux in Visual Studio Code. If you’re on a Windows machine, then you have some work to do installing Python on Windows, Cygwin, or setting up the Windows Subsystem for Linux (WSL). If you’re on Mac or Linux, you’ve already got Python installed and should be good to go. You’ll need to be able to run Python in order to follow the instructions here. You’ll need to read the output while listening along and check the words as well as add punctuation, but I’d estimate that I can do five minutes of audio in about 15 minutes of work now that I have this set up whereas it could take the better part of an hour before. Machine transcription is far from perfect, and IBM Watson is no exception. IBM allows 100 minutes of transcription a month which should be enough most people who need a hacky solution like this. I’m going to show you how to use IBM Watson’s Speech-to-Text along with Python in order to partially automate the transcription process. However, if you’re just looking to transcribe an interview for something like a case study, testimonial, or qualitative feedback, you don’t have the budget to pass off this tedious work to someone else, and don’t have the patience to do everything yourself then this hacked together workflow will speed up the boring bits significantly. If you are working with text at scale, I recommend you pay for one of the many wonderful services or pay to have someone integrate machine-learning-powered speech-to-text with your workflows. So you’ve got some audio that you want to turn into text? There are a lot of good, inexpensive options out there powered by machines, people, and a mix of the two.
0 Comments
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |