This article originally appeared in TidBITS on 2010-05-19 at 4:24 p.m.
The permanent URL for this article is: http://tidbits.com/article/11296
Include images: Off

Transcribe Recordings With MacSpeech Scribe

by Matt Neuburg

In recent years, dictation software has become a firmly entrenched reality. It is perfectly possible to sit at your computer wearing a headset and speak to the computer and have it transcribe, with astonishing accuracy, the words that you speak. But what if you are not sitting at your computer? What if you have an idea that requires later transcription, and all you have with you is some sort of recording device? The promise of MacSpeech Scribe is a solution to that problem.

MacSpeech Scribe [1] (from the makers of MacSpeech Dictate [2], the speech recognition application) does not pretend to have the human ability to recognize just anyone's speech. You have to train it, and the speech that it recognizes is yours, and yours alone, the result of your deliberately dictating into a particular digital recording device. So you're not going to be using MacSpeech Scribe to transcribe a teacher's lecture, let alone a debate.

Nevertheless, there are good reasons why the capability to transcribe one's own speech from a recording might be preferable to real-time dictation. As I've already suggested, you might not have a computer with you at the moment you'd like to dictate something. Also, there are significant psychological and even physical differences between dictating directly to your computer and speaking into a recording device. I find that something about the computer sitting there waiting for me, the necessity of wearing the headset, the importance of maintaining strict silence, and other factors combine to make me extremely nervous and tongue-tied. I feel more relaxed talking into a digital recorder. I feel I have time to collect my thoughts. Also, I can clean up the digital file a little with a sound editor program before I hand it over to MacSpeech Scribe, so I'm less nervous about errors than I am when the computer is listening to me.

Another reason why MacSpeech Scribe might be more congenial than MacSpeech Dictate is that the user interface is simpler. MacSpeech Dictate allows you to dictate directly into any application. The price of that power is that you then have to use your voice and some floating windows to make any corrections; you must not make corrections directly by typing, because then you would be acting behind the program's back, as it were, and it would not know what edits you had made to the dictated material. MacSpeech Scribe, on the other hand, is far simpler, both when you are doing your original training, and when you are transcribing an actual sound file. The folks at MacSpeech, which was recently bought by Nuance, the company from which MacSpeech licensed the speech recognition engine used in Dragon NaturallySpeaking [3] for PC, reduced the interface to a single extremely simple window. I find it quick and easy to make the very few corrections that might be necessary when MacSpeech Scribe transcribes a recording into text.

As a demonstration of the sort of thing that MacSpeech Scribe can do, I dictated almost the entirety of the first draft of this article using MacSpeech Scribe and a digital recording device, the Zoom H2 [4]. To give you a sense of what the experience is like, I've uploaded a portion of the actual recording [5] of myself speaking the original first draft, just making it up out of my head and saying it to the H2, along with MacSpeech Scribe's transcription [6] of that section of the recording, without any edits or changes. You can compare the two and see how accurately the program is able to interpret the recording. I think the results speak for themselves.

Training MacSpeech Scribe is simple. You speak to your device, enough to make a recording of at least 2 minutes in length; then you hand that recording over to MacSpeech Scribe. The program transcribes the first 15 seconds of the recording, and you run through the transcription phrase by phrase, either accepting or correcting each phrase. The program then starts over and transcribes the first 90 seconds of the recording, and you do the same thing. This is enough for MacSpeech Scribe to generate an initial voice profile for you; you can give it more recorded material for additional training and additional resulting accuracy.

Transcribing is equally simple. You hand your recording over to MacSpeech Scribe. It presents the text result very quickly (much more quickly than it was spoken originally; I'm not sure how that magic is performed), in a window with two panes. When you click on any part of the text in the first pane, possible corrections appear in the second pane. If the correction you want isn't there, you can edit a correction that is there. You then click a button to enter that correction in place of the original interpretation. The accuracy seems very high, especially for non-technical subjects. Vocabulary can be added manually, a word or phrase at a time, or by giving MacSpeech Scribe a text file to analyze.

Despite all this simplicity, the program has some bugs. For example, there's a checkbox to stop MacSpeech Scribe from checking online for a new version of the program every time it starts up, but your setting here is forgotten. And I several times got mysterious error dialogs about not being able to find a needed file or folder, and had to quit the program and start it up again.

My biggest complaint is about the manual and online help. Nothing tells you what punctuation you're allowed to say, a serious omission. Beyond that, I confess, I have a dog in this fight: I wrote the original manual and online help for MacSpeech Dictate, and these have been edited badly to create the help for MacSpeech Scribe. Thus the Scribe manual starts out with some material that's true of Dictate but false and irrelevant for Scribe, and a careless global replacement turned my sentence "Dictate, don't talk" into "Scribe, don't talk." I wasn't paid or credited for this reuse of my work, and considering the nature of the result, perhaps that's just as well.

Still, I find it astonishing that a program like MacSpeech Scribe is even possible. You're up and running, with the program trained and ready to go, in just a few minutes; after that, you have your own personal transcription secretary and you're ready to dictate the Great American Novel while you're out for a walk in the woods.

MacSpeech Scribe costs $149.99, and requires an Intel-based Mac running Mac OS X 10.6 Snow Leopard. Audio files must be WAV, AIFF, or AAC, and should be as high quality as possible; you can dictate into your computer or into a digital recorder (including, according to the manual, an iPhone).

[1]: http://www.macspeech.com/pages.php?pID=181
[2]: http://www.macspeech.com/pages.php?pID=143
[3]: http://www.nuance.com/naturallyspeaking/products/editions/
[4]: http://www.zoom.co.jp/english/products/h2/
[5]: http://www.tidbits.com/resources/2010-05/dictatedRecording.m4a
[6]: http://www.tidbits.com/resources/2010-05/transcribedRecording.txt