Tweaking Text To Speech (TTS)
We're evaluating Text To Speech (TTS) as a way of quickly and economically adding 'voice' to courses. Mileage definitely varies on the quality of "natural voices". If you're not familiar with it, you can buy voices based on gender and language (even "american" versus "UK" English). Right now we're evaluating NeoSpeech's VoiceText, NextUp's TextAloud, and hopefully Loquendo's TTS software.
Anyone familiar with any of these TTS apps? Any suggestions on sites to find information on comparing the different packages or tweaking the text to get the best inflections and pronunciations?
Here are the three we're evaluating at this time:
NextUp's TextAloud seems like the most practical and inexpensive option. It's currently $29.95for the basic software, $25 for two ATT natural voices, so it's $54.95 total for a single license. It had some trouble with acronyms, but I could create and store custom pronunciations. It has a 30 day trail, but not with the ATT Natural Voices, so it's very robotic sounding. I'm considering buying it just to see how the ATT voices sound.
NeoSpeech's VoiceText seems very simple, with little ability to tweak, and it's pricey, even for us, as internal users without any resell value added to the licenses it's 1500/annual for three licenses (Pricing based on intended usage as well as the number of licenses needed). The interface is simpler than the less expensive TextAloud, and doesn't seem as intuitive. They have a 30 day trial (a bit convoluted to install). I was able to create 33 fairly short WAV files, and could tweak the text to decent sounding audio. Not sold on it though, and certainly not on that price. Waiting to find out if the "server" component which is where pronunciation is edited, is an additional cost. My content contains chemical abbreviations and terms and lots of acronyms so custom pronunciation is a must.
Loquendo's TTS demo on their website was by far had the most superior pronunciation and inflection, and even has emotional emphasizers and human related sound effects (coughing, etc.). While I could tell it was an artificial voice, it was much more natural than the others. It has more languages supported, as well, which is a definite plus for me. However, I'm still waiting on pricing details and whether or not they have an evaluation version. The company is in Italy so it's taking some time to get responses.