Resolving Text to Speech Issues
Every once in a while, we will get a case from a customer that says that Voice Elements crashed unexpectedly. When we take a closer look at the Event Viewer we find that the DLL that gets faulted is from the TTS vendor that they use. We have seen this issue with most of the more commonly used TTS Vendors using SAPI (NeoSpeech, Cepstral, Loquendo, etc). While we don't know what the root cause of the issue is, here's what we do know:
- Some TTS products experience this issue more regularly than others.
- We've studied the SAPI documentation to make sure that the calls that we make via SAPI are correct
- We thought that the issue may have been with Voice Elements. However, we have created test applications that are completely oustide of Voice Elements to get to the bottom of the issue, and have seen the same problem.
- The only TTS vendor that I haven't seen this issue occur with is Microsoft (although that doesn't necessarily mean that it's immune from the problem -- see the point above).
- There seems to be little rhyme or reason to the issues that we see, which is why we've been unable to make much progress with each Vendor Specific TTS package, or with Microsoft (who created SAPI -- where we think the issue may be occuring).
- We've set up test instances where the error never occurs after making tens of thousands of TTS calls in a short span without seeing the issue, while a few thousand in a different test instance will cause the TTS engine to crash. I've even seen a case where the error occured at 3AM, while there hadn't been any calls made since 9PM the day before.
- We have found that this most often occurs with customers that use TTS heavily.
What should you do?
If you have experienced this issue, you should try to find out how often the issue is occuring. Like I described above, some TTS vendors are better than others. If you see the issue occur regularly (i.e. every few thousand TTS calls), you may want to look at a different TTS vendor. For example, I saw one TTS vendor that crashed like clockwork after just a few minutes of full load on the TTS engine (using all TTS licenses).
Recommended TTS Vendors
Below are a few TTS vendors that seem to hold up pretty well. Please note that we have seen these TTS vendors crash before, but it seems to be much less frequently than other TTS packages that we've seen:
We sell NeoSpeech licenses, and we've found their Voice Quality to be quite good. The pronunciation isn't perfect, but much better than the typical TTS vendor. NeoSpeech does have the issue described above, but it occurs much less frequently than other TTS vendors that we've seen. Here is the web address if you want to try out their voices. If you are interested we can send you pricing information: http://neospeech.com/
We also sell Cepstral licenses. I believe the pricing is pretty similar between NeoSpeech and Cepstral, but I find the NeoSpeech voices to be of a higher quality (completely subjective). You are welcome to give those a listen as well:
We've recently added support for the Microsoft Speech Platform Voices. These are free. While the voice quality isn't as good as other vendors, the pronunciation is a bit more accurate than the others (this is because the speech is synthesized instead of using voice files). Also, these voices do not use SAPI, and we suspect that they won't have the same issue with intermittent crashes that we've seen from other vendors (this is a little hard to confirm, because different vendors seem to crash at much different rates than each other -- but I haven't seen it occur yet).
Here is a link to some of the voices that you can listen to: http://voiceelements.com/tts/default.aspx
Because several TTS vendors seem to have this issue, we've come up with a workaround that works well for our customers:
We created a separate process that handles all of the TTS commands. Voice Elements will receive the commands and send a UDP packet to the separate process to perform the TTS action. What's nice about this TTS solution is that when it crashes, it takes down a separate process instead of causing all of your calls to terminate. It's also built so that if it crashes, Voice Elements will spawn a new instance of the process, so the worst case scenario is that you get an exception on the TTS commands that were placed while it was down (usually just one). Because we throw an exception, you can also re-send your TTS command so that you end up getting the TTS that you need. Also, you don't have to change any of your code to implement this.
How to set up the workaround
- Back up your Voice Elements and HMP Elements Directories (just in case something doesn't work as expected)
- Make sure that you are running a version of Voice Elements that is more recent than March 1st of 2012. If you are unsure of where to find updates, please e-mail firstname.lastname@example.org
- Download HmpElementsTTS.zip (see below), and place it in your Voice Elements Directory. You may download it here: http://download.voiceelements.com/Customer/HmpElementsTTS.zip
- Add the following configuration settings to your VoiceElementsServer.exe.config (in the CTI32NetLib section as shown below). You will want to replace the TTSIp with the IP address of the Voice Elements Server.
<CTI32NetLib.Properties.Settings> <setting name="TTSIp" serializeAs="String"> <value>192.168.1.130</value> </setting> <setting name="TTSPort" serializeAs="String"> <value>54332</value> </setting> </CTI32NetLib.Properties.Settings>