We have moved at community.getvera.com

Author Topic: New TTS engines  (Read 12020 times)

Offline ericonvera

  • Sr. Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
Re: New TTS engines
« Reply #15 on: January 19, 2014, 09:10:49 pm »
Some time ago, I built a TTS replacement using python's SimpleHTTPServer and OSX's say command, just in case google's service ever went away. If anybody's interested, I can find the code and post it.

I'm pretty sure that's the same idea behind the one that I got working from this site -> http://wolfpaulus.com/jounal/mac/ttsserver/ but I'm not sure exactly how he implemented it.

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engines
« Reply #16 on: January 20, 2014, 03:01:47 pm »
I just tested patched version by @ericonvera and it is working very well ... but you don't have the choice of the language.

I just downloaded a French voice on my mac. If you choose "Audrey" from the list of available voices in that admin panel I linked you, you'll get French support. I've set it up as the default just so you can hear it. Let me know if it works for you.

Yes, it works and the quality is really excellent, maybe the best quality I have heard ! But the speech is just too fast and I cannot use any special characters (french accents). Difficult to find a long sentence without that :) Maybe there is a charset encoding for the text to setup on server side ?

Even if it is not a real problem (generally you need one language), can you confirm that with this solution, we can have only one language supported at a time. Because now, if I use en or fr for the language, I always get Audrey's voice.

Offline ericonvera

  • Sr. Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
Re: New TTS engines
« Reply #17 on: January 20, 2014, 03:46:47 pm »
the speech is just too fast

This can be changed through the admin panel (I think).

I'm not sure on the charset what our options are. Ideally, I'd like to get the TTSServer code on github and modify it to allow the voice to be set from the URL as well as make adjustments for speech speed and see what needs to be done for encoding. I've emailed the original author letting him know about our application here. If I hear back from him, I'll ask about forking the project - otherwise I'll look into writing from scratch or using a start from @SM2k.

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engines
« Reply #18 on: January 20, 2014, 05:17:26 pm »
Read chapter 4.4, you can apparently choose your voice but you have to use a HTTP POST request.

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engines
« Reply #19 on: January 20, 2014, 05:50:15 pm »
What about the Microsoft translator service ? A HTTP API exists with a speak feature ? Any idea of the audio quality?
The service is free.

Offline ericonvera

  • Sr. Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
Re: New TTS engines
« Reply #20 on: January 21, 2014, 01:52:16 pm »
What about the Microsoft translator service ? A HTTP API exists with a speak feature ? Any idea of the audio quality?
The service is free.

Audio quality is decent. Certainly better than most free TTS engines. It looks like the same type of method we use for Google or the OSX one should work. The URL structure is as follows:

http://api.microsofttranslator.com/V2/Http.svc/Translate?

text=the text to speek&
from=en&
to=en&
contentType=textPlain&
appId=123

there's some more information about the method here -> http://msdn.microsoft.com/en-us/library/ff512421.aspx

that appId is something you need to get from Microsoft. There are instructions to do so here -> http://msdn.microsoft.com/en-us/library/hh454950.aspx

Offline SM2k

  • Full Member
  • ***
  • Posts: 179
  • Karma: +4/-0
Re: New TTS engines
« Reply #21 on: January 22, 2014, 11:09:39 am »
- otherwise I'll look into writing from scratch or using a start from @SM2k.

Sorry for the delay. It took me some time to locate the file. :) This code is horribly direct and was designed to simply mimic the pieces of Google's translate server that were used by the Sonos plugin. One note, I chose to use the Samantha voice (the new voice of Siri) as I think it sounds much better than other options. Depending on how new your OS is, you might need to download that voice: http://osxdaily.com/2012/06/04/add-voice-of-siri-to-mac-os-x/

Obviously to open port 80, you'll need to execute this script as root, which isn't a terrific idea of course--but this was for testing. One trick I used to test this without modification to the Sonos plugin code was to add a local IP for translate.google.com to vera's /etc/hosts file. I wouldn't recommend that for anything but testing of course.

Final thought: you'll need lame installed on your mac as well. I used macports for that, but installers for OS X appear to be readily available.

Code: [Select]
#!/usr/bin/python
from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer
import os
import urlparse
import tempfile
import subprocess
import shutil

PORT_NUMBER = 80

#http://translate.google.com/translate_tts?tl=%s&q=%s
class voiceProxy(BaseHTTPRequestHandler):

    #Handler for the GET requests
    def do_GET(self):
        parts = urlparse.urlparse(self.path)
        text = urlparse.parse_qs(parts.query).get('q', [''])[0]
        tmpDir = tempfile.mkdtemp()
        try:
            aiff_fn = os.path.join(tmpDir, 'translate_tts.aiff')
            mp3_fn = os.path.join(tmpDir, 'translate_tts.mp3')

            # create an aiff file of the submitted text
            p = subprocess.Popen(['say', '-v', 'Samantha', '-o', aiff_fn],
                stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
            p.stdin.write(text)
            out, err = p.communicate()
            status = p.wait()

            # translate the file to mp3
            p = subprocess.Popen(['lame', '-h', '-m', 'm', '-b', '64', aiff_fn, mp3_fn],
                stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
            out, err = p.communicate()
            status = p.wait()

            # send it back...
            f = open(mp3_fn)
            self.send_response(200)
            self.send_header('Content-type', "audio/mpeg")
            self.end_headers()
            self.wfile.write(f.read())
            f.close()
        finally:
            shutil.rmtree(tmpDir, ignore_errors = True)

try:
    #Create a web server and define the handler to manage the
    #incoming request
    server = HTTPServer(('', PORT_NUMBER), voiceProxy)
    print 'Started httpserver on port ' , PORT_NUMBER

    #Wait forever for incoming http requests
    server.serve_forever()

except KeyboardInterrupt:
    print '^C received, shutting down the web server'
    server.socket.close()

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engines
« Reply #22 on: January 23, 2014, 06:55:30 am »
@ericonvera: shall I keep the lang parameter in the URL ? This parameter is not mentioned here: http://wolfpaulus.com/jounal/mac/ttsserver/


Here are my plans:
1 - move all the TTS stuff, including multiple engines, in a library (for easy re-use in the DLNA plugin)
2 - add a new "engine" parameter to the Say action (to let user select its engine) + add a new variable to define the default engine
3 - move UI for TTS (setup and playback) in a new tab

I will try to finish (commit) work relative to points 1 and 2 today or tomorrow.
Point 3 will be managed later.
« Last Edit: January 23, 2014, 06:57:38 am by lolodomo »

Offline ericonvera

  • Sr. Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
Re: New TTS engines
« Reply #23 on: January 23, 2014, 08:38:13 am »
@ericonvera: shall I keep the lang parameter in the URL ? This parameter is not mentioned here: http://wolfpaulus.com/jounal/mac/ttsserver/

You can remove it. I had put it in there in hopes of modifying the TTSServer to accept it and choose a voice based on language but that certainly hasn't happened yet.

I'm glad to hear that this is making it into the general release soon.

Offline SM2k

  • Full Member
  • ***
  • Posts: 179
  • Karma: +4/-0
Re: New TTS engines
« Reply #24 on: January 23, 2014, 11:00:20 am »
I recall there being something like a 100 character limit inside the Sonos plugin (well the plugin breaks speech into chunks that large). I think that might have been imposed by Google. I think other TTS options don't necessarily impose a limit (I know the simple engine I posted doesn't). It would be nice to expose if and how many characters each engine chunks text into, because I artificially pad spaces into larger messages I send to the Sonos plugin so that longer messages don't oddly pause mid-sentence.

Offline ericonvera

  • Sr. Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
Re: New TTS engines
« Reply #25 on: January 23, 2014, 11:10:46 am »
I recall there being something like a 100 character limit inside the Sonos plugin (well the plugin breaks speech into chunks that large). I think that might have been imposed by Google. I think other TTS options don't necessarily impose a limit (I know the simple engine I posted doesn't). It would be nice to expose if and how many characters each engine chunks text into, because I artificially pad spaces into larger messages I send to the Sonos plugin so that longer messages don't oddly pause mid-sentence.

Correct. Google has a limit of 100 characters. The other service I tested doesn't have a limit so the plugin doesn't break it into chunks. There could probably be some smarter logic for Google that would use punctuation to break up long messages instead of just spaces like it does now.

Sent from my SCH-I545 using Tapatalk


Offline SM2k

  • Full Member
  • ***
  • Posts: 179
  • Karma: +4/-0
Re: New TTS engines
« Reply #26 on: January 23, 2014, 11:29:31 am »
Correct. Google has a limit of 100 characters. The other service I tested doesn't have a limit so the plugin doesn't break it into chunks. There could probably be some smarter logic for Google that would use punctuation to break up long messages instead of just spaces like it does now.

Ah! If the plugin were able to break on punctuation then I wouldn't need to artificially pad large messages, nor would I even need to know about how each engine chunks text. That approach would work even better.

Offline ericonvera

  • Sr. Newbie
  • *
  • Posts: 24
  • Karma: +0/-0
Re: New TTS engines
« Reply #27 on: January 23, 2014, 01:18:53 pm »
Correct. Google has a limit of 100 characters. The other service I tested doesn't have a limit so the plugin doesn't break it into chunks. There could probably be some smarter logic for Google that would use punctuation to break up long messages instead of just spaces like it does now.

Ah! If the plugin were able to break on punctuation then I wouldn't need to artificially pad large messages, nor would I even need to know about how each engine chunks text. That approach would work even better.

@lolodomo, this should be as simple as the following:

Code: [Select]
local pos = string.find(string.reverse(string.sub(remaining, 1, cutSize+1)), ".")
if (pos == nil) then
  pos = string.find(string.reverse(string.sub(remaining, 1, cutSize+1)), ",")
  if (pos == nil) then
    pos = string.find(string.reverse(string.sub(remaining, 1, cutSize+1)), " ")
  end
end

This way it first looks for a period, then a comma, then a space if it can't find either. This would go just before the line reading

Code: [Select]
if (pos ~= nil) then

Offline AgileHumor

  • Hero Member
  • *****
  • Posts: 984
  • Karma: +51/-27
  • KISS
Re: New TTS engines
« Reply #28 on: January 23, 2014, 02:56:41 pm »
I don't really use a TTS engine dynamically.  Instead, I use mControl to play static Audio Files when certain luup devices are on/off/armed/tripped.  mControl running on Windows is nice that it also integrates Vera and  Media Center. 

Downside is the price and having to duplicate some logic in both places...as well as the price.

I create and download the WAV files here:
http://www2.research.att.com/~ttsweb/tts/demo.php
WMC Leviton:18xVPE06,8xVRS15,3xVRP03-W,2xVRR15,4xVRCS4,2xVRCS2,VP00R,8xVRS15 Aeon:5xDSC06106,4xDSC24,4xDSC25,12xDSB29,2xDSC11,4xDSB54,DSB05,3xDSA22,DSA38,2xDSA03202B,DSB09104,HEM Other:3xYale,12xHSM100v3,7xSP103,45604,WDHA-12,SSA2USR,EVLCD1T,6xWWA02A,7xIPC-HFW2100,URTSI,Hue,Russound,OpenSprinker

Offline SM2k

  • Full Member
  • ***
  • Posts: 179
  • Karma: +4/-0
Re: New TTS engines
« Reply #29 on: January 23, 2014, 03:35:03 pm »
http://www.assistiveware.com/product/infovox-ivox I haven't downloaded any of these voices, but some of them sound fairly realistic. I *think* they integrate directly with OS X system voices (per their claim that they're system wide). If that's true they should work with any of the OS X TTS solutions that have been discussed. Looks like voices cost 20 to 30 bucks each and you can trial them for a month.