The Vera Community forums have moved!

Advanced => Plugins & Plugin Development => Programming => Sonos Plugin => Topic started by: lolodomo on January 18, 2014, 12:16:17 pm

Title: New TTS engines
Post by: lolodomo on January 18, 2014, 12:16:17 pm
I open a new topic after a short discussion in private with one of you.

Apparently it exists alternatives to Google TTS engine like Cepstral or the OSX TTS engine that we could easily use in the plugin (in place of Google).

First I changed a little the code to make it more easy to add an alternative TTS engine (mainly put the specific code for Google in a function).

I hope we can have good news soon.
Title: Re: New TTS engines
Post by: flyveleder on January 18, 2014, 01:20:11 pm
One of the best engines I have come across is this one : http://demo.acapela-group.com/

Sounds perfect in my language (Danish)

/Martin
Title: Re: New TTS engines
Post by: lolodomo on January 18, 2014, 01:28:04 pm
Of course, we need an engine offering an API and if possible through a simple HTTP request.
Title: Re: New TTS engines
Post by: lolodomo on January 18, 2014, 01:37:07 pm
One of the best engines I have come across is this one : http://demo.acapela-group.com/

Sounds perfect in my language (Danish)

Oh yes, not too bad in French too 8)
Title: Re: New TTS engines
Post by: lolodomo on January 18, 2014, 01:40:07 pm
Google TTS is a free service available on the Internet.

We could imagine to use a TTS engine not directly on the Internet but running on a server in our LAN network.
I think this is what will be explained for example with the OSX TTS engine.
What we need here is an HTTP API to request the engine to produce the audio file.
Title: Re: New TTS engines
Post by: flyveleder on January 18, 2014, 01:46:44 pm
Vera being Linux - maybe installed straight onto the device itself ? ;-) ...
Title: Re: New TTS engines
Post by: lolodomo on January 18, 2014, 01:48:06 pm
One of the best engines I have come across is this one : http://demo.acapela-group.com/

A cloud service seems to be available.
I imagine it is not free ?
Title: Re: New TTS engines
Post by: ericonvera on January 19, 2014, 01:43:09 pm
I've succeeded in implementing this using the TTS Server provided here -> http://wolfpaulus.com/jounal/mac/ttsserver/ (http://wolfpaulus.com/jounal/mac/ttsserver/)

The big benefits of using this are that it doesn't require an internet connection, it will always exist (meaning google can't just pull the plug), it's customizeable (you can select from many voices and adjust the speech rate), and it's just cool.

I have an implementation file for @lolodomo to review as I'm not sure of the best way to implement this in a way without my hardcoded values pointing at my server.
Title: Re: New TTS engines
Post by: lolodomo on January 19, 2014, 02:26:21 pm
I've succeeded in implementing this using the TTS Server provided here -> http://wolfpaulus.com/jounal/mac/ttsserver/ (http://wolfpaulus.com/jounal/mac/ttsserver/)

That's really a great news. 8)

Quote
The big benefits of using this are that it doesn't require an internet connection, it will always exist (meaning google can't just pull the plug), it's customizeable (you can select from many voices and adjust the speech rate), and it's just cool.

I am curious to know if it exists available TTS engine, free if possible, that could be run for example on a raspberry Pi ?

Quote
I have an implementation file for @lolodomo to review as I'm not sure of the best way to implement this in a way without my hardcoded values pointing at my server.

I have to see your code but I think a simple variable to define the URL of the TTS server could probably solve this problem.
Title: Re: New TTS engines
Post by: ericonvera on January 19, 2014, 02:43:50 pm
Cepstral works on a raspberry pi. They have a page on their website talking about this. It's quite inexpensive from what I understand. The API could probably be put together in a few lines of python.

Sent from my SCH-I545 using Tapatalk

Edit: here's the link -> http://www.cepstral.com/en/raspberrypi (http://www.cepstral.com/en/raspberrypi)
Title: Re: New TTS engines
Post by: lolodomo on January 19, 2014, 03:09:17 pm
Cepstral works on a raspberry pi. They have a page on their website talking about this. It's quite inexpensive from what I understand. The API could probably be put together in a few lines of python.

I tried the demo voices from Cepstral and I am not fully convinced by the quality ! Plus there is no French voices of France :)
Title: Re: New TTS engines
Post by: lolodomo on January 19, 2014, 03:57:40 pm
I just tested patched version by @ericonvera and it is working very well ... but you don't have the choice of the language.

The way to request TTS is absolutely similar to what we do with Google, but in this case you point to a personal TTS server rather than Google.

I will make the required changes and then commit that very soon. 8)


Integration of other TTS engines is of course welcome.
Title: Re: New TTS engines
Post by: lolodomo on January 19, 2014, 04:06:39 pm
http://wolfpaulus.com/jounal/mac/ttsserver/ (http://wolfpaulus.com/jounal/mac/ttsserver/)

This article has a link to LumenVox: http://www.lumenvox.com/products/tts/
You can try TTS and it is working pretty well. But prices seem to be very high. Maybe not a solution for personal use.
Title: Re: New TTS engines
Post by: ericonvera on January 19, 2014, 04:22:05 pm
I just tested patched version by @ericonvera and it is working very well ... but you don't have the choice of the language.

I just downloaded a French voice on my mac. If you choose "Audrey" from the list of available voices in that admin panel I linked you, you'll get French support. I've set it up as the default just so you can hear it. Let me know if it works for you.
Title: Re: New TTS engines
Post by: SM2k on January 19, 2014, 08:58:06 pm
Some time ago, I built a TTS replacement using python's SimpleHTTPServer and OSX's say command, just in case google's service ever went away. If anybody's interested, I can find the code and post it.
Title: Re: New TTS engines
Post by: ericonvera on January 19, 2014, 09:10:49 pm
Some time ago, I built a TTS replacement using python's SimpleHTTPServer and OSX's say command, just in case google's service ever went away. If anybody's interested, I can find the code and post it.

I'm pretty sure that's the same idea behind the one that I got working from this site -> http://wolfpaulus.com/jounal/mac/ttsserver/ (http://wolfpaulus.com/jounal/mac/ttsserver/) but I'm not sure exactly how he implemented it.
Title: Re: New TTS engines
Post by: lolodomo on January 20, 2014, 03:01:47 pm
I just tested patched version by @ericonvera and it is working very well ... but you don't have the choice of the language.

I just downloaded a French voice on my mac. If you choose "Audrey" from the list of available voices in that admin panel I linked you, you'll get French support. I've set it up as the default just so you can hear it. Let me know if it works for you.

Yes, it works and the quality is really excellent, maybe the best quality I have heard ! But the speech is just too fast and I cannot use any special characters (french accents). Difficult to find a long sentence without that :) Maybe there is a charset encoding for the text to setup on server side ?

Even if it is not a real problem (generally you need one language), can you confirm that with this solution, we can have only one language supported at a time. Because now, if I use en or fr for the language, I always get Audrey's voice.
Title: Re: New TTS engines
Post by: ericonvera on January 20, 2014, 03:46:47 pm
the speech is just too fast

This can be changed through the admin panel (I think).

I'm not sure on the charset what our options are. Ideally, I'd like to get the TTSServer code on github and modify it to allow the voice to be set from the URL as well as make adjustments for speech speed and see what needs to be done for encoding. I've emailed the original author letting him know about our application here. If I hear back from him, I'll ask about forking the project - otherwise I'll look into writing from scratch or using a start from @SM2k.
Title: Re: New TTS engines
Post by: lolodomo on January 20, 2014, 05:17:26 pm
Read chapter 4.4, you can apparently choose your voice but you have to use a HTTP POST request.
Title: Re: New TTS engines
Post by: lolodomo on January 20, 2014, 05:50:15 pm
What about the Microsoft translator service ? A HTTP API exists with a speak feature ? Any idea of the audio quality?
The service is free.
Title: Re: New TTS engines
Post by: ericonvera on January 21, 2014, 01:52:16 pm
What about the Microsoft translator service ? A HTTP API exists with a speak feature ? Any idea of the audio quality?
The service is free.

Audio quality is decent. Certainly better than most free TTS engines. It looks like the same type of method we use for Google or the OSX one should work. The URL structure is as follows:

http://api.microsofttranslator.com/V2/Http.svc/Translate?

text=the text to speek&
from=en&
to=en&
contentType=textPlain&
appId=123

there's some more information about the method here -> http://msdn.microsoft.com/en-us/library/ff512421.aspx (http://msdn.microsoft.com/en-us/library/ff512421.aspx)

that appId is something you need to get from Microsoft. There are instructions to do so here -> http://msdn.microsoft.com/en-us/library/hh454950.aspx (http://msdn.microsoft.com/en-us/library/hh454950.aspx)
Title: Re: New TTS engines
Post by: SM2k on January 22, 2014, 11:09:39 am
- otherwise I'll look into writing from scratch or using a start from @SM2k.

Sorry for the delay. It took me some time to locate the file. :) This code is horribly direct and was designed to simply mimic the pieces of Google's translate server that were used by the Sonos plugin. One note, I chose to use the Samantha voice (the new voice of Siri) as I think it sounds much better than other options. Depending on how new your OS is, you might need to download that voice: http://osxdaily.com/2012/06/04/add-voice-of-siri-to-mac-os-x/

Obviously to open port 80, you'll need to execute this script as root, which isn't a terrific idea of course--but this was for testing. One trick I used to test this without modification to the Sonos plugin code was to add a local IP for translate.google.com to vera's /etc/hosts file. I wouldn't recommend that for anything but testing of course.

Final thought: you'll need lame installed on your mac as well. I used macports for that, but installers for OS X appear to be readily available.

Code: [Select]
#!/usr/bin/python
from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer
import os
import urlparse
import tempfile
import subprocess
import shutil

PORT_NUMBER = 80

#http://translate.google.com/translate_tts?tl=%s&q=%s
class voiceProxy(BaseHTTPRequestHandler):

    #Handler for the GET requests
    def do_GET(self):
        parts = urlparse.urlparse(self.path)
        text = urlparse.parse_qs(parts.query).get('q', [''])[0]
        tmpDir = tempfile.mkdtemp()
        try:
            aiff_fn = os.path.join(tmpDir, 'translate_tts.aiff')
            mp3_fn = os.path.join(tmpDir, 'translate_tts.mp3')

            # create an aiff file of the submitted text
            p = subprocess.Popen(['say', '-v', 'Samantha', '-o', aiff_fn],
                stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
            p.stdin.write(text)
            out, err = p.communicate()
            status = p.wait()

            # translate the file to mp3
            p = subprocess.Popen(['lame', '-h', '-m', 'm', '-b', '64', aiff_fn, mp3_fn],
                stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
            out, err = p.communicate()
            status = p.wait()

            # send it back...
            f = open(mp3_fn)
            self.send_response(200)
            self.send_header('Content-type', "audio/mpeg")
            self.end_headers()
            self.wfile.write(f.read())
            f.close()
        finally:
            shutil.rmtree(tmpDir, ignore_errors = True)

try:
    #Create a web server and define the handler to manage the
    #incoming request
    server = HTTPServer(('', PORT_NUMBER), voiceProxy)
    print 'Started httpserver on port ' , PORT_NUMBER

    #Wait forever for incoming http requests
    server.serve_forever()

except KeyboardInterrupt:
    print '^C received, shutting down the web server'
    server.socket.close()
Title: Re: New TTS engines
Post by: lolodomo on January 23, 2014, 06:55:30 am
@ericonvera: shall I keep the lang parameter in the URL ? This parameter is not mentioned here: http://wolfpaulus.com/jounal/mac/ttsserver/


Here are my plans:
1 - move all the TTS stuff, including multiple engines, in a library (for easy re-use in the DLNA plugin)
2 - add a new "engine" parameter to the Say action (to let user select its engine) + add a new variable to define the default engine
3 - move UI for TTS (setup and playback) in a new tab

I will try to finish (commit) work relative to points 1 and 2 today or tomorrow.
Point 3 will be managed later.
Title: Re: New TTS engines
Post by: ericonvera on January 23, 2014, 08:38:13 am
@ericonvera: shall I keep the lang parameter in the URL ? This parameter is not mentioned here: http://wolfpaulus.com/jounal/mac/ttsserver/

You can remove it. I had put it in there in hopes of modifying the TTSServer to accept it and choose a voice based on language but that certainly hasn't happened yet.

I'm glad to hear that this is making it into the general release soon.
Title: Re: New TTS engines
Post by: SM2k on January 23, 2014, 11:00:20 am
I recall there being something like a 100 character limit inside the Sonos plugin (well the plugin breaks speech into chunks that large). I think that might have been imposed by Google. I think other TTS options don't necessarily impose a limit (I know the simple engine I posted doesn't). It would be nice to expose if and how many characters each engine chunks text into, because I artificially pad spaces into larger messages I send to the Sonos plugin so that longer messages don't oddly pause mid-sentence.
Title: Re: New TTS engines
Post by: ericonvera on January 23, 2014, 11:10:46 am
I recall there being something like a 100 character limit inside the Sonos plugin (well the plugin breaks speech into chunks that large). I think that might have been imposed by Google. I think other TTS options don't necessarily impose a limit (I know the simple engine I posted doesn't). It would be nice to expose if and how many characters each engine chunks text into, because I artificially pad spaces into larger messages I send to the Sonos plugin so that longer messages don't oddly pause mid-sentence.

Correct. Google has a limit of 100 characters. The other service I tested doesn't have a limit so the plugin doesn't break it into chunks. There could probably be some smarter logic for Google that would use punctuation to break up long messages instead of just spaces like it does now.

Sent from my SCH-I545 using Tapatalk

Title: Re: New TTS engines
Post by: SM2k on January 23, 2014, 11:29:31 am
Correct. Google has a limit of 100 characters. The other service I tested doesn't have a limit so the plugin doesn't break it into chunks. There could probably be some smarter logic for Google that would use punctuation to break up long messages instead of just spaces like it does now.

Ah! If the plugin were able to break on punctuation then I wouldn't need to artificially pad large messages, nor would I even need to know about how each engine chunks text. That approach would work even better.
Title: Re: New TTS engines
Post by: ericonvera on January 23, 2014, 01:18:53 pm
Correct. Google has a limit of 100 characters. The other service I tested doesn't have a limit so the plugin doesn't break it into chunks. There could probably be some smarter logic for Google that would use punctuation to break up long messages instead of just spaces like it does now.

Ah! If the plugin were able to break on punctuation then I wouldn't need to artificially pad large messages, nor would I even need to know about how each engine chunks text. That approach would work even better.

@lolodomo, this should be as simple as the following:

Code: [Select]
local pos = string.find(string.reverse(string.sub(remaining, 1, cutSize+1)), ".")
if (pos == nil) then
  pos = string.find(string.reverse(string.sub(remaining, 1, cutSize+1)), ",")
  if (pos == nil) then
    pos = string.find(string.reverse(string.sub(remaining, 1, cutSize+1)), " ")
  end
end

This way it first looks for a period, then a comma, then a space if it can't find either. This would go just before the line reading

Code: [Select]
if (pos ~= nil) then
Title: Re: New TTS engines
Post by: AgileHumor on January 23, 2014, 02:56:41 pm
I don't really use a TTS engine dynamically.  Instead, I use mControl to play static Audio Files when certain luup devices are on/off/armed/tripped.  mControl running on Windows is nice that it also integrates Vera and  Media Center. 

Downside is the price and having to duplicate some logic in both places...as well as the price.

I create and download the WAV files here:
http://www2.research.att.com/~ttsweb/tts/demo.php
Title: Re: New TTS engines
Post by: SM2k on January 23, 2014, 03:35:03 pm
http://www.assistiveware.com/product/infovox-ivox I haven't downloaded any of these voices, but some of them sound fairly realistic. I *think* they integrate directly with OS X system voices (per their claim that they're system wide). If that's true they should work with any of the OS X TTS solutions that have been discussed. Looks like voices cost 20 to 30 bucks each and you can trial them for a month.
Title: Re: New TTS engines
Post by: lolodomo on January 23, 2014, 06:37:13 pm
@ericonvera : I have committed (in the trunk), it is a first step, improvments are possible.
It only covers the point 1 I mentioned earlier in the day.

New variables:
- DefaultEngineTTS: use either GOOGLE or OSX_TTS_SERVER
- OSXTTSServerURL: set the URL of your personal TTS server, something like http://myserver.org:12345

It is working with the two engines.
Title: Re: New TTS engines
Post by: ericonvera on January 23, 2014, 07:04:50 pm
@lolodomo that is great news

Sent from my SCH-I545 using Tapatalk

Title: Re: New TTS engines
Post by: SM2k on January 25, 2014, 03:47:50 pm
@ericonvera : I have committed (in the trunk), it is a first step, improvments are possible.
It only covers the point 1 I mentioned earlier in the day.

New variables:
- DefaultEngineTTS: use either GOOGLE or OSX_TTS_SERVER
- OSXTTSServerURL: set the URL of your personal TTS server, something like http://myserver.org:12345

It is working with the two engines.

I've glanced at trunk, and there's a lot more files than there used to be.

If I wanted to smoke-jump and install from trunk, I assume I'll need everything in the services directory in addition to roughly what the wiki says for beta 2. It looks like S_SonosAVTransport1.xml was renamed to S_AVTransport1.xml and likewise for S_SonosGroupRenderingControl1.xml -> S_RenderingControl1.xml, correct?

Could I safely remove the beta 2 files after uploading everything from trunk?
Title: Re: New TTS engines
Post by: SM2k on January 25, 2014, 04:13:12 pm
Nevermind! I realized the comments on the files in the services directory were along the lines of "hide stuff that isn't ready/doesn't need to be dealt with". I went ahead and installed the files and have text to speech coming from my mac now. :D
Title: Re: New TTS engines
Post by: allmoney.ws on January 31, 2014, 06:37:59 pm
One of the best engines I have come across is this one : http://demo.acapela-group.com/

Sounds perfect in my language (Danish)

Oh yes, not too bad in French too 8)
Russian voice better that Google TTS ;)
Title: Re: New TTS engines
Post by: flyveleder on February 12, 2014, 06:24:05 am
Is anyone working on how to get Acapela TTS working with the Sonos Plugin ?
Title: Re: New TTS engines
Post by: lolodomo on February 12, 2014, 08:36:05 am
Is anyone working on how to get Acapela TTS working with the Sonos Plugin ?

If you can provide a lua function that returns the URL of the produced (local) audio file and duration (in seconds) with text and language as input parameters, I will add it with pleasure to the plugin and more generally to the TTS library.

First you need to know what kind of API is available. If HTTP API is available, it should be doable relatively easily.
Title: Re: New TTS engines
Post by: flyveleder on February 12, 2014, 09:00:55 am
I can provide you with exactly nothing :-) I don't have a clue about lua programming or know if acapela provides with public API.

My question was merely if someone was looking into it; Otherwise I will lower my expectations ;-)

(Google TTS is doable - but far from perfect).

Thanks,
Martin