We have moved at community.getvera.com

Author Topic: Plugin 2.1 Beta - Text To Speech Support  (Read 7740 times)

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Plugin 2.1 Beta - Text To Speech Support
« on: February 11, 2015, 10:38:22 am »
Howdy folks,

Version 2.1 (beta) is ready for consumption.  This release includes text-to-speech (TTS) support.  This is supposed to work as follows:

1) Your squeezebox player may be doing nothing (be idle) or may be playing something.
2) When a command is sent to say something at a given volume...
     a) The player starts saying the message at the volume specified
     b) As the message is being said, the player will display a two line message (if the message is supplied and if your player supports it)
     c) When finished playing the message, the player will revert to it's prior state.  (Continue playing the track where it left off, rest the volume, etc.)

Code Structure Changes:
  • Addition of L_SqueezeboxTTS.lua file.  This is essentially a re-use (with permission) of lolodomo's library that is used in the Sonos and DLNA plug-ins.

Usability / Cosmetic Enhancements:
  • The Squeezebox Server Device now has UI to support the setting of defaults related to TTS.
  • The squeezebox Player Devices support a new action: "Say" (documented below)
  • The Squeezebox Player Device(s) now has UI support to test the Say action


To install the beta, (on either UI5 or UI7):

  • Please upgrade to the official 2.0 version from the app store if you have not done so already.  Make sure your server device and one or more player devices are functioning properly.
  • download the attached zip, and upload all of the included files to Vera. ( Apps-->Develop Apps-->luup files). Make sure the check-box for "restart..." has a check mark in it.
  • After the luup restart, refresh your bowser (press F5)

Set-up:
The Text to speech engine supports googles "non official" API, and the OSX TTS engine.  I have exposed the same default settings that are exposed in the Sonos and DLNA plugins.  You access the default settings by going to the Squeezebox Server device, and going into the "Player Control" tab.  (Wrench in UI5, or the ">" button in UI7).

Here are the settings.  They will have defaults automatically populated (Language=en, Engine=GOOGLE, google URL=http://translate.google.com)  when the plug-in first installs.
  • Language:  This is the language code. In theory is should dictate the voice that is used and any dialect specific inflections.  I have tried some other dialect (like en-UK and en-AU), and they seem to work as intended.  Other language codes are reference here that you may want to try....your mileage may vary.   The current beta has an issue though where if you change the language setting (or server or URL settings) they don't go into effect until a luup restart.  This will be fixed prior to official release.
  • Default Engine:  Use GOOGLE's service, or your own Mac OSX TTS service if you are running one.  NOTE:  If your vera is set-up to be "secured", you will not be able to use google's service.
  • Google TTS Server URL: Some folks have gotten different voices by changing the url.
  • OSX TTS Server URL: If you're using an OSX server, the local url for that server.

Testing TTS
Go to one of your squeezebox player devices and to the "Player Control" tab.  (Wrench in UI5, or the ">" button in UI7).  Enter in some text in the box next to the "Test Say" button, and press the Test Say button.  You should hear the text-to-speech announcement at the volume level that is currently set for the device.

"Say" action

Action Name: Say
Service Name: urn:micasaverde-com:serviceId:SqueezeBoxPlayer1
Parameters:
  • Text   (the text to say)
  • Line1  (optional) The text to display on line one of your player as the message is being said
  • Line2  (optional) The text to display on line two of your player as the message is being said
  • Volume  (optional) Value from 0 to 100.  If omitted, the speech will be read at the current volume of your player device.
  • Language  (optional) the language code to use.  If omitted, the default value (see set-up above) will be used
  • Engine  (optional) Either "GOOGLE" or "OSX_TTS_SERVER"  If omitted, the default value (see set-up above) will be used.

Sample LUA for the Say action:


Below is sample lua to send a text announcement of The quick brown fox jumped over the lazy dog.  Have a nice day! at volume level 70%, to squeezebox player with a vera ID of 10.  While the message is being read, the player will display "Vera Squeezebox Plugin" on line one, and "Text to speech test" on line 2 of the display.

Code: [Select]
local lul_arguments = {}
lul_arguments["Text"] = "The quick brown fox jumped over the lazy dog.  Have a nice day!"
lul_arguments["Line1"] = "Vera Squeezbox Plugin"
lul_arguments["Line2"] = "Text to speech test"
lul_arguments["Volume"] = 70
luup.call_action("urn:micasaverde-com:serviceId:SqueezeBoxPlayer1","Say",lul_arguments,10)

Known issues / things to watch out for
I have not done thorough testing with trying to say "multiple" things at once.  In other words, send a say command, and then before it is done, send another command.  In theory there is a queue system that should work.  In practice, I have not tested it.

Some audio file formats do not support skipping to specific times within the file.  So if you're playing something and it gets interrupted by speech, the plugin may not be able to "pick-up where you left off" after the speech is done.  The result is likely to just start the track over, (or if it is an on-line stream, start as if you just reconnected to the stream.

You may not see the 2 lines of text on your display if you have "special characters" in the text.

If you change the default TTS settings (language, server, url), a luup restart may be needed for those settings to take.  Checking into this...

future enhancements
I have not tried to expose the option to dynamically sync and unsync players at this time.  For example, sync all your players, make the announcement, and then revert your players to how they were before the speech.   I may try and add this in a future release.  Right now, I believe if you have players synced and you send a speech command to one of those players, all sync'd players will also make the announcement.

As always...comments welcome!


« Last Edit: February 13, 2015, 05:02:02 pm by JoeyD »

Offline konradwalsh

  • Hero Member
  • *****
  • Posts: 566
  • Karma: +19/-6
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #1 on: February 11, 2015, 02:59:29 pm »
I am using picoreplayer to run all my squeezelites. Same problem exists here as when I manually run a google TTS link..

It plays but no sound...

Can I do anything to help you

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #2 on: February 11, 2015, 03:41:06 pm »
Well, I have no idea what a picoreplayer is, but I can try.   ;)

First some background so you can get a feel for what is actually happening behind the scenes:

What happens with this plug-in (via the TTS library), is actually different than going directly to google.  Previously when you tried to send a TTS link to your player, it would try and stream the file directly from Google's servers.  That should work as long as your text is less than 100 characters.  (Any more than that, and google returns an error.)  So you can see why a plugin to say any length of speech needs to work differently.

What the plug-in does is:
1) splits the text "intelligently" into chunks
2) goes to google and sends a call for each chunk
3) Gets all the mp3 file(s) from google, stitches them together and stores the complete mp3 file on your vera.  It then tells your squeezebox player to play the MP3 file directly from your vera unit.

(Then after the speech is said, the mp3 file is deleted from your vera.)

When you say "it plays but there is no sound" what does that mean?  (you see the squeezeplayer change tracks?)

If your squeeze players can't even play a manual TTS link (only a few words), that may indicate some network connectivity issue.  Let me think about what we can do to troubleshoot.

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #3 on: February 11, 2015, 05:40:42 pm »
konradwalsh,

Let's try the following.  Upload the attached updated file to your vera and re-start.

Go to the test box, and enter in the following text:
Quote
This is a test of the squeezebox say system.  It should be about 7 seconds long.

And press the say button.  Give the system 30 seconds or so, and then look at your log.  (I see you're developing the kettle app, so I know you're familiar with logs.:)  ) You should see in your log entries similar to this:
Code: [Select]
luup_log:24: SQUEEZE: Getting OS Command to retrieve MP3 file... <0x2b54e000>
luup_log:24: SQUEEZE: file = /www/Say.25.mp3 <0x2b54e000>
luup_log:24: SQUEEZE: GoogleServerURL = http://translate.google.com <0x2b54e000>
luup_log:24: SQUEEZE: language = en <0x2b54e000>
luup_log:24: SQUEEZE: fragment1 = This%20is%20a%20test%20of%20the%20squeezebox%20say%20system%2e%20%20It%20should%20be%20about%2010%20seconds%20long%2e <0x2b54e000>
luup_log:24: SQUEEZE: OS Command: rm /www/Say.25.mp3 ; wget --output-document /www/Say.25.mp3 \
luup_log:24: SQUEEZE: LOCAL FILE STORED AS: http://192.168.1.13:80/Say.25.mp3    DURATION = 7 seconds <0x2b54e000>

A few things to watch:

1) If you don't see any of these, then the code is erroring out before it tries to even say anything.
2) We can verify the MP3 file name that is being created, and the other settings.
3) You will notice the last line where it says "DURATION = 7 seconds".  If it turns out that the duration is more like 2 seconds, this means that the plug-in was unable to connect successfully to the google server and retrieve the mp3 file.
4) If the duration is in fact 7 seconds, then your squeezebox devices are not able to get the file from vera.  To confirm if this is the case,
try and play the url directly from your squeeze player. (In this example, try and play " http://192.168.1.13:80/Say.25.mp3"

Let me know what you find.
« Last Edit: February 11, 2015, 05:43:57 pm by JoeyD »

Offline konradwalsh

  • Hero Member
  • *****
  • Posts: 566
  • Karma: +19/-6
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #4 on: February 12, 2015, 06:27:03 am »
I will test all of this now and report back to you

Offline sota

  • Sr. Newbie
  • *
  • Posts: 46
  • Karma: +0/-1
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #5 on: February 12, 2015, 08:59:33 am »
I installed this last night and tested it on a few devices this morning. It works as described, apart from one small problem. It seems that after the plugin has played the file, it sets the device volume to zero, so you cannot hear subsequent messages. It's not always consistent, but it is repeatable on several different devices. If you reset the volume on the plugin, the next message will be heard, but the volume drops to zero again afterwards.
Thanks,

Pat

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #6 on: February 12, 2015, 09:34:37 am »
Thanks for the feedback, Pat.

I have to do some housekeeping when changing the settings back.  It gets a little complicated since it appears that the squeezeboxes don't really have a command buffer. So if you send a command too soon after a previous one was issued it tends to be ignored.

For example, I have to set the volume to zero after the statement is read....because when I then switch back to the previous track it immediately starts playing...even if it was paused prior to the speech.  So the current workflow is like this:

1) Store the current settings of the squeezebox device
2) Add the speech file to the current playlist.
3) Start playing the speech track
4) change the volume to the requested speech volume.

Then when it's finished playing:

5) change volume to zero
6) revert playlist track to the previous settings.  (Squeeebox immediately starts playing it at this point, but the volume is zero so you don't hear it.)
7) Delete the speech track from the playlist
8) move the time time of the track to previous setting.
9) change the volume to previous setting.

I need to have a delay between some of those steps to ensure that they are all received.  Apparently, the delay to change the volume back to the original level is not long enough...so its being ignored.

Currently the plugin uses asynchronous i/o so that the plug-in controls react to changes you make on the devices.  That works well for its intention, but it makes things complicated when trying to send multiple commands programmatically.  I'm trying to work out a way (using i/o intercept) so that I "know" when a command is processed, rather than just use a wait time.

In any case, try uploading the attached file and see if the volume resets properly for you after speech is said.

Offline sota

  • Sr. Newbie
  • *
  • Posts: 46
  • Karma: +0/-1
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #7 on: February 12, 2015, 03:49:27 pm »
OK, that explains what was happening. To be fair and honest, I was "pounding" the app a bit and probably not giving it an opportunity complete its steps before being hit with another request. I'll try the new version over the weekend and report back.

Pat

Offline konradwalsh

  • Hero Member
  • *****
  • Posts: 566
  • Karma: +19/-6
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #8 on: February 13, 2015, 04:05:57 am »
At this point I can confirm everything you have already stated... everything functions according to your plan..
If I finally take the dynamic link and open it in a browser.. the MP3 plays perfectly...

I see that same link open on the Squeeze Player but no audio comes plays...

Another oddity is that, like Pat stated, the volume resets sometimes but also, sometimes it carries on playing the last track/playlist when its down..
I think the ideal situation would be .. Pause, If Playing, Stop if wasn't playing..

Anyway.. PiCorePlayer is just a linux squeezelite that runs on RPI.
I will do more extesnive testing this evening when I am home, including copying that dynamic SAY Mp3 file to my library and make sure it can play that too..

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #9 on: February 13, 2015, 07:15:27 am »
Edit: you may want to skip to my next post first before doing any other extensive troubleshooting.

At this point I can confirm everything you have already stated... everything functions according to your plan..
If I finally take the dynamic link and open it in a browser.. the MP3 plays perfectly...

I see that same link open on the Squeeze Player but no audio comes plays...

Just so that I understand...are you referring to the MP3 file that is created on your Vera?  (It is created fine and it plays in your browser if you point to it, but not your PiCorePlayer....  Correct?)

Have you tried creating a Squeezeplay software player on your PC?  And tried to say something on that?  If that works, then I assume there is something about the file format that your PiCorePlayer does not like, or a connectivity issue between your PiCorePlayers and your vera.

Quote
Another oddity is that, like Pat stated, the volume resets sometimes but also, sometimes it carries on playing the last track/playlist when its down..
I think the ideal situation would be .. Pause, If Playing, Stop if wasn't playing..

Can you explain this a little better?  It's supposed to switch back to the prior track, and either continue playing (if it was playing prior to the speech), or pause / stop if it was not playing prior to the speech.  Are you saying that it sometimes starts playing the track even though it was paused / stopped prior to the speech?

This is another command buffer / delay issue.  There is no Squeezebox command I'm aware of to "switch to a different track, but don't start playing."  Whenever you switch to a new track, it automatically starts playing, no matter if your player is currently paused or stopped.  This is why I set the volume to zero prior to switching back., and then have to pause / stop AFTER switching back (not before).   I may just need to extend the length of the "wait time" between commands to be sure they are all processed. This means it will take longer for the squeezeplayers to return to their prior state after the speech, but it will better ensure returning to the correct state.

I am still investigating if I can i/o intercept the return string of the commands I sent, which should then just resolve the timing issues.

Quote
Anyway.. PiCorePlayer is just a linux squeezelite that runs on RPI.
I will do more extesnive testing this evening when I am home, including copying that dynamic SAY Mp3 file to my library and make sure it can play that too..

That's a good idea...to ensure that your player does not have an issue with the MP3 file itself.  If it does, then I'm afraid there's not much that can be done about that aside from an update to your PiCorePlayer to address it....or perhaps the next post is the issue?
« Last Edit: February 13, 2015, 05:05:17 pm by JoeyD »

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #10 on: February 13, 2015, 07:40:35 am »
@konradwalsh,

Google is our friend: :)

I found several instances (google search) where a recurring theme tends to be: Google TTS + RPI Squeezelite + USB sound card = no sound.

If you start reading here though...it looks like some folks have gotten it to work.  Specifically post 56 in that thread.  Apparently you may just need to update your SL_SOUNDCARD setting (where-ever that is) to a different setting that corresponds to your USB device.  Good luck!
« Last Edit: February 13, 2015, 07:57:26 am by JoeyD »

Offline konradwalsh

  • Hero Member
  • *****
  • Posts: 566
  • Karma: +19/-6
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #11 on: February 13, 2015, 04:30:23 pm »
Thanks for doing that extra research... trying that suggestion out... will let you know

Offline sota

  • Sr. Newbie
  • *
  • Posts: 46
  • Karma: +0/-1
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #12 on: February 14, 2015, 07:19:25 am »
So I had a play with it this morning. It seems to be an imrovement over the previous version, but I find that you must wait about 6 to 8 seconds after the message plays, before submitting another. Otherwise, you end up with the volume at zero and can hear nothing. Is it necessary to set the volume to zero after playing? Or perhaps the option to simply use the current volume settings for the player as the TTS volume would get around the problem.

Pat

Offline JoeyD

  • Sr. Member
  • ****
  • Posts: 410
  • Karma: +36/-5
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #13 on: February 14, 2015, 08:25:25 am »
Thanks for the feedback, sota.

I am going to tackle getting a propper "queueing" system in place as part of the next phase of development.  (So it would more gracefully handle multiple say requests.)

There is a reason I believe I need to set the volume to zero temporarily after the say command:
1) When you switch to a new track, there is no way (to my knowledge) to prevent the new track from starting to play immediately from time zero.
2) It usually takes a lest one or two seconds in between commands in order for each command to be processed. 

Without putting the volume to zero after the speech, this is what would happen:
1) Say you are on track 3, paused at 2:15.
2) You make an announcement at 70% volume.
3) Announcement is done, you'll hear the first second or two of track 3 at 70% volume
4) Then the track will pause.
5) Then it moves to 2:15

It would all be much simpler if there was a way to switch tracks without having the squeezebox immediately start playing it.

Offline jtmoore

  • Full Member
  • ***
  • Posts: 171
  • Karma: +2/-1
Re: Plugin 2.1 Beta - Text To Speech Support
« Reply #14 on: February 20, 2015, 07:00:43 pm »
Thanks for working on this plugin. I've been looking forward to trying it with my Synology and Open Karotz Squeezbox combination.
When I try TTS, the sound come out a short high pitch squeak. It sounds like it is talking 10 times faster than normal (too fast to make out any words). It also says it twice.
Is there a speed setting I need to set? And how do I stop it repeating the message? Thanks.
jtmoore