We have moved at community.getvera.com

Author Topic: New TTS engine: Microsoft Translator  (Read 66237 times)

Offline JsChiSurf

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +4/-0
Re: New TTS engine: Microsoft Translator
« Reply #120 on: February 27, 2016, 11:10:40 am »
@tomgru

As a follow up to my previous post, I did a bit more digging and can see there is also available the Project Oxford Bing Voice Output API:

https://msdn.microsoft.com/en-us/library/mt679063.aspx

Whose API endpoint is:

https://speech.platform.bing.com/synthesize

I'm guessing this is a newer API and suppose the mostly supported one going forward? In your post, however, the people you spoke with did mention the "/Speak" endpoint, which this one is not, as compared to the one being used by the plugin, which while a very different URL, that is calling "/Speak".

Regardless, I used the documentation on the Oxford Project based API, where gender can indeed be specified, and built out some PHP code to test, and can, with 100% consistency, have my text converted to audio in the specific "dialect" and "gender" specified.

The key difference is calling this:

https://speech.platform.bing.com/synthesize

Versus:

http://api.microsofttranslator.com/V2/Http.svc/Speak

So, I suppose, if worse comes to worse, the plugin could potentially be updated to use the Oxford Project based API, however, it seems the limits are a bit lower.  I was able to use the same client ID, but had to signup there to obtain a new / different key / secret to get it going.




Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #121 on: February 28, 2016, 03:53:29 am »
I confirm there is no gender in the API.
For French, it is stable with a change from female to male voice.

Offline JsChiSurf

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +4/-0
Re: New TTS engine: Microsoft Translator
« Reply #122 on: February 28, 2016, 11:50:58 am »
@lolodomo,

Any chance of considering using / offering the use of the Project Oxford version of the MS API referenced above, so that users can choose the specific voice they prefer?

Offline Mai Pensato

  • Full Member
  • ***
  • Posts: 228
  • Karma: +5/-1
Re: New TTS engine: Microsoft Translator
« Reply #123 on: February 28, 2016, 02:55:31 pm »
Here (Dutch) also the female voice changed suddenly in maile voice last week. But also it's much slower now..it takes more than 10 seconds before the TTS starts. I also use the Google TTS and this is much faster (immediate response) only I can use it only 2 times per day on one of my Sonos players. I hope the Microsoft TTS can be improved...

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #124 on: March 01, 2016, 03:05:36 am »
I noticed a bigger delay between the end of the TTS message and the resume of previous playback. I have to check if they increase the bitrate of the MP3 file.

Offline Mai Pensato

  • Full Member
  • ***
  • Posts: 228
  • Karma: +5/-1
Re: New TTS engine: Microsoft Translator
« Reply #125 on: March 01, 2016, 07:22:57 am »
The delay is also at the start. When I manually run a "say" message it takes more than 5 seconds before it starts. Before this was much faster with the MS TTS. Google TTS  reacts always instantly. I have made my own personsl weather announcement that runs every morning and consists of several seperate messages. With google its ready in15-20 seconds. With the recent MS this takes more than 1 minute...Really annoying so I switched this one back to Google

Offline tomgru

  • Hero Member
  • *****
  • Posts: 1403
  • Karma: +18/-6
Re: New TTS engine: Microsoft Translator
« Reply #126 on: March 01, 2016, 09:30:55 am »
This was another reply from the Microsoft team.  Do you want me to ask them about the delay?

-------------------

Looking at the doc in the link below (https://msdn.microsoft.com/en-us/library/ff512420.aspx) could see why this is not clear, as it is not fully explained.

This is a sample to the API call for a female dialect for en-CA language.


http://api.microsofttranslator-int.com:80/v2/http.svc/Speak?appId=xxxx&language=en-CA&format=audio%2Fmp3&options=MinSize%7CFemale&text=Someone%20at%20the%20door.


In this case, language query parameter has the value of:
                en-CA

The options query parameter has the value of:
MinSize|Female

Offline JsChiSurf

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +4/-0
Re: New TTS engine: Microsoft Translator
« Reply #127 on: March 01, 2016, 09:40:36 am »
This was another reply from the Microsoft team.  Do you want me to ask them about the delay?

-------------------

Looking at the doc in the link below (https://msdn.microsoft.com/en-us/library/ff512420.aspx) could see why this is not clear, as it is not fully explained.

This is a sample to the API call for a female dialect for en-CA language.


http://api.microsofttranslator-int.com:80/v2/http.svc/Speak?appId=xxxx&language=en-CA&format=audio%2Fmp3&options=MinSize%7CFemale&text=Someone%20at%20the%20door.


In this case, language query parameter has the value of:
                en-CA

The options query parameter has the value of:
MinSize|Female

Hot diggity dog!  That's it.  The documentation does not reference gender and they didn't state that this was an options parameter in their initial response to you!

I just took the code in the LUA file:

local returnCocde = os.execute(SAY_EXECUTE:format(file, file, token, url.escape(text), language, url.escape("audio/mp3"), "MaxQuality"))

And modified the last value (hard-coded) which is what is set for the "options" querystring parameter, and made the line:

local returnCocde = os.execute(SAY_EXECUTE:format(file, file, token, url.escape(text), language, url.escape("audio/mp3"), "MaxQuality|Male"))

And after an upload and Luup restart, I'm 10 for 10 with the Male voice (based on my settings).

While I didn't mind my morning weather announcement as a female, my nightly joke of the day, after setting the alarm, comes off so much better in a male voice :-)

Thanks for looking into this for us!

I'm guessing a quick update to the plugin could be released with a drop down for gender (assuming all languages support), in order to dynamically set, though, this may be a bit tricky, since only the Microsoft API supports gender (I'm assuming).


« Last Edit: March 01, 2016, 10:56:47 am by JsChiSurf »

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #128 on: March 01, 2016, 11:23:57 am »
The solution could be to finally add a voice/gender paramter to the Say action.
For MS translator, we will use Male or Female as parameter value.
For Mary TTS, it will allow to choose the voice.
And for MS translator, a default gender will be required.

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #129 on: March 01, 2016, 03:58:22 pm »
I noticed a bigger delay between the end of the TTS message and the resume of previous playback. I have to check if they increase the bitrate of the MP3 file.

My idea was good.
After checking, the MP3 file is now @128 kbps (MaxQuality).
If I switch to option "MinSize", bitrate is @32 kbps.

So I fixed the file L_SonosTTS.lua. Delay to resume is now ok.
You can get the last version from the ZIP file you can download at the bottom of this page: http://code.mios.com/trac/mios_sonos-wireless-music-systems/browser/trunk

This increased bitrate could explain why it takes more time to start playing. The file to download is bigger. We could decide to switch to option MinSize and have a smaller file and so a faster download..
« Last Edit: March 01, 2016, 07:14:39 pm by lolodomo »

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #130 on: March 01, 2016, 06:32:03 pm »
I have finally added a parameter "Microsoft option". By default, the parameter is empty and the result will be a small audio file and no gender specified.
You can set in the plugin UI one of these values: "Male", "Female", "MinSize|Male", "MinSize|Female", "MaxQuality|Male", "MaxQuality|Female".

You will have to upload 4 files I have updated: I_Sonos1.xml J_Sonos1.js L_SonosTTS.lua S_Sonos1.xml
Don't forget to free your WEB browser cache as the JavaScript file is updated.

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #131 on: March 01, 2016, 07:10:48 pm »
So with this update, by default smaller files will be produced. It is safer for our Vera with so small memory and by the way I don't really hear a difference in quality. The other advantage is a shorter delay to listen the text. It is noticeable especially when using a big text.
Of course if you think the quality is reduced too much, you can set the new option  to MaxQuality to restore the previous quality.
And of course with this update, duration calculation is fixed, meaning a faster resume.
And finally you can even use the new option parameter to force a male or female voice.
« Last Edit: March 01, 2016, 07:12:25 pm by lolodomo »

Offline lolodomo

  • Moderator
  • Master Member
  • *****
  • Posts: 3484
  • Karma: +74/-10
Re: New TTS engine: Microsoft Translator
« Reply #132 on: March 01, 2016, 07:18:24 pm »
I will replace tomorrow in the UI the free text by a list of choices. More easy for everybody.

Offline tomgru

  • Hero Member
  • *****
  • Posts: 1403
  • Karma: +18/-6
Re: New TTS engine: Microsoft Translator
« Reply #133 on: March 01, 2016, 10:05:01 pm »
awesome...thanks guys.  Glad my stomping grounds could actually help!

Offline BOFH

  • Sr. Hero Member
  • ******
  • Posts: 2409
  • Karma: +112/-140
Re: New TTS engine: Microsoft Translator
« Reply #134 on: March 01, 2016, 10:21:51 pm »
Thanks to everyone who helped figure this out. I'll see in the AM if the friendly Canadian lady will be telling me teh wether (As I set) or it's the burly Canuck guy doing so...
Vera3 UI5 UI7 Edge Plus
Trane TZEMT400AB32 | Schlage BE369 FE599 | GE 45601 45602 45603 45604 45606 45609 45631 | Intermatic HA01C HA03C HA05C HA07C CA600 CA3000 | Aeon DSC06106 | Telguard GDC1 | Foscam FI8910W FI8905W FI9821W | D-Link 930L | Wanscam JW0011 | ZModo ZPIBH13W