Cloud Text-to-Speech API

Posted on Posted in Main

I’ve been developing voice apps since 2005. One of the things that has always bothered me was the robotic experience that ‘text to speech’ has provided. I KNOW, I know, this is a hard problem to solve and I think Google has leaped ahead of anyone else. Google just announced the release of a new API, you can check it out here. 4.00 USD / 1 million characters can allow you to go pretty far. Below is an example of a request to convert text into a synthesized audio response.

Screen Shot 2018-04-19 at 5.59.32 AM

curl -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) -H "Content-Type: application/json; charset=utf-8" --data "{
'input':{
'text':'I\'ve added the event to your calendar.'
},
'voice':{
'languageCode':'en-gb',
'name':'en-GB-Standard-A',
'ssmlGender':'FEMALE'
},
'audioConfig':{
'audioEncoding':'MP3'
}
}" "https://texttospeech.googleapis.com/v1beta1/text:synthesize"