Amazon Polly is a service that turns text into lifelike speech. It lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. It includes 47 lifelike voices spread across 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries. You can cache and save Polly’s speech audio to replay offline or redistribute.
Amazon Polly delivers the consistently fast response times required to support real-time, interactive dialog. And it is easy to use. You simply send the text you want converted into speech to the Amazon Polly API, and it immediately returns the audio stream to your application so your application can play it directly or store it in a standard audio file format, such as MP3.
With Amazon Polly, you only pay for the number of characters you convert to speech, and you can save and replay the generated speech. Its low cost per character converted, and lack of restrictions on storage and reuse of voice output, make it a cost-effective way to enable Text-to-Speech everywhere. Amazon Polly makes it easy to request an additional stream of metadata that provides information about when particular sentences, words and sounds are being pronounced. Using this metadata stream alongside the synthesized speech audio stream, you can now build your applications with an enhanced visual experience, such as speech-synchronized facial animation or karaoke-style word highlighting.
Amazon Polly supports Speech Synthesis Markup Language (SSML), a W3C standard, XML-based markup language for speech synthesis applications, and supports common SSML tags for phrasing, emphasis, and intonation. This flexibility helps you create lifelike speech that will attract and hold the attention of your audience. It supports all the programming languages included in the AWS SDK (Java, Node.js, .NET, PHP, Python, Ruby, Go, and C++) and AWS Mobile SDK (iOS/Android). Polly also supports an HTTP API so you can implement your own access layer. With Amazon Polly’s custom lexicons, or vocabularies, you can modify the pronunciation of particular words, such as company names, acronyms, foreign words and neologisms (e.g., “ROTFL”, “C’est la vie” when spoken in a non-French voice). To customize these pronunciations, you upload an XML file with lexical entries.