Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Polly includes 47 lifelike voices spread across 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries. Amazon Polly delivers the consistently fast response times required to support real-time, interactive dialog. You can cache and save Polly’s speech audio to replay offline or redistribute. And Polly is easy to use. You simply send the text you want converted into speech to the Polly API, and Polly immediately returns the audio stream to your application so your application can play it directly or store it in a standard audio file format, such as MP3. With Polly, you only pay for the number of characters you convert to speech, and you can save and replay Polly’s generated speech. Polly’s low cost per character converted, and lack of restrictions on storage and reuse of voice output, make it a cost-effective way to enable Text-to-Speech everywhere.
These are the organizations I come across in my research who are doing interesting things in the API space. They could be companies, institutions, government agencies, or any other type of organizational entity. My goal is to aggregate so I can stay in tune with what they are up to and how it impacts the API space.
ReadSpeaker speech-enables online content on the fly in 35+ languages and 100+ voices. In 1999, ReadSpeaker pioneered the first-ever speech-enabling application for websites. Today, the company provides a portfolio of web-based text-to-speech solutions for websites, mobile sites, mobile apps, RSS feeds, online documents and forms, as well as online campaigns. Its solutions are used by over 5000 corporate, media, government, and nonprofit customers around the world.
SpeakerText is a video transcription company with a dead-simple web interface and well-documented API. Video publishers of all sizes have flocked to the service. Many use the video transcripts to boost SEO, others to create closed captions for accessibility, and still others to simply improve the viewer experience.
Clarify makes what was said, searchable. As the world records more and more of its interactions, in more and more ways, it needs a new generation of tools to filter, manage, and process the raw data. Clarify's API, SDKs and plugins enable developers and entrepreneurs to add audio and video search to any application. Clarify is already being used by disruptors and innovators, as well as some of the world’s largest institutions.
Pop Up Archive makes sound searchable. Search is a big deal. So why are we limited to only searching text? We get Google alerts every time someone types our name on the web, but not every time someone says our name. Pop Up Archive helps solve this problem by making spoken word searchable. Using speech-to-text software built for specific industries, we enable media companies and institutions like NPR, KQED, Princeton, Stanford to find, reuse, and monetize media. Rich media search provides accessibility, search engine optimization, and advertising opportunities.
Allows any device with a speaker, microphone and Internet connection to become a voice interactive product. An end-to-end service for implementing voice interaction on any hardware. Our mission is to make interaction with technology and the world around us seamless, secure and natural through voice by enabling everyday objects to come alive.
AT&T Developer Program is the API platform for all AT&T devices and the AT&T network. AT&T provides device, call, location, messaging, speech, notifications, payments as well as advertising and healthcare solutions across multiple devices and mobile operating systems.
TelAPI providers developers with a cloud based telephony platform with advanced telecom features and customizations not available in other cloud communications APIs. TelAPI provides the ability to send SMS messages, and manage quote for each account. The platform also provides a free trial for developers, with pay as you go, unit based pricing to pay for services that go beyond trial access.
CallFire is a cloud-based telephony company that provides voice and text connectivity services. It offers the necessary tools for businesses to communicate and market effectively. The company works to provide a diverse line of innovative products that enable its users to get their messages delivered.
Jeannie (Voice Actions) is a virtual assistant with over two Million downloads, now also available via API. The objective of this service is to provide you and your robot with the smartest answer to any natural language question, just like Siri. This service provides an interface to the standard functions that users demand of modern voice assistants. For example chatting, looking up information, creating messages and much much more.
Plivo is an API Platform for building voice and SMS features into web and mobile applications that are focused on running in the cloud. Plivo provides web APIs that allow developers to integrate voice and SMS sending, receiving, and account management features into any applications. Plivo provides a free tier for playing around with the API, as well as unit based, pay as you go pricing, which includes volume pricing for larger scale operations.
Skype is a software application that allows users to make voice and video calls and chats over the Internet. Calls to other users within the Skype service are free, while calls to both traditional landline telephones and mobile phones can be made for a fee using a debit-based user account system. Skype was founded by Niklas Zennstrom and Janus Friis who were also the founders of the file sharing application Kazaa. Skype has also become popular for its additional features which include instant messaging, file transfer, and video conferencing. Skype has 663 million registered users as of 2010.
Maluuba’s mission is to empower people with the ability to find exactly what they want by speaking to their smart phone. Maluuba’s proprietary, patent-pending engine provides superior capabilities to traditional voice recognition systems. Asking a question like, what movies are playing nearby? enables users to buy tickets, find theater directions, and share search results on social platforms such as Facebook and Twitter.
Api.ai provides developers and companies with the advanced tools they need to build voice interfaces for apps and hardware devices. The Api.ai platform lets developers seamlessly integrate intelligent voice command systems into their products to create consumer-friendly voice-enabled user interfaces. Api.ai is also the company behind Assistant, a first of its kind conversational assistant app created in 2010 that now has more than 20 million users and is the highest rated assistant app available. The Api.ai team and board of advisors brings decades of experience in artificial intelligence, machine learning and human-computer interaction services.
SightCall software development kits for browsers and mobile devices help developers rapidly integrate Realtime Video, Audio and Text capabilities directly into their Website or Mobile App. SightCall software products are aimed at developers and make incorporating Realtime Communication as easy as using any other client-side framework.
If you think there is an organization I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.