Software or SaaS for speech enabling?
I have for the last 8 years worked developing SaaS solutions that in different ways speech enable web content. The business logic is quite straight forward; the customers (content owners) are the ones subscribing to and paying for the product and their visitors are the ones using it. So, it is a service that the website owners offer to their visitors.
There are other ways for the visitors to listen to the content as well, even without the need for the website owner to subscribe to any service. Either the user can download any of the large number of software available at places like download.com or simply use the built in text-to-speech engine available in almost all existing operating systems. This is nothing new, it has been built in on Mac’s for decades and in Windows from W95. I remember my Amiga 1000 had it as well when I was a child. However, the quality of the voices available for free is not just THAT fantastic, but for many people; good enough. It really depends on how badly you need it.
People that depend on text-to-speech in order to use their computer at all already have a solution installed on their PC. They need it from anything to start MS Office, or start the web browser. These we can call “Professional users”. They have certain demands and requirements on the text-to-speech that differ quite a lot from people that do not depend on it. These professionals I am talking about are severe visually impaired or blind. They normally use software and equipment that costs tens of thousands of Euros. Pretty unreachable if they didn’t get support from the government.
Then we have a really large group, almost 20% of the population in most European countries that have milder difficulties with text. These are for example people with reading difficulties, dyslexia, low literacy level or are not native speaking. These groups have proven to be greatly helped by having the text read to them. They have other requirements on the text-to-speech voice, and they usually would need to buy the software themselves if they want a higher quality than the free ones. A good TTS for personal use can cost anything between 100 to 1500 Euros. Most people can not afford this.
The even bigger group of people these days that have proven to appreciate text-to-speech is “all the others”. Listening to content online like the newspaper, their email, reports etc enables them to be more efficient. To be able to perform other tasks at the same time as they read like driving a car, doing the dishes, commuting for example. These people have other requirements on the TTS. For example; It should sound very human. And there are no TTS products that target these groups of people.
The explosion of Audio Book consumption all over the world shows that for many people, even those without any reading or visual problems prefer to listen. It is just a fact. Then, the people that are depending on speech version of printed text are the true winners! Design for all is just fantastic!
I think that together with publishing lots of text follows a responsibility; a responsibility to make it accessible to as many as possible. And I do not need to say that making text content talk helps a whole lot of people in various situations.
Back to the SaaS track. The Products I have been working with do just this. We are speech enabling the web. This enables the customers to give their visitors/users a service and support. Today millions of people using our services every month, and that proves that it is a well adopted feature. Web based services need not only to be accessible and user friendly. They in fact also need to be “use worthy”.
It is like when you go to the supermarket. They usually have a free service to offer you a piece of assistive technology known as “a shopping cart”. This is to help people overcome their handicap of not being able to carry a lot of grocery goods at the same time. It is great. Everybody uses them if they intend to buy more stuff than they can carry. It helps the customers to buy more. This is a service. It is use worthy.
You do not need to buy (or build) your own shopping cart to go mass shopping since this is a supportive service the supermarket offers.
There are a couple of “non SaaS” suppliers of Speech software.
Some thinking like this; “we let the users buy and download and install software on their PC so that they can listen to web pages”.
Others think like this; “We let the users download and install software on their PC for free, and then we bill the website owners that would like this software to be able to read their websites”
Those two are not services at all. It is simply a way to sell software products. The only difference is; who is paying.
I reason like this; the user/visitor would like to use the text to speech service without having to download anything. And the user should not need to pay anything to use it. And the user should be able to use this service from whatever device he likes; from a web browser on an Internet café or other public location, from his mobile phone, from whatever device and from wherever. The user can “carry” around his need to listen wherever he goes, and his need is not in any way tied to a specific individual computer where he has installed the software.
There are also others that try to do something in between; offering a limited web based speech service with the option for the user to download and install software for additional features that only works for a limited number of hours, then the user needs to re-install it to have it work for another few hours. I wonder how smooth solution that is in the long run.
No, the way to go is a totally server based product, where the best technology for producing high quality text to speech can be used, all improvements and updates are fully centralized and it is a device independent solution that works for anybody anywhere. Someone needs to pay, yes. It is pretty much the same logic as with the shopping carts; the content provider pays.
Doing what for whom exactly?
I came across an interesting fact today. There is no reports/studies/statistics showing that “having a web shop that is accessible for people with disabilities actually increases sales” (if I’m wrong, here please let me know where to find such report). Or maybe there is, but under a slightly different name? Like for example “Having a web shop that actually works in different browsers and on different devices increase sales”. Hmm.
In web accessibility, it is time to stop talking about people’s shortage of capabilities and time to really focus on the basic fact that the website should be rich and structured enough to give the power to decide how the information should be presented the user – to the user. The user is king anyway, since he has the ultimate power to decide if he should read this at all or just go ahead do something else.
Anyway, I think this can illustrate what I mean:
Web consultant say;-“We can increase the accessible of your ecommerce store for disabled people, but that will cost you”.
The web shop owner says;-“What’s the ROI?”
Consultant;-“Hmm, let’s see, how many disabled people are in your primary target group?”
Owner;-“They are not in our target group, we targeting middle class working people with decent income with our home electronic goods, so I do not see that this is relevant for us”
These kinds of dialogues are not as uncommon as you might think.
If he opened up like this;
“-We can make your web shop work for everyone independent of what sort of device or web browser they use, but it will cost you.”
Reaction could have been something like:
Owner;-“Would it mean that we could get more customers?”
Sales;-“Yep, People would actually be able to buy things in your store from their PC’s, Mac’s, Linux boxes, Blackberries, mobile phones, hey, maybe also from their digital TV sets”.
Owner;-“What a killer! I’ll take two!”
There is not really any difference in the results from the two (quite different) dialogues, the thing is that the second one sounds attractive and is sellable, the first one not.
A new Blog is born!
Ok, I have finally decided to start a blog.
It will be about my passions.
That being; Entrepreneurship, software as a service (SaaS), text-to-speech (TTS), web accessibility and the world of web business. I have been in the industry since 1999 and have founded a number of companies and technologies on my way towards the future. In 1999 I together with some friends founded Phoneticom AB, a small (only me) company that was going to explore what audio in general and speech in particular could add to the user experience on websites. After only one year we launched the first ever server based web service for Text-To-Speech. ReadSpeaker was born. After a few years of setting up companies to sell the ReadSpeaker services, sell consultancy in web accessibility, creating new innovations etc I think I have finally landed in my new company VoiceCorp.
My philosophies about;
SaaS
It is better to let one person make a proper job installing the software on a server to let other people just use it.
Web accessibility
This is broad. It’s not about making the web working for people with disabilities, it is way to reach out to a maximum number of people, nondependent of device used, technology skills or abilities.
Entrepreneurship
Either you are or you’re not, or maybe you are but you just haven’t found out just yet.
TTS
Using text-to-speech technology is a way to free text from letters and words. It is just great for all occasions where reading is just not a good option. The quality of the speech is more about how bad you need it rather then how good it really speaks.
