Meet me at Web2.0 Expo San Francisco

Listen with webReader

If you are interested in having a meeting with me for a live chat in San Francisco between the 21st and 25th of April please contact me or my office. I will be attending the Web2.0Expo conference, co-produced by the Webtech and O’Reilly Media, this year as well. More about the web2.0Expo at the conference homepage.

Last year the event was really successful. A great opportunity to network with people ranging from newbeez to gurus in the field of web 2.0.

About this web2.0 thing; I was attending the Monaco Media Forum end of last year, and there was this conference track, moderated by Dale Dougherty (who coined the expression Web2.0 in the first place), that had the title “Web3.0”. Now, what is that all about I asked myself. Obviously, most other attendees seemed to ask themselves the same thing, so the conference room was really filled up. And what was it then? Was it a new technology thing? A new Internet “behaviour” or business logic kind of thing? Well, we just have to wait and see ;-)

Niclas

Posted in: General
Tags:

Software or SaaS for speech enabling?

Listen with webReader

I have for the last 8 years worked developing SaaS solutions that in different ways speech enable web content. The business logic is quite straight forward; the customers (content owners) are the ones subscribing to and paying for the product and their visitors are the ones using it.  So, it is a service that the website owners offer to their visitors.

There are other ways for the visitors to listen to the content as well, even without the need for the website owner to subscribe to any service. Either the user can download any of the large number of software available at places like download.com or simply use the built in text-to-speech engine available in almost all existing operating systems. This is nothing new, it has been built in on Mac’s for decades and in Windows from W95. I remember my Amiga 1000 had it as well when I was a child. However, the quality of the voices available for free is not just THAT fantastic, but for many people; good enough. It really depends on how badly you need it.

People that depend on text-to-speech in order to use their computer at all already have a solution installed on their PC. They need it from anything to start MS Office, or start the web browser. These we can call “Professional users”. They have certain demands and requirements on the text-to-speech that differ quite a lot from people that do not depend on it. These professionals I am talking about are severe visually impaired or blind. They normally use software and equipment that costs tens of thousands of Euros. Pretty unreachable if they didn’t get support from the government.

Then we have a really large group, almost 20% of the population in most European countries that have milder difficulties with text. These are for example people with reading difficulties, dyslexia, low literacy level or are not native speaking. These groups have proven to be greatly helped by having the text read to them. They have other requirements on the text-to-speech voice, and they usually would need to buy the software themselves if they want a higher quality than the free ones. A good TTS for personal use can cost anything between 100 to 1500 Euros. Most people can not afford this.

The even bigger group of people these days that have proven to appreciate text-to-speech is “all the others”. Listening to content online like the newspaper, their email, reports etc enables them to be more efficient. To be able to perform other tasks at the same time as they read like driving a car, doing the dishes, commuting for example. These people have other requirements on the TTS. For example; It should sound very human. And there are no TTS products that target these groups of people.

The explosion of Audio Book consumption all over the world shows that for many people, even those without any reading or visual problems prefer to listen. It is just a fact. Then, the people that are depending on speech version of printed text are the true winners! Design for all is just fantastic!

I think that together with publishing lots of text follows a responsibility; a responsibility to make it accessible to as many as possible. And I do not need to say that making text content talk helps a whole lot of people in various situations.

Back to the SaaS track. The Products I have been working with do just this. We are speech enabling the web. This enables the customers to give their visitors/users a service and support. Today millions of people using our services every month, and that proves that it is a well adopted feature. Web based services need not only to be accessible and user friendly. They in fact also need to be “use worthy”.

It is like when you go to the supermarket. They usually have a free service to offer you a piece of assistive technology known as “a shopping cart”. This is to help people overcome their handicap of not being able to carry a lot of grocery goods at the same time. It is great. Everybody uses them if they intend to buy more stuff than they can carry. It helps the customers to buy more. This is a service. It is use worthy.

You do not need to buy (or build) your own shopping cart to go mass shopping since this is a supportive service the supermarket offers.

There are a couple of  “non SaaS” suppliers of Speech software.

Some thinking like this; “we let the users buy and download and install software on their PC so that they can listen to web pages”.

Others think like this; “We let the users download and install software on their PC for free, and then we bill the website owners that would like this software to be able to read their websites”

Those two are not services at all. It is simply a way to sell software products. The only difference is; who is paying.
 
I reason like this; the user/visitor would like to use the text to speech service without having to download anything. And the user should not need to pay anything to use it. And the user should be able to use this service from whatever device he likes; from a web browser on an Internet café or other public location, from his mobile phone, from whatever device and from wherever. The user can “carry” around his need to listen wherever he goes, and his need is not in any way tied to a specific individual computer where he has installed the software.

There are also others that try to do something in between; offering a limited web based speech service with the option for the user to download and install software for additional features that only works for a limited number of hours, then the user needs to re-install it to have it work for another few hours. I wonder how smooth solution that is in the long run.

No, the way to go is a totally server based product, where the best technology for producing high quality text to speech can be used, all improvements and updates are fully centralized and it is a device independent solution that works for anybody anywhere. Someone needs to pay, yes. It is pretty much the same logic as with the shopping carts; the content provider pays.

More about them Blades

Listen with webReader

Yesterday we installed the new blade centres and the new storage systems. I felt like a boy carrying around new toys. With this worm and fuzzy “Christmas eve just after opening the presents” feeling. When we arrived in the new hosting facility, looking down on the power outlet under the rack, my happiness slowly transformed into a more tiredsome mood. Two three phase 32Ampere sockets were staring in my eyes. If you read my previous post “from servers to blades”, you would understand why. All of a sudden, we didn’t have the correct power cords any more. Hi-tech never want to be easy.

Well, they managed to solve the problem. Cables were connected, lamps blinking, fans started to roar. And there it was. The rSpeak infrastructure was ready!

In a few days we will start beta testing our new range of products for private blogs and websites. If you would like to become one of the exclusive beta testers, please become a member by signing up on the VoiceCorp website.

Stay tuned!

Posted in: General

Want to become a TTS Voice? Now you can!

Listen with webReader

I met one of our TTS suppliers the other day. Lars-Erik Larsson, the CEO of Acapela Group. He told me about their latest development being a service to create corporate voices for their text-to-speech engine. The coolest thing was that they could now offer it at a very reasonable cost thanks to a new technology and procedures they have developed. The Acapela Voice Factory. and it only takes between 3 weeks(!) for the simplest version and up to 14 weeks for the full quality version!

Still the cost level is not really reachable for private users, but there are other (free) alternatives as well like the FestVox with the Festival TTS platform (however you can not in any way compare the quality with the commercial solutions).

With a price tag starting at 7500€ (excluding the cost of the speaker) it is really reachable for a lot of companies that want to have a corporate TTS voice that can easily pay off by using TTS in to automate some customer support, automate switchboards, or just do it as a fun thing in for example interactive web campaigns. But you would also need to buy licenses for the engine itself if you want to use it. But that is a very reasonable cost in such a project.

Is there a market? Sure. Lets say we have two car manufacturer that want to integrate speech into their cars. Obviously, Volvo would not want to use the same voice as Saab for instance. You must hear the difference :-) . Some companies today have a corporate voice that they use in all radio and TV commercials, and wouldn’t it be great to have that voice talent also answering the phone on 30 lines at the same time?

However, if you have dreamt of immortality, this is one step closer. But if you decide to give up your voice up to a TTS engine, there are a couple of things you should be aware of.

  • Your voice can technically be used to speak very dirty words and there is always the risk of people using it in a very very bad way.
  • You can never really use your voice as something that identify that it is actually you. I.e. quite a problem if you also happens to be a big fan of speech verification systems…

The target group for corporate voices are mainly, well, corporations. Congratulations Acapela-Group!

Posted in: TTS

Doing what for whom exactly?

Listen with webReader

I came across an interesting fact today. There is no reports/studies/statistics showing that “having a web shop that is accessible for people with disabilities actually increases sales” (if I’m wrong, here please let me know where to find such report). Or maybe there is, but under a slightly different name? Like for example “Having a web shop that actually works in different browsers and on different devices increase sales”. Hmm.

In web accessibility, it is time to stop talking about people’s shortage of capabilities and time to really focus on the basic fact that the website should be rich and structured enough to give the power to decide how the information should be presented the user – to the user. The user is king anyway, since he has the ultimate power to decide if he should read this at all or just go ahead do something else.

Anyway, I think this can illustrate what I mean:

Web consultant say;-“We can increase the accessible of your ecommerce store for disabled people, but that will cost you”.

The web shop owner says;-“What’s the ROI?”

Consultant;-“Hmm, let’s see, how many disabled people are in your primary target group?”

Owner;-“They are not in our target group, we targeting middle class working people with decent income with our home electronic goods, so I do not see that this is relevant for us”

These kinds of dialogues are not as uncommon as you might think.
If he opened up like this;
“-We can make your web shop work for everyone independent of what sort of device or web browser they use, but it will cost you.”

Reaction could have been something like:

Owner;-“Would it mean that we could get more customers?”

Sales;-“Yep, People would actually be able to buy things in your store from their PC’s, Mac’s, Linux boxes, Blackberries, mobile phones, hey, maybe also from their digital TV sets”.

Owner;-“What a killer! I’ll take two!”

There is not really any difference in the results from the two (quite different) dialogues, the thing is that the second one sounds attractive and is sellable, the first one not.

VoiceCorp help Dow Jones to reach out!

Listen with webReader

Dow Jones have really understood what speech enabling is about. It is not just something you do to help people with various disabilities. It is so much more than that. This is true “design for all” where the people with different difficulties are the biggest winner. And on top of that, they use it as a competative advantage. That’s the spirit Dow Jones!

See Dow Jones press release below. / Niclas

Dow Jones Factiva Listen Capability Transforms the Way Users Consume News

Time-Saving Tool Enables Users to Perform Other Tasks While Listening to Relevant News
NEW YORK, (March 5, 2008) – Dow Jones & Company introduced a new “text-to-speech” capability in Dow Jones Factiva that allows users to listen to the news that drives their business. With one click, users can now listen to a news article rather than read it, freeing them to do other things and multitask as the pace of business today requires.

Currently available in beta format, a “Listen to Article” link appears at the top of any full-text article with fewer than 4,000 words. The listen capability is available in English, French, German, Italian and Spanish languages and automatically defaults to reading in the interface language previously selected by the user.

“Dow Jones Factiva continues to set itself apart from the competition by being the first to offer text-to-speech technology in the current awareness, news and research market,” said Dennis Cahill, senior vice president and chief product officer of the Dow Jones Enterprise Media Group. “This new capability builds on our commitment to provide customers with relevant news when, where and how they need it and to reinforce our No. 1 position in the marketplace.”

The listen capability is a Web-based service provided by VoiceCorp (www.voice-corp.com) that converts text into speech on the fly. It is made available wherever full-text articles are found, including alerts, search results and newsletters. Once the link is clicked, the listen capability uses a Flash player to read the article.

The addition of text-to-speech further builds on Dow Jones’s goal of integrating various forms of multimedia content into Dow Jones Factiva. In August 2007, Dow Jones Factiva added highly relevant video and audio information including business news, CEO interviews, executive speeches, shareholders meetings, product reviews and other meaningful business content. Dow Jones Factiva searches across more than 14,000 authoritative sources, including the exclusive combination of The Wall Street Journal, Dow Jones and Reuters newswires.

Posted in: Customers

Design contest that rocks!

Listen with webReader

We needed a new fresh and intuitive logo for our upcoming rSpeak range of products. Non of us here at the company are really a designer and we do not (yet) have an agency for these things. So what to do?

The answer is this; http://99designs.com

Neat! You submit what you need (like a logo or a design of any kind) and some preferences and a price tag. Then people all over the world are getting to work on it. Only after 2 days we had more then 50 nice designed logos to choose from. Now, after a week, the number has grown to over 150. It is getting more and more difficult to choose the right one.

The good thing is that you can see how many times a designer have won before, and also look at some of their previous work. The best part is also that you will get some help on the way choosing the right one. They are getting rated by visitors and other designers. So, if you know “almost” what you want, it is a great support to see what contribution others think is good. And that is definitely worth something. And hey, it is a real time saver. However, you are recommended to give some feedback to the designers, since that would probably result in them submitting new version that are even better.

So, a piece of advice. If you want a nice design, and you do not really know who can help you out. Try 99designs.

I will post the winner in just a couple of days here on the blog.

Posted in: General

From servers to blades

Listen with webReader

Since we started with our speaking web services about 9 years ago we have grown quite a bit. Growing from a lonely Sun Ultra 5 back in 2000 to about 50 high capacity servers of various kinds today. That’s a whole lot of servers…

Managing that number of servers is not the easiest task, and wow how much space they take.

We have now designed a brand new infrastructure for our future needs based on Linux clusters based on IBM blade centres. We have also invested in a real powerful SAN (Storage Area Network) over the iScsi protocol. A very competitive priced solution compared to Fibre Channel based SAN’s. Almost the same result for only a fraction of the cost. Anyway, going from “normal” rack servers to something like a Blade Centre was not just start working with another server. With our “normal” servers, we are used to configure, plug in network cables, plug in power cord and off you go!

When our brand new 14 slot Blade centre arrived the other day, we noted that it was not just to plug in the power cords. Instead of a usual one, it was a (very large) plug completely unknown to us. But wait! There are some good old fashion power cables! almost… After a call to IBM they kindly advised us to invest in a 1300€ PDU (Power distribution Unit) that was to be connected to 3 phase electricity. To that we could easily plug this “almost normal” power cables to. We understood that this is not just another server; this was a whole new level of computer infrastructure.

Anyway, the new Infrastructure is now up and running and I must say; if you currently have 10+ servers and you plan to grow quite a bit. Invest in blade centres!

And it´s better for the environment too! They are not consuming by far as much electricity (per server) compared to rack ones. Also they does not require as much cooling either. From a management point of view it is brand new world. And EVERYTHING is redundant! I fell like a happy kid. Pretty much the same feeling I experienced the day when my mom upgraded my first computer (the C64) to the brand new Amiga 1000 :-)  

Posted in: General
© 2010 VoiceCorp International B.V. | www.voice-corp.com | Powered by WordPress