20. Sep 2011


 TTS Demos

Text-to-Speech (TTS) systems: In learning and teaching, I take full advantage of technologies. TTS is one of such tools. I have been using Linguatec Personal Translator Professional (which contains a TTS module), and AT&T Natural Voices. While a low quality TTS system with robotic voice might be sufficient to clue a native speaker on how to pronounce an unfamiliar word, such audio would be virtually useless for a foreign learner. Only high quality TTS systems will do. Beware that even high-end TTS systems could still be not quite natural in intonation.

Here are some TTS samples I produced in March 2008 to illustrate the quality differences:

0. Robotic

1. Medium

2. Medium High

3. High Quality

Here are a few drill clips I created in the past, using TTS for my own learning, or to show others:

EN ECB - the European Central Bank
FR BCE - la Banque centrale européenne
DE EZB - die Europäische Zentralbank
FR ONU - l'Organisation des Nations unies
FR USA - les États-Unis d'Amérique

« Bonjour, bienvenue à Bravo Centre de Technologie. Nous sommes le département impartir de informatique, pour de petites entreprises. Nous fournissons les ordinateurs, le logiciel, le réseau, la formation et la consultation. Nous servons des clients aux Etats-Unis, au Royaume-Uni, et au Hong Kong. »

Sample recording of my own practice, following the above clips I produced. This is one of the actual methods I use to learn and improve my pronunciation of certain phrases, or complex words. This approximates the benefits of a patient human tutor, breaking down the syllables at the trouble spots.

I'm learning here. Do NOT rely on my pronunciation accuracy.

Tech Notes

Of course, I'm just using a software product, and not creating it. My only value-add here, is the scripting of source document for each clip, and refinement of the strategy and concept, in sequencing and repetition, etc. Typically, some tagging is needed, to control pace and other characteristics. I oft also need to tweak the spelling, in order to accommodate the quirks of algorithm in the particular TTS engine; or to override defaults, when multiple pronunciations exist, for the same letter pattern.

Besides creating voice clips for the purpose of sending out mini drills to others, they're also great for inserting in playlists between songs. It's like having brief lessons as 30-second "commercial" breaks. I have written some "batch scripts" to automate that process. In fact, I routinely listen to my own "radio programs" with a mix of songs I'm currently learning, and favorites of the month/week, with a batch of linguistic drill clips and mini tutorials (self-produced or otherwise acquired) inserted programmatically, with randomized sequence.




Eyeful Tower * Cameralot * Samcelot * Bravo Studio * Open Letter * Musings * Bravo Tech * Clocks