Add synthesized speech to your web site with

WebNation QuickSpeech™

A Text-to-Speech-to-QuickTime Converter for Macintosh

Build version #0

July 8, 1999


Contents

Overview and Demo

Requirements for using QuickSpeech

Download QuickSpeech

QuickSpeech Usage Notes

Input and Output URLs

Menu commands

Preferences

Play Speech from URL

Status Window

Embedding QuickSpeech movies in your web page

Tips

QuickSpeech Sites

Possible future projects

Correspondence


Overview and Demo

WebNation QuickSpeech for Macintosh makes it easy to add synthesized speech to your web site.

Given a text file as input, QuickSpeech converts the text to synthesized speech, then saves the audio output as a QuickTime movie file.

The input and output files may be located on your disk drive or an Internet server, so you can automatically upload the QuickSpeech movie to your web site.

You can embed the QuickSpeech movie in your web pages. The speech audio will play back in the web browser of any Mac or Windows computer with QuickTime 4 installed.

If the speech needs to be updated periodically, the QuickSpeech Automatic Conversion mode can repeat the process according to your settings.

We're experimenting with QuickSpeech at ConwayWeather.com to give a talking weather forecast. The content is updated every 20 minutes or so.


Requirements for using QuickSpeech

QuickSpeech run on the Power Mac only. 680x0 processors are not supported.

QuickSpeech requires MacOS 8.6. It uses several technologies built-in to MacOS 8.6, including:

QuickSpeech requires QuickTime 4 Pro. This is necessary in order to save the audio output files.

(Note: These requirements apply to the creation of the QuickSpeech output, not playback. Any Mac or Windows computer with the free version of QuickTime 4 can listen to the QuickSpeech movie.)


Download QuickSpeech

Follow this link to download QuickSpeech.


QuickSpeech Usage Notes

When you launch QuickSpeech for the first time, you should go to the Preferences window and click the "Change QuickTime output settings" button. The recommended output formats are Qualcomm PureVoice or QDesign Music2.

The QuickTime movie produced in the final output is ready for immediate use on the web. It has been compressed to a QuickTime audio format, flattened, stored in the data fork, and formatted for fast start playback.

Currently, there are no controls in QuickSpeech to select the Speech Manager Voice. Use Apple's "Speech" control panel to select the voice and rate of speech. The best voices for intelligible speech are Victoria High Quality, Bruce High Quality and Agnes High Quality.

In Automatic Conversion mode, you may find it difficult to use other programs while QuickSpeech is creating the text-to-speech output. During the speech synthesis, audio capture and format conversion phases of conversion, most programs are halted while the QuickSpeech conversion is running. (But some programs like IPNetRouter continue to run in the background during QuickSpeech conversion.) This situation is expected to improve when MacOS X is available. Until then, you may want to consider running QuickSpeech on an unattended Macintosh.

During the speech synthesis and audio capture phase of the program, a large AIFF audio file is created. As a general guideline, reserve 10 megabyte of disk storage for every minute of audio. After the final audio compression, the final output can be reduced to 150K per minute, or less.

QuickSpeech creates several work files during the conversion process, then deletes them when it is finished. Avoid using these file names in the same folder as QuickSpeech to avoid any conflicts:

  • "Sound 1"
  • "output1.aiff"
  • "output2.mov"
  • "output3.mov"

If your Macintosh crashes for some reason while using QuickSpeech, you should drag those files to the trash before starting again.


Input and Output URLs

QuickSpeech uses Universal Resource Locaters (URLs) to specify the input and output files. These URLs are entered in the Preferences window.

There are three ways to specify an input file:

  • http:// - gets the input file from a web server
  • ftp:// - gets the input file from an ftp server
  • file:/// - gets the file from your disk drive or an Appleshare file server

Here are some examples of input URLs:

http://www.my-isp.com/~my-userid/speech.txt
   
ftp://ftp.my-isp.com/home/my-userid/public_html/news.txt
   
file:///my-disk/my-folder/myinputfile.txt

There are two ways to specify the output file:

  • ftp:// - transfers the output file to an ftp server
  • file:/// - transfers the output file to your disk drive

Here are some examples of output URLs

ftp://my-userid:my-password@ftp.my-isp.com/home/my-userid/public_html/news.mov
   
file:///forecast.mov

When transferring files with an ftp site, you will probably need to use your User ID and password in the URL. In the ftp example above, the User ID ("my-userid") and password ("my-password") are specified in the URL. Be sure to include the colon (:) and the at sign (@) that are used as delimiters. You may also need to use your User ID and password to get an input file from an FTP server.

In this version of QuickSpeech, if the "file:///" output URL is used, the file will be saved in the same folder as the QuickSpeech application. Any slash characters that are normally used as directory delimiters will instead be part of the output file name. So your best bet is to keep the output URL simple, as in the example above.


Menu commands

Play Speech from URL - This command takes a URL input, acquires the file and sends it to the Speech Synthesizer, without converting it to QuickTime audio. It uses some of the same parameters as the Preferences window, and is useful for testing different settings for the input file. It's also fun just to have it speak web pages.

Automatic Conversion Mode - This repeats the text-to-speech conversion automatically, based on the settings in the Preferences window.

Convert Now - Performs an immediate, one-time text-to-speech conversion, based on the settings in the Preferences window.

Preferences - Poses the Preferences dialog window. This is where most of the settings are made.

Quit - Quits the program.


Preferences

The first step in using QuickSpeech is to set your preferences:

Do Automatic Conversion - Check this box if you want QuickSpeech to run Automatic Conversions when the program is launched. This is useful for unattended startup of QuickSpeech. You can put an alias in the "Startup Items" folder to launch QuickSpeech when the computer is started, and this setting will allow QuickSpeech to start automatic conversions. The "Automatic Conversion Mode" menu command overrides this setting.

Seconds between Automatic Conversions - The number of seconds to wait between automatic text-to-speech conversion tasks.

Maximum Characters to Convert - An optional value to limit the maximum number of characters that will be converted. The current maximum is 32,767 (after HTML filtering).

Start Phrase - An optional text string to indicate where speech conversion should start. This can be plain text or an HTML tag.

End Phrase - An optional text string to indicate where speech conversion should end. This can be plain text or an HTML tag.

HTML Filter - If this option is checked, QuickSpeech will attempt to remove HTML tags from the input file. This HTML Filter is not very good. Things like JavaScript can slip through and cause errors in the filtering. If you are using the HTML Filter, but still getting HTML tags spoken in the output, try using the Start Phrase setting to get past the problem.

Input URL - The location of the text input file. The maximum length for the URL is 255 characters. See the Input and Output URLs section for futher details.

Output URL - The location of the output QuickTime movie file. The maximum length for the URL is 255 characters. See the Input and Output URLs section for futher details.

Change QuickTime Compression Settings - Press this button to change the QuickTime audio compression settings. You will probably want to experiment with some different settings of the QuickTime Sample Rate and other options to get the best balance of quality and speed. Usually, this should be set to Qualcomm Pure Voice or QDesign Music2 format. Be sure to click the "Options" button for the respective formats and make the appropriate settings.


Play Speech from URL

This is an interactive tool to help set the input parameters. Most of the controls are the same as the Preferences window.

There are two additional buttons for this window:

Speak the Text - Click this button to download the Input URL file and immediately play the synthesized speech. This function does not do the QuickTime compression. If you want to stop the speech while it is playing, press Command-period on the keyboard.

Copy to Preferences - This will copy the settings from this window to the corresponding settings in the Preferences window.


Status Window

The Status Window is always open. It displays a line of text showing the current status of QuickSpeech. In Automatic Conversion mode, it will update the countdown timer every ten seconds. Error messages will also be displayed there.


Embedding QuickSpeech movies in your web page

Here is a typical HTML statement to put a QuickTime movie such as the QuickSpeech audio output into your web page:

<EMBED SRC="http://www.my-isp.com/my-speech.mov" 
PLUGINSPAGE="http://quicktime.apple.com" 
WIDTH=8 HEIGHT=8 CONTROLLER=false LOOP=false AUTOPLAY=true HIDDEN>

For more information about embedding QuickTime files in web pages, see this page from Apple.

You can also access a QuickTime file on the Internet with Apple's QuickTime Player program, with the "Open URL" command.


Tips

You can place embedded speech commands in the text input file, and the Speech Manager will use them to control rate, pitch, silence, etc.

On WebNation's Apache web server (on a RedHat Linux system), we use PHP and wget to create the text input files for QuickSpeech.

To ensure that web browsers download a fresh copy of the QuickTime movie each time they visit the page, configure Apache to adjust the Expiration header for the movie file.


QuickSpeech Sites

Currently, only one site is using QuickSpeech to generate synthesized speech. If you use QuickSpeech for your site, we would be happy to add it to this list.


Possible future projects

Improve the HTML Filter. A regex library has been compiled for future work in this area.

Add a dialog window for speech synthesis parameters (voice, rate of speech, etc.)

Add an optional hints track for streaming servers (QuickTime Streaming Server for Mac OS X Server, PRISS for Linux).

Improve automatic conversion scheduling, add multiple jobs, etc.

Allow different voices based on phrases in the text input. For instance, in the text of an interview, the phrase "Q:" could use one voice and the phrase "A:" could use another voice. This would help the synthesized speech sound like a conversation between two people. This would work with screenplays, scripts, press conferences and other literature too.

Modify the built-in AIFF Writer component to use less disk space for the audio capture, or divide the text-to-speech conversion into smaller segments and merge the compressed segments into the final output file.

Port to MacApp and Carbon libraries. Add support for AppleEvents, scripting and to operate as a web browser helper application.

Improve the URL Output specification to allow movie files to be saved in locations on the local disk drive other than the application folder.


Correspondence

Correspondence regarding QuickSpeech may be directed to Doug Ward at dsward@aol.com. Be sure to put "QuickSpeech" in the subject line.

There is no official support for QuickSpeech.

 

QuickSpeech is a trademark of Doug Ward and WebNation.

Other trademarks are the property of their respective owners.

Copyright© 1999 by Doug Ward and WebNation.