Source File Character Set

Internally, TADS 3 uses Unicode to represent all text strings.  Unicode was created to provide a single character set that can represent virtually all of the characters used by all of the world's written languages.

 

Unfortunately, most computers today don't use Unicode as their native format, so your source files will probably use some other character set.  For example, if you're a Windows user in the United States or Western Europe, your source files probably use "code page 1252," which includes the ASCII characters plus the accented letters and special symbols commonly used in Western European languages such as French, Spanish, and German.  Macintosh users in these parts of the world use a different, Mac-specific character set, which includes almost exactly the same set of glyphs but assigns different character codes to most of the accented letters and special symbols.

 

The TADS 3 compiler is designed to work with these diverse system character sets.  When the compiler reads your source file, it automatically translates the characters from the local format used by your operating system into Unicode characters.  Only Unicode characters are stored in the compiled version of your program (the image file).

 

The TADS 3 compiler uses an external file called a "character mapping file" to translate from your source format into Unicode.  The character map is a file that comes with the compiler; its name ends with ".tcm" (TADS Character Map).  In most cases, you won't need to worry about this file, because the compiler will automatically load the appropriate mapping for your operating system.  However, in some cases you might want to specify a mapping explicitly, rather than allowing the compiler to choose; for example, if the source file you're compiling was originally created on a different type of computer, and you didn't translate the file into your local character set, you might need to use the mapping for that other computer.  In addition, you might be intentionally using a non-standard character set; if you have a machine localized for Western Europe, but you're writing a game in Russian, you'll probably prepare your source file in a Cyrillic character set rather than your computer's default Western Europe setting.

 

To specify a mapping, use the -cs option when compiling.  Specify the name of the mapping (this is simply the filename minus the ".tcm" suffix).  For example, to compile source code created using code page 1250 (Windows Central/Eastern Europe), you'd use this command:

 

    t3make –cs cp1250 mygame.t

 

If the TADS 3 distribution doesn't include a mapping file for the character set you're using, you can build your own mapping file.