www.delorie.com/gnu/docs/elisp-manual-21/elisp_198.html   search  
Buy the book!

GNU Emacs Lisp Reference Manual

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

15.3 Loading Non-ASCII Characters

When Emacs Lisp programs contain string constants with non-ASCII characters, these can be represented within Emacs either as unibyte strings or as multibyte strings (see section 33.1 Text Representations). Which representation is used depends on how the file is read into Emacs. If it is read with decoding into multibyte representation, the text of the Lisp program will be multibyte text, and its string constants will be multibyte strings. If a file containing Latin-1 characters (for example) is read without decoding, the text of the program will be unibyte text, and its string constants will be unibyte strings. See section 33.10 Coding Systems.

To make the results more predictable, Emacs always performs decoding into the multibyte representation when loading Lisp files, even if it was started with the `--unibyte' option. This means that string constants with non-ASCII characters translate into multibyte strings. The only exception is when a particular file specifies no decoding.

The reason Emacs is designed this way is so that Lisp programs give predictable results, regardless of how Emacs was started. In addition, this enables programs that depend on using multibyte text to work even in a unibyte Emacs. Of course, such programs should be designed to notice whether the user prefers unibyte or multibyte text, by checking default-enable-multibyte-characters, and convert representations appropriately.

In most Emacs Lisp programs, the fact that non-ASCII strings are multibyte strings should not be noticeable, since inserting them in unibyte buffers converts them to unibyte automatically. However, if this does make a difference, you can force a particular Lisp file to be interpreted as unibyte by writing `-*-unibyte: t;-*-' in a comment on the file's first line. With that designator, the file will unconditionally be interpreted as unibyte, even in an ordinary multibyte Emacs session. This can matter when making keybindings to non-ASCII characters written as ?vliteral.

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

  webmaster   donations   bookstore     delorie software   privacy  
  Copyright 2003   by The Free Software Foundation     Updated Jun 2003