www.delorie.com/gnu/docs/emacs/gnus_103.html   search  
Buy the book!

Gnus Manual

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.18 Charsets

People use different charsets, and we have MIME to let us know what charsets they use. Or rather, we wish we had. Many people use newsreaders and mailers that do not understand or use MIME, and just send out messages without saying what character sets they use. To help a bit with this, some local news hierarchies have policies that say what character set is the default. For instance, the `fj' hierarchy uses iso-2022-jp-2.

This knowledge is encoded in the gnus-group-charset-alist variable, which is an alist of regexps (to match group names) and default charsets to be used when reading these groups.

In addition, some people do use soi-disant MIME-aware agents that aren't. These blithely mark messages as being in iso-8859-1 even if they really are in koi-8. To help here, the gnus-newsgroup-ignored-charsets variable can be used. The charsets that are listed here will be ignored. The variable can be set on a group-by-group basis using the group parameters (see section 2.10 Group Parameters). The default value is (unknown-8bit), which is something some agents insist on having in there.

When posting, gnus-group-posting-charset-alist is used to determine which charsets should not be encoded using the MIME encodings. For instance, some hierarchies discourage using quoted-printable header encoding.

This variable is an alist of regexps and permitted unencoded charsets for posting. Each element of the alist has the form (test header body-list), where:

is either a regular expression matching the newsgroup header or a variable to query,
is the charset which may be left unencoded in the header (nil means encode all charsets),
is a list of charsets which may be encoded using 8bit content-transfer encoding in the body, or one of the special values nil (always encode using quoted-printable) or t (always use 8bit).

Other charset tricks that may be useful, although not Gnus-specific:

If there are several MIME charsets that encode the same Emacs charset, you can choose what charset to use by saying the following:

(put-charset-property 'cyrillic-iso8859-5
                      'preferred-coding-system 'koi8-r)

This means that Russian will be encoded using koi8-r instead of the default iso-8859-5 MIME charset.

If you want to read messages in koi8-u, you can cheat and say

(define-coding-system-alias 'koi8-u 'koi8-r)

This will almost do the right thing.

And finally, to read charsets like windows-1251, you can say something like

(codepage-setup 1251)
(define-coding-system-alias 'windows-1251 'cp1251)

while if you use a non-Latin-1 language environment you could see the Latin-1 subset of windows-1252 using:

(define-coding-system-alias 'windows-1252 'latin-1)

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

  webmaster     delorie software   privacy  
  Copyright 2003   by The Free Software Foundation     Updated Jun 2003