www.delorie.com/gnu/docs/recode/recode_15.html   search  
 
Buy GNU books!


The recode reference manual

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.9 Debugging considerations

It is our experience that when recode does not provide satisfying results, either recode was not called properly, correct results raised some doubts nevertheless, or files to recode were somewhat mangled. Genuine bugs are surely possible.

Unless you already are a recode expert, it might be a good idea to quickly revisit the tutorial (see section 1. Quick Tutorial) or the prior sections in this chapter, to make sure that you properly formatted your recoding request. In the case you intended to use recode as a filter, make sure that you did not forget to redirect your standard input (through using the < symbol in the shell, say). Some recode false mysteries are also easily explained, See section 3.5 Reversibility issues.

For the other cases, some investigation is needed. To illustrate how to proceed, let's presume that you want to recode the `nicepage' file, coded UTF-8, into HTML. The problem is that the command `recode u8..h nicepage' yields:

 
recode: Invalid input in step `UTF-8..ISO-10646-UCS-2'

One good trick is to use recode in filter mode instead of in file replacement mode, See section 3.1 Synopsis of recode call. Another good trick is to use the `-v' option asking for a verbose description of the recoding steps. We could rewrite our recoding call as `recode -v u8..h <nicepage', to get something like:

 
Request: UTF-8..:libiconv:..ISO-10646-UCS-2..HTML_4.0
Shrunk to: UTF-8..ISO-10646-UCS-2..HTML_4.0
[...some output...]
recode: Invalid input in step `UTF-8..ISO-10646-UCS-2'

This might help you to better understand what the diagnostic means. The recoding request is achieved in two steps, the first recodes UTF-8 into UCS-2, the second recodes UCS-2 into HTML. The problem occurs within the first of these two steps, and since, the input of this step is the input file given to recode, this is this overall input file which seems to be invalid. Also, when used in filter mode, recode processes as much input as possible before the error occurs and sends the result of this processing to standard output. Since the standard output has not been redirected to a file, it is merely displayed on the user screen. By inspecting near the end of the resulting HTML output, that is, what was recoding a bit before the recoding was interrupted, you may infer about where the error stands in the real UTF-8 input file.

If you have the proper tools to examine the intermediate recoding data, you might also prefer to reduce the problem to a single step to better study it. This is what I usually do. For example, the last recode call above is more or less equivalent to:

 
recode -v UTF-8..ISO_10646-UCS-2 <nicepage >temporary
recode -v ISO_10646-UCS-2..HTML_4.0 <temporary
rm temporary

If you know that the problem is within the first step, you might prefer to concentrate on using the first recode line. If you know that the problem is within the second step, you might execute the first recode line once and for all, and then play with the second recode call, repeatedly using the `temporary' file created once by the first call.

Note that the `-f' switch may be used to force the production of HTML output despite invalid input, it might be satisfying enough for you, and easier than repairing the input file. That depends on how strict you would like to be about the precision of the recoding process.

If you later see that your HTML file begins with `@lt;html@gt;' when you expected `<html>', then recode might have done a bit more that you wanted. In this case, your input file was half-UTF-8, half-HTML already, that is, a mixed file (see section 3.7 Using mixed charset input). There is a special -d switch for this case. So, your might be end up calling `recode -fd nicepage'. Until you are quite sure that you accept overwriting your input file whatever what, I recommend that you stick with filter mode.

If, after such experiments, you seriously think that the recode program does not behave properly, there might be a genuine bug in the program itself, in which case I invite you to to contribute a bug report, See section 2.3 Contributions and bug reports.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

  webmaster   donations   bookstore     delorie software   privacy  
  Copyright © 2003   by The Free Software Foundation     Updated Jun 2003  

Please take a moment to fill out this visitor survey
You can help support this site by visiting the advertisers that sponsor it! (only once each, though)