| www.delorie.com/gnu/docs/recode/recode_18.html | search |
![]() Buy GNU books! | |
recode reference manual| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The request level functions are meant to cover most recoding needs
programmers may have; they should provide all usual functionality.
Their API is almost stable by now. To get started with request level
functions, here is a full example of a program which sole job is to filter
ibmpc code on its standard input into latin1 code on its
standard output.
#include <stdio.h>
#include <stdbool.h>
#include <recode.h>
const char *program_name;
int
main (int argc, char *const *argv)
{
program_name = argv[0];
RECODE_OUTER outer = recode_new_outer (true);
RECODE_REQUEST request = recode_new_request (outer);
bool success;
recode_scan_request (request, "ibmpc..latin1");
success = recode_file_to_file (request, stdin, stdout);
recode_delete_request (request);
recode_delete_outer (outer);
exit (success ? 0 : 1);
}
|
The header file <recode.h> declares a RECODE_REQUEST structure,
which the programmer should use for allocating a variable in his program.
This request variable is given as a first argument to all request
level functions, and in most cases, may be considered as opaque.
RECODE_REQUEST recode_new_request (outer); bool recode_delete_request (request); |
No request variable may not be used in other request level
functions of the recoding library before having been initialised by
recode_new_request. There may be many such request
variables, in which case, they are independent of one another and
they all need to be initialised separately. To avoid memory leaks, a
request variable should not be initialised a second time without
calling recode_delete_request to "un-initialise" it.
Like for recode_delete_outer, calling recode_delete_request
prior to program termination, in the example above, may be left out.
struct recode_request
Here are the fields of a struct recode_request which may be
meaningfully changed, once a request has been initialised by
recode_new_request, but before it gets used. It is not very frequent,
in practice, that these fields need to be changed. To access the fields,
you need to include `recodext.h' instead of `recode.h',
in which case there also is a greater chance that you need to recompile
your programs if a new version of the recoding library gets installed.
verbose_flag
false. When set to true, the
library will echo to stderr the sequence of elementary recoding steps
needed to achieve the requested recoding.
diaeresis_char
texte
charset, some countries use double quotes to mark diaeresis, while other
countries prefer colons. This field contains the diaeresis character
for the texte charset.
make_header_flag
false. When set to true, it
indicates that the program is merely trying to produce a recoding table in
source form rather than completing any actual recoding. In such a case,
the optimisation of step sequence can be attempted much more aggressively.
If the step sequence cannot be reduced to a single step, table production
will fail.
diacritics_only
false. For HTML and LaTeX
charset, it is often convenient to recode the diacriticized characters
only, while just not recoding other HTML code using ampersands or angular
brackets, or LaTeX code using backslashes. Set the field to true
for getting this behaviour. In the other charset, one can edit text as
well as HTML or LaTeX directives.
ascii_graphics
false, and relate to characters 176 to
223 in the ibmpc charset, which are use to draw boxes. When set
to true, while getting out of ibmpc, ASCII characters are
selected so to graphically approximate these boxes.
bool recode_scan_request (request, "string"); |
The main role of a request variable is to describe a set of
recoding transformations. Function recode_scan_request studies
the given string, and stores an internal representation of it into
request. Note that string may be a full-fledged recode
request, possibly including surfaces specifications, intermediary
charsets, sequences, aliases or abbreviations (see section 3.2 The request parameter).
The internal representation automatically receives some pre-conditioning
and optimisation, so the request may then later be used many times
to achieve many actual recodings. It would not be efficient calling
recode_scan_request many times with the same string, it is
better having many request variables instead.
Once the request variable holds the description of a recoding transformation, a few functions use it for achieving an actual recoding. Either input or output of a recoding may be string, an in-memory buffer, or a file.
Functions with names like
recode_input-type_to_output-type request an actual
recoding, and are described below. It is easy to remember which arguments
each function accepts, once grasped some simple principles for each
possible type. However, one of the recoding function escapes these
principles and is discussed separately, first.
recode_string (request, string); |
The function recode_string recodes string according
to request, and directly returns the resulting recoded string
freshly allocated, or NULL if the recoding could not succeed for
some reason. When this function is used, it is the responsibility of
the programmer to ensure that the memory used by the returned string is
later reclaimed.
char *recode_string_to_buffer (request, input_string, &output_buffer, &output_length, &output_allocated); bool recode_string_to_file (request, input_file, output_file); bool recode_buffer_to_buffer (request, input_buffer, input_length, &output_buffer, &output_length, &output_allocated); bool recode_buffer_to_file (request, input_buffer, input_length, output_file); bool recode_file_to_buffer (request, input_file, &output_buffer, &output_length, &output_allocated); bool recode_file_to_file (request, input_file, output_file); |
All these functions return a bool result, false meaning that
the recoding was not successful, often because of reversibility issues.
The name of the function well indicates on which types it reads and which
type it produces. Let's discuss these three types in turn.
A string is merely an in-memory buffer which is terminated by a NUL
character (using as many bytes as needed), instead of being described
by a byte length. For input, a pointer to the buffer is given through
one argument.
It is notable that there is no to_string functions. Only one
function recodes into a string, and it is recode_string, which
has already been discussed separately, above.
A buffer is a sequence of bytes held in computer memory. For input, two arguments provide a pointer to the start of the buffer and its byte size. Note that for charsets using many bytes per character, the size is given in bytes, not in characters.
For output, three arguments provide the address of three variables, which
will receive the buffer pointer, the used buffer size in bytes, and the
allocated buffer size in bytes. If at the time of the call, the buffer
pointer is NULL, then the allocated buffer size should also be zero,
and the buffer will be allocated afresh by the recoding functions. However,
if the buffer pointer is not NULL, it should be already allocated,
the allocated buffer size then gives its size. If the allocated size
gets exceeded while the recoding goes, the buffer will be automatically
reallocated bigger, probably elsewhere, and the allocated buffer size will
be adjusted accordingly.
The second variable, giving the in-memory buffer size, will receive the
exact byte size which was needed for the recoding. A NUL character
is guaranteed at the end of the produced buffer, but is not counted in the
byte size of the recoding. Beyond that NUL, there might be some
extra space after the recoded data, extending to the allocated buffer size.
A file is a sequence of bytes held outside computer memory, but
buffered through it. For input, one argument provides a pointer to a
file already opened for read. The file is then read and recoded from its
current position until the end of the file, effectively swallowing it in
memory if the destination of the recoding is a buffer. For reading a file
filtered through the recoding library, but only a little bit at a time, one
should rather use recode_filter_open and recode_filter_close
(these two functions are not yet available).
For output, one argument provides a pointer to a file already opened for write. The result of the recoding is written to that file starting at its current position.
The following special function is still subject to change:
void recode_format_table (request, language, "name"); |
and is not documented anymore for now.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| webmaster donations bookstore | delorie software privacy |
| Copyright © 2003 by The Free Software Foundation | Updated Jun 2003 |