www.delorie.com/gnu/gnu2html/   search  
gnu2html

gnu2html is the program I use to maintain my GNU Documentation page. Due to the high number of requests for copies of this archive (which my modem can't handle), I've decided to publish gnu2html so that others may create their own local copies of this archive. If you try this, please send me e-mail and let me know how it went - so far, nobody has gotten back to me!

If You Are Impatient

What You Need Before You Get Started

gnu2html is a Perl 5 script that runs on a Unix machine. The machine your run it on must have direct FTP access to a GNU mirror (such as prep.ai.mit.edu). This program can't go through a web proxy or socks firewall. You will need about 44 Mb of disk space for the 11,000 or so HTML files it will produce. You will need to create four template HTML files that will be used to put your standard headers and trailers on generated index pages (more on that later). You will need to download and install a copy of texi2html.

What You Need From Me

Most importantly, you'll need the gnu2html program itself, which is a perl script. There are some changes you'll need to make to it (more on that later) to configure it for your site. I've also written man2html.c, a program to convert man pages to HTML. It's not perfect, but it converts most of the GNU manual pages pretty well, and doesn't rely on having groff or nroff installed. You'll have to figure out how to compile it on your system, but it's pretty standard C. You'll have to edit it to put in your own header and trailer HTML.

There's also a control file called gnudocs.ini that tells gnu2html how to extract the documentation out of the source archives and process it into the HTML archive. By default, gnu2html will FTP to my server and get the latest copy of this file, as I keep it up to date for my own archive. You can get a copy of this file and edit it, but it's easier to just let gnu2html grab a copy from my server. If you need to edit it, here are some instructions.

How To Configure And Install It

First, check the first line of gnu2html to make sure it refers to an appropriate copy of Perl 5.

You'll need to edit gnu2html to get files from your local GNU source mirror. If the closest mirror is prep.ai.mit.edu, you won't need to edit that part. You also have to tell it where to put the HTML files. In the gnu2html file you'll find these varables:

$ftp_host
This should be set to the name of the FTP server where you get GNU sources from. The default is prep.ai.mit.edu, but you should set it to something closer to you.

$ftp_user
The name of the user to log in to the FTP server as. anonymous is usually a good choice, but you may need to use a personal account if you're using a private FTP server.

$ftp_password
The password to log in with. For anonymous FTP, this is usually your e-mail address, like joe@somewhere.com (please use your actual email address!). The usual warnings about storing passwords in clear text apply.

$ftp_directory
The directory on the FTP server where the GNU documentation is kept. This directory will have a couple of hundred .tar.gz files (example: emacs-*.tar.gz).

$ini_*
Same as $ftp_* (above), but this is the place where gnu2html will get the latest gnudocs.ini file from. If you're getting it from me, you can leave these alone. If you want to use a local file, remove (or comment out) these lines and store your file as gnudocs.ini in the directory you run gnu2html from (or give the full path on the command line).

$default_ini
If you're not using FTP to get your gnudocs.ini file, you can set this to be the default location of your local gnudocs.ini file. This is not an URL!.

$temp
Where to store temporary files. Such files include downloaded gnu sources, uncompressed sources, and converted HTML files. On some systems, /tmp won't have much room, so you should set this to a suitable directory (maybe /usr/tmp).

$web
This is the directory that becomes the top of the GNU Documentation archive. gnu2html will create $web/index.html and many subdirectories here.

$ftp_cache
If you uncomment this line and set it to a suitable directory, then any files that are downloaded from the GNU ftp server will be stored there and not deleted. If you do this, you can use "gnu2html -getnew" to download new files automatically and notify you of them without converting them.

$man
The program to use to convert man pages to HTML. This program should take the man sources (i.e. *.1) via stdin, and produce HTML via stdout.

$texi
The location of texi2html on your system.

Next, you'll need to create four template files that gnu2html will use to generate the index.html files. The one for subpages can have %s where you want the title placed. I've got samples available in the "if you are impatient" section at the top of this page.

OK, if you've got gnu2html configured, and man2html and texi2html installed, you're ready to try it. Just run gnu2html and wait (this could take many hours the first time).

The last thing to do is install a cron entry to keep your documents up to date. This one runs early Sunday morning:

    0 3 * * 0 /home/dj/src/gnu2hmtl/gnu2html

  webmaster   donations   bookstore     delorie software   privacy  
  Copyright 1999   by The Free Software Foundation     Updated Jan 1999