www.delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1995/01/09/06:35:32

Date: Mon, 9 Jan 1995 20:18:08 +0900
From: Stephen Turnbull <turnbull AT shako DOT sk DOT tsukuba DOT ac DOT jp>
To: djgpp AT sun DOT soe DOT clarkson DOT edu
Subject: Searchable DJGPP archive (FAQ? ;)

Two announcements concerning the Yaseppochi-gumi archive:

(1) I am making Eli Zaretskii's beta FAQ available.  I am adding a
    search capability for it.
(2) I now believe that the *.stripped.gz files now have no loss of
    content.  From my announcement:

      One caveat: to speed up searches, I have stripped duplicate headers
      generated by RMail and nuisance headers (such as "Reply-to:" and
      "Received:") from the archives.  However, the reduction in size of the
      *.gz files is suspiciously large, [...]

I was right.  This has been fixed; the *.stripped.gz are now
substantially larger in several cases (you can look at the DU-sorted
file on my server---of course I used the "-s" option; before comes
first).  Some are smaller because I added "X400-[-A-Za-z]+:" to the
list of nuisance headers.

Some FAQs on the archive search (well, these are the *only* questions
I've got so far, so they're the most F-lyAQs ;)

   That is probably just the Received headers.  Your message came here
   as seen below, so you can see that you should expect some shrinking.

I deleted the appended "Received:" headers, we all know what they look
like---and if we don't, we don't want to.  That's why I filter them.
(Well, much more important, it substantially speeds up the greps.)

As Bob Babcock (I think it was) pointed out, if stuff like "Received:"
headers can make the *.gz files balloon (in one case, to 3 times the
size!), gzip ain't on the job.  In fact, when I filtered my own .sig,
I was stripping large amounts of content from a couple of files.  This
is due to that fact that the "last-line-of-my-sig" regexp didn't catch
some variant .sigs I use ;-) I don't use my .sig all the time (that's
why whole files didn't disappear).

   Also, I just tried searching for "unsubscibe" and came up with way
                                          ^
                                          |
typo ;-) ---------------------------------+

   to little text.  I don't know why.

These are my personal received-mail files---my left middle finger sits
on the 'd' key just to filter "unsubscribe".  I will eventually use
the Clarkson archives, but for the moment I'm using my personal stuff
as it's more easily available to me right at the moment (my Clarkson
copy is about 4 months old and offline).  When I *do* get the Clarkson
archives, I will filter all messages with less than 4 lines content
containing "subscribe", "add", or "delete" ;-)
    I hope there's nothing too mortifyingly personal or insulting in
there ;-)
    --Steve <turnbull AT shako DOT sk DOT tsukuba DOT ac DOT jp>

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019