[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Archiving of big mailboxes



On Sun, Apr 14, 2002 at 02:16:31PM -0700, Brian Nelson wrote:
> Grzegorz Prokopski <greg@sente.pl> writes:
> 
> > Hello!
> > 
> > I thought some of You can have a problem which is similar to mine.
> > I am subscribed to many, many lists and I get more than 200 emails
> > every day (maybe more, I didn't check exactly).
> > I use procmail to automaticly sort this mail to different files.
> > Then I use IMAP server and various clients (NN, Evolution, mutt) to
> > acces my mail.
> > 
> > The problem is that the mailboxes grow and grow steadily and it takes
> > more time to check for new mail in every of them, more time to get
> > message indexes etc. And - to be honest - I'd like to backup those
> > 100MB+ files to some CD or at least compress them. But You need
> > to cut out the "older" part from them first.
...
> gnus, total expiry.  It's all automatic.  I just delete the stuff since
> all my mailing lists have web archives, but you can make it expire to
> another folder.
> 
> Back when I used mutt, I used
> 
> folder-hook "+lists" 'push "<delete-pattern>~d >2w<enter>"'
> 
> which marks everything in folders matching "+lists" older than 2 weeks
> for deletion.

Sounds like he wants to keep the old mail, which is what I do.

I use mutt to read the mail, and (depending on volume) once
a month, once a quarter, once every half-year or once a year
I select all messages older than a (month, quarter, half-year,
or year) and save them all to another folder.  Then you can copy
that folder off to a cd if you like, and/or compress it.

I also use procmail to pre-sort the mail into topics.

So:  mail comes in.  procmail pre-sorts it into topics.
Then when I read the various folders, once in a while I will
save all the older messages to a separate folder with the date
and time-period in the name:  eg:  debian-user-200203 for
the March 2002 debian-user messages.  My mutt configuration
automatically marks those messages in the debian-user mail
folder as deleted, and I can delete them safely because they
now exist in the other folder (debian-user-200203).  I usually
go to a shell and compress the older folders:

$ gzip debian-user-200203
$ ls debian-user*
debian-user            debian-user-200010.gz  debian-user-200108.gz
debian-user-200001.gz  debian-user-200011.gz  debian-user-200109.gz
debian-user-200002.gz  debian-user-200012.gz  debian-user-200110.gz
debian-user-200003.gz  debian-user-200101.gz  debian-user-200111.gz
debian-user-200004.gz  debian-user-200102.gz  debian-user-200112.gz
debian-user-200005.gz  debian-user-200103.gz  debian-user-200201.gz
debian-user-200006.gz  debian-user-200104.gz  debian-user-200202
debian-user-200007.gz  debian-user-200105.gz  debian-users
debian-user-200008.gz  debian-user-200106.gz
debian-user-200009.gz  debian-user-200107.gz

Oops I see I've saved one or a few messages to the wrong
list name (debian-users).  I'll have to clean that up some day.

Once they are compressed you could copy them to a cd and
erase them from your hard disk.

Other lists don't accumulate so quickly and I save them every
three months, every 6 months, or every year.

debian-sparc-2001q1.gz  netbsd-users-2000h2.gz  sparc-list-2000h1.gz
debian-sparc-2001q2.gz  netbsd-users-2001h1.gz  sparc-list-2000h2.gz
debian-sparc-2001q3.gz  pilot-2001h1.gz         sparc-list-2001h1.gz
hurd-2000h1.gz          pilot-2001h2.gz         sparclinux-2001h1.gz
hurd-2000h2.gz          port-sparc-2000h1.gz    sparclinux-2001h2.gz
hurd-2001h1.gz          port-sparc-2000h2.gz    suns-at-home-2000h1.gz
netbsd-help-2000h1.gz   port-sparc-2001h1.gz    suns-at-home-2000h2.gz
netbsd-help-2000h2.gz   port-sparc64-2001h1.gz  suns-at-home-2001h1.gz
netbsd-help-2001h1.gz   spam-2000h1.gz          tech-userlevel-2000h1.gz
netbsd-help-2001h2.gz   spam-2000h2.gz          tech-userlevel-2000h2.gz
netbsd-users-2000h1.gz  spam-2001h1.gz          tech-userlevel-2001h1.gz


To grep through old messages, you can use:

$ zcat hurd-*gz | grep what_youre_looking_for | less

  ----- again, at the command line.



I just do this for high-volume lists.  When a list gets to be
"too big" (ie takes too long to load into mutt), then I whack off
a piece of history and compress it with the above procedure.


I do this because I have a slow dial-up connection, and this is
one way to be able to look stuff up before asking a question on the
mailing list.  Also, who knows when the net will cease to be free
(in either the beer or speech meaning)?

-- 
bjb@achilles.net
Welcome to the GNU age!   http://www.gnu.org


-- 
To UNSUBSCRIBE, email to debian-user-request@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: