[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#61342: your mail



On Wed, Mar 16, 2005 at 04:55:06PM -0800, Don Armstrong wrote:
> On Wed, 16 Mar 2005, Colin Watson wrote:
> > I realise it's a database format change, but I'd really prefer to
> > have the metadata files be pure UTF-8, so that we don't have to
> > process them for display every time, and to make things like
> > searching easier. We can always write a migration script.
> 
> I think that's the optimal solution too. However, this patch at least
> will work now, and we can move to pure UTF-8 later.

I've taken the approach of creating a new .summary format version; the
way the .summary file format works means that we can have
"Format-Version: 2" indicate RFC1522 metadata and "Format-Version: 3"
indicate UTF-8 metadata. I haven't yet made format version 3 the
default, but I will do in time.

This made the code a lot simpler, because metadata only needs to be
decoded/encoded in the two functions responsible for reading/writing
.summary files.

I've checked this into CVS, along with some of the uses of
decode_rfc1522() from your patch and the changes to make bugreport.cgi
and pkgreport.cgi output UTF-8, and installed it on bugs.debian.org.
This means that at least maintainer and submitter addresses are now
displayed properly.

The .log metadata and mail character set fixes still need more work; I'm
almost inclined to introduce a new more structured record type to
replace html at the same time, and make that be encoded in UTF-8.

Cheers,

-- 
Colin Watson                                       [cjwatson@debian.org]



Reply to: