[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#96085: mgdiff has wrong heuristics for determining whether a file is text or not



Package: mgdiff
Version: 1.0-12
Severity: important

mgdiff uses a heuristics for sanity checking whether it is handling text
files or not. But this heuristics is getting it wrong for any text that
is non-ascii. I would accept it if it were only a warning, but mgdiff
downright refuses to diff my files. This bug is in fact very old and
well known here, don't quite know why it hasn't been fixed. I guess
upstream development ceased. Actually, with the advent of vimdiff I
don't need mgdiff anymore, but anyway, here is a quick patch:

-----------------------------------------------------------------
diff silly/files.c serious/files.c
54c54
< static int is_ascii_text (char *filename);
---
> static int is_text (char *filename);
81,82c81,83
< /* 
<  * quick heuristic to test whether a file's contents are ascii text
---
> /*
>  * quick heuristic to test whether a file's contents are text
>  * MPi: too quick! changed to accept at least all of the latin-* stuff
84c85
< static int is_ascii_text (char *filename)
---
> static int is_text (char *filename)
86c87
<     int fd, bytes, i;
---
>     int fd, bytes, i, byte;
93,94c94,98
<     for (i = 0; i < bytes; i++)
<       if (!isascii (buffer[i]))
---
>     for (i = 0; i < bytes; i++) {
>       byte=buffer[i];
>       if (!( isascii (byte) ||
>                   (160<=byte && byte<=255) ||
>                   (-96<=byte && byte<=-1) ))
95a100
>     }
128c133
<     if (!is_ascii_text (filename)) {
---
>     if (!is_text (filename)) {
diff silly/patchlevel.h serious/patchlevel.h
36c36
< #define PATCHLEVEL "0"
---
> #define PATCHLEVEL "0-debian1"
-----------------------------------------------------------------

Also I'd like to know - is this package orphaned? It has debian-qa as
its maintainer, but it doesn't appear on the WNPP. If noone objects,
I'll take it, it doesn't look like a heavy burden. Oh wait, sorry, I see
an ITA, but it's from January...


-- System Information
Debian Release: testing/unstable
Architecture: i386
Kernel: Linux kosh 2.4.0-ac7 #5 Mon Jan 22 09:11:26 CET 2001 i686

Versions of packages mgdiff depends on:
ii  file                         3.33-4      Determines file type using "magic"
ii  lesstif1                     1:0.92.26-1 OSF/Motif implementation released 
ii  libc6                        2.2.2-4     GNU C Library: Shared libraries an
ii  mawk                         1.3.3-5     a pattern scanning and text proces
ii  xlibs                        4.0.2-13    X Window System client libraries  

-- 
|=| Michael Piefel                    piefel@informatik.hu-berlin.de
|=| Humboldt-Universität zu Berlin              http://www.piefel.de
|=| Tel. (+49 30) 2093 3831



Reply to: