[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#722156: ITP: binaryornot -- Ultra-lightweight pure Python package to check if a file is binary or text.

 ❦  8 septembre 2013 17:04 CEST, Julian Taylor <jtaylor.debian@googlemail.com> :

>> This Python package provides a function to check if a file is a text
>> file or a binary file. It uses the same heuristic as file(1) by
>> looking at the first 1024 bytes of the file and checks that all
>> characters are printable.
> do we need a package for that?
> it is effectively:
>     with open(filename, 'r') as f:
>         chunk = f.read(1024)
>         textchars = ''.join(map(chr, [7,8,9,10,12,13,27] + range(0x20,
> 0x100)))
>        bool(chunk.translate(None, textchars))
> and only works with utf-8 or ascii encoded files.

Yes, it is quite light. I believe that the detection mechanism will be
extended later. Then, you can say we could still package it later. I
need it as a reverse dependency.

I have already uploaded it but can ask for it to be removed if there is
controversy on its usefulness and patch the package that needs those
bits until the package becomes more useful.
Make it right before you make it faster.
            - The Elements of Programming Style (Kernighan & Plauger)

Attachment: signature.asc
Description: PGP signature

Reply to: