Re: Bug#722156: ITP: binaryornot -- Ultra-lightweight pure Python package to check if a file is binary or text.
* Julian Taylor <email@example.com>, 2013-09-08, 17:04:
* URL : https://github.com/audreyr/binaryornot
* License : BSD
Programming Lang: Python
Description : Ultra-lightweight pure Python package to check if a file is binary or text.
This Python package provides a function to check if a file is a text file or
a binary file. It uses the same heuristic as file(1) by looking at the first
1024 bytes of the file and checks that all characters are printable.
do we need a package for that?
I would mind a library that does only one little thing, if it did it right, was
well-documented and came with a decent test suite. Unfortunately, binaryornot
is currently not like that. Its bug density is rather high:
PY3 = sys.version > '3'
Eww, the sys.version_info tuple should be used for comparisons instead.
def unicode_open(filename, *args, **kwargs):
Opens a file as usual on Python 3, and with UTF-8 encoding on Python 2.
So it uses locale encoding in Python 3, but UTF-8 in Python 2. Why such
inconsistency? Also, this function isn't used anywhere...
:param filename: File to open and get the first little chunk of.
:returns: Starting chunk of bytes.
with open(filename, 'r') as f:
chunk = f.read(1024)
Docstring says it returns "bytes", but in Python 3 it returns a Unicode string.
:param bytes: A chunk of bytes to check.
The parameter's name is "bytes_to_check", not "bytes".
textchars = ''.join(
map(chr, [7, 8, 9, 10, 12, 13, 27] + range(0x20, 0x100)))
In Python 3, this raises TypeError.
:param filename: File to check.
:returns: True if it's a binary file, otherwise False.
How is is_binary_alt() different than is_binary()? They have identical docstrings.
chunk = get_starting_chunk(filename)
if not PY3:
There's no "else" branch...