[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: One-line password generator



On 25/08/17 04:21, Thomas Schmitt wrote:
> One can estimate entropy by an approximation of the best possible
> compression in the context of the knowledge of the reader.
> The compression result will generally be longer if the compressor has
> fewer knowledge about the message.

In principle, yes, but in practice, not at all. File compressors are
designed assuming that the common case will be compression of data much
longer than passwords (at least 1 MB). The behavior for message less
than 100 B long will be highly anomalous.

Moreover, the meta-data (like magic number, container, et cetera) add
overhead to the compressed file. If we interpret compressed length as
entropy this will inflate your estimate of entropy by tens of bytes,
which is enough to make it useless.

The problem trying to estimate entropy of a message M' given a prior
message M (the _context_ in your wording) can be formulated
mathematically in terms of Kolmogorov complexity. Unfortunately,
determining “the” Kolmogorov complexity of a message (given an universal
encoding scheme, for example, programs in untyped λ-calculus) is
algorithmically undecidable. Worse yet, Chaitin proved a theorem (now
called Chaitin incompleteness theorem) that for any consistent formal
system there exist a bound N such that the formal system can not prove
that “the” Kolmogorov complexity of any specific string is higher than N.

> The second password class and my knowledge about it gives me not more
> than a reduction of text bit number by 25 percent (6 bit text -> 8 bit
> binary) and a couple of bits which are harder to harvest.
> E.g. i know that a dictionary attack is of few use.  That's one bit,
> because it's the first decision i can make. Any further insight might add
> only a fraction of a bit. (It's probabilistic. So we can grind bits to dust.)

This is a somewhat oversimplified analysis. You know beforehand that a
password is almost surely a sequence of printable characters among the
allocated code points in Unicode. If you know the program in which the
password has been input, then you can know the character encoding as
well. Assuming it is UTF-8, you can discard a large fraction of all
possible 8-bit strings (not all 8-bit strings are valid UTF-8). Thus the
prior distribution has less than 8 bits of entropy per bit.

-- 
Do not eat animals, respect them as you respect people.
https://duckduckgo.com/?q=how+to+(become+OR+eat)+vegan

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: