[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: User-Agent strings, privacy and Debian browsers




On Sep 23, 2007, at 1:39 AM, Joerg Jaspert wrote:
On 11150 March 1977, Peter Eckersley wrote:

This is highly debateable. There may be tens or thousands of users of
the same package visiting a web site.
I've seen reports from very large sites indicating that User-Agent
strings are almost as useful as cookies for tracking their users.

I cant believe this. Looking at the stats from packages.debian.org - U-A is the worst possible way to "track users". Would be totally dumb to try
something with U-A:

Whether it is dumb or not, it is widely used.

Same for anything matching "Firefox/", has 467789 total hits,
with more detail, first 15 rows we get
89003 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/ 20061201 Firefox/2.0.0.6 (Ubuntu-feisty) 51159 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 21879 Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 11289 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/ 20061201 Firefox/2.0.0.3 (Ubuntu-feisty) 10975 Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 10217 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6 8542 Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.6) Gecko/ 20061201 Firefox/2.0.0.6 (Ubuntu-feisty) 7572 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 6029 Mozilla/5.0 (Windows; U; Windows NT 5.1; es-ES; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 5379 Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 4885 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.7) Gecko/ 20070914 Firefox/2.0.0.7 4859 Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7 4606 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/ 20070725 Firefox/2.0.0.6 4549 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/ 20070919 Ubuntu/7.10 (gutsy) Firefox/2.0.0.6 4472 Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 1.8.1.7) Gecko/20070914 Firefox/2.0.0.7

And thats a quick and very inaccurate way to do it. But it nicely shows that modifying your UA (or forcing others to do so) does not gain you or
anyone else anything. The only effect you have is to make statistics
more unusable than they already are.

I think that is the stated goal.


I am still not sure what the issue is from a privacy standpoint. Is it that the EFF fears that information in web server logs might point to a particular user because that user could be identified by the package number of the web browser they are using as stated in the UA string? This seems a pretty flimsy legal premise to identify someone before a court. Not least because that string is completely malleable.

Furthermore, the second that package gets updated, the string will change. Packages can change frequently, at least in comparison to new versions of debian itself. Any change from upstream should bump that version string you speak of, as well as a new package inside debian (the last bit of the version string is often the version of the debian package, if the package is not debian native. i.e. the -1 ending in Debian-1.1.4-1). So the package version is a volatile string and not something that a web site analytics software tool (like yaalr for instance :) ) would use to effectively "track" the user.

Furthermore, it seems highly unlikely that a web site would drill down so low into the UA string to get this data and use that as a unique identification. What purpose would that serve? Certainly no web site relies on the package version number of Iceweasel or Firefox to be rendered correctly, and if so they would more likely look for the version string of the software itself, ignoring any debian packaging.

I could see one scenario where this might have relevance. That would be if the UA string was logged on several servers. For example, our hypothetical user goes to stealmp3.com and leaves her user string. Then she goes to hacktheNSA.org leaving her version string. She carefully refused any cookies and used different IP addresses, but the version string shows which version of the Iceweasel package she used and the authorities know that that package was only available in a two week period - coincidentally the same time as our user was surfing. The authorities (or RIAA) use this information to narrow down the network and potentially the location of the user (through geolocation perhaps, but that is also unreliable).

But this scenario seems highly implausible.

	Jeremiah





Reply to: