[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Linux' (and other OS's) code patterns present in device drivers, the kernel and userland code ...



I'm sorry, I still don't fully understand the point you are trying
to make, but I'll try...

On Mon, Dec 02, 2013 at 08:26:04PM +0000, Albretch Mueller wrote:
>  as I mentioned in relation to adler32's hashing implementation, there
> are quite a few important libraries heavily using it:
> ~
>  http://packages.debian.org/search?searchon=contents&keywords=adler32&mode=filename&suite=stable&arch=any

This isn't a very reliable way to ascertain this, because...

>  I think code correlation issues, even if not a totally trivial,
> syntactic problem (compilers could take care of), pertains only to
> adler32's hashing implementation, yet in the case of such relatively
> straightforward and simple piece of code (more than) 16 libraries
> include their own implementations:
> ~
>  golang | grub | lib32go0 | lib64go0 | libavutil | libbotan1 |
> libcrypto++ | libgcj12 | libghc | libgo0 | libsrecord | libwireshark |
> openswan | php5 | ri1 | tcllib

...You are listing binary package search results. Several of those binary
packages are from the same source package (e.g., lib32go0, lib64go0,
libgo0 all belong to the source package gcc-4.7).

The simplicity of the hash implementation is one reason that there is
such code duplication. I recall once looking for an MD5 library for a
program I was working on and finding instead embedded copies of MD5
implementations everywhere.  I eventually took the public domain one
from dpkg.

Libraries don't come for free. The work required to create a shared
"libmd5" library would far outweigh the work writing the algorithm
itself. The same is probably true for all individual hash algorithms.

Shared libraries that offer a collection of hashes do exist, there is
also a cost for a downstream user to use a library: license concerns
(GPL vs. openssl for example); portability issues; increasing the
complexity of source builds for some.

> In the search results above, golang 
> ~
>  and many such as rsync and zlib do their own adler32 hashing in code modules
> ~

Even when an appropriate shared library does exist, you could only
use it in C directly: for C++ or scripting languages you would need
to wrap the C library in native bits to make it work. To do this for
a simple algorithm and the wrapping would once again outweigh the 
algorithm. In many cases, you would also lose one of the reasons you
were writing in a higher-level language in the first place, with your
library interface either a close match to the C semantics or a thick
abstraction layer. In some cases you would get better performance
re-implementing the algorithm. In some cases, of course, you get better
performance using a C (or assembler) implementation.

>  For example, this is what {kadav, swift} @cs.wisc.edu say about
> (similar) code patterns:
> ~
>  http://pages.cs.wisc.edu/~kadav/study/study.pdf

I don't see the resemblance between this paper and what I have
interpreted you as trying to say above.

P.S.: what's with all the tildes?


Reply to: