[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Linux' (and other OS's) code patterns present in device drivers, the kernel and userland code ...



>>  Do you know of any code correlation analysis in Linux, preferably,
>> based on some measurable metrics?
~
> You may get more of a response to this question if you ask again with a more appropriate Subject.
~
 as I mentioned in relation to adler32's hashing implementation, there
are quite a few important libraries heavily using it:
~
 http://packages.debian.org/search?searchon=contents&keywords=adler32&mode=filename&suite=stable&arch=any
~
 I think code correlation issues, even if not a totally trivial,
syntactic problem (compilers could take care of), pertains only to
adler32's hashing implementation, yet in the case of such relatively
straightforward and simple piece of code (more than) 16 libraries
include their own implementations:
~
 golang | grub | lib32go0 | lib64go0 | libavutil | libbotan1 |
libcrypto++ | libgcj12 | libghc | libgo0 | libsrecord | libwireshark |
openswan | php5 | ri1 | tcllib
~
 and many such as rsync and zlib do their own adler32 hashing in code modules
~
 More concise code would also influece reliability and security. I
don't think that the NSAs of the world out there would like a bit if
networking is entirely taken out of the kernel and defined explicitly
for certain users and if especially drivers (and code in general)
cleanly separate their input, output and i/o venues
~
 Even though monkey me is interested in such problems (corpora
research) I doubt I am the only one who has been bugged by those
questions
~
 For example, this is what {kadav, swift} @cs.wisc.edu say about
(similar) code patterns:
~
 http://pages.cs.wisc.edu/~kadav/study/study.pdf
~
 "... First, a substantial number of assumptions about drivers, such
as class behavior, lack of computation, are true for many drivers but
by no means all drivers. For example, instead of request handling, the
bulk of driver code is dedicated to initialization/cleanup and
configuration, together accounting for 51% of driver code. A
substantial fraction (44%) of drivers have behavior outside the class
definition, and 15% perform significant computations over data. Thus,
relying on a generic frontend network driver, as in Xen
virtualization, conceals the unique features of different devices.
Similarly, synthesizing driver code may be difficult, as this
processing code may not be possible to synthesize."
~
 "At one end, miniport drivers contain almost exclusively
device-specific code that talks to the device, leaving kernel
interactions to a shared library. At the other end, some drivers make
extensive calls to the kernel and very few into shared device
libraries."
~
 "We find that USB and Xenbus provide the opportunity to utilize the
extra cycles on devices by executing drivers on them and can
effectively be used to remove drivers from the kernel leaving only
standardized bus code in the kernel."
~
 "Finally, we find strong evidence that there are substantial
opportunities to reduce the amount of driver code. The similarity
analysis shows that there are many instances of similar code patterns
that could be replaced with better library abstractions, or in some
cases with tables."
~
 Probably there is also some level of "emperor's-clothes" secrecy
going on. Let's face it, even though we talk about "computer
'sciences'" (not even engineering) coding is basically some carpentry
;-). If all the paths of a state machine can be written in a fully
addressable way (such as XML) there wouldn't be left much to great
many programming languages' obfuscations ...
~
 lbrtchx


Reply to: