Re: [neroden@twcny.rr.com: Re: Re: more evil firmwares found]

To: Guillem Jover <guillem@debian.org>
Cc: debian-devel <debian-devel@lists.debian.org>
Subject: Re: [neroden@twcny.rr.com: Re: Re: more evil firmwares found]
From: Donovan Baarda <abo@obsidian.com.au>
Date: Wed, 12 May 2004 12:58:40 +1000
Message-id: <[🔎] 1084330720.1025.116.camel@schizo>
In-reply-to: <20040511024820.GB6039@zulo.hadrons.org>
References: <20040511024820.GB6039@zulo.hadrons.org>
On Tue, 2004-05-11 at 12:48, Guillem Jover wrote:
> Hi Donovan,
> 
> Same as the other mail, not CCed to you.

Thanks for slinging this to me... can't resist responding :-)

> ----- Forwarded message from Nathanael Nerode <neroden@twcny.rr.com> -----
[...]
> Donovan Baarda wrote:
> 
> > G'day all,
> > 
> > I'm not subscribed to the debian-devel list... feel free to repost to
> > the list if this bounces.
> > 
> > I've been browsing the kernel firmware threads in the archives, and felt
> > the need to throw in my 2c.
> > 
> > I have at various times worked on firmware for custom hardware. This
> > often involved large sequences of hand coded hex.
> Was the *entire* firmware ever hand coded hex?  (How often?  How long?)
[...]

>From a hardware developers point of view, the "entire firmware" would
include all the stuff burned into hardware ROM or FPGA. The "software"
provided to support that hardware/firmware could easily be a single hex
array and support code to load it. Everything else is burned into the
hardware, and all that is needed is initialisation values for registers.
This allows you to tweak timing values etc with a software upgrade.

> > With things like gate arrays, the preferred form for modification often
> > is a large binary file. This file is "edited" using a large commercial
> > application, often provided by the gate-array manufacturer. Anyone who
> > has this application can "open" and modify this file. There is no source
> > or compiler as such; the binary file is directly modified and directly
> > loaded into the gate array. The format of this binary file is usually
> > documented in the gate array spec's, and in theory you could use any
> > tool to edit this file.
> I see.  OK, if it's fully documented, great, that's GPL-compliant!  For the
> DFSG, if you have no tool, that's 'contrib'; if you have the tool it can go
> in 'main'.

Often the gate array is well documented by the gate-array manufacturer,
but the particular piece of hardware will not include documentation that
tells you which gate array is used, or how it is wired up and used.
Examining the hardware may or may not be enough to identify the part
used. So even though the gate array is well documented, you don't know
that it is used, or how it is used, so you are still in the dark.

> If it's an embedded hex blob, however -- since the preferred form for
> modification is a binary file -- then you need to supply the binary file,
> not just the hex blob!
[...]

Yeah. I suspect that the complications of convincing the Open Source
world that a binary blob _is_ the "source code" has encouraged hardware
manufacturers to turn their binary blobs into C. This has not helped :-(

> > Without details on the hardware, it would be nearly impossible to
> > determine if the binary file you have is or isn't the preferred form for
> > modification. You would have to take the distributors word on that.
> A little documentation would help.  Or indeed, the copyright holder's word,
> rather than hypotheses and implications.

Documentation would definitely help, but I suspect that documentation
will be harder to squeeze out of manufacturers than source. It assumes
they actually have some documentation in a usable form, and not just in
some engineers head and/or scribbled notes. Decent engineers working for
quality companies produce good internal documentation, but I don't know
how many budget hardware manufacturers fall into that category.
Companies that do produce decent internal documentation will consider
the documentation to be more important IP than the source code.

> Still when I see something in the format "static u32 text[] = {...}; static
> u32 rodata[] = {...}; ...", (as in the tg3 firmware) it looks pretty much
> like a assembled executable file, and I have trouble believing that it
> isn't.
[...]

It is really difficult to tell. Remember that many bits of hardware
don't even have processors. Many are effectively just hardwired state
machines with arrays of registers that modify their behaviour. Even if
they do have processors, they often have code burned in so they act like
hardwired state machines. Even relatively complex bits of hardware are
implemented this way (the example with the large binary blob was a
hardware Japanese character recognition engine implemented using a
gate-array state machine).

Many other bits of hardware are assembled using off-the-shelf chipsets
that each have their own register sets and hardwired behaviour. There is
no "assembler" for this, just the data-sheets for the chipsets
themselves that help you figure out the hex array you need to load.

I have never seen assembled executable code converted into a hex array.
Perhaps that is because I've never tried to pass such code off as "Open
Source", but I've still never seen it done before.

> > Firmware walks the grey line between hardware and software.
> Yes.  If it isn't burnt into ROM, however, it's software.  :-)

Actually, I would argue that they are all software. Even the hardware
design is "software". However, the hardware and "burnt" firmware isn't
distributed by Debian, so it's not Debian's problem.

> >> > Hasn't anyone considered that maybe the binary blob *is* the source
> >> > code?
> >> 
> >> Yes, and we've pretty much rejected it. While they might actually use
> >> ASM, I've yet to run into someone who regularly codes (and modifies
> >> their code) in machine language.
> > 
> > In firmware for custom hardware I've found it very common to have long
> > runs of hex data hand coded in C arrays. These arrays are the preferred
> > form for modification, and no scripts are used to generate them.
> Hooray!  A fact!
> 
> Are they ever the *entire* firmware?

See above for my definition of the "entire firmware". However, it would
be fairly common for chipset and state-machine based designs to have a
single hand coded hex array and code to load it as the only "software"
provided with the hardware.

> > A typical example would be hardware with large numbers of configuration
> > registers (up to 256 registers would be fairly common, but I would not
> > be surprised if there was hardware out there with many K's worth of
> > registers). Each register needs to be loaded with specific values that
> > are evaluated by hand from the hardware's spec's. Evaluating many
> > hundreds of hex values and typing them into an array is painful, but it
> > only needs to be done once, so you do it by hand. During testing you
> > might find you need to tweak the values, which is again done by hand.
> 
> So here you would
> *specifying the width of the registers
> *make an array of items of that width
> *loading it in some clear way, *not* written in hex, into the registers

Yes. The width of the registers would typically be 8-64 bits. In some
cases, the width of the registers would vary, so you would use an array
of the largest register size, and some values would include multiple
adjacent smaller registers and/or unused bytes. Sometimes there is
"holes" in the register address space where there are unused or
non-existent registers, so you fill the array with dummy values.
Sometimes chips have a primitive communications protocol where you need
to send them "packets" of data, so you hand code the packets in hex. 

BTW, after doing this stuff for a while, you start to think base 10
sucks... why couldn't the metric system have been based on a base 16
number system instead :-)

> Anyway, would you call such an array 'microcode'?

Possibly. If you have some kind of specialised processor (possibly
implemented using a gate-array state machine), you might have
"microcode" for it. A specialised processor for a single application
will have it's own specialised instruction set. You wouldn't bother
writing an assembler or compiler for that instruction set, you would
just manually write the hex. Particularly something like microcode, you
don't expect to change it once it's working.... why build a whole
development tool-set for it?

> I have to assume that the 'microcode' in the r128 and radeon drivers is, in
> fact, microcode.

I would tend to agree, except I wouldn't bet on it. From ATI's point of
view it would make sense to build development assemblers and compilers
for their "high level" instruction set. However, it might be easier to
just hand code the microcode once for each chipset to implement that
"high level" instruction set. The overhead of building microcode
development tool-sets might not be worth it, particularly if each
chipset is so different the tool-set can't be re-used between them.

> > Often they don't know; all they have
> > is the same source you do, and the incomprehensible scribbled
> > calculations of the engineer who wrote it were either filed in the
> > trash, or are buried in a working file somewhere with configuration
> > management.
> Well, that's OK, of course.

Well... from a good design practices point of view it is totally evil
:-)

More 2 c's... :-(

-- 
Donovan Baarda <abo@obsidian.com.au>
Obsidian Consulting Group
Reply to:
Follow-Ups:
- Re: [neroden@twcny.rr.com: Re: Re: more evil firmwares found]
  - From: Humberto Massa <humberto.massa@almg.gov.br>
Prev by Date: Re: Spam in the lists out of control
Next by Date: Re: ITA: filler - Simple game in Java
Previous by thread: Bug#248555: ITP: ximian-connector -- Exchange connector for the Evolution groupware suite
Next by thread: Re: [neroden@twcny.rr.com: Re: Re: more evil firmwares found]
Index(es):
- Date
- Thread