... if you're feeling really adventurous look at depinit rather than systemd :)
i recall a few years back there was some company claiming they'd
managed a 1 second boot time (was it redhat or was it IBM?), and there
were also some embedded companies that managed under 350ms including
starting up a single-screen dedicated QT app. this was on 720mhz TI
OMAPs so it's definitely doable.
one of the things i remember them doing was removing damn udev! i
recall having (back in only 2005) having a 90mhz Pentium-I system
which i used as a firewall. the depth of the bash shell scripts fired
up by udev was flat-out *insane*. the fork/process tree was in some
cases well over 30 deep. it was only because i had such a slow system
that i was able to catch udev "in the act" so to speak.
i think i ended up reporting a debian bug for the pty / tty creation
at the time, because there were 256 ptys, 256 ttys, and another mad
bunch of 256 ttys somewhere else. this resulted in 768 *separate*
instances of udev insanity at shell script depth 30 each. it was
therefore no wonder that that poor pentium I system, with little in
the way of process context switching support that modern CPUs now
have, was flipping its nuts off and took over *twenty seconds* to
complete the udev setup phase.
now, the relevance here to ARM is that context-switching on ARM CPUs
is not as heavily hardware-optimised as it is in the high-end x86
world with "hyperthreading" and 4+ mbytes of 2nd level cache pushing
the number of transistors close to and in some cases above a billion.
the recommendation was therefore, if you want to keep udev, to
recompile the kernel reducing the number of MAX_TTYs.
now, the reason i mentioned depinit was because when i explored this i
took a different approach. basically what i did was create two
*separate* udev initialisation trigger scripts, and created separate
parallel dependencies on each.
the first udev trigger script fired off the absolute minimum necessary
stuff: only 10 ptys, /dev/sd*, /dev/hd*, that sort of thing.
following on from that it was possible to make networking, disks and
so on depend on that.
the *second* udev trigger script was the "normal" one that you get
every day on the majority of linux distros. it fired eeeverything.
dependent on the completion of this script i therefore had everything
else. cups printer service. ssh server. etc. etc.
it worked like a charm and i had a boot time on a 1ghz pentium-III
laptop *including* X-Server startup at something like 15 seconds.
shutdown time (thanks to depinit) was something like 3 seconds, and
much of that was the actual hardware shutting down. depinit didn't
mess about there :)
you _should_ be able to replicate this if it really bothers you that
udev's too slow, with other parallel startup systems, but the advice
to find out *where* the main time is being spent, first, is very very
good!
also wasn't there something recently about the 3.15 kernel having a
more parallel approach to hardware startup? although... you're a bit
buggered there because you'd need to patch together your own kernel...
l.