Re: armelfp: new architecture name for an armel variant

To: Debian ARM <debian-arm@lists.debian.org>
Cc: Matt Sealey <matt@genesi-usa.com>
Subject: Re: armelfp: new architecture name for an armel variant
From: Hector Oron <hector.oron@gmail.com>
Date: Fri, 9 Jul 2010 02:02:46 +0200
Message-id: <[🔎] AANLkTik39gxqGFPGs5Rj72snxK-QFoc3iPI1q9msW49D@mail.gmail.com>
In-reply-to: <[🔎] AANLkTimoAsV_lxMTplp3aQv9vDM5Gx-31tU7pqnI6k-e@mail.gmail.com>
References: <[🔎] AANLkTimoAsV_lxMTplp3aQv9vDM5Gx-31tU7pqnI6k-e@mail.gmail.com>

I, hereby, quote some comments from Matt Sealey (Genesi-USA devel):

( Call for hardware: Anyone wanting hardware to work on the port,
please contact me -- better do it ASAP --, Genesi-USA is kind enough
to provide it )

--

Just a few comments on the IRC log:

The kernel on gitorious was kind of aborted. We are still working
through the very last of the bugs from moving to the May BSP. i2c
isn't working so we don't get a display. Not sure how we're fixing it
yet, but in the end the display driver needs to be rewritten from
scratch in our opinion to be DRM/KMS and suchlike and allow pluggable
encoders/connectors and all the fancy stuff DRM provides for us.

Hopefully Freescale can do this and bring the kernel to a real 2.6.31
technology level instead of the 2.6.28 forward port :D but of course
making a DRM-style driver requires some sincere expert knowledge of
the way the IPU works, at the very least the way to get from
"1280x720-16MR@60" to making the IPU set up clocks for that mode is
somewhat of a scattered, horrible mess even if you don't move outside
of drivers/video/mxc/mxc_ipuv3_fb.c, because of the way the kernel
framebuffer subsystem is organized with var, fix, par and horrible
Amiga-era design decisions and VESA BIOS holdovers. You still need to
do it for a framebuffer on top of DRM encoders, connectors and mapping
framebuffers and so on but at least the actual management of the chip
is sane, with this clunky layer on top, instead of it being so tightly
integrated so as to be mind-numbingly complex/

Until then mainlining the Efika support is doable just we wouldn't
want to impose the current horrible display hacks (Pegatron's fault)
for the SII9022A on any mainline tree.

http://www.powerdeveloper.org/linux-2.6-imx.tar.lzma

This is the code as we've got it up to about 4:30pm this afternoon
(CST). You can try it and if you fix anything or have any suggestions
or patches please say so :)

We have an updated U-Boot too we can ship but we don't want to publish
it because instead we are finishing off a unification of the desktop
and netbook U-Boot source trees which have, due to being from
different business departments at Pegatron, seemed to have diverged in
the way they handle things like configurations and SD cards and boot
variables, and we'd like them to work the same. This port won't take
too long but might not be ready for the Debconf conference as we have
a lot to do internally in the meantime and obviously needs a lot of
testing before we ship as you can really brick a board if you're not
careful. I can send it to you guys anyway if you like. We're also
looking into moving to 2009.08 (since Freescale abandoned 2009.01 our
U-Boot is based on) but also may just take 2010.06 or so and run
straight for a real U-Boot mainlining so we can wash our hands of the
opensource fw support forever and concentrate on our proprietary
(device tree based, no iomux litter everywhere in linux, board
specific changes to the BSP of ~100 lines or less) solution.

Anyway this is the state of affairs right now. We're really excited
that Debian have taken this unto themselves and we hope we can support
you guys, but what we need is not just hard work but suggestions on
how we can help or what we should be doing better.

BTW I really advocate patching dpkg such that it does not rely on the
compiler tuple to identify the port. After all who knows what compiler
you'd use - arm-none-linux-gnueabi- from codesourcery will compile
everything under the sun whether it has explicity vendor "none" or
missing part (arm-linux-gnueabi) as Ubuntu or Debian ship. BTW is the
Linaro effort going to push codesourcery as the standard compiler on
any of these systems? We're really not too fond of using gcc 4.5 (too
unstable) or gcc 4.4.1 (without CS patches it generates some weirdo
code..). Konstantinos already did this and had great results with the
compiler. If not using it as the "main" distro compiler then I really
would think that it should be available as a sort of "native" cross
compiler (installed seperately and accessible through the full tuple?)
so you don't HAVE to cross compile to get the advantages of the
advanced arm support it has over gcc 4.4.1 or gcc 4.5 depending on
where you are getting it from (2010q1 or unreleased 2010q3 resp).

--

I would also say I would fix your sights on:

ARMv7 as a baseline processor model.

VFPv3-D16 as a baseline FPU model.

VFPv3-D32 (proper VFPv3) and the updated VFP in Cortex-A9 can be used
using the ld.so hwcaps where necessary if there really is a
performance gain from the extra 16 registers or extra-special VFP
scheduling.

That should cover iMX5x, Snapdragon, Dove, Tegra2, OMAP3 and any
Cortex-A9 which are the real important cases where you will be wanting
to do heavy work (600MHz+). Anything less and you are crazy to want to
do FPU math on it anyway. VFPv3-D16 is pretty much dictated by
Dove/Tegra I think, but the performance difference - in my benchmarks
at least - is really basically minimal unless you really have a
20-variable equation to work on. Those are pretty rare on a desktop
environment.

NEON should be handled by ld.so hwcaps or - better yet - things like
ffmpeg will automatically plug in NEON code based on hwcaps itself
without involving the linker meaning a single binary will work for
everyone. I would not trust autovectorizing compilers, I have yet to
see any visible benefit from unrolling a loop automatically with NEON,
AltiVec or SSE but the real differences come from libraries like
ffmpeg, pixman, anything that does a lot of FPU work that is being
used all the time (for instance rendering an SVG icons or window
decorations) that can benefit from NEON optimization. Autovectorizing
cannot do the clever performance optimizations like using different
floating point approximations or linear algebra. Everything else can
stick to the normal FPU or manually use NEON through ld.so hwcaps.

While we want it to work best on IMX515 for obvious reasons,
ARMv7+VFPv3 is good enough start and the important bit is the fact
that it is building with the assumption that an FPU is always present
and of a certain level of functionality, and the improved FP
argument-passing in the hardfp ABI. It is not to maximize the number
of registers available on the FPU at the detriment of other very
popular ARMv7 processors.

Best Regards,
-- 
 Héctor Orón

"Our Sun unleashes tremendous flares expelling hot gas into the Solar
System, which one day will disconnect us."

Reply to:

Follow-Ups:
- Re: armelfp: new architecture name for an armel variant
  - From: Martin Guy <martinwguy@gmail.com>

References:
- armelfp: new architecture name for an armel variant
  - From: Hector Oron <hector.oron@gmail.com>

Prev by Date: Re: armelfp: new architecture name for an armel variant
Next by Date: Re: armelfp: new architecture name for an armel variant
Previous by thread: Re: armelfp: new architecture name for an armel variant
Next by thread: Re: armelfp: new architecture name for an armel variant
Index(es):
- Date
- Thread