Re: Parallella project on Kickstarter
On Tue, Oct 16, 2012 at 9:31 AM, Jari Kirma <email@example.com> wrote:
> Although I'm fan of Parallela, just a small comment on GPUs vs. Parallela:
> although both are computing technologies exploiting parallelism, they are
> also quite different beasts. GPUs tend to be vectorized, often leaning
> towards VLIW architecture and have considerable amount of
> application-specific units and instructions.
yes. this is why i particularly like ICubeCorp's approach. they've
ported open64 to their architecture, which is a VLIW multi-core
general purpose processor with instructions that help accelerate both
video and 3D graphics. rather than have a separate GPU with its own
built-in CPU. or a video engine with its own built-in CPU.
i've dealt with things like the Aspex Semiconductor's "ASP".
parallel processors like this: i tell you, they're absolute hell to
program. when i say "hell" i mean that you can measure the number of
lines of assembly code in *days* per line of code. not lines of code
per day - DAYS per line of code. the benefit however is that for
certain algorithms you get a performance boost of 10:1 or even 100:1
over any other type of chip (on a price-performance metric).
in the case of the parallela IC, the grid arrangement is something
that, with a hell of a lot of work, could be exploited to say dedicate
16 of the 64 cores to performing 3D GPU SIMD (vector) processing.
another 16 could be dedicated to video processing, and so on. but it
would take an enormous amount of work to do that hard-coded
programming. where are the tools which aid and assist in doing that,
and take away much of the pain?
by contrast, ICubeCorp's processor, because they have a compiler
expert on board with over 15 years of experience with SGI *before* he
started work for ICubeCorp, you can just take the standard free
software MesaGL library for example, and do "CC=/usr/bin/mvpcc
./configure" and err... you're done.
when you have a decent compiler that takes care of all the hard work,
the advantage of the VLIW approach is that the clock speed is *much*
lower yet achieving performance that rivals systems with 3 or 4 times
the clock rate (and therefore roughly 10 to 15 times the power
tensilica is another company that has a VLIW compiler associated with
their RISC core. both are unfortunately proprietary: the performance
however is unrivaled, and they've sold over 1.5 *billion* RISC core
whereas, the work that ICubeCorp has done on porting open64 to their
architecture is, of course, entirely GPL'd.
the bottom line is that whilst i'm delighted to see what parallela are
doing, they're still quite a long way behind, in actually getting
anything useful from a commercial perspective out the door. their
architecture is very similar to what ziilabs have, *BUT*, remember:
ziilabs grid-based parallel processor is on-board in combination with
an ARM processor (ZMS08 etc.)
what i'm trying to say is that whilst parallela's approach is
laudable, they've only done about 25% of the work needed (the hardware
side). to be a commercial success, they need the software side as
well. not just to release the tools or the documentation, which is of
course itself a laudable step, but to *prove* that, commercially, they
can handle 1080p30 video, and that they can do over 70 million
triangles per second or whatever it is.