Share this project


Share this project

The Parallella project will make parallel computing accessible to everyone.
The Parallella project will make parallel computing accessible to everyone.
Created by

4,965 backers pledged $898,921 to help bring this project to life.

Demos, white papers, and example projects

There have been a lot of requests for demos and examples (and rightfully so!).  Here's a taste!  There will be more demos published in the coming week. Clearly one of our greatest challenges will be to convince everyone that the Parallella board and Epiphany chips are not "vaporware". Yesterday we published the Epiphany architecture and SDK manuals.  Today we are publishing white papers and source code for some important signal processing and mathematical kernels running on the existing Epiphany chips and boards.

Demo #1: Using a Scalable Parallel 2D FFT for Image Enhancement

Demo #2: Scalable Parallel Multiplication of Big Matrices

Demo #3: Approaching Peak Theoretical Performance with Standard C

The main messages we want to portray with these demos are:

1.) How well the architecture scales to large array sizes. We know a lot of you are interested in much more powerful chip and board combinations for the future.

2.) How few restrictions there are for expert programmers. No magic library calls or proprietary drivers required.

3.) How it should be possible to build abstraction layers above the fabric exposed in these demos to make parallel programming practical for even novice users.

We hope that these examples will excite those developers who are interested in looking behind the curtain. Rest assure that we are working on flashier demos that should appeal to a wider audience.

Best wishes,

Adapteva Team


    1. Creator Liad Weinberger on October 9, 2012

      With the new reward level, I really wish I could pledge 10K but alas, I can't. I hope some people could though.

    2. Creator RadovanS on October 9, 2012

      Would be interested to explore systolic realization of matrix multiply, followed by convolution, and even modified Faadeeva algorithm on this chip. It all depends on the intercell synch mechanism and data pumping between cells control.
      Very good design, congratulations.

    3. Creator Richard Henderson on October 9, 2012

      I'm interested in this (indeed I've pledged) but not for mathematical work per se. My interest is in vector processing and filtering of data-streams (think CEP), so please keep an eye on data throughput, pipeline composition and short-cuts for low latency and general I/O scaling. I would suggest this is more commercially interesting than teraflops on their own, and is currently rather painful on FPGA's and GPU's. Good luck with the fund-raising!

    4. Creator Beau Webber on October 9, 2012

      What a fascinating project.
      I program in the array processing language Apl, and I have just ported the run time library of the Apl to C compiler aplc to the Altera NiosII soft processors, and the XMOS chips - I would love to see it working on your hardware, with decent floating point units. I don't claim the code will ever be as efficient as hand-written code (why not ? - you can co it better ? - teach the compiler how!) but if like me you are doing a different thing each day with curve fitting, peak solving, digital filtering, and your main attention is on the physics, it is the only way to go.
      So wish you guys the best.
      cheers Beau ( )

      Here are some simple matrix operations, first the source code function, then running on the NiosII - a few minutes work from start.Other algorithms such as generating Gaussian noise and then binning it to get a normal curve have also just worked.


      A ← ⍳ 5
      ' '
      B ← ⍳ 9
      ' '
      'C: inner product:'
      C ← A ∘.× B
      ' '
      'T: transpose:'
      T ← ⍉ C
      ' '
      'O: outer product (matrix multiply):'
      O ← C +.× T
      ' '
      'S: solve matrix:'
      S ← C ⌹ A

      On the NiosII "Media" floating point processor on a Terassic DE1 :

      1 2 3 4 5

      1 2 3 4 5 6 7 8 9

      C: inner product:
      1 2 3 4 5 6 7 8 9
      2 4 6 8 10 12 14 16 18
      3 6 9 12 15 18 21 24 27
      4 8 12 16 20 24 28 32 36
      5 10 15 20 25 30 35 40 45

      T: transpose:
      1 2 3 4 5
      2 4 6 8 10
      3 6 9 12 15
      4 8 12 16 20
      5 10 15 20 25
      6 12 18 24 30
      7 14 21 28 35
      8 16 24 32 40
      9 18 27 36 45

      O: outer product (matrix multiply):
      285 570 855 1140 1425
      570 1140 1710 2280 2850
      855 1710 2565 3420 4275
      1140 2280 3420 4560 5700
      1425 2850 4275 5700 7125

      S: solve matrix:
      1 2 3 4 5 6 7 8 9

    5. Creator Adapteva on October 8, 2012

      @Mike Certainly matrix inversion and eivenvalue calculations are possible, the challenge becomes writing algorithms that scale and run fast. This is a problem that has been seen time and time again on all parallel architectures to date. (a problem that we want to attack together with the Parallella project)

      @Mark Thanks for the support and suggestion! Maybe we can give each backer access to "one core".:-) We'll think about it.

    6. Creator mark foley on October 8, 2012

      Hi guys,

      This is a really important project - I wish you the best! I'm one of those people who think this will benefit humanity but I have no actual purchase need for a board right now. Hopefully more people like me will be able to throw in just a bit to get you over the line. I've been thinking about what kind of reward might entice more people like myself to contribute (perhaps access to a shared sandbox board online or something like that might be good). Good luck Adapteva!

      PS. I'm a c# person - any way this platform could be purposed to work with Mono?

    7. Creator Mike Theisen on October 8, 2012

      I'm kinda new to this but I have a quick question. I have a lot of code that requires matrix inversion and eigenvalue calculations. Is this possible in addition to parallel multiplication?