Slide 1

Slide 1 text

Get Ready for Parallel Programming Featuring Ancient City Ruby St. Augustine, FL, USA April 6-8, 2016 RayHightower.com

Slide 2

Slide 2 text

RJ-45 Power μUSB μHDMI μSD

Slide 3

Slide 3 text

RISC ARM + FPGA 18 cores: 2 ARM + 16 RISC

Slide 4

Slide 4 text

RayHightower.com

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

Why?

Slide 9

Slide 9 text

What can we do now that we could not do before?
 How do we show the difference visually?

Slide 10

Slide 10 text

What can we do in parallel that we cannot do serially?
 How do we show the difference visually? Parallel: So what?

Slide 11

Slide 11 text

Moore’s Law: 2x every 18 months

Slide 12

Slide 12 text

Moore’s Law: 2x every 18 months 1993 http://www.washingtonpost.com/blogs/innovations/wp/ 2015/04/14/10-images-that-explain-the-incredible-power-of- moores-law/ 2013

Slide 13

Slide 13 text

Parallelism

Slide 14

Slide 14 text

If one ox could not do the job they did not try to grow a bigger ox, but used two oxen. When we need greater computer power, the answer is not to get a bigger computer, but to build systems of computers and operate them in parallel. -Grace Hopper

Slide 15

Slide 15 text

Concurrency is not Parallelism. https://www.youtube.com/watch?v=cN_DpYBzKso&list=PLOnWKC1gI_OPU8SDIBnCLHsgzNLSbnPJQ&index=3 -Rob Pike, Go

Slide 16

Slide 16 text

Concurrency At least two threads are making progress. Parallelism At least two threads are executing simultaneously. Oracle Multithreaded Programming Guide http://docs.oracle.com/cd/E19455-01/806-5257/6je9h032b/index.html vs.

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Watts & Dollars

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

17.8 million watts $17.8 million per year

Slide 21

Slide 21 text

5 watts for Parallella?

Slide 22

Slide 22 text

http://rayhightower.com/blog/2014/09/09/solar-powered-parallella/

Slide 23

Slide 23 text

http://rayhightower.com/blog/2014/09/09/solar-powered-parallella/ 5 volts 1 amp 5 watts Solar!

Slide 24

Slide 24 text

Reduced Instruction Set Computer RISC

Slide 25

Slide 25 text

Advanced (or Acorn) RISC Machine ARM

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

Find all primes up to 16,000,000. Serial on Parallella.

Slide 28

Slide 28 text

1 #include 2 #include 3 #include 4 #include 5 6 #define DEFAULT_MAX_TESTS 16000000 7 8 inline int isprime(unsigned long number) 9 { 10 unsigned long i; 11 unsigned long s = sqrt(number); 12 for(i=3;i<=s;i+=2) 13 { 14 if(number % i == 0) 15 return 0; 16 } 17 return 1; 18 } /* Copyright (c) Adapteva, contributed by M. Thompson with modifications by T. Malthouse. */

Slide 29

Slide 29 text

8 inline int isprime(unsigned long number) 9 { 10 unsigned long i; 11 unsigned long s = sqrt(number); 12 for(i=3;i<=s;i+=2) 13 { 14 if(number % i == 0) 15 return 0; 16 } 17 return 1; 18 } /* Copyright (c) Adapteva, contributed by M. Thompson with modifications by T. Malthouse. */

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

237.1 sec

Slide 32

Slide 32 text

Find all primes up to 16,000,000. Serial on Mac OS X.

Slide 33

Slide 33 text

Same serial code, written in C. Build it on OS X.

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

14.4 sec

Slide 36

Slide 36 text

Find all primes up to 16,000,000. Parallel on Parallella.

Slide 37

Slide 37 text

27 #include 28 29 // Default max number of prime tests per core 30 // Used if a limit it not provided in argv[1] 31 #define DEFAULT_MAX_TESTS 500000 32 33 int main(int argc, char *argv[]) 34 { 35 unsigned row, col, coreid, i, j; 36 e_platform_t platform; 37 e_epiphany_t dev; 38 /* Copyright (c) Adapteva, contributed by M. Thompson with modifications by T. Malthouse.*/

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

18.6 sec

Slide 40

Slide 40 text

Summary: Finding Primes 0" 50" 100" 150" 200" 250" Parallel Parallella Serial Mac Serial Parallella 18.6 sec 14.4 sec 237.1 sec ($2,000.00 Apple MacBook Pro) ($150.00 Parallella)

Slide 41

Slide 41 text

Embarrassingly parallel problem.

Slide 42

Slide 42 text

Mandelbrot Set

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Why?

Slide 45

Slide 45 text

https://www.e-education.psu.edu/worldofweather/node/2029 Grid spacing influences accuracy. Weather Prediction

Slide 46

Slide 46 text

break? wear out? work? Finite Element Analysis Will it {

Slide 47

Slide 47 text

Finite Element Analysis http://www.ce.ncsu.edu/news/article/21550/making-bridges-more-robust-to-earthquakes/

Slide 48

Slide 48 text

Finite Element Analysis http://www.ce.ncsu.edu/news/article/21550/making-bridges-more-robust-to-earthquakes/

Slide 49

Slide 49 text

Free Body Diagram F F applied friction F gravity F normal

Slide 50

Slide 50 text

Where can Ruby (or Python, Node.js, Erlang, etc.) fit?

Slide 51

Slide 51 text

Ruby
 +
 Sinatra
 +
 WebSockets

Slide 52

Slide 52 text

Original example by The Hybrid Group and Engine Yard. Modified by WisdomGroup.

Slide 53

Slide 53 text

Field Programmable Gate Array FPGA

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

AB 00 01 10 11 Q 1 1 1 0

Slide 56

Slide 56 text

WebPACK Edition http://www.xilinx.com/products/design-tools/vivado.html

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

•Finance: High-freq trading •Routers •Health care: MRIs, CAT scanners •Cell phone towers •Anything with lots of connections •HDMI interface on Parallella. FPGA Uses…

Slide 61

Slide 61 text

Embedded Systems

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

https://www.parallella.org/2015/06/01/the-open-camera-project-1000-bounty-for-open-firmwaredrivers-for-raspberry-pi-camera-module/

Slide 65

Slide 65 text

Parallella vs GPU

Slide 66

Slide 66 text

384 cores MacBook Pro Video

Slide 67

Slide 67 text

Specialized vs. General

Slide 68

Slide 68 text

Kickstarter: Jan 2016 Pine 64 4 cores 64-bit $15.00

Slide 69

Slide 69 text

Pine 64 4 cores 64-bit 4K video Starts at $15.00 N ew !

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

Pine64 vs Parallella 4K video 4 cores 2GB RAM 2.5 watts Full-size ports no FPGA $15 - $29 HDMI 18 cores (2 + 16) 1GB RAM 5 watts Micro-ports FPGA $99 - $150

Slide 72

Slide 72 text

Pine64 vs Parallella 4K video 4 cores 2GB RAM 2.5 watts Full-size ports Larger board $15 - $29 HDMI 18 cores (2 + 16) 1GB RAM 5 watts Micro-ports Smaller board $99 - $150 Winner? Depends on design goals.

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

Thanks! RayHightower.com