Slide 1

Slide 1 text

:BLBUBCVOFTXJGU .FFUIJHIQFSGPSNBODF JNBHFGJMUFSJOHJO4XJGU !LOULZNU

Slide 2

Slide 2 text

*OEFY w8IBUJTlJNBHF fi MUFSJOHz w)PXUPBDDFMFSBUFJUPO$16 w*OlllQVSFzzz4XJGU

Slide 3

Slide 3 text

*OEFY w8IBUJTlJNBHF fi MUFSJOHz w)PXUPBDDFMFSBUFJUPO$16 w*OlllQVSFzzz4XJGU /PUF*XJMM/05UBMLBCPVU$*'JMUFS

Slide 4

Slide 4 text

*OEFY w8IBUJTlJNBHF fi MUFSJOHz w)PXUPBDDFMFSBUFJUPO$16 w*OlllQVSFzzz4XJGU

Slide 5

Slide 5 text

*NBHF'JMUFSJOH wFH (BVTTJBO'JMUFS w6TFEGPS w&EHF 'FBUVSFEFUFDUJPO w/PJTF3FEVDUJPO

Slide 6

Slide 6 text

#PY'JMUFS wBLB"WFSBHF'JMUFS

Slide 7

Slide 7 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 8

Slide 8 text

let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 9

Slide 9 text

let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 10

Slide 10 text

let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 11

Slide 11 text

let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 12

Slide 12 text

let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 13

Slide 13 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 14

Slide 14 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 15

Slide 15 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 16

Slide 16 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 17

Slide 17 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 18

Slide 18 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 19

Slide 19 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 20

Slide 20 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 21

Slide 21 text

#PY'JMUFS let input: [[UInt8]] = … var output: [[UInt8]] = … for j in 0..

Slide 22

Slide 22 text

#PY'JMUFS wBLB"WFSBHF'JMUFS

Slide 23

Slide 23 text

*OEFY w8IBUJTlJNBHF fi MUFSJOHz w)PXUPBDDFMFSBUFJUPO$16 w*OlllQVSFzzz4XJGU

Slide 24

Slide 24 text

"DDFMFSBUFJNBHFGJMUFSJOH w "MHPSJUIN w )BSEXBSF1SPHSBNJOH-BOHVBHF

Slide 25

Slide 25 text

"DDFMFSBUFJNBHFGJMUFSJOH w "MHPSJUIN w /BJWF0 3? w 4FQBSBCMF'JMUFSJOH0 3 w *OUFHSBM*NBHF0

Slide 26

Slide 26 text

"DDFMFSBUFJNBHFGJMUFSJOH w "MHPSJUIN w /BJWF0 3? w 4FQBSBCMF'JMUFSJOH0 3 w *OUFHSBM*NBHF0

Slide 27

Slide 27 text

"DDFMFSBUFJNBHFGJMUFSJOH w "MHPSJUIN w /BJWF0 3? w 4FQBSBCMF'JMUFSJOH0 3 w *OUFHSBM*NBHF0 LJOE UJNF /BJWF 4FQBSBCMF Y HSBZTDBMF 3 .1SP $PSF Y

Slide 28

Slide 28 text

"DDFMFSBUFJNBHFGJMUFSJOH w )BSEXBSF1SPHSBNJOH-BOHVBHF w 4*.%

Slide 29

Slide 29 text

4*.% w 4JOHMF*OTUSVDUJPO.VMUJQMF%BUB w *OTUSVDUJPOTGPSCJUT w 9@44& "79 "79 w "3./&0/

Slide 30

Slide 30 text

4XJGU4*.%7FDUPS5ZQFT

Slide 31

Slide 31 text

w *OTJEFUIF4UBOEBSE-JCSBSZ 4XJGUd 4XJGU4*.%7FDUPS5ZQFT

Slide 32

Slide 32 text

4XJGU4*.%7FDUPS5ZQFT let a0: UInt16 = 1 let a1: UInt16 = 2 … let b0: UInt16 = 9 let b1: UInt16 = 10 … a1 + b1 a2 + b2 … a7 + b7 ldrh w0, … ldrh w1, … ldrh w2, … ldrh w3, … … ldrh w15, … add w0, w0, w1 add w2, w2, w3 … add w14, w14, w15

Slide 33

Slide 33 text

4XJGU4*.%7FDUPS5ZQFT let a = SIMD8(1, 2, 3, …) let b = SIMD8(9, 10, 11, …) a &+ b ldr q0 … ldr q1 … add.8h v0, v0, v1 let a0: UInt16 = 1 let a1: UInt16 = 2 … let b0: UInt16 = 9 let b1: UInt16 = 10 … a1 + b1 a2 + b2 … a7 + b7 ldrh w0, … ldrh w1, … ldrh w2, … ldrh w3, … … ldrh w15, … add w0, w0, w1 add w2, w2, w3 … add w14, w14, w15

Slide 34

Slide 34 text

#PY'JMUFSVTJOH4*.% for y in 0...zero for k in 0..(yresult[startIndex..

Slide 35

Slide 35 text

#PY'JMUFSVTJOH4*.%XJUI1PJOUFS let imagePointer: UnsafeMutablePointer = … var resultPointer: UnsafeMutableBufferPointer = … let weightSIMD = SIMD16(repeating: UInt16(L * L)) let widthExtended = width + 2 * radius // ຖճdeallocate͠ͳ͍ͱϝϞϦϦʔΫ͢Δ͕ // Ұ൪࠷ޙͷ࣮ݧ݁Ռ͸ඞཁͳͨΊઌ಄Ͱߦ͏ resultPointer.deinitialize() resultPointer.deallocate() resultPointer = .allocate(capacity: width * height) resultPointer.initialize(repeating: .zero) let extendedPointer: UnsafeMutableBufferPointer> = .allocate(capacity: L) for k in 0...allocate(capacity: widthExtended) yresultPointer.initialize(repeating: .zero) for y in 0...zero for k in 0..(extendedPointer[k][x...zero for k in 0..(extendedPointer[k][offset...zero for k in 0..(yresultPointer[startIndex...zero for k in 0..(yresultPointer[startIndex..

Slide 36

Slide 36 text

#PY'JMUFSVTJOH4*.%XJUI1PJOUFS let imagePointer: UnsafeMutablePointer = … var resultPointer: UnsafeMutableBufferPointer = … let weightSIMD = SIMD16(repeating: UInt16(L * L)) let widthExtended = width + 2 * radius // ຖճdeallocate͠ͳ͍ͱϝϞϦϦʔΫ͢Δ͕ // Ұ൪࠷ޙͷ࣮ݧ݁Ռ͸ඞཁͳͨΊઌ಄Ͱߦ͏ resultPointer.deinitialize() resultPointer.deallocate() resultPointer = .allocate(capacity: width * height) resultPointer.initialize(repeating: .zero) let extendedPointer: UnsafeMutableBufferPointer> = .allocate(capacity: L) for k in 0...allocate(capacity: widthExtended) yresultPointer.initialize(repeating: .zero) for y in 0...zero for k in 0..(extendedPointer[k][x...zero for k in 0..(extendedPointer[k][offset...zero for k in 0..(yresultPointer[startIndex...zero for k in 0..(yresultPointer[startIndex.. /BJWF 4FQBSBCMF 4FQBSBCMF 4*.%1PJOUFS Y Y

Slide 37

Slide 37 text

#PY'JMUFSVTJOH4*.%XJUI1PJOUFS let imagePointer: UnsafeMutablePointer = … var resultPointer: UnsafeMutableBufferPointer = … let weightSIMD = SIMD16(repeating: UInt16(L * L)) let widthExtended = width + 2 * radius // ຖճdeallocate͠ͳ͍ͱϝϞϦϦʔΫ͢Δ͕ // Ұ൪࠷ޙͷ࣮ݧ݁Ռ͸ඞཁͳͨΊઌ಄Ͱߦ͏ resultPointer.deinitialize() resultPointer.deallocate() resultPointer = .allocate(capacity: width * height) resultPointer.initialize(repeating: .zero) let extendedPointer: UnsafeMutableBufferPointer> = .allocate(capacity: L) for k in 0...allocate(capacity: widthExtended) yresultPointer.initialize(repeating: .zero) for y in 0...zero for k in 0..(extendedPointer[k][x...zero for k in 0..(extendedPointer[k][offset...zero for k in 0..(yresultPointer[startIndex...zero for k in 0..(yresultPointer[startIndex.. /BJWF 4FQBSBCMF 4FQBSBCMF 4*.%1PJOUFS

Slide 38

Slide 38 text

LJOE UJNF /BJWF 4FQBSBCMF 4FQBSBCMF 4*.%1PJOUFS #PY'JMUFSVTJOH4*.%XJUI1PJOUFS let imagePointer: UnsafeMutablePointer = … var resultPointer: UnsafeMutableBufferPointer = … let weightSIMD = SIMD16(repeating: UInt16(L * L)) let widthExtended = width + 2 * radius // ຖճdeallocate͠ͳ͍ͱϝϞϦϦʔΫ͢Δ͕ // Ұ൪࠷ޙͷ࣮ݧ݁Ռ͸ඞཁͳͨΊઌ಄Ͱߦ͏ resultPointer.deinitialize() resultPointer.deallocate() resultPointer = .allocate(capacity: width * height) resultPointer.initialize(repeating: .zero) let extendedPointer: UnsafeMutableBufferPointer> = .allocate(capacity: L) for k in 0...allocate(capacity: widthExtended) yresultPointer.initialize(repeating: .zero) for y in 0...zero for k in 0..(extendedPointer[k][x...zero for k in 0..(extendedPointer[k][offset...zero for k in 0..(yresultPointer[startIndex...zero for k in 0..(yresultPointer[startIndex..

Slide 39

Slide 39 text

4VNNBSZ w 4XJGUDBOFWFSZUIJOH w 4XJGUDBOIJHIQFSGPSNBODFQSPDFTTJOH

Slide 40

Slide 40 text

3FGFSFODFT w 4&4*.%7FDUPST w IUUQTHJUIVCDPNBQQMFTXJGUFWPMVUJPOCMPCNBJOQSPQPTBMTTJNENE w 4*.% w IUUQTEFWFMPQFSBQQMFDPNEPDVNFOUBUJPOTXJGUTJNE w 4BNQMF$PEF w IUUQTHJUIVCDPNLOULZNUTXJGUJNBHFQSPDFTTJOHTBNQMF

Slide 41

Slide 41 text

&OWJSPONFOU w TXJGUESJWFSWFSTJPO"QQMF4XJGUWFSTJPO TXJGUMBOHDMBOH w 5BSHFUBSNBQQMFNBDPTY w DNETXJGUCVJMEDSFMFBTF