Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Improving performance by switching from C++ to Python

Improving performance by switching from C++ to Python

For an electronic artwork displayed at Spier's recent "Festival of Light" I had the opportunity (need?) to rewrite a legacy C code base for driving an LED cube from a Raspberry Pi.

While this might appear to be a talk about making code run faster, it's really a talk about thinking about software carefully at different levels, doing the right thing, not giving up when things are hard, and asking friends for help when needed.

Note: This is an entirely different talk about controlling LEDs with Python to the one I gave last year -- different artwork, completely different kind of LEDs and drivers, and no C involved at all!

Simon Cross

April 06, 2019
Tweet

More Decks by Simon Cross

Other Decks in Technology

Transcript

  1. Table of contents 1. Backstory 2. Why rewrite? 3. Seeing

    the forest 4. Bringing it all together 1
  2. Tesseract 1.0 For AfrikaBurn 2017 we took up an LED

    cube called the Tesseract. • Wooden frame • Control software in C • 8 x 8 x 8 LEDS (512) • 2m x 2m x 2m (physical size) • Electronics on veroboard • No budget 2
  3. Tesseract 1.0 While this was happening ... ... the Tesseract

    was tumbled 100m. The frame was wrecked. 3
  4. Tesseract 1.0 This is what we had left ... ...

    physical size – 20cm x 20cm x 20cm. :| 4
  5. Tesseract 2.0 Dream big! • Steel frame • Control software

    in Python • 8 x 8 x 8 LEDS (512) • 2m x 2m x 2m (physical size) • Electronics on custom PCB 7
  6. What the code needs to do What the code needs

    to do: • Render horizontal layers one at a time as rapidly as possible. • Signal clocking needs extremely tight timing. • Render the animation frame. • Ideally provide a simulator. The legacy C++ code did all of this. 8
  7. Legacy C++ What was wrong with the C++ code: •

    Performance wasn’t great (used lots of CPU). • Original developer had left the project. • No one knew why some parts were the way they were. • No unit tests. • Simulator and real renderer intermingled with #IFDEFs. • Adding new features was slow. Plus it was just so ludicrously low level – we’re trying to make ART! 9
  8. Legacy C++ Tight rendering loop. #ifdef BCM2835_RENDER while(1) { for(int

    i = 0; i < 8; i++ ) // horizontal layers { memset(gsData, 0x00, sizeof(gsData)); for( int j = 0; j < 8; j++) for( int k = 0; k < 8; k++ ) { // set item in gsData unsigned char ucValue = m_pLattice->GetValue(j,k,(7-i)); int pin_no = pin_mask[j * 8 + k]; if( ucValue > 125 ) memset(&gsData[pin_no * 12], 0x01, 12); } for( int j = 0; j < 16; j++ ) { if( row_mask[i*16+j] != 0 ) memset(&gsData[16*4*12+j*12], 0x01, 12); } // loop over gsData toggling pins on and off ... TLC5940_SetGS_And_GS_PWM(); } ... } #endif 10
  9. Legacy C++ Rendering text was MAD. ... // @558 'z'

    (4 pixels wide) 0x00, // 0x00, // 0xF0, // #### 0x10, // # 0x60, // ## 0x80, // # 0xF0, // #### 0x00, // 0x00, // ... And the code for PLACING the text was even crazier! 11
  10. Legacy C++ Drawing a sphere wasn’t easy. virtual UCHAR GenerateValue(Lattice

    &xLattice, int i, int j, int k) { int iResolution = xLattice.GetResolution(); float x = float(i) / float(iResolution - 1); ... vector3f vPos(x - 0.5f, y - 0.5f, z - 0.5f); matrix3f mRot1 = matrix3f::IDENTITY; mRot1.rotate_x(m_fTimer * 0.01f); ... matrix3f mTrans = matrix3f::IDENTITY; mTrans.set_translation(vPos); matrix3f mResult; multiply<4,4,float>(mRot3, mTrans, mResult); mResult.get_translation(vPos); for (std::vector<vector3f>::iterator iter = m_vecPoints.begin(); iter != m_vecPoints.end(); ++iter) { float fDist = (vPos - (*iter) * m_fRadius).length(); const float fMaxDist = (1.0f/float(iResolution-1)) * m_fRadius * 8.0f; if (fDist <= fMaxDist) return UCHAR(((fMaxDist - fDist) / fMaxDist) * 255); } return 0; } And the second function needed is as long. :| 12
  11. Virtual Frames Shared interface from effect engine to simulator and

    LED driver. • 8 x 8 x 8 array of bytes • one byte per LED • numpy (Python) • ZeroMQ (network) 15
  12. SPI What the heck is SPI? • Serial Peripheral Interface

    (unhelpful) SPI Master SCLK MOSI MISO SS SPI Slave SCLK MOSI MISO SS • Miniature processor just for toggling pins! • TLCs need two clocks (eek). • Can be written to as a file once set up (yay, Unix!) 16
  13. Time to Code! Time to code! • Keep the big

    picture in mind! • Keep an eye on the critical paths. • Know your tools well. Let’s look at the Python! 17
  14. Results What did we end up with? • Clear conceptual

    layers. • Solid choices for how data moves through the system. • Higher level language for writing effects in. • Appropriate use of available hardware. 21