Given an existing application, how do you get it to run inside Python? Thoughts • Rewriting the whole application is not an option. • Breaking the application for an extended period not an option. • Probably want to evolve the code and verify correct behavior as you go. Personal bias • I think that Python should always be an optional program feature. • Clean separation of what a program does and how it is controlled. • You always want to keep your options open • Maybe something better than Python will come along. • Or you will want to use the code in a different way. Alternative view • Build your application as a direct extension of Python using Python objects. • I won’t discuss that option.
is not to just add a Python interpreter • You’re really trying to make your software more flexible. • Python is only a means to an end. Tasks • You want to expose the internals of your application. • Internal data structures, global variables, functions, constants, etc. • You want to use Python as a mechanism for exploring data. • Python as a control program. • Break the traditional batch processing cycle. • If possible, you want to clean up your software by creating modules. Comment • Almost everything I will discuss is independent of Python. • Would apply to other scripting environments (Perl, Tcl, Ruby, etc.). • Mostly relates to software engineering and software architecture.
What you need to start • Python • An extension building tool (e.g., SWIG) • A good book on Python. • Source code for your application (especially header files). • C/C++ compiler. • Some understanding of makefiles (hopefully). Assumptions • I will assume the use of SWIG in this tutorial. • Will also assume C/C++ code. • If using Fortran, there are special Fortran tools (f2py, pyfort). • Extensions still involve some C programming however.
main() and expose everything that’s left • Python interpreter is in control • main() no longer used. • Python wrappers added to provide interface (generated by SWIG). main() myprog python myprogmodule.so wrappers
Module Make sure you can compile a Python extension • Create an empty extension module (no functions, no variables, etc.) // swig: myprog.i %module myprog.i • Compile $ swig -python myprog.i $ cc -c -I/usr/local/include/python2.1 myprog_wrap.c $ cc -shared myprog_wrap.o -o myprogmodule.so Try it and make sure there are no errrors $ python Python 2.1 (#3, Aug 20 2001, 15:41:42) [GCC 2.95.2 19991024 (release)] on sunos5 >>> import myprog >>>
Scripting language extensions are libraries • You need to convert your application to a library Suggestion • Change the makefile to build an archive in addition to an executable. target: cc $(OBJS) -lm -o myprog target: cc $(OBJS) -lm -o myprog ar cr libmyprog.a $(OBJS) Comment • Include everything in the archive. Don’t worry about main() or other functions.
your extension module with your library • Easy to do in the Makefile (add a new rule). python: swig -python myprog.i cc -c -I/usr/local/include/python2.1 myprog_wrap.c cc -shared myprog_wrap.o -L. -lmyprog -o myprogmodule.so Try loading into Python >>> import myprog >>> • If you get an ImportError, you are probably missing some libraries. • Add extra libraries to the link line and repeat until problems go away. Now the tricky part... • Actually doing something. • Replicating the functionality of main().
for initialization code • In main(), there is probably some code similar to this: #include "header.h" int main() { ... init_memory(); init_io(); ... }; Expose to Python and test %module myprog %{ #include "header.h" %} >>> import myprog void init_memory(); >>> myprog.init_memory() void init_io(); >>> myprog.init_io() ... ... • See if you can get the program to do anything without crashing.
Problem Pick a very simple computational problem • Something with a known result • Maybe a test run of some kind Replicate as a Python script • Look at the code in main() • Expose all of the functions needed to run the problem from Python. • Add to the SWIG interface file. • Write a short Python script that executes the same sequence of operations • Verify the output. Comments • This stage of development is probably the most difficult • Modifications to build environment. • Forcefully ripped the whole control mechanism out of the code. • A tangled mess left behind. Question • Now what?
a library • The program consists of functions, global variables, constants, classes, etc. • You will need to think about how all of this is organized. Header files • You will probably want to create some kind of library header file • Describes every function and variable in your program (that you want to expose). /* myprog.h */ #ifndef _HAVE_MYPROG_H #define _HAVE_MYPROG_H extern void init_memory(); extern void init_io(); extern int integrate(double Dt, int nsteps); extern double Cutoff; ... #endif • This step may be easy (existing headers) • May have to do some searching and cleanup.
Extension Module Library --> Extension • Expose all functions, variables, structures, etc. in library header to Python • Complexity of header will determine difficulty of this task • In SWIG... %module myprog %{ /* Include header in wrapper file */ #include "myprog.h" %} /* Parse the header to generate wrappers */ %include "myprog.h" Goal • Create some kind of rudimentary extension module that mirrors the application Comments • May need to do extra work to make this work. • May want to tackle the library in small pieces. • Focus on functions and variables, then structures and classes.
Program Replicate the functionality of main() • Write some scripts to solve various problems. • Exercise various features of the scripting language interface. • Make sure everything is working like you expect. • You might even rewrite main() entirely in Python as an experiment. • Or to maintain backwards compatibility with old programs. Case study • In our own application, we added integrated visualization • Created Python wrappers for some graphics functions. • Wrote some small functions to plot data. • Experimented with interactive simulation/visualization >>> integrate(100) # simulation >>> plot() # visualization >>> rotx() # visualization >>> zoom(200) # visualization >>> integrate(1000) # simulation >>> plot() # visualization ...
using the scripted code, you will discover problems • Awkward function calls • Inaccessible data • Missing functionality • Illogical control flow. • Catastrophic program crashes. • Other annoyances You will also uncover useful information • A better sense of how different pieces of the program relate • Parts of the program that are independent • Opportunities for cleanup and improvement. This starts a cycle of incremental refinement • This depends on the application • I will give a few examples
many applications, main() is the glue • Along with bits and pieces of user interface (UI) code. • Code elimination may reveal more structure. An opportunity for code reorganization main() big monolithic program UI UI UI UI remove main() Loosely coupled modules
of having one module, create many src/ lib/ io/ libio.a integrate/ libintegrate.a grid/ libgrid.a crack/ libcrack.a force/ libforce.a ... ... Separate Python modules • Create a Python extension module for each library independently • Clean up header files and other source code. Comments • This process takes a lot of work (and thought). • Separates parts of your application into logical subcomponents • Changes development. • No longer work on the whole application---you work on smaller components. • There are some complications with libraries (will cover later).
Programming with modules • Subdivision into modules makes application seem smaller • Each module only a few thousand lines of code • Doesn’t look or feel like a huge application anymore • Simplifies maintenance. Can work on individual modules. Case study • In our application, "user" code dropped from 30000 to about 2000 lines of code. • This was the code users typically wrote to set up and run simulations. • Everything else disappeared into libraries. • Instead of copying whole package into their own directory, copy a module
force you to think about data structures • How modules exchange data • The organization of internal data structures • Mechanisms for accessing data • Extensible data structures • Dynamic allocation of data. • Parameter passing Example • Parameters vs. globals vs. objects void foo(int x, int y, double a, double b, int n); int x, y; double a,b; void foo(int n); struct Foo { int x,y; doule a,b; } void foo(Foo *f, int n);
simulation codes, almost everything centers on data • Grids, meshes, particles, etc. • We decided to make data the center of the system (as globals) • Collection of utility functions for accessing the data Modules • Merely perform operations on the global data (queries, transforms, etc.) • Python simply loads and coordinates the modules. Simulation Data Python Visualize Analysis I/O Integrate Force
• Assumes that a user only runs one simulation at a time. • Greatly simplifies the interface between functions and modules. • Global simulation data is implicitly assumed everywhere (not passed). • Object-orientation used elsewhere in the system (e.g., visualization) • This is not the only approach Contrast to data-flow Initial Condition Iterator Integrate Visualize Boundary Force data
Data Data Wrapping • C/C++ data structures can be hidden behind Python objects • Objects can use operator overloading to mimic Python lists, dictionaries, etc. Operators • Almost all standard mathematic operators can be overloaded Tools • SWIG makes this very easy--simplified implementation of accessors. double *ary; ... class dArray: def __init__(self,ptr): self.ptr = ptr def __getitem__(self,index): .... def __setitem__(self,index,value): .... ... >>> a = dArray() >>> print a[3] 3.887233 >>> a[3] = 2.5 >>> 3.887233
Data Access vs. marshalling • Primary goal is to provide access to underlying data structures • Usually do not want to convert data into Python objects (marshalling). • Example: convert a C/C++ array into a Python list. • Instead, you provide a proxy object that refers to C/C++ data. Discussion • Should you use native Python objects in the simulation? • Example: Numeric Python arrays. • I think it’s a bad idea if you care about portability and long-term development. • Not everyone agrees with me on this point.
turned into a library • Elimination of old user interface. • Identification of useful functionality Incremental refinement • Identification of modules • Source cleanup. • Changes to the API. • Changes to data structures (if necessary). • This is a gradual process Role of Python • Python allows application to be used interactively. • Can explore data structures, execute functions, etc. • This kind of exploration can identify problems in the design • If it’s hard to use from Python, then maybe there is a better way. • Interactivity allows you to experiment with module interfaces • Interaction between modules
code • Initial scripting language interface took a few days. • Incremental refinement for more than a year afterwards • Data structure cleanup. • Modules • Error handling • User interface problems. Changed some functions around. • Changes were always motivated by the use of the code from Python Overview • System broken into about 8 different modules • Each module built into as a separate Python extension. • SWIG interface for each module (incorporated into build process). • Modules maintained separately (split the source code). Discussion