ICCS’02, April 22, 2002 1 [email protected] An Extensible Compiler for Creating Scriptable Scientific Software David M. Beazley Department of Computer Science The University of Chicago [email protected] April 22, 2002 PY-MA
ICCS’02, April 22, 2002 2 [email protected] Scripted Scientific Software Many scientific programs are being transformed Motivation: Scripting languages offer many benefits • Interpreted and interactive user environment. • Extensible with compiled C/C++/Fortran code. • Rapid prototyping. • Debugging. • Systems integration and components. • Can make applications look a lot like MATLAB, IDL, etc. C/C++ Application C/C++ Application Scripting Language Non-interactive batch processing (Python, Perl, Tcl, etc.) (extension)
ICCS’02, April 22, 2002 4 [email protected] Extension Code Example Python wrapper #include "Python.h" extern int gcd(int,int); PyObject *wrap_gcd(PyObject *self, PyObject *args) { int x,y; int result; if (!PyArg_ParseTuple(args,"ii", &x, &y)) { return NULL; } result = gcd(x,y); return PyInt_FromLong(result); } • Data conversion from C <---> Python. • You would write a wrapper for each part of your program. • Ex: 300 C functions ==> 300 wrapper functions • C++ classes, structures, templates, etc. are more complicated.
ICCS’02, April 22, 2002 5 [email protected] The Problem No one wants to write extension code • Highly repetitive. • Prone to error. • Difficult for complicated programs. Other issues • Scientific programs characterized by rapid change. • Functions change, variables change, objects change. • Piecemeal development. • Would require continual maintenance of the wrappers. • Complicates development. • Makes scripting languages impractical to use in early stages of a project.
ICCS’02, April 22, 2002 6 [email protected] The SWIG Project SWIG • A C/C++ compiler for generating wrappers to existing code. • Freely available and in development since 1995. • Currently targets Python, Perl, Tcl, Ruby, Java, PHP , Guile, and Mzscheme. Source translation • C++ header files are parsed to generate wrappers Goals • Make it extremely easy for users (scientists) to build wrappers. • Allow scripting interface to automatically track changes in underlying source. • Make the wrapping process as invisible as possible. .h .h .h swig Wrapper Code C/C++ Perl, Python,Tcl, Ruby, ...
ICCS’02, April 22, 2002 7 [email protected] SWIG Overview Key components: • Header file parsing. • Special SWIG directives. Supported C++ features • Functions, variables, constants. • Classes • Inheritance and multiple inheritance. • Pointers, references, arrays, member pointers. • Overloading (with renaming) • Operators. • Namespaces. • Templates. • Preprocessing. Not supported • Nested classes, member templates, template partial specialization Will show a few examples • Not a complete coverage of SWIG.
ICCS’02, April 22, 2002 9 [email protected] Creating a Module Compilation and linking % swig -python example.i % cc -c -I/usr/local/include/python2.1 example_wrap.c % cc -shared example_wrap.o $(OBJS) -o examplemodule.so Use % python Python 2.1 (#3, Aug 20 2001, 15:41:42) [GCC 2.95.2 19991024 (release)] on sunos5 >>> import example >>> example.gcd(12,16) 4 >>> Comments: • Modules built as shared libraries/DLLs • Dynamic loading used to import into interpreter. • Contents of the module similar to C.
ICCS’02, April 22, 2002 13 [email protected] How it Works class Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; C++ (input)
ICCS’02, April 22, 2002 19 [email protected] How it Works class Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Extension Module (DLL) >>> a = Complex(2,3) >>> b = Complex(4,5) >>> c = a + b >>> c.re() 6 >>> C++ (input) Procedure Wrappers Python class Python script SWIG generated SWIG generated
ICCS’02, April 22, 2002 20 [email protected] How it Works • User only works with input file (C++) and scripts • Details of wrappers hidden. • Wrappers not modified by user. Only used to compile DLL. class Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Extension Module (DLL) >>> a = Complex(2,3) >>> b = Complex(4,5) >>> c = a + b >>> c.re() 6 >>> C++ (input) Python script swig
ICCS’02, April 22, 2002 21 [email protected] Challenges C/C++ is a bad interface definition language • Type system complexity: typedef int (*PFIA[20])(int, double *x); double foo(PFIA *const x); • Ambiguity in data conversion (pointers, arrays, output values, etc.) double bar(double *x, double *y, double *r); • Structures, classes, unions. • Templates, namespaces, overloading, operators, etc. SWIG solution • Declaration annotation. • Pattern based type conversion. • Will provide a brief tour of internals.
ICCS’02, April 22, 2002 24 [email protected] Type Conversion Problem: marshalling • Must convert data between scripting and C representation. Example: In Python >>> gcd(12,16) 4 >>> count("Hello",5,"e") 1 int gcd(int x, int y); int count(char *buf, int len, char c); Integers String Single character
ICCS’02, April 22, 2002 27 [email protected] Typemaps and Datatypes Pattern matching integrated with C++ typesystem %typemap(in) int { ... } typedef int Integer; ... Integer gcd(Integer x, Integer y); namespace std { class string; %typemap(in) string * { ... }; } namespace S = std; using std::string; ... void foo(string *a, S::string *b); Comments: • All type conversion in SWIG is pattern based. • Type conversion by naming convention. • Mostly hidden from users. • Allows advanced customization.
ICCS’02, April 22, 2002 28 [email protected] Advanced Typemap Example Conversion of Numeric Python array to C %typemap(in) (double *mat, int nx, int ny) { PyArrayObject *array; if (!PyArray_Check($input)) { PyErr_SetString(PyExc_TypeError,"Expected an array"); return NULL; } array = (PyArrayObject *) PyArray_ContiguousFromObject(input, PyArray_DOUBLE, 2, 2); if (!array) { PyErr_SetString(PyExc_ValueError, "array must be two-dimensional and of type float"); return NULL; } $1 = (double *) array->data; /* Assign grid */ $2 = array->dimensions[0]; /* Assign nx */ $3 = array->dimensions[1]; /* Assign ny */ } ... double determinant(double *mat, int nx, int ny); Key point • SWIG can be customized to handle new datatypes. • Customized data marshalling.
ICCS’02, April 22, 2002 29 [email protected] Using SWIG Summary • Existing C/C++ header files used to build wrappers. • Process guided by some special SWIG directives. • Most details hidden from user. • Can customize output using typemaps and other features. .h .h .h .h Scientific Application (C/C++/Fortran) Scientific Application (C/C++/Fortran) .i Scientific Application (C/C++/Fortran) Wrapper Layer swig DLL
ICCS’02, April 22, 2002 30 [email protected] Extending SWIG SWIG consists of several components • Preprocessor. • C++ parser. • C++ type system. • Fully supports multi-pass compilation/code generation. • Internal data structures loosely based on XML-DOM. Target language modules • Implemented as C++ classes. • Virtual methods redefined according to target language. class SomeLanguage : public Language { public: virtual void main(int argc, char *argv[]); virtual int top(Node *n); virtual int functionWrapper(Node *n); virtual int variableWrapper(Node *n); ... };
ICCS’02, April 22, 2002 31 [email protected] Limitations Unsupported C++ features • Nested classes (soon). • Certain advanced features of templates. • Not all C++ features map cleanly to scripting interface. • Subtle differences in semantics (assignment, overloading, etc.) Problematic topics • Callback functions and methods. • Memory management (object ownership). • Arrays. No universal representation, marshalling, mapping to arguments.
ICCS’02, April 22, 2002 33 [email protected] Current Status and Availability SWIG is actively used and developed • 750 members on mailing list ([email protected]) • 86000 downloads in last 3 years. • Used in industry and commercial products. • And real scientific computing applications. Status • Currently working on major new release (SWIG-1.3.x ===> SWIG-2.0). • About 6 active developers. • Major enhancements to C++ handling (templates, namespaces, type system). • New target languages. Availability: • http://www.swig.org • And many Linux distributions.