$30 off During Our Annual Pro Sale. View Details »

An Extensible Compiler for Creating Scriptable Scientific Software

An Extensible Compiler for Creating Scriptable Scientific Software

Conference presentation. ICCS 2002. Amsterdam.

David Beazley

April 22, 2002
Tweet

More Decks by David Beazley

Other Decks in Programming

Transcript

  1. ICCS’02, April 22, 2002 1 beazley@cs.uchicago.edu An Extensible Compiler for

    Creating Scriptable Scientific Software David M. Beazley Department of Computer Science The University of Chicago beazley@cs.uchicago.edu April 22, 2002 PY-MA
  2. ICCS’02, April 22, 2002 2 beazley@cs.uchicago.edu Scripted Scientific Software Many

    scientific programs are being transformed Motivation: Scripting languages offer many benefits • Interpreted and interactive user environment. • Extensible with compiled C/C++/Fortran code. • Rapid prototyping. • Debugging. • Systems integration and components. • Can make applications look a lot like MATLAB, IDL, etc. C/C++ Application C/C++ Application Scripting Language Non-interactive batch processing (Python, Perl, Tcl, etc.) (extension)
  3. ICCS’02, April 22, 2002 3 beazley@cs.uchicago.edu Extension Programming Wrappers •

    main() replaced by wrappers (data marshalling, error handling, etc.) • Similar to stub-code with RPC, CORBA, COM, etc. • Goal: expose application internals to interpreter (functions, classes, variables,...) >>> gcd(12,16) 4 Scripting Interpreter Scientific Application (C/C++/Fortran) Wrapper Layer Scientific Application (C/C++/Fortran) Wrapper Layer Scientific Application (C/C++/Fortran) main() Original Application
  4. ICCS’02, April 22, 2002 4 beazley@cs.uchicago.edu Extension Code Example Python

    wrapper #include "Python.h" extern int gcd(int,int); PyObject *wrap_gcd(PyObject *self, PyObject *args) { int x,y; int result; if (!PyArg_ParseTuple(args,"ii", &x, &y)) { return NULL; } result = gcd(x,y); return PyInt_FromLong(result); } • Data conversion from C <---> Python. • You would write a wrapper for each part of your program. • Ex: 300 C functions ==> 300 wrapper functions • C++ classes, structures, templates, etc. are more complicated.
  5. ICCS’02, April 22, 2002 5 beazley@cs.uchicago.edu The Problem No one

    wants to write extension code • Highly repetitive. • Prone to error. • Difficult for complicated programs. Other issues • Scientific programs characterized by rapid change. • Functions change, variables change, objects change. • Piecemeal development. • Would require continual maintenance of the wrappers. • Complicates development. • Makes scripting languages impractical to use in early stages of a project.
  6. ICCS’02, April 22, 2002 6 beazley@cs.uchicago.edu The SWIG Project SWIG

    • A C/C++ compiler for generating wrappers to existing code. • Freely available and in development since 1995. • Currently targets Python, Perl, Tcl, Ruby, Java, PHP , Guile, and Mzscheme. Source translation • C++ header files are parsed to generate wrappers Goals • Make it extremely easy for users (scientists) to build wrappers. • Allow scripting interface to automatically track changes in underlying source. • Make the wrapping process as invisible as possible. .h .h .h swig Wrapper Code C/C++ Perl, Python,Tcl, Ruby, ...
  7. ICCS’02, April 22, 2002 7 beazley@cs.uchicago.edu SWIG Overview Key components:

    • Header file parsing. • Special SWIG directives. Supported C++ features • Functions, variables, constants. • Classes • Inheritance and multiple inheritance. • Pointers, references, arrays, member pointers. • Overloading (with renaming) • Operators. • Namespaces. • Templates. • Preprocessing. Not supported • Nested classes, member templates, template partial specialization Will show a few examples • Not a complete coverage of SWIG.
  8. ICCS’02, April 22, 2002 8 beazley@cs.uchicago.edu Input files C/C++ declarations

    mixed with special SWIG directives. // example.i : Sample SWIG input file %module example %{ #include "example.h" %} // Resolve a name clash with Python %rename(cprint) print; // C/C++ declarations extern int gcd(int x, int y); extern int fact(int n); ... %include "example.h" // Parse a header file ...
  9. ICCS’02, April 22, 2002 9 beazley@cs.uchicago.edu Creating a Module Compilation

    and linking % swig -python example.i % cc -c -I/usr/local/include/python2.1 example_wrap.c % cc -shared example_wrap.o $(OBJS) -o examplemodule.so Use % python Python 2.1 (#3, Aug 20 2001, 15:41:42) [GCC 2.95.2 19991024 (release)] on sunos5 >>> import example >>> example.gcd(12,16) 4 >>> Comments: • Modules built as shared libraries/DLLs • Dynamic loading used to import into interpreter. • Contents of the module similar to C.
  10. ICCS’02, April 22, 2002 10 beazley@cs.uchicago.edu A More Complicated Example

    • Structures/classes mapped into wrapper objects. • Provides natural access from an interpreter. class Complex { double real, imag; public: Complex(double r = 0, double i = 0); Complex(const Complex &c); Complex &operator=(const Complex &c); Complex operator+(const Complex &); Complex operator-(const Complex &); Complex operator*(const Complex &); Complex operator-(); double re(); double im(); ... }; >>> a = Complex(3,4) >>> b = Complex(5,6) >>> c = a + b >>> c.re() 8.0 >>> c.im() 10.0 >>> C++ Python
  11. ICCS’02, April 22, 2002 11 beazley@cs.uchicago.edu Structure Extension Converting C

    structures to classes • Can make C programs look OO (or extend C++ classes) typedef struct { double x,y,z; } Vector; ... %addmethods Vector { Vector(double x, double y, double z) { Vector *r = (Vector *) malloc(sizeof(Vector)); r->x = x; r->y = y; r->z = z; return r; } double magnitude() { return sqrt(self->x*self->x+self->y*self->y + self->z*self->z); } ... };
  12. ICCS’02, April 22, 2002 12 beazley@cs.uchicago.edu Template Wrapping %template directive

    template<class T> class vector { public: vector(); ~vector(); T get(int index); int size(); ... }; // Instantiate templates %template(intvector) vector<int>; %template(doublevector) vector<double>; In Python >>> v = intvector() ... >>> x = v.get(2) >>> print v.size() 10 >>>
  13. ICCS’02, April 22, 2002 13 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; C++ (input)
  14. ICCS’02, April 22, 2002 14 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Complex * new_Complex(double r, double i) { return new Complex(r,i); } Complex * Complex_operator___add__( Complex *self,Complex *other) { Complex *r; r=new Complex(self->operator+(*other)); return r; } double Complex_re(Complex *self) { return self->re(); } C++ (input) Procedure Wrappers
  15. ICCS’02, April 22, 2002 15 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Complex * new_Complex(double r, double i) { return new Complex(r,i); } Complex * Complex_operator___add__( Complex *self,Complex *other) { Complex *r; r=new Complex(self->operator+(*other)); return r; } double Complex_re(Complex *self) { return self->re(); } Extension Module (DLL) C++ (input) Procedure Wrappers
  16. ICCS’02, April 22, 2002 16 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Complex * new_Complex(double r, double i) { return new Complex(r,i); } Complex * Complex_operator___add__( Complex *self,Complex *other) { Complex *r; r=new Complex(self->operator+(*other)); return r; } double Complex_re(Complex *self) { return self->re(); } Extension Module (DLL) class Complex: def __init__(self,r,i): self.this = new_Complex(r,i) def __add__(self,other): return Complex_operator___add__ self.this,other) def re(self): return Complex_re(self.this) ... C++ (input) Procedure Wrappers Python class
  17. ICCS’02, April 22, 2002 17 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Complex * new_Complex(double r, double i) { return new Complex(r,i); } Complex * Complex_operator___add__( Complex *self,Complex *other) { Complex *r; r=new Complex(self->operator+(*other)); return r; } double Complex_re(Complex *self) { return self->re(); } Extension Module (DLL) class Complex: def __init__(self,r,i): self.this = new_Complex(r,i) def __add__(self,other): return Complex_operator___add__ self.this,other) def re(self): return Complex_re(self.this) ... >>> a = Complex(2,3) >>> b = Complex(4,5) >>> c = a + b >>> c.re() 6 >>> C++ (input) Procedure Wrappers Python class Python script
  18. ICCS’02, April 22, 2002 18 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Complex * new_Complex(double r, double i) { return new Complex(r,i); } Complex * Complex_operator___add__( Complex *self,Complex *other) { Complex *r; r=new Complex(self->operator+(*other)); return r; } double Complex_re(Complex *self) { return self->re(); } Extension Module (DLL) class Complex: def __init__(self,r,i): self.this = new_Complex(r,i) def __add__(self,other): return Complex_operator___add__ self.this,other) def re(self): return Complex_re(self.this) ... >>> a = Complex(2,3) >>> b = Complex(4,5) >>> c = a + b >>> c.re() 6 >>> C++ (input) Procedure Wrappers Python class Python script
  19. ICCS’02, April 22, 2002 19 beazley@cs.uchicago.edu How it Works class

    Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Extension Module (DLL) >>> a = Complex(2,3) >>> b = Complex(4,5) >>> c = a + b >>> c.re() 6 >>> C++ (input) Procedure Wrappers Python class Python script SWIG generated SWIG generated
  20. ICCS’02, April 22, 2002 20 beazley@cs.uchicago.edu How it Works •

    User only works with input file (C++) and scripts • Details of wrappers hidden. • Wrappers not modified by user. Only used to compile DLL. class Complex { public: Complex(double r = 0, double i = 0); Complex operator+(const Complex &); double re(); ... }; Extension Module (DLL) >>> a = Complex(2,3) >>> b = Complex(4,5) >>> c = a + b >>> c.re() 6 >>> C++ (input) Python script swig
  21. ICCS’02, April 22, 2002 21 beazley@cs.uchicago.edu Challenges C/C++ is a

    bad interface definition language • Type system complexity: typedef int (*PFIA[20])(int, double *x); double foo(PFIA *const x); • Ambiguity in data conversion (pointers, arrays, output values, etc.) double bar(double *x, double *y, double *r); • Structures, classes, unions. • Templates, namespaces, overloading, operators, etc. SWIG solution • Declaration annotation. • Pattern based type conversion. • Will provide a brief tour of internals.
  22. ICCS’02, April 22, 2002 22 beazley@cs.uchicago.edu Declaration Annotation The underlying

    customization mechanism %module example %rename(cprint) print; %ignore Complex::operator=; %include "example.h" // example.h void print(char *s); class Complex { public: void print(); ... Complex& operator=(const Complex &); ... }; Declaration modifiers (special directives) Pattern matching (unmodified C/C++)
  23. ICCS’02, April 22, 2002 23 beazley@cs.uchicago.edu Declaration Annotation Advanced features

    • Fully integrated with the C++ type system. • Annotations can be parameterized with type signatures. Example: %ignore Object::bar(string *s) const; ... class Object { ... void bar(string *s); void bar(string *s) const; // Ignored ... } ; class Foo : public Object { ... void bar(string *s); void bar(string *s) const; // Ignored ... };
  24. ICCS’02, April 22, 2002 24 beazley@cs.uchicago.edu Type Conversion Problem: marshalling

    • Must convert data between scripting and C representation. Example: In Python >>> gcd(12,16) 4 >>> count("Hello",5,"e") 1 int gcd(int x, int y); int count(char *buf, int len, char c); Integers String Single character
  25. ICCS’02, April 22, 2002 25 beazley@cs.uchicago.edu Pattern-Based Type Conversion Typemap

    patterns %typemap(in) int { $1 = PyInt_AsLong($input); } %typemap(out) int { $result = PyInt_FromLong($1); } %typemap(in) char * { $1 = PyString_AsString($input); } ... %include "example.h" int gcd(int x, int y); ... int count(char *buf, int len, char c); ... C datatype conversion code. (depends on target language) C header Note: user rarely writes this.
  26. ICCS’02, April 22, 2002 26 beazley@cs.uchicago.edu Typemaps Named typemaps: %typemap(in)

    double nonnegative { $1 = PyFloat_AsDouble($input); if ($1 < 0) { PyErr_SetString(PyExc_ValueError,"domain error!"); return NULL; } } double sqrt(double nonnegative); Sequences %typemap(in) (char *buf, int len) { $1 = PyString_AsString($input); $2 = PyString_Size($input); } int count(char *buf, int len, char c); >>> count("Hello","e") 1
  27. ICCS’02, April 22, 2002 27 beazley@cs.uchicago.edu Typemaps and Datatypes Pattern

    matching integrated with C++ typesystem %typemap(in) int { ... } typedef int Integer; ... Integer gcd(Integer x, Integer y); namespace std { class string; %typemap(in) string * { ... }; } namespace S = std; using std::string; ... void foo(string *a, S::string *b); Comments: • All type conversion in SWIG is pattern based. • Type conversion by naming convention. • Mostly hidden from users. • Allows advanced customization.
  28. ICCS’02, April 22, 2002 28 beazley@cs.uchicago.edu Advanced Typemap Example Conversion

    of Numeric Python array to C %typemap(in) (double *mat, int nx, int ny) { PyArrayObject *array; if (!PyArray_Check($input)) { PyErr_SetString(PyExc_TypeError,"Expected an array"); return NULL; } array = (PyArrayObject *) PyArray_ContiguousFromObject(input, PyArray_DOUBLE, 2, 2); if (!array) { PyErr_SetString(PyExc_ValueError, "array must be two-dimensional and of type float"); return NULL; } $1 = (double *) array->data; /* Assign grid */ $2 = array->dimensions[0]; /* Assign nx */ $3 = array->dimensions[1]; /* Assign ny */ } ... double determinant(double *mat, int nx, int ny); Key point • SWIG can be customized to handle new datatypes. • Customized data marshalling.
  29. ICCS’02, April 22, 2002 29 beazley@cs.uchicago.edu Using SWIG Summary •

    Existing C/C++ header files used to build wrappers. • Process guided by some special SWIG directives. • Most details hidden from user. • Can customize output using typemaps and other features. .h .h .h .h Scientific Application (C/C++/Fortran) Scientific Application (C/C++/Fortran) .i Scientific Application (C/C++/Fortran) Wrapper Layer swig DLL
  30. ICCS’02, April 22, 2002 30 beazley@cs.uchicago.edu Extending SWIG SWIG consists

    of several components • Preprocessor. • C++ parser. • C++ type system. • Fully supports multi-pass compilation/code generation. • Internal data structures loosely based on XML-DOM. Target language modules • Implemented as C++ classes. • Virtual methods redefined according to target language. class SomeLanguage : public Language { public: virtual void main(int argc, char *argv[]); virtual int top(Node *n); virtual int functionWrapper(Node *n); virtual int variableWrapper(Node *n); ... };
  31. ICCS’02, April 22, 2002 31 beazley@cs.uchicago.edu Limitations Unsupported C++ features

    • Nested classes (soon). • Certain advanced features of templates. • Not all C++ features map cleanly to scripting interface. • Subtle differences in semantics (assignment, overloading, etc.) Problematic topics • Callback functions and methods. • Memory management (object ownership). • Arrays. No universal representation, marshalling, mapping to arguments.
  32. ICCS’02, April 22, 2002 32 beazley@cs.uchicago.edu Related Work Many extension

    building tools are available Goals • Simplify extension programming • Automate extension programming. Common approaches • Programming libraries. • Specialized compilers. • Mixed language procedure inlining. • SWIG • CABLE • Inline • Boost Python • Wrappy • Grad • f2py • pyfort • G-wrap • Tolua • CXX • Pyrex • Weave • Many others
  33. ICCS’02, April 22, 2002 33 beazley@cs.uchicago.edu Current Status and Availability

    SWIG is actively used and developed • 750 members on mailing list (swig@cs.uchicago.edu) • 86000 downloads in last 3 years. • Used in industry and commercial products. • And real scientific computing applications. Status • Currently working on major new release (SWIG-1.3.x ===> SWIG-2.0). • About 6 active developers. • Major enhancements to C++ handling (templates, namespaces, type system). • New target languages. Availability: • http://www.swig.org • And many Linux distributions.