Slide 1

Slide 1 text

RETHINKING EXTENSION DEVELOPMENT FOR PHP AND HHVM A look at the new MongoDB driver Jeremy Mikola jmikola With special thanks to Derick Rethans and Hannes Magnusson for assisting with slides and working tirelessly to get us to 1.0.0 ὁ

Slide 2

Slide 2 text

One Extension, Three Engines PHP 5 HHVM PHP 7 A History Lesson Rethinking It All A New Architecture Implementation and Userland Current Status and Roadmap

Slide 3

Slide 3 text

History

Slide 4

Slide 4 text

History: 0.8.4 (April 2009) Resources for connections Classes for BSON types Code, Date, Object ID, Regex, BinData Very few functions: m o n g o _ c o n n e c t , m o n g o _ c l o s e , m o n g o _ r e m o v e , m o n g o _ q u e r y , m o n g o _ i n s e r t , m o n g o _ b a t c h _ i n s e r t , m o n g o _ u p d a t e , m o n g o _ h a s _ n e x t , m o n g o _ n e x t , m o n g o _ g r i d f s _ i n i t , m o n g o _ g r i d f s _ s t o r e , m o n g o _ g r i d f i l e _ w r i t e

Slide 5

Slide 5 text

History: 0.8.4 (April 2009) i n c l u d e ' M o n g o . p h p ' ; $ m = n e w M o n g o ( ) ; $ c = $ m - > s e l e c t C o l l e c t i o n ( ' p h p t ' , ' f i n d ' ) ; $ c - > d r o p ( ) ; $ c - > i n s e r t ( a r r a y ( ' f o o ' = > ' b a r ' , ' a ' = > ' b ' , ' b ' = > ' c ' , ) ) ; $ c u r s o r = $ c - > f i n d ( a r r a y ( ' f o o ' = > ' b a r ' ) , a r r a y ( ' a ' = > 1 , ' b ' = > 1 ) ) ; w h i l e ( $ c u r s o r - > h a s N e x t ( ) ) { v a r _ d u m p ( $ c u r s o r - > g e t N e x t ( ) ) ; }

Slide 6

Slide 6 text

History: 0.9.0 (May 2009) Support for PHP 5.3 Reimplement most PHP-based functionality in C Tests are implemented with PHPUnit

Slide 7

Slide 7 text

History: 1.0.0 (September 2009) Query operators don’t have to start with $ anymore: $ m = n e w M o n g o ; $ c = $ m - > s e l e c t C o l l e c t i o n ( ' d b n a m e ' , ' c o l n a m e ' ) ; $ r = $ c - > f i n d ( a r r a y ( ' f i e l d n a m e ' = > a r r a y ( ' $ g t e ' = > 4 2 ) ) ) ; i n i _ s e t ( ' m o n g o . c m d ' , ' @ ' ) ; $ m = n e w M o n g o ; $ c = $ m - > s e l e c t C o l l e c t i o n ( ' d b n a m e ' , ' c o l n a m e ' ) ; $ r = $ c - > f i n d ( a r r a y ( ' f i e l d n a m e ' = > a r r a y ( ' @ g t e ' = > 4 2 ) ) ) ;

Slide 8

Slide 8 text

History: 1.0.0 (September 2009) BSON arrays and documents come back as PHP arrays $ m = n e w M o n g o ; $ c = $ m - > s e l e c t C o l l e c t i o n ( ' d e m o ' , ' t e s t ' ) ; $ c - > d r o p ( ) ; $ c - > i n s e r t ( ( o b j e c t ) a r r a y ( ' f o o ' = > ' b a r ' ) ) ; $ c - > i n s e r t ( a r r a y ( ' f o o ' = > ' b a r ' ) ) ; f o r e a c h ( $ c - > f i n d ( a r r a y ( ' f o o ' = > ' b a r ' ) , a r r a y ( ' _ i d ' = > 0 ) ) a s $ r e s u l t ) { v a r _ d u m p ( $ r e s u l t ) ; } ↓ a r r a y ( 1 ) { [ " f o o " ] = > s t r i n g ( 3 ) " b a r " } a r r a y ( 1 ) { [ " f o o " ] = > s t r i n g ( 3 ) " b a r " }

Slide 9

Slide 9 text

History: 1.2.8 (February 2012) Derick joins as a contractor, after having helped with integer types (November 2011) We no longer use any functionality implemented as PHP classes (everything is C) Connection handling uses connection pools Kristina’s last release

Slide 10

Slide 10 text

History: 1.3.0 (November 2012) Hannes starts helping out (June 2012) Rewritten connection handling and replica set connection management Sane branching in GIT:

Slide 11

Slide 11 text

History: 1.4.0 – 1.6.0 Releases now come together with MongoDB server releases May 2013 – 1.4.0 – MongoDB 2.4 SSL support, Jeremy starts helping out Apr 2014 – 1.5.0 – MongoDB 2.6 SASL support, command cursors, write commands Jan 2015 – 1.6.0 – MongoDB 3.0 Rewritten cursor support, MONGO_METHOD refactoring

Slide 12

Slide 12 text

Current State of the Legacy Driver Former, "great" design choices: Array and object deserialization Positional arguments vs. options arrays i n s e r t ( ) injection of generated Object IDs No clear strategy for adding command helpers Configuration via INI and static properties M o n g o C u r s o r : : $ t i m e o u t , M o n g o C u r s o r : : $ s l a v e O k a y MongoLog and stream context listeners GridFS implementation could get some more love

Slide 13

Slide 13 text

Rethinking It All

Slide 14

Slide 14 text

Goals for a New Driver Support for multiple engines Well-thought-out and minimalistic API Faster to write, easier to maintain Build on top of existing code

Slide 15

Slide 15 text

Which Engines? PHP 5 HHVM PHP 7 PHP 5.4+ HHVM 3.9+ As of 1.1.1

Slide 16

Slide 16 text

Engine: PHP 5 PHP 5.4 – PHP 5.6 Not sure what else to say…

Slide 17

Slide 17 text

Engine: HHVM Alternative PHP runtime from Facebook Evolution of HipHop, which compiled PHP to C++ binaries Implements PHP language spec, supports Hack language Written in C++, OCAML, et al.

Slide 18

Slide 18 text

Engine: PHP 7 Userland code is mostly compatible with PHP 5 Internals are quite different from PHP 5

Slide 19

Slide 19 text

New Architecture

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

libbson l i b b s o n is a new shared library written in C for developers wanting to work with the BSON serialization format. It’s used by l i b m o n g o c It’s used by the drivers directly as well High performance BSON serialization and deserialization Callback API for deserialization

Slide 22

Slide 22 text

libmongoc (mongo-c-driver) l i b m o n g o c is a client library written in C for MongoDB Meant as both a low level driver for applications and higher level languages About two years old Used by the PHP and HHVM drivers, among others

Slide 23

Slide 23 text

Phongo (mongo-php-driver) New MongoDB extension for PHP 5.4+ Backwards compatibility is not preserved Very minimal API Core classes, BSON, exceptions Meant to be built upon by PHP libraries (like the 0.8.4 legacy driver) Uses PHP namespaces

Slide 24

Slide 24 text

On Backwards Compatibility Old extension name is "mongo" New extension name is "mongodb" Both extensions may be loaded simultaneously PECL PHP.net docs

Slide 25

Slide 25 text

Legacy API (mongo.so) $ m = n e w M o n g o C l i e n t ( ' m o n g o d b : / / l o c a l h o s t : 2 7 0 1 7 ' ) ; $ m - > d e m o - > t e s t - > d r o p ( ) ; $ m - > d e m o - > t e s t - > i n s e r t ( [ ' s t r i n g ' = > ' b a r ' , ' n u m b e r _ i ' = > 5 5 , ' n u m b e r _ l ' = > 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 , ' b o o l ' = > t r u e , ' n u l l ' = > n u l l , ' f l o a t ' = > M _ P I , ] ) ; $ c u r s o r = $ m - > d e m o - > t e s t - > f i n d ( ) ; f o r e a c h ( $ c u r s o r a s $ r e s u l t ) { v a r _ d u m p ( $ r e s u l t ) ; }

Slide 26

Slide 26 text

New API (mongodb.so) $ m = n e w M o n g o D B \ D r i v e r \ M a n a g e r ( ' m o n g o d b : / / l o c a l h o s t : 2 7 0 1 7 ' ) ; $ c m d = n e w M o n g o D B \ D r i v e r \ C o m m a n d ( [ ' d r o p ' = > ' t e s t ' ] ) ; $ m - > e x e c u t e C o m m a n d ( ' d e m o ' , $ c m d ) ; $ b u l k = n e w M o n g o D B \ D r i v e r \ B u l k W r i t e ; $ b u l k - > i n s e r t ( [ ' s t r i n g ' = > ' b a r ' , ' n u m b e r _ i ' = > 5 5 , ' n u m b e r _ l ' = > 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 , ' b o o l ' = > t r u e , ' n u l l ' = > n u l l , ' f l o a t ' = > M _ P I , ] ) ; $ m - > e x e c u t e B u l k W r i t e ( ' d e m o . t e s t ' , $ b u l k ) ; $ q u e r y = n e w M o n g o D B \ D r i v e r \ Q u e r y ( [ ] ) ; $ c u r s o r = $ m - > e x e c u t e Q u e r y ( ' d e m o . t e s t ' , $ q u e r y ) ; f o r e a c h ( $ c u r s o r a s $ r e s u l t ) { v a r _ d u m p ( $ r e s u l t ) ; }

Slide 27

Slide 27 text

Hippo’s API Same implementation of Phongo’s public API, but available as an extension for HHVM Drop-in replacement HHVM is C++, so no shared code with Phongo Parts of the extension may be written in Hack HNI: HHVM-Native Interface

Slide 28

Slide 28 text

Serialization Specification Persistence in Hippo and Phongo This document discusses the methods how compound structures (documents, arrays, objects) are persisted through the drivers. And how they are brought back into PHP land. Serialization to BSON Arrays If an array is a packed array — i.e. the keys start at 0 and are sequential without gaps: BSON array. If the array is not packed — i.e. having associative (string) keys, the keys don't start at 0, or when there are gaps:: BSON object A top­level (root) document, always serializes as a BSON document. Examples These serialize as a BSON array: [ 8 , 5 , 2 , 3 ] = > [ 8 , 5 , 2 , 3 ]

Slide 29

Slide 29 text

A Built-in ODM, too! Object serialization This driver includes four public interfaces to facilitate serializing PHP classes to and from BSON. MongoDB\BSON\Type (no methods) MongoDB\BSON\Serializable abstract public method bsonSerialize(); MongoDB\BSON\Unserializable abstract public method bsonUnserialize(array $data); MongoDB\BSON\Persistable extends both BSON\Serializable and BSON\Unserializable Objects that implement the MongoDB\BSON\Type interface get special treatment by the BSON serializer. In general, these objects represent a BSON type that cannot be natively represented in PHP (e.g. MongoDB\BSON\UTCDatetime, MongoDB\BSON\ObjectID) and are explicitly checked for and handled by the driver. MongoDB\BSON\Type should not be implemented directly by userland classes. Userland classes that require custom BSON serialization should utilize the MongoDB\BSON\Serializable interface and implement the b s o n S e r i a l i z e ( ) function, which should return a document (i.e. PHP array or stdClass object) representing the document that should be stored. Furthermore, if the object implements the MongoDB\BSON\Persistable interface, the driver will also inject a MongoDB\BSON\Binary value (with type 0x80 and an internal field name) into the document, which contains the userland object's fully qualified classname. This field can then be used during unserialization to ensure that the BSON document becomes an object of the same class on the way out of the database. During unserialization of a document, if a BSON Binary value (with type 0x80 and within the expected internal field name) is encountered, the driver will peek at the value and attempt to resolve it to a classname (triggering autoloaders if neccesary). If the class name cannot be resolved, nothing magical happens; however, if the class exists, the driver will create a new instance (without invoking the constructor) and invoke its

Slide 30

Slide 30 text

The Implementation

Slide 31

Slide 31 text

Hippo (implementation) An extract from e x t _ m o n g o d b . p h p : < ? h h n a m e s p a c e M o n g o D B \ D r i v e r ; < < _ _ N a t i v e > > p u b l i c f u n c t i o n _ _ c o n s t r u c t ( s t r i n g $ d s n = " l o c a l h o s t " , a r r a y $ o p t i o n s = [ ] , a r r a y $ d r i v e r O p t i o n s = [ ] ) ; < < _ _ N a t i v e > > p u b l i c f u n c t i o n _ _ d e b u g I n f o ( ) : a r r a y ; < < _ _ N a t i v e > > p u b l i c f u n c t i o n e x e c u t e Q u e r y ( s t r i n g $ n a m e s p a c e , Q u e r y $ q u e r y , R e a d P r e f e r e n c e $ r e a d P r e f e r e n c e = n u l l ) : C u r s o r ; < < _ _ N a t i v e > > p u b l i c f u n c t i o n s e l e c t S e r v e r ( R e a d P r e f e r e n c e $ r e a d P r e f e r e n c e ) : S e r v e r ; / / … } ? >

Slide 32

Slide 32 text

Hippo (implementation) An extract from s r c / M o n g o D B / D r i v e r / M a n a g e r . c p p : v o i d H H V M _ M E T H O D ( M o n g o D B D r i v e r M a n a g e r , _ _ c o n s t r u c t , c o n s t S t r i n g & d s n , c o n s t A r r a y & o p t i o { M o n g o D B D r i v e r M a n a g e r D a t a * d a t a = N a t i v e : : d a t a < M o n g o D B D r i v e r M a n a g e r D a t a > ( t h i s _ ) ; m o n g o c _ u r i _ t * u r i ; m o n g o c _ c l i e n t _ t * c l i e n t ; u r i = h i p p o _ m o n g o _ d r i v e r _ m a n a g e r _ m a k e _ u r i ( d s n . c _ s t r ( ) , o p t i o n s ) ; c l i e n t = m o n g o c _ c l i e n t _ n e w _ f r o m _ u r i ( u r i ) ; i f ( ! c l i e n t ) { t h r o w M o n g o D r i v e r : : U t i l s : : t h r o w R u n T i m e E x c e p t i o n ( " F a i l e d t o c r e a t e M a n a g e r f r o m U R } d a t a - > m _ c l i e n t = c l i e n t ; h i p p o _ m o n g o _ d r i v e r _ m a n a g e r _ a p p l y _ s s l _ o p t s ( d a t a - > m _ c l i e n t , d r i v e r O p t i o n s ) ; h i p p o _ m o n g o _ d r i v e r _ m a n a g e r _ a p p l y _ r p ( d a t a - > m _ c l i e n t , o p t i o n s ) ; h i p p o _ m o n g o _ d r i v e r _ m a n a g e r _ a p p l y _ w c ( d a t a - > m _ c l i e n t , o p t i o n s ) ; }

Slide 33

Slide 33 text

Hippo (implementation) An extract from e x t _ m o n g o d b . p h p : < ? h h n a m e s p a c e M o n g o D B \ D r i v e r ; < < _ _ N a t i v e D a t a ( " M o n g o D B D r i v e r R e a d P r e f e r e n c e " ) > > f i n a l c l a s s R e a d P r e f e r e n c e { < < _ _ N a t i v e > > p r i v a t e f u n c t i o n _ s e t R e a d P r e f e r e n c e ( i n t $ r e a d P r e f e r e n c e ) : v o i d ; < < _ _ N a t i v e > > p r i v a t e f u n c t i o n _ s e t R e a d P r e f e r e n c e T a g s ( a r r a y $ t a g S e t s ) : v o i d ; p u b l i c f u n c t i o n _ _ c o n s t r u c t ( i n t $ r e a d P r e f e r e n c e , m i x e d $ t a g S e t s = n u l l ) { i f ( $ t a g S e t s ! = = N U L L & & U t i l s : : m u s t B e A r r a y O r O b j e c t ( ' p a r a m e t e r 2 ' , $ t a g S e t s r e t u r n ; } s w i t c h ( $ r e a d P r e f e r e n c e ) { c a s e R e a d P r e f e r e n c e : : R P _ P R I M A R Y : c a s e R e a d P r e f e r e n c e : : R P _ P R I M A R Y _ P R E F E R R E D : c a s e R e a d P r e f e r e n c e : : R P _ S E C O N D A R Y : c a s e R e a d P r e f e r e n c e : : R P _ S E C O N D A R Y _ P R E F E R R E D : c a s e R e a d P r e f e r e n c e : : R P _ N E A R E S T : / / c a l l i n g i n t o N a t i v e $ t h i s - > _ s e t R e a d P r e f e r e n c e ( $ r e a d P r e f e r e n c e ) ; i f ( $ t a g S e t s ) {

Slide 34

Slide 34 text

Hippo vs. Phongo v o i d H H V M _ M E T H O D ( M o n g o D B D r i v e r M a n a g e r , _ _ c o n s t r u c t , c o n s t S t r i n g & d s n , c o n s t A r r a y & o p t i o { / * … * / } P H P _ M E T H O D ( M a n a g e r , _ _ c o n s t r u c t ) { p h p _ p h o n g o _ m a n a g e r _ t * i n t e r n ; z e n d _ e r r o r _ h a n d l i n g e r r o r _ h a n d l i n g ; m o n g o c _ u r i _ t * u r i ; c h a r * u r i _ s t r i n g ; i n t u r i _ s t r i n g _ l e n ; z v a l * o p t i o n s = N U L L ; b s o n _ t b s o n _ o p t i o n s = B S O N _ I N I T I A L I Z E R ; z v a l * d r i v e r O p t i o n s = N U L L ; ( v o i d ) r e t u r n _ v a l u e ; ( v o i d ) r e t u r n _ v a l u e _ p t r ; ( v o i d ) r e t u r n _ v a l u e _ u s e d ; z e n d _ r e p l a c e _ e r r o r _ h a n d l i n g ( E H _ T H R O W , p h o n g o _ e x c e p t i o n _ f r o m _ p h o n g o _ d o m a i n ( P H O N G O _ E R R O i n t e r n = ( p h p _ p h o n g o _ m a n a g e r _ t * ) z e n d _ o b j e c t _ s t o r e _ g e t _ o b j e c t ( g e t T h i s ( ) T S R M L S _ C C ) i f ( z e n d _ p a r s e _ p a r a m e t e r s ( Z E N D _ N U M _ A R G S ( ) T S R M L S _ C C , " s | a ! a ! " , & u r i _ s t r i n g , & u r i _ s t r i z e n d _ r e s t o r e _ e r r o r _ h a n d l i n g ( & e r r o r _ h a n d l i n g T S R M L S _ C C ) ; r e t u r n ; } z e n d _ r e s t o r e _ e r r o r _ h a n d l i n g ( & e r r o r _ h a n d l i n g T S R M L S _ C C ) ; i f ( o p t i o n s ) { z v a l _ t o _ b s o n ( o p t i o n s , P H O N G O _ B S O N _ N O N E , & b s o n _ o p t i o n s , N U L L T S R M L S _ C C ) ; } / * … * /

Slide 35

Slide 35 text

Adding PHP 7 Support to Phongo Order of elements in structs c h a r * vs. z e n d _ s t r i n g zend_hash changes PHP 7 itself was being developed alongside our new driver, which meant frequent internal API changes PHP 7 was released after we shipped 1.0 (PHP 5, HHVM), so support was added 1.1.1

Slide 36

Slide 36 text

A Minimalistic API Does Not a Full Driver Make

Slide 37

Slide 37 text

MongoDB PHP Library (PHPLIB) Userland library sitting atop Phongo and Hippo Depends on e x t - m o n g o d b Implements the convenience methods lacking from the basic extension APIs , command helpers and enumeration Installable via Composer: CRUD specification Collection index $ m k d i r n e w - p r o j e c t & & c d n e w - p r o j e c t $ c o m p o s e r r e q u i r e " m o n g o d b / m o n g o d b = ^ 1 . 0 . 0 "

Slide 38

Slide 38 text

Library Usage r e q u i r e ' v e n d o r / a u t o l o a d . p h p ' ; $ m = n e w M o n g o D B \ C l i e n t ( " m o n g o d b : / / l o c a l h o s t : 2 7 0 1 7 " ) ; $ c = $ m - > s e l e c t C o l l e c t i o n ( ' t e s t ' , ' d e m o ' ) ; $ c - > d r o p ( ) ; $ c - > i n s e r t O n e ( [ ' s t r i n g ' = > ' b a r ' , ' n u m b e r _ i ' = > 5 5 , ' n u m b e r _ l ' = > 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 , ' b o o l ' = > t r u e , ' n u l l ' = > n u l l , ' f l o a t ' = > M _ P I , ] ) ; f o r e a c h ( $ c - > f i n d ( ) a s $ r e s u l t ) { v a r _ d u m p ( $ r e s u l t ) ; }

Slide 39

Slide 39 text

Challenges Along the Way

Slide 40

Slide 40 text

Dependencies and APIs were a Moving Target As the first major driver built atop l i b m o n g o c , there were bugs to be found and fixed PHPLIB required identical Phongo and Hippo APIs, which wasn’t always the case during development Namespace changes along the way Exceptions moved to MongoDB\Driver\Exception Debates over a top-level BSON\ namespace Consolidating classes and methods CommandResult and QueryResult → Cursor Remove single-write helpers on Manager

Slide 41

Slide 41 text

HHVM Extension Writing Woes Many of the internal APIs are not documented Subtle compatibility issues regarding exceptions and constructors Bundling and configuring HHVM does not yet have a PECL equivalent Sharing connections and state across requests Derick’s memoirs: HHVM/HNI Extension Cookbook

Slide 42

Slide 42 text

Current Status

Slide 43

Slide 43 text

PHP and HHVM Drivers Docs: Phongo 1.1.2 released Hippo 1.1.0RC1 released php.net/mongodb Userland Library Docs: 1.0.0 released mongodb.github.io/mongo-php-library

Slide 44

Slide 44 text

Differentiating New from Old New PHP Driver JIRA: PECL: Docs: GitHub: PHPC mongodb PHP.net/mongodb mongodb/mongo-php-driver Legacy PHP Driver JIRA: PECL: Docs: GitHub: PHP mongo PHP.net/mongo mongodb/mongo-php-driver-legacy

Slide 45

Slide 45 text

On the Roadmap Resolve HHVM memory issues and ship 1.1.0 Additional ODM features planned for 1.2.0 GridFS coming to PHPLIB in 1.1.0 Upgrading Doctrine ODM and other OSS libs for legacy apps Mongo PHP Adapter

Slide 46

Slide 46 text

Future Possibilities

Slide 47

Slide 47 text

by François Laupretre PHP Code Service An easy way to mix C and PHP in extensions PECL extension: Supports both PHP 5 and 7 ( ) Opcode caching and minimal loading overhead Classes may be autoloaded Functions/constants registered on RINIT pcs pecl-compat

Slide 48

Slide 48 text

PCS Architecture

Slide 49

Slide 49 text

PCS with MongoDB

Slide 50

Slide 50 text

Thanks! speakerdeck.com/jmikola joind.in/talk/0929c Questions?