Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

XHP Behind the Scenes

XHP Behind the Scenes

XHP is PHP 7 Extension that augments the syntax of the language such that XML document fragments become valid PHP expressions

Avatar for Fadhil Mandaga

Fadhil Mandaga

January 10, 2017
Tweet

Other Decks in Programming

Transcript

  1. History Develop by Facebook, abandoned repo: https://github.com/facebookarchive/xhp-php5-extension First commit June

    23rd, 2009 and Last commit Jan 21st, 2015 (~ 6 years) for PHP 5.6 First release by tag 1.0.1 at Aug 20th, 2009, and last release by tag 1.6.0 XHP Class Library at repo: https://github.com/facebook/xhp-lib Wiki: https://github.com/facebook/xhp-lib/wiki/
  2. PHP7 Improvement by KMK-ONLINE Active repo: https://github.com/KMK-ONLINE/xhp-php7-extension Wiki: https://github.com/KMK-ONLINE/xhp-php7-extension/wiki Last

    version by tag 1.7.1 Compilable Zend Engine v3.0.0 and Syntax supported to PHP 7.0 Scanner and Parser decouple, create new function xhp_token_get_all and xhp_token_name Enabled by using <?hh Production already used at http://www.liputan6.com
  3. Requirement and Specification PHP using C, but XHP using C++

    PHP lexer using re2c, but XHP using flex Both PHP and XHP parser using bison Text concatenation using __gnu_cxx rope algorithm (non-generic STL alternate of std::string) PHP XHP Extension that hooks the zend API zend_compile_file and use XHP Preprocess that convert XHP syntax into valid PHP syntax.
  4. PHP XHP Extension: ext.cpp typedef zend_op_array* (zend_compile_file_t )(zend_file_handle*, int TSRMLS_DC);

    static zend_compile_file_t * dist_compile_file; static zend_op_array* xhp_compile_file( zend_file_handle * f, int type TSRMLS_DC); static PHP_MINIT_FUNCTION(xhp) { dist_compile_file = zend_compile_file; zend_compile_file = xhp_compile_file; }
  5. PHP XHP Extension: ext.cpp (Cont’d) static zend_op_array* xhp_compile_file(zend_file_handle* f, int

    type TSRMLS_DC) { original_code = f->handle.stream.mmap.buf; result = xhp_preprocess(original_code, rewrit, ...); if (result == XHPErred) { zend_bailout(); } else if (result == XHPRewrote) { code_to_give_to_php = &rewrit; } else { code_to_give_to_php = &original_code; } zend_file_handle fake_file; fake_file.handle.stream.mmap.buf = code_to_give_to_php->c_str(); fake_file.handle.stream.mmap.len = code_to_give_to_php->size(); zend_op_array* ret = dist_compile_file(&fake_file, type TSRMLS_CC); return ret; }
  6. XHP Preprocessor re2c - fastpath.re Detect if the file is

    xhp or not, just see the tag <?hh flex - scanner.l Scanner or Tokenizer bison - parser.y Parser
  7. XHP Preprocess: xhp_preprocess.cpp XHPResult xhp_preprocess(std::string &in, std::string &out, ...) {

    if (!xhp_fastpath(buffer, ...)) { return XHPDidNothing; } xhplex_init(&scanner); xhpset_extra(&extra, scanner); xhpparse(scanner, &new_code); if (extra.terminated) { return XHPErred; } else if (extra.used || extra.hh_tags) { out = new_code.c_str(); return XHPRewrote; } else { return XHPDidNothing; } }
  8. Scanner or Tokenizer: scanner.l Start state: <INITIAL> Rule: “<?hh” {

    token: T_OPEN_TAG, state: ST_PHP } Current: <ST_PHP> Rule: “echo” { token: T_ECHO } Rule: <ST_PHP>"<"[a-zA-Z_\x7f-\xff] { check last token, token: T_XHP_TAG_LT, push current state, state: ST_XHP_IN_TAG, consume: “<” } Current: ST_XHP_IN_TAG Rule: {XHPLABEL} { token: T_XHP_LABEL, consume: “a” } Rule: {XHPLABEL} { token: T_XHP_LABEL, consume: “href” } Rule: “=” { token: “=” } Rule: [“][^”]*[”] { token: T_XHP_TEXT, decode value } Rule: “>” { token: T_XHP_TAG_GT, state: ST_XHP_CHILD }
  9. Scanner or Tokenizer: scanner.l (Cont’d) Current: <ST_XHP_CHILD> Rule: [^{<]+ {

    token: T_XHP_TEXT, decode value } Rule: “</” { token: T_XHP_TAG_LT, consume: “<”, state: ST_XHP_END_CLOSE_TAG } Current: <ST_XHP_END_CLOSE_TAG> Rule: “/” { token: “/” } Rule: {XHPLABEL} { token: T_XHP_LABEL, consume: “a” } Rule: “>” { token: T_XHP_TAG_GT, pop into current state, state: ST_PHP }
  10. Parser: parser.y start: top_statement_list top_statement_list: top_statement_list top_statement | /* empty

    */ top_statement: statement | ... statement: T_OPEN_TAG | unticked_statement unticked_statement: T_ECHO echo_expr_list ‘;’ | … echo_expr_list: expr | … expr: expr_without_variable | ... expr_without_variable: xhp_tag_expression | …
  11. Parser: parser.y (Cont’d) xhp_tag_expression: xhp_tag_open xhp_children xhp_tag_close | … xhp_tag_open:

    xhp_tag_start xhp_attributes T_XHP_TAG_GT xhp_tag_start: T_XHP_TAG_LT T_XHP_LABEL xhp_attributes: xhp_attributes xhp_attribute | /* empty */ xhp_attribute: T_XHP_LABEL ‘=’ xhp_attribute_value xhp_attribute_value: T_XHP_TEXT | … xhp_children: xhp_literal_text | ... xhp_literal_text: T_XHP_TEXT xhp_tag_close: T_XHP_TAG_LT '/' T_XHP_LABEL T_XHP_TAG_GT
  12. Example Xhpized (3) top_statement <?php [T_ECHO] echo xhp_tag_start a [T_XHP_LABEL]

    href [=] ( [T_XHP_TEXT] "m.liputan6.com") -> xhp_attribute_value
  13. Example Xhpized (4) top_statement <?php [T_ECHO] echo xhp_tag_start a (

    [T_XHP_LABEL] href [=] xhp_attribute_value "m.liputan6.com") -> xhp_attribute
  14. Example Xhpized (5) top_statement <?php [T_ECHO] echo xhp_tag_start: a (xhp_attribute

    ’href’ => “m.liputan6.com” empty) -> xhp_attributes
  15. Example Xhpized (6) top_statement <?php [T_ECHO] echo (xhp_tag_start a xhp_attributes

    ’href’ => “m.liputan6.com” [T_XHP_TAG_GT] >) -> xhp_tag_open (save current tag is ‘a’)
  16. Example Xhpized (7) top_statement <?php [T_ECHO] echo xhp_tag_open new xhp_a

    (array(‘href’ => “m.liputan6.com”), array( ( [T_XHP_TEXT] Liputan6) -> xhp_literal_text
  17. Example Xhpized (8) top_statement <?php [T_ECHO] echo xhp_tag_open new xhp_a

    (array(‘href’ => “m.liputan6.com”), array( (xhp_literal_text Liputan6) -> xhp_children
  18. Example Xhpized (9) top_statement <?php [T_ECHO] echo xhp_tag_open new xhp_a

    (array(‘href’ => “m.liputan6.com”), array( xhp_children ’Liputan6’ ( [T_XHP_TAG_LT] < [/] [T_XHP_LABEL] a [T_XHP_TAG_GT] >) -> xhp_tag_close (validate if close tag match ‘a’)
  19. Example Xhpized (10) top_statement <?php [T_ECHO] echo (xhp_tag_open new xhp_a

    (array(‘href’ => “m.liputan6.com”), array( xhp_children ’Liputan6’ xhp_tag_close ) -> xhp_tag_expression = echo_expr_list
  20. Example Xhpized (11) top_statement <?php ( [T_ECHO] echo echo_expr_list new

    xhp_a (array(‘href’ => “m.liputan6.com”), array(’Liputan6’)) [;]) -> unticked_statement = top_statement
  21. Example Xhpized (12) (top_statement <?php top_statement echo new xhp_a (array(‘href’

    => “m.liputan6.com”), array(’Liputan6’)); empty) -> top_statement_list = start (VALID PHP Syntax)
  22. XHP PHP Class <?php class xhp_a { private $attributes, $children;

    public function __construct($attributes, $children) { $this->attributes = $attributes; $this->children = $children; } final public function __toString() { $output = ‘ <a’; foreach ($this->attributes as $key => $val) { $output .= ‘ ’ . $key . ‘ =“’ . $val . ‘”’; } $output .= ‘ >’; foreach ($this->children as $child) { $output .= ( string)$child; } $output .= ‘ </a>’; return $output; } }
  23. Further Improvement Standarized The rope __gnu_cxx::rope<char> is GNU compiler specific.

    Other compiler like Clang ( Apple), and VC (Microsoft) should use std::string which may degrade performance, see: https://gist.github.com/josephg/3474848 Refer to C++ FQA http://yosefk.com/c++fqa/ it may be better to use C as PHP source and its extension also in C. Prefer re2c than flex which may help developer to learn both PHP and XHP lexer faster. Use ZendMM API instead C/C++ default malloc or __gnu_cxx::__pool_alloc<char> The output and input line number __LINE__ should be equal to help debugging. Thanks. --