Save 37% off PRO during our Black Friday Sale! »

Content integration @ Bookmate

Cb09696b034cce3cc79cab80a4bba4a3?s=47 exAspArk
October 07, 2015

Content integration @ Bookmate

#MaybeMonad #XML #ONIX #Ruby

https://github.com/exAspArk/better_struct

Cb09696b034cce3cc79cab80a4bba4a3?s=128

exAspArk

October 07, 2015
Tweet

Transcript

  1. Content integration at Bookmate Evgeny Li exAspArk

  2. None
  3. • 3 million users • 600 publishers • 1000 new

    uploaded books per day
  4. Simplified architecture Publisher Metadata XML,CSV Book EPUB, FB2 Cover JPEG,

    PNG Bookmate Source Clients
  5. Metadata http://www.editeur.org/83/Overview ONIX (ONline Information eXchange) is an XML-based international

    standard for representing and communicating book industry product information in electronic form.
  6. How many people in Russia use ONIX http://25.media.tumblr.com/tumblr_m784a1326G1rb3v0jo1_250.gif

  7. ONIX sample <?xml  version="1.0"  encoding="utf-­‐8"?>   <ONIXMessage  xmlns:xsd="http://www.w3.org/2001/XMLSchema"  xmlns:xsi="http://www.w3.org/ 2001/XMLSchema-­‐instance"

     release="3.0"  xmlns="http://www.editeur.org/onix/3.0/reference">      <Header>          <SentDateTime>20131202T192459</SentDateTime>          <DefaultPriceType>42</DefaultPriceType>          <DefaultCurrencyCode>EUR</DefaultCurrencyCode>          ...      </Header>      <Product>          <NotificationType>03</NotificationType>          <RecordSourceType>01</RecordSourceType>          <ProductIdentifier>              <ProductIDType>15</ProductIDType>              <IDValue>9781824302123</IDValue>          </ProductIdentifier>          <DescriptiveDetail>            <Contributor>                  <SequenceNumber>1</SequenceNumber>                  <ContributorRole>A01</ContributorRole>                  <PersonName>David  Herbert  Lawrence</PersonName>              </Contributor>              <Subject>                  <SubjectSchemeIdentifier>10</SubjectSchemeIdentifier>                  <SubjectCode>FIC000000</SubjectCode>              </Subject>              ...          </DescriptiveDetail>          <PublishingDetail>              <PublishingDate>                  <PublishingDateRole>11</PublishingDateRole>                  <DateFormat>00</DateFormat>                  <Date>20101028</Date>              </PublishingDate>              <SalesRights>                  <SalesRightsType>06</SalesRightsType>                  <Territory>                      <CountriesIncluded>AS  CA  GU  MP  PH  PR  US  VI</CountriesIncluded>                  </Territory>              </SalesRights>              ...          </PublishingDetail>          ...
  8. Specification is about 500 pages long http://1.bp.blogspot.com/-6-PaY1EWEK0/UnZt-TRK5QI/AAAAAAAAOnU/hSTQA3YNNxQ/s1600/chandler-bing-book.gif

  9. • ONIX 2.1 – https://github.com/yob/onix (not maintained) • ONIX 3.0

    – https://github.com/immateriel/im_onix RubyGems
  10. Gems use the same names for modules & classes https://media.giphy.com/media/ToMjGpnXBTw7vnokxhu/giphy.gif

  11. RubyGems • ONIX 2.1 – https://github.com/exAspArk/onix2 (fork) • ONIX 3.0

    – https://github.com/immateriel/im_onix
  12. • Easy to start • Open source • No ubiquitous

    language • Extra data abstractions https://en.wikipedia.org/wiki/Domain-driven_design
  13. ONIX 2.1 Book metadata ONIX 3.0 onix2 im_onix onix2 adapter

    onix3 adapter Gems Application XML
  14. Simple Made Easy http://www.infoq.com/presentations/Simple-Made-Easy Simple Easy

  15. ONIX 2.1 Book metadata ONIX 3.0 onix2 im_onix onix2 adapter

    onix3 adapter Gems Application XML Why to use this extra layer?
  16. ONIX 2.1 Book metadata ONIX 3.0 onix2 adapter onix3 adapter

    Application XML
  17. https://github.com/exAspArk/better_struct

  18. BetterStruct example BetterStruct.new(nil)  ==  BetterStruct.new(nil).no_undefined_method_error   #  =>  true  

  19. BetterStruct example BetterStruct.new(nil)  ==  BetterStruct.new(nil).no_undefined_method_error   #  =>  true  

    BetterStruct.new(nil)  ==  BetterStruct.new(nil).maybe.monad   #  =>  true  
  20. BetterStruct example BetterStruct.new(nil)  ==  BetterStruct.new(nil).no_undefined_method_error   #  =>  true  

    BetterStruct.new(nil)  ==  BetterStruct.new(nil).maybe.monad   #  =>  true   like_open_struct  =  BetterStruct.new({  title:  "Fifty  Shades  of  Grey"  })   like_open_struct.title.value   #  =>  "Fifty  Shades  of  Grey"  
  21. BetterStruct example BetterStruct.new(nil)  ==  BetterStruct.new(nil).no_undefined_method_error   #  =>  true  

    BetterStruct.new(nil)  ==  BetterStruct.new(nil).maybe.monad   #  =>  true   like_open_struct  =  BetterStruct.new({  title:  "Fifty  Shades  of  Grey"  })   like_open_struct.title.value   #  =>  "Fifty  Shades  of  Grey"   like_open_struct.title.sub("Fifty",  "No").first.value   #  =>  "N"
  22. ONIX example <?xml  version="1.0"  encoding="utf-­‐8"?>   <ONIXMessage  xmlns:xsd="http://www.w3.org/2001/XMLSchema"  xmlns:xsi="http://www.w3.org/2001/ XMLSchema-­‐instance"

     release="3.0"  xmlns="http://www.editeur.org/onix/3.0/reference">      <Product>          <DescriptiveDetail>              <TitleDetail>                  <TitleElement>                      <TitleText>Atlas  Shrugged</TitleText>                  </TitleElement>              </TitleDetail>          </DescriptiveDetail>      </Product>   </ONIXMessage>
  23. class  ONIX::V3::ProductAdapter      def  initialize(product)        

     @product  =  BetterStruct.new(product)      end      def  title          @product.descriptive_detail.title_detail.title_element.title_text.value      end   end <?xml  version="1.0"  encoding="utf-­‐8"?>   <ONIXMessage  xmlns:xsd="http://www.w3.org/2001/XMLSchema"  xmlns:xsi="http://www.w3.org/2001/ XMLSchema-­‐instance"  release="3.0"  xmlns="http://www.editeur.org/onix/3.0/reference">      <Product>          <DescriptiveDetail>              <TitleDetail>                  <TitleElement>                      <TitleText>Atlas  Shrugged</TitleText>                  </TitleElement>              </TitleDetail>          </DescriptiveDetail>      </Product>   </ONIXMessage> ONIX example
  24. class  ONIX::V3::ProductAdapter      def  initialize(product)        

     @product  =  BetterStruct.new(product)      end      def  title          @product.descriptive_detail.title_detail.title_element.title_text.value      end   end   ONIX example xml  =  "<DescriptiveDetail><TitleDetail><TitleElement><TitleText>Atlas  Shrugged</ TitleText></TitleElement></TitleDetail></DescriptiveDetail>"  
  25. class  ONIX::V3::ProductAdapter      def  initialize(product)        

     @product  =  BetterStruct.new(product)      end      def  title          @product.descriptive_detail.title_detail.title_element.title_text.value      end   end   ONIX example https://github.com/savonrb/nori xml  =  "<DescriptiveDetail><TitleDetail><TitleElement><TitleText>Atlas  Shrugged</ TitleText></TitleElement></TitleDetail></DescriptiveDetail>"   product  =  Nori.new.parse(xml)   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>  {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}  
  26. class  ONIX::V3::ProductAdapter      def  initialize(product)        

     @product  =  BetterStruct.new(product)      end      def  title          @product.descriptive_detail.title_detail.title_element.title_text.value      end   end   ONIX example xml  =  "<DescriptiveDetail><TitleDetail><TitleElement><TitleText>Atlas  Shrugged</ TitleText></TitleElement></TitleDetail></DescriptiveDetail>"   product  =  Nori.new.parse(xml)   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>  {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}   product_adapter  =  ONIX::V3::ProductAdapter.new(product)   product_adapter.title   #  =>  "Atlas  Shrugged" https://github.com/savonrb/nori
  27. product   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>

     {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}   BetterStruct.new(product).descriptive_detail.title_detail.title_element.title_text.value   ONIX example
  28. product   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>

     {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}   BetterStruct.new(product).descriptive_detail.title_detail.title_element.title_text.value   product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]["TitleText"]   ONIX example
  29. product   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>

     {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}   BetterStruct.new(product).descriptive_detail.title_detail.title_element.title_text.value   product["DescriptiveDetail"]  &&        product["DescriptiveDetail"]["TitleDetail"]  &&        product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]  &&      product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]["TitleText"]   ONIX example
  30. product   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>

     {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}   BetterStruct.new(product).descriptive_detail.title_detail.title_element.title_text.value   product["DescriptiveDetail"]  &&        product["DescriptiveDetail"]["TitleDetail"]  &&        product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]  &&      product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]["TitleText"]   product[“DescriptiveDetail”].      try(:[],  “TitleDetail").      try(:[],  “TitleElement").      try(:[],  "TitleText")   ONIX example
  31. product   #  =>  {  "DescriptiveDetail"  =>  {  "TitleDetail"  =>

     {  "TitleElement"  =>  {  "TitleText"  =>   "Atlas  Shrugged"  }}}}   BetterStruct.new(product).descriptive_detail.title_detail.title_element.title_text.value   product["DescriptiveDetail"]  &&        product["DescriptiveDetail"]["TitleDetail"]  &&        product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]  &&      product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]["TitleText"]   product[“DescriptiveDetail”].      try(:[],  “TitleDetail").      try(:[],  “TitleElement").      try(:[],  "TitleText")   product["DescriptiveDetail"]["TitleDetail"]["TitleElement"]["TitleText"]  rescue  nil ONIX example
  32. CSV example Título completo Año publicación Nombre 1 eISBN epub

    Programa de transición 2008 Trotsky, León 9781449262123 filepath  =  "spanish_publisher.csv"   books  =  CSV.parse(File.read(filepath),  headers:  true)   book_data  =  BetterStruct.new(books.first.to_h)  
  33. CSV example Título completo Año publicación Nombre 1 eISBN epub

    Programa de transición 2008 Trotsky, León 9781449262123 filepath  =  "spanish_publisher.csv"   books  =  CSV.parse(File.read(filepath),  headers:  true)   book_data  =  BetterStruct.new(books.first.to_h)   book_data.titulo_completo.value   #  =>  "Programa  de  transición"  
  34. CSV example Título completo Año publicación Nombre 1 eISBN epub

    Programa de transición 2008 Trotsky, León 9781449262123 filepath  =  "spanish_publisher.csv"   books  =  CSV.parse(File.read(filepath),  headers:  true)   book_data  =  BetterStruct.new(books.first.to_h)   book_data.titulo_completo.value   #  =>  "Programa  de  transición"   book_data.ano_publicacion.value   #  =>  "2008"  
  35. CSV example Título completo Año publicación Nombre 1 eISBN epub

    Programa de transición 2008 Trotsky, León 9781449262123 filepath  =  "spanish_publisher.csv"   books  =  CSV.parse(File.read(filepath),  headers:  true)   book_data  =  BetterStruct.new(books.first.to_h)   book_data.titulo_completo.value   #  =>  "Programa  de  transición"   book_data.ano_publicacion.value   #  =>  "2008"   book_data.nombre_1.split(",  ").reverse.join("  ").value   #  =>  "León  Trotsky"  
  36. CSV example Título completo Año publicación Nombre 1 eISBN epub

    Programa de transición 2008 Trotsky, León 9781449262123 filepath  =  "spanish_publisher.csv"   books  =  CSV.parse(File.read(filepath),  headers:  true)   book_data  =  BetterStruct.new(books.first.to_h)   book_data.titulo_completo.value   #  =>  "Programa  de  transición"   book_data.ano_publicacion.value   #  =>  "2008"   book_data.nombre_1.split(",  ").reverse.join("  ").value   #  =>  "León  Trotsky"   book_data.e_isbn_epub.value   #  =>  "9781449262123"
  37. Conclusion Don’t ruin information behind a micro-language, i.e. a class

    with information-specific methods Rich Hickey
  38. Thank you!