$30 off During Our Annual Pro Sale. View details »

jOpenDocument: Restarting the ODF manipulation Java library after a seven-years hiatus

jOpenDocument: Restarting the ODF manipulation Java library after a seven-years hiatus

Talk for COSCUP 2021 OSPN Japan Track on Aug. 1st, 2021.

This talk describes my challenge to reboot a quite old Java library.
jOpenDocument is a useful ODF manipulation library for JVM languages but stopped developing since 2014. So I decided I'll reboot it by myself, this is the power of OpenSource.

Naruhiko Ogasawara

July 17, 2021
Tweet

More Decks by Naruhiko Ogasawara

Other Decks in Technology

Transcript

  1. jOpenDocument: Restarting the ODF manipulation Java library after a seven-years

    hiatus Naruhiko Ogasawara Twitter: @naru0ga Facebook: naruoga Telegram: @naruoga
  2. 01/08/2021 COSCUP 2021 Day2 / OSPN track 2 Who am

    I • 小笠原 (OGASAWARA) 徳彦 (Naruhiko) – Call me “NARU” • FLOSS lover from Japan – LibreOffice, Ubuntu, Selenium, Jenkins, ... • An employee of the security vendor in Japan – Internal tools development (like report generation systems) – DevSecOps service development
  3. 01/08/2021 COSCUP 2021 Day2 / OSPN track 3 Agenda •

    About ODF and jOpenDocument • Motivation for this “modernization” • Topic 1: Mavenize and publish for devs • Topic 2: Make tests run and success • Topic 3: Continuous Integration on GitHub • Conclusion
  4. 01/08/2021 COSCUP 2021 Day2 / OSPN track 4 About ODF

    and jOpenDocument
  5. 01/08/2021 COSCUP 2021 Day2 / OSPN track 5 OpenDocument Format

    (ODF): overviews • http://opendocumentformat.org/ • “REAL” International Standard file format for document productive suite – Standardized by OASIS, Open Document Format for Office Applications TC – ISO/IEC 26300 • LibreOffice (and its predecessor, OpenOffice.org) native format • Other software can use it thanks of Open Standard – Microsoft Office, Google Drive also support • Simple, human-readable, easy to machine-manipulate zipped XML • Keep up with the evolution of the application – Not as the “pseudo standard,” which is essentially unrevised from the proprietary application document format released in 2007
  6. 01/08/2021 COSCUP 2021 Day2 / OSPN track 6 ODF ecosystems

    (manipulation libraries) • https://github.com/search?q=opendocument+form at&ref=opensearch • There should be several libraries available in your favorite programming languages • Or easily can develop your own libraries because ODF is so simple
  7. 01/08/2021 COSCUP 2021 Day2 / OSPN track 7 ODF +

    LibreOffice → PDF solution
  8. 01/08/2021 COSCUP 2021 Day2 / OSPN track 8 Our choice:

    jOpenDocument • http://www.jopendocument.org/ • Well template handling with the dedicated extension • Open Source under GPLv3 License • Simple and useful API • Bit an old: latest release at 2014 (1.4 rc2) • But still useful
  9. 01/08/2021 COSCUP 2021 Day2 / OSPN track 9 Anyway, I

    already mentioned last year ... https://speakerdeck.com/naruoga/why-odf-is-the-best-intermediate-format-for-report-generation-systems
  10. 01/08/2021 COSCUP 2021 Day2 / OSPN track 10 Motivation for

    this “modernization”
  11. 01/08/2021 COSCUP 2021 Day2 / OSPN track 11 We are

    happy with it, but... • (Again) Bit an old: latest release at 2014 (1.4 rc2) • Some technical concerns – Lack of ODF 1.3 (= LibreOffice ≧ 7.0 standard) support – Will not work new JDK near future • But our business has strongly depend on it • Hard to migrate to other libraries
  12. 01/08/2021 COSCUP 2021 Day2 / OSPN track 12 Open Source

    Rule: We always have the right to fork them! • If no one can solve our problem, we can solved by ourselves • So I decided to “modernize” jOpenDocument • With – Modern build system (ant → maven) – Modern development process (GitHub, Jitpack, CI) – Modern library dependencies – ...
  13. 01/08/2021 COSCUP 2021 Day2 / OSPN track 13 Topic 1:

    Mavenize and publish for devs
  14. 01/08/2021 COSCUP 2021 Day2 / OSPN track 14 I don’t

    want to learn Ant in 21st century • Migrate from Ant to Maven • From ZIP-archived source files into modern Git repo • All dependencies should managed as text file (pom.xml) • NOT JAR files in “lib/”
  15. 01/08/2021 COSCUP 2021 Day2 / OSPN track 15 Java beginners

    way to migrate from Ant to Maven • Create an new Maven project by Intellij IDEA • Check all JARs’ repositories in Maven Central via https://search.maven.org/ – Sometimes by filename – Sometimes by package name (taken from unzipped JAR) • Add them into pom.xml <dependencies> • Add some extra plugins (surefire, sources, ...)
  16. 01/08/2021 COSCUP 2021 Day2 / OSPN track 16 … and

    check if it works well • In Intellij IDEA, we have “maven” pane which allows all tasks without local maven install • So at first, use “package” to create JAR file • Then I add it into my small example project – https://github.com/naruoga/jopendocumentsample • Luckily, it worked in one shot
  17. 01/08/2021 COSCUP 2021 Day2 / OSPN track 17 Published to

    developers • GitHub – https://github.com/naruoga/jOpenDocument • Jitpack – https://jitpack.io/#naruoga/jOpenDocument – Can publish JAR of GitHub public repo like as maven repo – Just input repo URL, without login – My sample project use this, and also runs well • See “fix/jopendocumentNg” branch
  18. 01/08/2021 COSCUP 2021 Day2 / OSPN track 18 Topic 2:

    Make tests run and success
  19. 01/08/2021 COSCUP 2021 Day2 / OSPN track 19 DON’T TRUST

    SOFTWARE WITHOUT AUTOMATED TEST • Thanks for previous devs, they thought same thing • There are some automated Unit Tests in the source repo • In Ant way • It means, they are mixed into one src directory not separated with product codes
  20. 01/08/2021 COSCUP 2021 Day2 / OSPN track 20 Make all

    tests to follow Maven policy • Ant and Maven has little different policies to manage their (unit-)test files • And resources also • In Maven, we should put all test Java file into src/test/java • And resources should be put into src/(main|test)/resources
  21. 01/08/2021 COSCUP 2021 Day2 / OSPN track 21 It should

    be comply Maven way • Or Gradle, SBT, something else modern build system
  22. 01/08/2021 COSCUP 2021 Day2 / OSPN track 22 Ok, test

    runs, but lots of RED and YELLOW... • Resource loading path – We have to write absolute path from “src/test/resources” • At this time, the Schema validation feature for RelaxNG doesn’t work – XML + Java is my weakness – Please send me PR to fix it • Some test has been broken long time – Without concern about non-French env
  23. 01/08/2021 COSCUP 2021 Day2 / OSPN track 23 NOT fixed

    yet, but make them SUCCESS • Fix references to resources – Just greped “getResource…()” then fix them • Ignore some tests – RelaxNG Schema related • I know it’s a fundamental, but it will be the homework – And already-broken tests
  24. 01/08/2021 COSCUP 2021 Day2 / OSPN track 24 Looks every

    tests good, but... • This doesn’t mean “this code quality is good” • But I would like to move another step… Continuous Integration!
  25. 01/08/2021 COSCUP 2021 Day2 / OSPN track 25 Topic 3:

    Continuous Integration on GitHub
  26. 01/08/2021 COSCUP 2021 Day2 / OSPN track 26 Good old

    days, there are many many CI software and services
  27. 01/08/2021 COSCUP 2021 Day2 / OSPN track 27 But if

    we’ll publish OSS to GitHub... • GitHub Actions is considerable choice • So I want to try it out
  28. 01/08/2021 COSCUP 2021 Day2 / OSPN track 28 Quickstart for

    Java Project • They provide rich Docs – https://docs.github.com/en/actions/quickstart • And sample repo – https://github.com/actions/starter-workflows • Of course Java/Maven sample is included – https://github.com/actions/starter-workflows/blob/main/ci /maven.yml
  29. 01/08/2021 COSCUP 2021 Day2 / OSPN track 29 Just put

    the files into our repo… Easy! • .github/workflows/*.yml • Declarative syntax • Declare multiple actions into one repo • But at this time, I only have two “toy” actions in there (because of my laziness :)
  30. 01/08/2021 COSCUP 2021 Day2 / OSPN track 30 Conclusion

  31. 01/08/2021 COSCUP 2021 Day2 / OSPN track 31 Conclusion •

    Now I’m rebooting a Java library jOpenDocument after 7 years hiatus with modern development scheme • This indicate the POWER OF OPENSOURCE • It just started to work, but still ton of homeworks • If you have strength of Java and XML, and love to manipulate office documents by code, please join me :)
  32. 02/08/2020 COSCUP 2020 Day2 32 Questions? Twitter: @naru0ga Facebook: naruoga

    Telegram: @naruoga
  33. 01/08/2021 COSCUP 2021 Day2 / OSPN track 33 Question and

    Answer • Q: Where is the good starting point if someone has an interest in your project? • A: Lots of… :( – The first priority is fixing RelaxNG schema validation, because the validation feature is vital of an XML manipulation library. If you have a strength of XML handling with Java, please help me. – And ODF 1.3 support is also high priority. – And this XML engine is quite old, so I want to migrate the engine to another ODF manipulation library like ODFToolKit developed by TDF. • It might solve ODF 1.3 issue. – To do this, we need more tests to verify all functionality has been kept after the migration. – Document improvement (for users, for developers) is also needed. – I submit several issues in my GitHub repo. Please check them out: https://github.com/naruoga/jOpenDocument/issues/