$30 off During Our Annual Pro Sale. View Details »

New Features in Texdoc 3.0 / texdoc3

Watson
April 29, 2018

New Features in Texdoc 3.0 / texdoc3

Texdoc is a command line program to find documentation in TeX Live. The new version of the program, Texdoc 3.0, has two big features: new option parser and fuzzy search. In this talk, I will also introduce how Texdoc finds documents related to input keywords, and discuss how the process can be improved. The latest source code is available from: https://github.com/TeX-Live/texdoc.

Watson

April 29, 2018
Tweet

More Decks by Watson

Other Decks in Programming

Transcript

  1. New Features in Texdoc 3.0
    New Features in Texdoc 3.0
    Takuto ASAKURA (wtsnjp)
    National Institute of Informatics
    BachoTEX 2018
    1 / 26

    View Slide

  2. New Features in Texdoc 3.0
    What is Texdoc?
    … A command line tool
    … Search & view documents in TEX Live
    … Cross-platform (Windows, macOS, Linux, etc.)
    Usage
    $ texdoc [hoptionsi] hkeywordi
    … Typically, hkeywordi is a package name
    … Documents related to hkeywordi will be shown
    Please try “texdoc texdoc” for more details
    2 / 26

    View Slide

  3. New Features in Texdoc 3.0
    Why did I become
    a maintainer of Texdoc?
    (off-topic)
    3 / 26

    View Slide

  4. New Features in Texdoc 3.0
    Snowman culture in Japan
    … Snowman is popular among Japanese TEXperts
    … They are known as “Snowman Comedians”
    … August 8th ( / ) is “Snowman’s Day”
    … Just like the Duck culture in the West (?)
    Japan Western countries
    4 / 26

    View Slide

  5. New Features in Texdoc 3.0
    Variety of snowman in glyphs
    A1 Gothic Shuei Nijimi
    Mincho
    Shuei Maru
    Gothic
    Bunkyu
    Gothic
    HGP Gothic Nishiki-teki Ro Hon
    Mincho
    Meiryo
    5 / 26

    View Slide

  6. New Features in Texdoc 3.0
    SC-series: L
    A
    TEX packages
    … The scsnowman package
    DL
    CTAN
    … Displaying many variants of snowmen
    … Utilizing TikZ (independent from fonts)
    … cf. There is package tikzducks for duck lovers
    Examples
    scsnowman tikzducks
    T
    EX
    … The scpremiumfriday package
    DL
    GitHub
    … The scwrapfig package
    DL
    GitHub
    6 / 26

    View Slide

  7. New Features in Texdoc 3.0
    The Whitesnowman language
    … A variant of the Whitespace language, using:
    … ⇤ (U+2603) Space
    … ƒ (U+26C4) Tab
    … « (U+26C7) New line
    … I developed an interpreter in TEX
    DL
    GitHub
    Example
    7 / 26

    View Slide

  8. New Features in Texdoc 3.0
    SC-ripts: TEX-related scripts
    There are also SC-variant scripts
    DL
    GitHub
    Mostly develped by Takayuki YATO (a.k.a. ZR)
    … scmakesvf: creating Snowman VFs
    … scmendex: variant of mendex†
    … scptex2pdf: variant of ptex2pdf
    … scxml2ltx: variant of xml2ltx
    I also created one. . .
    … sctexdoc: variant of Texdoc
    DL
    Gist
    To develop above, I read the souce code of Texdoc
    ! I became familiar with its implementation!
    †Japanese-aware version of makeindex program
    8 / 26

    View Slide

  9. New Features in Texdoc 3.0
    Let’s talk about
    Texdoc 3.0
    9 / 26

    View Slide

  10. New Features in Texdoc 3.0
    Current status of Texdoc
    … Maintained by the TEX Live team:
    Karl Berry, Norbert Preining, and me
    … Development is hosted on GitHub
    ! The svn on puszcza has been dropped!
    … Texdoc 2.0171 is the latest (TEX Live 2017)
    … Main features of v3.0 is already implemented
    ! Those are in fuzzy_search branch
    Please visit:
    https://github.com/TeX-Live/texdoc
    10 / 26

    View Slide

  11. New Features in Texdoc 3.0
    New features in Texdoc 3.0
    … New option parser
    ! to specify multiple options much easier
    … Fuzzy search
    ! now you can mistype hkeywordi
    … Other small changes
    … Documentation updates
    … Removing a chache file
    There is no major update since 2012. . .
    Now it’s time to change a lot!
    11 / 26

    View Slide

  12. New Features in Texdoc 3.0
    New option parser
    Features
    … POSIX compatible as much as possible
    … Allowing the users to group short options
    Example
    In the past versions:
    $ texdoc -v -l -I ...
    Now, you can specify just like:
    $ texdoc -vlI ...
    12 / 26

    View Slide

  13. New Features in Texdoc 3.0
    What is fuzzy search?
    13 / 26

    View Slide

  14. New Features in Texdoc 3.0
    How to implement fuzzy search?
    Levenshtein distance
    … Differnce between two strings
    … Minimum number of single-character edits
    … insertions
    … deletions
    … substitutions
    Example
    Levenshtein distance between “duck” and “dog”:
    1. duck deletion
    ! duc
    2. duc substitution
    ! dug
    3. dug substitution
    ! dog The distance is 3
    14 / 26

    View Slide

  15. New Features in Texdoc 3.0
    Calculating Levenshtein distance
    Levenshtein distance can be calculated with DP:
    … considering two strings , b; and
    … let | |, |b| are lengths of , b respectively.
    Calculating the distance in O(| ||b|) time:
    lev ,b( , j) =
    8
    >
    >
    >
    <
    >
    >
    >
    :
    m x( , j) if min( , j) = 0; otherwise,
    min
    8
    <
    :
    lev ,b( 1, j) + 1
    lev ,b( , j 1) + 1
    lev ,b( 1, j 1) + 1( 6=bj)
    15 / 26

    View Slide

  16. New Features in Texdoc 3.0
    Don’t you think. . .
    we heard similar story last year?
    Implementing bioinformatics algorithms in TEX
    The Gotoh algorithm: DP
    Sequence alignment has a slightly more complex
    scoring scheme.
    Example
    m tch = 1, mism tch = 1, g( ) = d ( 1)e
    The algorithm
    Sequence alignment in O(mn) time:
    M +1,j+1 = m x

    M j, j
    , y j
    ©
    + c bj
    where
    +1,j = m x

    M j d, j
    e, y j
    d
    ©
    ,
    y ,j+1
    = m x

    M j d, y j
    e
    ©
    .
    5 / 11
    16 / 26

    View Slide

  17. New Features in Texdoc 3.0
    Fuzzy search in Texdoc 3.0
    … Running when the normal search cannot find
    any documentation in TEX Live
    … Finding a package name which has minimum
    Levenshtein distance with the hkeywordi
    Let me show you a demonstration!
    More details
    … The default allowance is up to 5
    … To change: set fuzzy_level in texdoc.cnf
    … Set the value to 0, disable fuzzy search
    … The result is shown as “info” message
    17 / 26

    View Slide

  18. New Features in Texdoc 3.0
    Other changes in Texdoc 3.0
    These are small things, now we’re working on:
    Documentation updates
    … Explaining about new features
    … Removing old information
    Removing a cache file
    … A large cache file was in the repo (1.5 MB)
    … It was used only when tlpdb does not exist
    ! It won’t happen in normal TEX Live
    18 / 26

    View Slide

  19. New Features in Texdoc 3.0
    Call for testers
    … Where can we get the development version?
    ! https://github.com/TeX-Live/texdoc
    … How to run the development version?
    ! http://tug.org/texdoc/dev/
    19 / 26

    View Slide

  20. New Features in Texdoc 3.0
    How to report bugs?
    … Creating issues on GitHub
    … Pull requests are also welcome
    … Sending a mail to the mailing list
    [email protected]
    20 / 26

    View Slide

  21. New Features in Texdoc 3.0
    Lastly, I’d like discuss about
    the future of Texdoc
    21 / 26

    View Slide

  22. New Features in Texdoc 3.0
    Other functions of Texdoc
    Alias
    … Aliases can be defined in texdoc.cnf
    … Both the original and name will be searched
    alias horiginal keywordi = hnamei
    Adjusting score (in ad-hoc)
    … You can adjust score in texdoc.cnf
    adjscore hpatterni = hscore adjustmenti
    22 / 26

    View Slide

  23. New Features in Texdoc 3.0
    Improving fuzzy search
    Finding Scoring
    Results
    exist?
    Viewing
    Fuzzy
    search
    yes
    no
    … Even if the results exist for normal search,
    could fuzzy results still be helpful?
    … Can fuzzy results win the normal results?
    … Should we not only search package names?
    ! What it should be?
    23 / 26

    View Slide

  24. New Features in Texdoc 3.0
    Problems in the scoring
    The scoring scheme needs to be improved:
    Too much adjscore (in default)
    Too many false cases
    Too complicated
    Possible solution
    … Allowing package authors to control the
    priority of the documents in their package
    ! How to compare among other packages?
    … Using useful meta data in tlpdb
    We need your wonderful ideas
    24 / 26

    View Slide

  25. New Features in Texdoc 3.0
    Documents not included in TEX Live
    Problem
    There is no way to find the documents
    Solution 1
    … A variety of alias to specify absolute paths
    alias* foo = /path/to/your/foobar.pdf
    Solution 2
    … Make it possible to specify particular directory
    Example ~/texdoc
    … Searching the documents in the directory
    25 / 26

    View Slide

  26. New Features in Texdoc 3.0
    Conclusions
    … Please visit our GitHub repository
    https://github.com/TeX-Live/texdoc
    … New features of Texdoc 3.0 are:
    … New option parser
    … Fuzzy search
    … If you are interested in the features:
    1. Getting them from fuzzy_search branch
    2. Reporting bugs on GitHub or using mailing list
    … Any suggestion to improve Texdoc is welcome
    … Sending a mail to [email protected]
    … Creating an issue/pull requests on GitHub
    Thank you & Happy Texdocing
    26 / 26

    View Slide