$30 off During Our Annual Pro Sale. View Details »

How GitHub Supports Vim License Detection, The Five Years Journey

othree
July 30, 2022

How GitHub Supports Vim License Detection, The Five Years Journey

othree

July 30, 2022
Tweet

More Decks by othree

Other Decks in Programming

Transcript

  1. othree at COSCUP 2022
    How GitHub Supports Vim License Detection
    The Five Years Journey

    View Slide

  2. othree at COSCUP 2022
    GitHub ⽀援 Vim License 的故事
    歷時五年的開源貢獻

    View Slide

  3. Who is othree
    • Web Developer


    • Mozillian (MozTW)


    • Blogger https://blog.othree.net


    • Speaker https://speakerdeck.com/othree


    • Vimmer


    • ❤ OSS

    View Slide

  4. Before We Start
    • GitHub and Open Source License


    • Vim License

    View Slide

  5. GitHub and Open Source License

    View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. License detection
    • Highlight selected License


    • Search by license


    • Only major licenses are supported


    • Detected by Licensee, a Ruby Gem

    View Slide

  10. The Vim License

    View Slide

  11. The Vim License
    • A special License only for Vim


    • GPL compatible according to Richard Stallman


    • Coupled with the Vim


    • Many Vim script uses Vim license, a.k.a “Same as Vim”


    • https://vimhelp.org/uganda.txt.html

    View Slide

  12. VIM LICENSE


    I) There are no restrictions on distributing unmodified copies of Vim except


    that they must include this license text. You can also distribute


    unmodified parts of Vim, likewise unrestricted except that they must


    include this license text. You are also allowed to include executables


    that you made from the unmodified Vim sources, plus your own usage


    examples and Vim scripts.


    II) It is allowed to distribute a modified (or extended) version of Vim,


    including executables and/or source code, when the following four


    conditions are met:


    1) This license text must be included unmodified.


    2) The modified Vim must be distributed in one of the following five ways:


    a) If you make changes to Vim yourself, you must clearly describe in


    the distribution how to contact you. When the maintainer asks you


    (in any way) for a copy of the modified Vim you distributed, you


    must make your changes, including source code, available to the


    maintainer without fee. The maintainer reserves the right to


    include your changes in the official version of Vim. What the



    View Slide

  13. VIM LICENSE


    I) There are no restrictions on distributing unmodified copies of Vim except


    that they must include this license text. You can also distribute


    unmodified parts of Vim, likewise unrestricted except that they must


    include this license text. You are also allowed to include executables


    that you made from the unmodified Vim sources, plus your own usage


    examples and Vim scripts.


    II) It is allowed to distribute a modified (or extended) version of Vim,


    including executables and/or source code, when the following four


    conditions are met:


    1) This license text must be included unmodified.


    2) The modified Vim must be distributed in one of the following five ways:


    a) If you make changes to Vim yourself, you must clearly describe in


    the distribution how to contact you. When the maintainer asks you


    (in any way) for a copy of the modified Vim you distributed, you


    must make your changes, including source code, available to the


    maintainer without fee. The maintainer reserves the right to


    include your changes in the official version of Vim. What the



    View Slide

  14. II) It is allowed to distribute a modified (or extended) version of Vim,


    including executables and/or source code, when the following four


    conditions are met:


    1) This license text must be included unmodified.


    2) The modified Vim must be distributed in one of the following five ways:


    a) If you make changes to Vim yourself, you must clearly describe in


    the distribution how to contact you. When the maintainer asks you


    (in any way) for a copy of the modified Vim you distributed, you


    must make your changes, including source code, available to the


    maintainer without fee. The maintainer reserves the right to


    include your changes in the official version of Vim. What the


    maintainer will do with your changes and under what license they


    will be distributed is negotiable. If there has been no negotiation


    then this license, or a later version, also applies to your changes.


    The current maintainer is Bram Moolenaar . If this


    changes it will be announced in appropriate places (most likely


    vim.sf.net, www.vim.org and/or comp.editors). When it is completely


    impossible to contact the maintainer, the obligation to send him


    your changes ceases. Once the maintainer has confirmed that he has


    received your changes they will not have to be sent again.


    b) If you have received a modified Vim that was distributed as


    mentioned under a) you are allowed to further distribute it



    View Slide



  15. a) If you make changes to Vim yourself, you must clearly describe in


    the distribution how to contact you. When the maintainer asks you


    (in any way) for a copy of the modified Vim you distributed, you


    must make your changes, including source code, available to the


    maintainer without fee. The maintainer reserves the right to


    include your changes in the official version of Vim. What the


    maintainer will do with your changes and under what license they


    will be distributed is negotiable. If there has been no negotiation


    then this license, or a later version, also applies to your changes.


    The current maintainer is Bram Moolenaar . If this


    changes it will be announced in appropriate places (most likely


    vim.sf.net, www.vim.org and/or comp.editors). When it is completely


    impossible to contact the maintainer, the obligation to send him


    your changes ceases. Once the maintainer has confirmed that he has


    received your changes they will not have to be sent again.


    b) If you have received a modified Vim that was distributed as


    mentioned under a) you are allowed to further distribute it


    unmodified, as mentioned at I). If you make additional changes the


    text under a) applies to those changes.


    c) Provide all the changes, including source code, with every copy of


    the modified Vim you distribute. This may be done in the form of a


    context diff. You can choose what license to use for new code you




    View Slide



  16. received your changes they will not have to be sent again.


    b) If you have received a modified Vim that was distributed as


    mentioned under a) you are allowed to further distribute it


    unmodified, as mentioned at I). If you make additional changes the


    text under a) applies to those changes.


    c) Provide all the changes, including source code, with every copy of


    the modified Vim you distribute. This may be done in the form of a


    context diff. You can choose what license to use for new code you


    add. The changes and their license must not restrict others from


    making their own changes to the official version of Vim.


    d) When you have a modified Vim which includes changes as mentioned


    under c), you can distribute it without the source code for the


    changes if the following three conditions are met:


    - The license that applies to the changes permits you to distribute


    the changes to the Vim maintainer without fee or restriction, and


    permits the Vim maintainer to include the changes in the official


    version of Vim without fee or restriction.


    - You keep the changes for at least three years after last


    distributing the corresponding modified Vim. When the maintainer


    or someone who you distributed the modified Vim to asks you (in


    any way) for the changes within this period, you must make them


    available to him.



    View Slide



  17. - You clearly describe in the distribution how to contact you. This


    contact information must remain valid for at least three years


    after last distributing the corresponding modified Vim, or as long


    as possible.


    e) When the GNU General Public License (GPL) applies to the changes,


    you can distribute the modified Vim under the GNU GPL version 2 or


    any later version.


    3) A message must be added, at least in the output of the ":version"


    command and in the intro screen, such that the user of the modified Vim


    is able to see that it was modified. When distributing as mentioned


    under 2)e) adding the message is only required for as far as this does


    not conflict with the license used for the changes.


    4) The contact information as required under 2)a) and 2)d) must not be


    removed or changed, except that the person himself can make


    corrections.


    III) If you distribute a modified version of Vim, you are encouraged to use


    the Vim license for your changes and make them available to the


    maintainer, including the source code. The preferred way to do this is


    by e-mail or by uploading the files to a server and e-mailing the URL.


    If the number of changes is small (e.g., a modified Makefile) e-mailing a


    context diff will do. The e-mail address to be used is




    View Slide



  18. - You clearly describe in the distribution how to contact you. This


    contact information must remain valid for at least three years


    after last distributing the corresponding modified Vim, or as long


    as possible.


    e) When the GNU General Public License (GPL) applies to the changes,


    you can distribute the modified Vim under the GNU GPL version 2 or


    any later version.


    3) A message must be added, at least in the output of the ":version"


    command and in the intro screen, such that the user of the modified Vim


    is able to see that it was modified. When distributing as mentioned


    under 2)e) adding the message is only required for as far as this does


    not conflict with the license used for the changes.


    4) The contact information as required under 2)a) and 2)d) must not be


    removed or changed, except that the person himself can make


    corrections.


    III) If you distribute a modified version of Vim, you are encouraged to use


    the Vim license for your changes and make them available to the


    maintainer, including the source code. The preferred way to do this is


    by e-mail or by uploading the files to a server and e-mailing the URL.


    If the number of changes is small (e.g., a modified Makefile) e-mailing a


    context diff will do. The e-mail address to be used is




    View Slide

  19. is able to see that it was modified. When distributing as mentioned


    under 2)e) adding the message is only required for as far as this does


    not conflict with the license used for the changes.


    4) The contact information as required under 2)a) and 2)d) must not be


    removed or changed, except that the person himself can make


    corrections.


    III) If you distribute a modified version of Vim, you are encouraged to use


    the Vim license for your changes and make them available to the


    maintainer, including the source code. The preferred way to do this is


    by e-mail or by uploading the files to a server and e-mailing the URL.


    If the number of changes is small (e.g., a modified Makefile) e-mailing a


    context diff will do. The e-mail address to be used is





    IV) It is not allowed to remove this license from the distribution of the Vim


    sources, parts of it or from a modified version. You may use this


    license for previous Vim releases instead of the license that they came


    with, at your option.


    View Slide

  20. View Slide

  21. The Vim License
    • A lot of word Vim in the text


    • The Vim Maintainer

    View Slide

  22. VIM LICENSE


    I) There are no restrictions on distributing unmodified copies of Vim except


    that they must include this license text. You can also distribute


    unmodified parts of Vim, likewise unrestricted except that they must


    include this license text. You are also allowed to include executables


    that you made from the unmodified Vim sources, plus your own usage


    examples and Vim scripts.


    II) It is allowed to distribute a modified (or extended) version of Vim,


    including executables and/or source code, when the following four


    conditions are met:


    1) This license text must be included unmodified.


    2) The modified Vim must be distributed in one of the following five ways:


    a) If you make changes to Vim yourself, you must clearly describe in


    the distribution how to contact you. When the maintainer asks you


    (in any way) for a copy of the modified Vim you distributed, you


    must make your changes, including source code, available to the


    maintainer without fee. The maintainer reserves the right to


    include your changes in the official version of Vim. What the


    maintainer will do with your changes and under what license they


    will be distributed is negotiable. If there has been no negotiation


    then this license, or a later version, also applies to your changes.


    The current maintainer is Bram Moolenaar . If this


    changes it will be announced in appropriate places (most likely


    vim.sf.net, www.vim.org and/or comp.editors). When it is completely


    impossible to contact the maintainer, the obligation to send him


    your changes ceases. Once the maintainer has confirmed that he has


    received your changes they will not have to be sent again.


    b) If you have received a modified Vim that was distributed as


    mentioned under a) you are allowed to further distribute it


    unmodified, as mentioned at I). If you make additional changes the


    text under a) applies to those changes.


    c) Provide all the changes, including source code, with every copy of


    the modified Vim you distribute. This may be done in the form of a


    context diff. You can choose what license to use for new code you


    add. The changes and their license must not restrict others from


    making their own changes to the official version of Vim.


    d) When you have a modified Vim which includes changes as mentioned


    under c), you can distribute it without the source code for the


    changes if the following three conditions are met:





    - The license that applies to the changes permits you to distribute


    the changes to the Vim maintainer without fee or restriction, and


    permits the Vim maintainer to include the changes in the official


    version of Vim without fee or restriction.


    - You keep the changes for at least three years after last


    distributing the corresponding modified Vim. When the maintainer


    or someone who you distributed the modified Vim to asks you (in


    any way) for the changes within this period, you must make them


    available to him.


    - You clearly describe in the distribution how to contact you. This


    contact information must remain valid for at least three years


    after last distributing the corresponding modified Vim, or as long


    as possible.


    e) When the GNU General Public License (GPL) applies to the changes,


    you can distribute the modified Vim under the GNU GPL version 2 or


    any later version.


    3) A message must be added, at least in the output of the ":version"


    command and in the intro screen, such that the user of the modified Vim


    is able to see that it was modified. When distributing as mentioned


    under 2)e) adding the message is only required for as far as this does


    not conflict with the license used for the changes.


    4) The contact information as required under 2)a) and 2)d) must not be


    removed or changed, except that the person himself can make


    corrections.


    III) If you distribute a modified version of Vim, you are encouraged to use


    the Vim license for your changes and make them available to the


    maintainer, including the source code. The preferred way to do this is


    by e-mail or by uploading the files to a server and e-mailing the URL.


    If the number of changes is small (e.g., a modified Makefile) e-mailing a


    context diff will do. The e-mail address to be used is





    IV) It is not allowed to remove this license from the distribution of the Vim


    sources, parts of it or from a modified version. You may use this


    license for previous Vim releases instead of the license that they came


    with, at your option.


    View Slide

  23. Motivation

    View Slide

  24. Guide to Contribution

    View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. 2017

    View Slide

  29. First Issue

    View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. 2018

    View Slide

  35. View Slide

  36. View Slide

  37. View Slide

  38. 2019

    View Slide

  39. Contribute Again

    View Slide

  40. View Slide

  41. Requirements
    1. The license must have an SPDX identifier


    If your license isn't registered with SPDX, please request that it be added




    2. The license must be listed on one of the following approved


    lists of licenses:


    * List of OSI approved licenses


    * GNU's list of free licenses

    (*note: the license must be listed in one of the three


    "free" categories*)


    * Open Definition's list of conformant licenses


    3. The license must be used in at least *1,000* public repositories.


    This may be documented, for example, with a GitHub code search


    View Slide

  42. 1. The license must have an SPDX identifier


    If your license isn't registered with SPDX, please request that it be added




    2. The license must be listed on one of the following approved


    lists of licenses:


    * List of OSI approved licenses


    * GNU's list of free licenses

    (*note: the license must be listed in one of the three


    "free" categories*)


    * Open Definition's list of conformant licenses


    3. The license must be used in at least *1,000* public repositories.


    This may be documented, for example, with a GitHub code search


    View Slide

  43. 1. The license must have an SPDX identifier


    If your license isn't registered with SPDX, please request that it be added




    2. The license must be listed on one of the following approved


    lists of licenses:


    * List of OSI approved licenses


    * GNU's list of free licenses

    (*note: the license must be listed in one of the three


    "free" categories*)


    * Open Definition's list of conformant licenses


    3. The license must be used in at least *1,000* public repositories.


    This may be documented, for example, with a GitHub code search


    4. 3 notable projects using the license must be identified. These




    View Slide



  44. 2. The license must be listed on one of the following approved


    lists of licenses:


    * List of OSI approved licenses


    * GNU's list of free licenses

    (*note: the license must be listed in one of the three


    "free" categories*)


    * Open Definition's list of conformant licenses


    3. The license must be used in at least *1,000* public repositories.


    This may be documented, for example, with a GitHub code search


    4. 3 notable projects using the license must be identified. These


    must have straightforward LICENSE files which serve as examples


    newcomers can follow and that could be detected by licensee


    if it knew about the license.


    View Slide

  45. * GNU's list of free licenses

    (*note: the license must be listed in one of the three


    "free" categories*)


    * Open Definition's list of conformant licenses


    3. The license must be used in at least *1,000* public repositories.


    This may be documented, for example, with a GitHub code search


    4. 3 notable projects using the license must be identified. These


    must have straightforward LICENSE files which serve as examples


    newcomers can follow and that could be detected by licensee


    if it knew about the license.


    View Slide

  46. View Slide

  47. View Slide

  48. View Slide

  49. View Slide

  50. 3 Notable Projects
    • Vim


    • pathogen.vim by Tim Pope


    • TBD

    View Slide

  51. View Slide

  52. View Slide

  53. View Slide

  54. View Slide

  55. Substitutions

    View Slide

  56. Substitutions
    Vim →
    software?


    package?


    project?
    Vim Maintainer →
    softwareMaintainer?


    …etc

    View Slide

  57. Should I?

    View Slide

  58. View Slide

  59. Yes, But
    , But

    View Slide

  60. Not The Vim Maintainer

    View Slide

  61. Everything is Ready

    View Slide

  62. View Slide

  63. The Licensee

    View Slide

  64. Licensee
    • A Ruby Gem to detect under what license a project is distributed


    • Use data from choosealicense.com


    • Only detects picked licenses


    • Supports LICENSE, package
    fi
    le, statements in README, SPDX
    fi
    le …

    View Slide

  65. The matching method
    • Exact match(hash)


    • Sørensen–Dice coe
    ff i
    cient (aka Dice)


    • Similarity of two sets(word set)


    • String length

    View Slide

  66. View Slide

  67. The Issue
    • There are too many [project] in the license text


    • After replaced all substitutions to Vim. The con
    fi
    dence < 98

    View Slide

  68. Possible solutions
    • Add tolerance to the speci
    fi
    ed license


    • Use regex if there are substitutions


    • Dynamic tolerance based on the substitutions

    View Slide

  69. View Slide

  70. View Slide

  71. View Slide

  72. SPDX License List

    View Slide

  73. Software Package Data Exchange (SPDX)
    • Open standard for communicating software bill of material information


    • From Linux Foundation


    • An ISO standard (ISO/IEC 5962:2021)


    • Example

    View Slide

  74. SPDX License List
    • Most popular standard by SPDX


    • Standardize license full name, text, identi
    fi
    er


    • A canonical permanent URL for each license


    • Supported by most eco system(ex: npm, composer, cabal)

    View Slide

  75. View Slide

  76. View Slide

  77. View Slide

  78. First Pull Request

    View Slide

  79. View Slide

  80. View Slide

  81. View Slide

  82. View Slide

  83. The Result
    • Mike Linksvayer already create a PR to
    fi
    x the issue


    • Add extra tolerance to substitution
    fi
    elds

    View Slide

  84. View Slide

  85. vim-license.dev

    View Slide

  86. Vim License Gen
    • A web app help you generate Vim License Text
    fi
    le with custom project name

    View Slide

  87. View Slide

  88. View Slide

  89. 2020

    View Slide

  90. The 2nd/3rd Pull Request

    View Slide

  91. • Add LICENSE
    fi
    le to the notable projects


    • vim


    • pathogen.vim

    View Slide

  92. View Slide

  93. The Main Pull Request

    View Slide

  94. 3 notable projects
    • Vim


    • pathogen.vim by Tim Pope


    • vim-license-gen (aka vim-license.dev)

    View Slide

  95. View Slide

  96. Merged !!

    View Slide

  97. View Slide

  98. View Slide

  99. Wait for Licensee Update

    View Slide

  100. View Slide

  101. Wait for Licensee Release

    View Slide

  102. View Slide

  103. Wait for GitHub to Upgrade

    View Slide

  104. View Slide

  105. Wait and Wait
    • Give zero pressure to the GitHub


    • Wait and wait


    • Then one day, I test the Licensee again

    View Slide

  106. View Slide

  107. View Slide

  108. The Cause
    • The change of the similarity algorithm


    • No test case for Vim License

    View Slide

  109. Fix Licensee

    View Slide

  110. The Solution
    • Update the algorithm or manual decision parameter


    • Modi
    fi
    ed a parameter from 10 to 13


    • Add test case for Vim License

    View Slide

  111. View Slide

  112. The Result
    • Mike Linksvayer already create a PR to
    fi
    x the issue


    • For each substitution
    fi
    eld, provide 2 ch space not counting to the penalty


    • Merged at Oct 2020

    View Slide

  113. View Slide

  114. Wait Again

    View Slide

  115. View Slide

  116. Feb 2021

    View Slide

  117. GitHub Support

    View Slide

  118. View Slide

  119. View Slide

  120. Wait Again and Again

    View Slide

  121. View Slide

  122. Jan 2022

    View Slide

  123. GitHub Search by License

    View Slide

  124. View Slide

  125. Vim’s License Still Not Correct

    View Slide

  126. View Slide

  127. View Slide

  128. GitHub Support

    View Slide

  129. View Slide

  130. View Slide

  131. View Slide

  132. View Slide

  133. View Slide

  134. Summary

    View Slide

  135. • 2 issues


    • 5 pull requests


    • 1 mailing-list thread


    • 1 new web app


    • 2 GitHub support request


    • ~ 5 years

    View Slide

  136. Take away

    View Slide

  137. • How to start a CONTRIBUTION and how to communicate


    • What is SPDX and SPDX License List


    • What is Licensee and how it work


    • The deep understand of Vim License

    View Slide

  138. Thanks for You Listening

    View Slide

  139. Questions?

    View Slide