帰納と演繹の間を求めて：記号と離散構造の統計的機械学習

Slide 1

Slide 1 text

[email protected] https://itakigawa.github.io/

Slide 5

Slide 5 text

5 ( ) CODE-COMPLETION SYSTEMS OFFERING suggestions to a developer in their integrated development environment (IDE) have become the most frequently used kind of programmer assistance.1 When generating whole snippets of code, they typically use a large language model (LLM) to predict what the user might type next (the completion) from the context of what they are working on at the moment (the prompt).2 This system allows for completions at any position in Measuring GitHub Copilot’s Impact on Productivity DOI:10.1145/3633453 Case study asks Copilot users about its impact on their productivity, and seeks to find their perceptions mirrored in user data. BY ALBERT ZIEGLER, EIRINI KALLIAMVAKOU, X. ALICE LI, ANDREW RICE, DEVON RIFKIN, SHAWN SIMISTER, GANESH SITTAMPALAM, AND EDWARD AFTANDILIAN key insights AI pair-programming tools such as GitHub Copilot have a big impact on developer productivity. This holds for developers of all skill levels, with junior developers seeing the largest gains. The reported benefits of receiving AI suggestions while coding span the full range of typically investigated aspects of productivity, such as task time, product quality, cognitive load, enjoyment, and learning. Perceived productivity gains are reflected in objective measurements of developer activity. While suggestion correctness is important, the driving factor for these improvements appears to be not correctness as such, but whether the suggestions are useful as a starting point for further development. 54 COMMUNICATIONS OF THE ACM | MARCH 2024 | VOL. 67 | NO. 3 research the code, often spanning multiple lines at once. Potential benefits of generating large sections of code automatically are huge, but evaluating these systems is challenging. Offline evaluation, where the system is shown a par- tial snippet of code and then asked to complete it, is difficult not least because for longer completions there are many acceptable alternatives and no straightforward mechanism for labeling them automatically.5 An ad- ditional step taken by some research- ers3,21,29 is to use online evaluation and track the frequency of real users accepting suggestions, assuming that the more contributions a system makes to the developer’s code, the higher its benefit. The validity of this assumption is not obvious when con- sidering issues such as whether two short completions are more valuable than one long one, or whether review- ing suggestions can be detrimental to programming flow. Code completion in IDEs using language models was first proposed in Hindle et al.,9 and today neural syn- thesis tools such as GitHub Copilot, CodeWhisperer, and TabNine suggest code snippets within an IDE with the explicitly stated intention to increase a user’s productivity. Developer productivity has many aspects, and a re- cent study has shown that tools like these are helpful in ways that are only partially reflected by measures such as completion times for standardized tasks.23,a Alternatively, we can leverage the developers themselves as expert assessors of their own productivity. This meshes well with current think- ing in software engineering research suggesting measuring productivity on multiple dimensions and using self-reported data.6 Thus, we focus on studying perceived productivity. Here, we investigate whether usage measurements of developer interac- tions with GitHub Copilot can predict perceived productivity as reported by developers. We analyze 2,631 sur- a Nevertheless, such completion times are greatly reduced in many settings, often by more than half.16 MARCH 2024 | VOL. 67 | NO. 3 | COMMUNICATIONS OF THE ACM 55 ILLUSTRATION BY JUSTIN METZ • • • (explain) • (brushes) • • A B • • GitHub Copilot ( ) Comm. ACM. 67(3) https://doi.org/10.1145/3633453 GitHub Next / Copilot Labs https://githubnext.com/

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text