Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Feature Scattering in the Large: A Longitudinal Study of Linux Kernel Device Drivers (Modularity 2015) - Best Paper Award (slides and presentation by L. Passos)

Feature Scattering in the Large: A Longitudinal Study of Linux Kernel Device Drivers (Modularity 2015) - Best Paper Award (slides and presentation by L. Passos)

Feature code is often scattered across wide parts of the code base. But, scattering is not necessarily bad if used with care—in fact, systems with highly scattered features have evolved successfully over years. Among others, feature scattering allows developers to circumvent limitations in programming languages and system design. Still, little is known about the characteristics governing scattering, which factors influence it, and practical limits in the evolution of large and long-lived systems. We address this issue with a longitudinal case study of feature scattering in the Linux kernel. We quantitatively and qualitatively analyze almost eight years of its development history, focusing on scattering of device-driver features. Among others, we show that, while scattered features are regularly added, their proportion is lower than non-scattered ones, indicating that the kernel architecture allows most features to be integrated in a modular manner. The median scattering degree of features is constant and low, but the scattering-degree distribution is heavily skewed. Thus, using the arithmetic mean is not a reliable threshold to monitor the evolution of feature scattering. When investigating influencing factors, we find that platform-driver features are 2.5 times more likely to be scattered across architectural (subsystem) boundaries when compared to nonplatform ones. Their use illustrates a maintenance-performance trade-off in creating architectures as for Linux kernel device drivers

ASERG, DCC, UFMG

March 18, 2015
Tweet

More Decks by ASERG, DCC, UFMG

Other Decks in Research

Transcript

  1. Feature Scattering in the Large: A Longitudinal
    Study of Linux Kernel Device Drivers
    Leonardo Passos
    [email protected]
    University of Waterloo
    Canada
    Modularity’15 Research Track
    1
    Krzysztof Czarnecki
    [email protected]
    University of Waterloo
    Canada
    Thorsten Berger
    [email protected]
    University of Waterloo
    Canada
    Sven Apel
    [email protected]
    University of Passau
    Germany
    Jesús Padilla
    [email protected]
    University of Waterloo
    Canada
    Marco Tulio Valente
    [email protected]
    Federal University of Minas Gerais
    Brazil

    View Slide

  2. 2
    Feature = configuration option
    Feature CONFIG_ACPI is
    scattered across the
    IA-64 CPU code

    View Slide

  3. Hinders parallel development
    3

    View Slide

  4. 4

    View Slide

  5. Leads to code tangling, negatively affecting code
    understanding
    5

    View Slide

  6. Nonetheless, feature scattering is a popular
    mechanism to support new features
    6

    View Slide

  7. Quick solution understood by all developers
    7

    View Slide

  8. No modules, no interfaces, no design patterns, etc.
    8

    View Slide

  9. Allows overcoming modularity limitations in existing
    programming languages
    (not every feature can be modular)
    9

    View Slide

  10. 10
    Many large & long-lived software systems have shown that is possible
    to continuously-evolve in the face of feature scattering
    axTLS
    Coreboot SeaBIOS
    FreeBSD

    View Slide

  11. However, no empirical study has investigated
    feature scattering in the evolution of large
    and long-lived systems
    11

    View Slide

  12. Such kind of studies are key
    in creating a general theory on how to
    effectively manage feature scattering
    12

    View Slide

  13. 13
    Scattering
    How could a theory help?
    Scattering is harmful
    Scattering is not
    necessarily bad
    (easy & cheap solution)

    View Slide

  14. Many empirical works have to be performed
    before devising such a theory
    14

    View Slide

  15. “A journey of a thousand miles must begin
    with the first step”
    15

    View Slide

  16. Starting point: the Linux kernel
    16

    View Slide

  17. 17
    > 13,000 features
    feature-oriented system
    continuously evolving

    View Slide

  18. 18
    r = 0.996

    View Slide

  19. 19
    Scope: device-driver features

    View Slide

  20. 20
    In our analyses, we consider scattering of
    features in terms of referring ifdefs

    View Slide

  21. Scattering degree (SD)
    of a feature f
    Nbr. of ifdefs
    referring to f
    21

    View Slide

  22. 22
    #ifdef, #ifndef, #elif, #if
    ifdefs

    View Slide

  23. 23
    #ifdef CONFIG_ACPI || CONFIG_PM
    #ifndef CONFIG_ACPI
    #if defined(CONFIG_ACPI)
    ...
    A feature f is scattered if its SD(f) ≥ 2

    View Slide

  24. From the kernel evolution history,
    some limits clearly emerge...
    24

    View Slide

  25. % of scattered features is nearly constant
    (~ 18%)
    25

    View Slide

  26. Local vs global scattering
    26

    View Slide

  27. A feature is locally scattered when its referring
    ifdefs are restricted to files in the
    driver subsystem only
    27

    View Slide

  28. A feature is globally scattered when there is
    at least one referring ifdef in a file outside
    the driver subsystem
    28

    View Slide

  29. Stabilization
    (~ 43%)
    29
    % of globally scattered features is increasing,
    but ≤ 43% at all times

    View Slide

  30. What about the scattering degree
    of features?
    30

    View Slide

  31. 31

    View Slide

  32. For 50% (median) of scattered-driver
    features, SD ≤ 4
    32

    View Slide

  33. For 75%, SD ≤ 8
    33

    View Slide

  34. Non-outlier features: 8 < SD ≤ 55
    34

    View Slide

  35. 35
    Outliers: 35 ≤ SD ≤ 377

    View Slide

  36. There appears to exist different groups
    in Linux, with different SD-limits
    36

    View Slide

  37. Group 1 (low SD): SD ≤ 4
    37
    50% of scattered-driver features

    View Slide

  38. 38
    Group 2 (medium SD): 5 ≤ SD ≤ 8
    25% of scattered-driver features

    View Slide

  39. 39
    Group 3 (high SD): SD > 8
    Non-outliers: ~ 22.5%
    Outliers: ~ 2.5% (max SD = 377)

    View Slide

  40. 40
    … a single SD-limit controlling all features
    does not seem to apply

    View Slide

  41. 41
    In summary, …

    View Slide

  42. 42
    % of scattered-driver features ~ 18%
    % of globally scattered-driver features ≤ 43%
    SD is not defined by a single absolute value,
    although most features (75%) have SD ≤ 8

    View Slide

  43. 43
    75% of scattered-driver features have SD ≤ 8
    no more than 25% of features
    have SD > 8 (relative limit)

    View Slide

  44. What about possible factors influencing
    the observed scattering?
    44

    View Slide

  45. Platform-driver features: features whose
    drivers support devices that cannot be
    discovered by the CPU
    45
    Infrastructure-driver features: abstractions
    in the O.S domain (e.g., ACPI)

    View Slide

  46. The analyses of a random-sample
    shows statistically significant results :
    46
    Platform-driver features are 2.5x more
    likely to being globally scattered than
    non-platform ones

    View Slide

  47. 47
    In the sample, global scattering of
    platform-driver features occurs
    mostly in the arch subsystem
    Tight relationship between platform-
    driver features and CPU-dependent code
    (hard to modularize)

    View Slide

  48. 48
    In general, there is no relationship
    between infrastructure-driver features
    and global-/local-scattering

    View Slide

  49. 49
    In general, there is no relationship between
    being a platform-driver or infrastructure
    feature in scattering degree

    View Slide

  50. 50
    There is, however, a relationship
    between extreme scattering and
    infrastructure-related driver features

    View Slide

  51. 51
    9/15 are
    infrastructure
    extreme scattering

    View Slide

  52. 52
    Wrapping up…

    View Slide

  53. 53
    In the Linux kernel
    Most driver features are not scattered (~ 82%)
    C-language modularity constructs + the kernel
    plugin-based architecture are “good enough”

    View Slide

  54. 54
    In the Linux kernel
    When the existing solutions are not good
    enough, developers scatter features in code

    View Slide

  55. 55
    In the Linux kernel
    Scattering seems to respect some limits
    (consciously enforced???)

    View Slide

  56. 56
    Next steps

    View Slide

  57. 57
    Conduct interviews
    Are the observed limits consciously
    enforced in practice?
    If so, how, and how were they set-up?

    View Slide

  58. 58
    If not, why do they occur?
    Do they indirectly stem from some
    development practice or process?

    View Slide

  59. 59
    Investigate whether the limits we found
    also apply to other systems
    (ongoing collaborative work)

    View Slide

  60. 60
    Thanks for listening
    :)
    http://lpassos.bitbucket.org/modularity15/

    View Slide