Slide 1

Slide 1 text

Thorsten Leemhuis Make Linux developers fix your kernel bug

Slide 2

Slide 2 text

intro; sometimes reports on kernel bugs will just fizzle out ๐Ÿ˜•

Slide 3

Slide 3 text

intro; in rare cases, developers will be unable to fix an issues ๐Ÿ˜“

Slide 4

Slide 4 text

intro; kernel contains code nobody is really responsible for ๐Ÿ˜“

Slide 5

Slide 5 text

intro; the Linux kernel is made by volunteers

Slide 6

Slide 6 text

intro; you can't really force volunteers to do work they can't do or don't want to do

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

intro; Linux kernel developers are obliged to fix some issues! ๐Ÿ˜ƒ

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ if you write a decent report!if

Slide 12

Slide 12 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ unless life gets in the way :-/

Slide 13

Slide 13 text

intro; then bad bug reports are the first developers will let fall through the cracks!

Slide 14

Slide 14 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ unless life gets in the way :-/

Slide 15

Slide 15 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ if you write a decent report!

Slide 16

Slide 16 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ if you write a decent report!

Slide 17

Slide 17 text

intro; that's how you make most developers fix your bug, if they are able to

Slide 18

Slide 18 text

intro; you'll also learn when you can insist on a fix

Slide 19

Slide 19 text

intro; and how to spot issues unlikely to be fixed

Slide 20

Slide 20 text

[ act 1 ]

Slide 21

Slide 21 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 28

Slide 28 text

decent report; vanilla; most kernels used in the wild are not vanilla often heavily modified & enhanced

Slide 29

Slide 29 text

decent report; vanilla; makes most distro kernel's unsuitable for reporting issues Linux kernel devs.

Slide 30

Slide 30 text

decent report; vanilla; you might want to report the issue to your Linux distributor

Slide 31

Slide 31 text

decent report; vanilla; or install a vanilla kernel yourself instead โ€“ for example a pre-built one

Slide 32

Slide 32 text

decent report; vanilla; or compile a kernel yourself hint: `make olddefconfig localmodconfig` makes things easier and relatively fast

Slide 33

Slide 33 text

decent report; vanilla; check if issue happens with a vanilla kernel, too

Slide 34

Slide 34 text

decent report; vanilla; focus on this kernel in your report, forget the distro's mentioning the distro's even briefly often just complicates report unnecessarily

Slide 35

Slide 35 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 38

Slide 38 text

decent report; freshness; test with latest mainline (aka -RC) release

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

decent report; freshness; focus your report on the freshest kernel you tested mentioning older briefly somewhere can be okay, but often just make report hard to grasp

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

decent report; freshness; some bugfixes are never backported to stable/longterm kernel series

Slide 45

Slide 45 text

decent report; freshness; makes longterm (LTS) kernels quite unsuitable for reporting

Slide 46

Slide 46 text

decent report; freshness; exception: regressions within a stable or longterm series something breaks 5.15.10 -> 5.15.11

Slide 47

Slide 47 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

decent report; integrity;

Slide 52

Slide 52 text

decent report; integrity;

Slide 53

Slide 53 text

decent report; integrity; Nvidia's proprietary graphics driver

Slide 54

Slide 54 text

decent report; integrity; all out-of-tree drivers are a problem incl. Nvidia's new open kernel driver

Slide 55

Slide 55 text

decent report; integrity; deinstall such drivers, reboot, check if issue still present and recheck the tainted flag!

Slide 56

Slide 56 text

decent report; integrity; many other incidents can taint kernel

Slide 57

Slide 57 text

decent report; integrity; an "Oops" for example

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

decent report; integrity; tainted kernels most of the time unsuitable for reporting bugs

Slide 61

Slide 61 text

decent report; integrity; big exception: the first Oops, warning, etc.

Slide 62

Slide 62 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

Slide 63

Slide 63 text

https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html

Slide 64

Slide 64 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity [continued] d) depict the problem adequately decent report; integrity;

Slide 65

Slide 65 text

decent report; integrity; is your hardware working reliably and as specified? memtest: great idea! overclocking: stupid idea!

Slide 66

Slide 66 text

decent report; integrity; issue with file-system? fsck the volume!

Slide 67

Slide 67 text

decent report; integrity; check `dmesg -H` look out for anything red or bold

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 72

Slide 72 text

No content

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

No content

Slide 75

Slide 75 text

No content

Slide 76

Slide 76 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS

Slide 77

Slide 77 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS

Slide 78

Slide 78 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS

Slide 79

Slide 79 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS

Slide 80

Slide 80 text

decent report; right place; sadly MAINTAINERS contains more than 2000 entries:-/

Slide 81

Slide 81 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 82

Slide 82 text

No content

Slide 83

Slide 83 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 84

Slide 84 text

decent report; depiction; "how to write a good report" worth its own, quite long talk

Slide 85

Slide 85 text

decent report; depiction; a balancing act

Slide 86

Slide 86 text

decent report; depiction; think of it as asking for a favor

Slide 87

Slide 87 text

decent report; depiction; a favor from someone that doesn't have to help you

Slide 88

Slide 88 text

decent report; depiction; a favor from a someone that might be stressed or really short on time

Slide 89

Slide 89 text

decent report; depiction; hence, make your depiction easy to grasp for recipients

Slide 90

Slide 90 text

decent report; depiction; describe the problem neither to brief nor as a novella

Slide 91

Slide 91 text

decent report; depiction; mention version, vanilla, and taint status

Slide 92

Slide 92 text

decent report; depiction; upload & link clearly relevant logs or attach them but *don't* overload the report!

Slide 93

Slide 93 text

decent report; depiction; often relevant: output from `dmesg` & `lspci -nn`; maybe kernel's '.config', too

Slide 94

Slide 94 text

decent report; depiction; add two or three sentences summarizing the situation on top of your depiction

Slide 95

Slide 95 text

decent report; depiction; use a even more condensed and crystal-clear depiction as subject

Slide 96

Slide 96 text

decent report; depiction; in general: don't over-think or overdo your report!

Slide 97

Slide 97 text

decent report; depiction; short report will often do getting the basics right (vanilla, fresh version, no taint, easy to grasp, ...) is important

Slide 98

Slide 98 text

decent report; depiction; check for existing reports about the problem to join check , lore.kernel.lorg/all/, and bugzilla.kernel.org

Slide 99

Slide 99 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

Slide 100

Slide 100 text

decent report; depiction; it tells you to check what kind of issue you deal with

Slide 101

Slide 101 text

[ act 2 ]

Slide 102

Slide 102 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions

Slide 103

Slide 103 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue;

Slide 104

Slide 104 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix;

Slide 105

Slide 105 text

https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html

Slide 106

Slide 106 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix;

Slide 107

Slide 107 text

kind of issue; mustfix; devastating; something really really bad data is lost or damaged, hardware is bricked, ...

Slide 108

Slide 108 text

kind of issue; mustfix; devastating; make impact & urgency obvious in your report

Slide 109

Slide 109 text

No content

Slide 110

Slide 110 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix;

Slide 111

Slide 111 text

kind of issue; mustfix; regressions; something breaks when updating the kernel say from 5.15 -> 5.16 or from 5.17.3 -> 5.17.4

Slide 112

Slide 112 text

kind of issue; mustfix; regressions; first rule of Linux kernel development: "we don't cause regressions"

Slide 113

Slide 113 text

No content

Slide 114

Slide 114 text

https://linux-regtracking.leemhuis.info/about/

Slide 115

Slide 115 text

No content

Slide 116

Slide 116 text

No content

Slide 117

Slide 117 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-regressions.html

Slide 118

Slide 118 text

kind of issue; mustfix; regressions; make it obvious your report is about a regression

Slide 119

Slide 119 text

kind of issue; mustfix; regressions; CC for forward the report to [email protected]

Slide 120

Slide 120 text

kind of issue; mustfix; regressions; fine print(1): only userland interfaces matter [it's thus not a regression if your out-of-tree kernel module breaks]

Slide 121

Slide 121 text

kind of issue; mustfix; regressions; fine print(2): the build config of the newer kernel version must be similar to the older one

Slide 122

Slide 122 text

kind of issue; mustfix; regressions; fine print(3): you often will be asked to find the culprit yourself

Slide 123

Slide 123 text

kind of issue; mustfix; regressions; if you find the culprit, a fix is pretty much guaranteed

Slide 124

Slide 124 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix;

Slide 125

Slide 125 text

No content

Slide 126

Slide 126 text

No content

Slide 127

Slide 127 text

2. the kind of issue at hand a) issues someone is obliged to address b) issues most likely to be ignored kind of issue;

Slide 128

Slide 128 text

2. the kind of issue at hand b) issues most likely to be ignored I. known deficits kind of issue; unlikely;

Slide 129

Slide 129 text

kind of issue; unlikely; deficits; Linux contains many incomplete drivers

Slide 130

Slide 130 text

kind of issue; unlikely; deficits; might lack a volunteer with enough time and/or motivation to improve it

Slide 131

Slide 131 text

kind of issue; unlikely; deficits; or some real-world issue prevents improvements

Slide 132

Slide 132 text

kind of issue; unlikely; deficits; check internet and docs for known deficits

Slide 133

Slide 133 text

2. the kind of issue at hand b) issues most likely to be ignored I. known deficits kind of issue; unlikely;

Slide 134

Slide 134 text

2. the kind of issue at hand b) issues most likely to be ignored I. known deficits II. code without an active maintainer kind of issue; unlikely;

Slide 135

Slide 135 text

kind of issue; unlikely; w/o maintainer; code often remains, as it useful for people

Slide 136

Slide 136 text

https://www.kernel.org/doc/html/latest/process/maintainers.html

Slide 137

Slide 137 text

kind of issue; unlikely; w/o maintainer; sending at least a quick brief report definitely a good idea

Slide 138

Slide 138 text

https://www.kernel.org/doc/html/latest/process/maintainers.html

Slide 139

Slide 139 text

kind of issue; unlikely; w/o maintainer; sending at least a quick brief report likely worth it

Slide 140

Slide 140 text

2. the kind of issue at hand a) issues someone is obliged to address b) issues most likely to be ignored

Slide 141

Slide 141 text

2. the kind of issue at hand a) issues someone is obliged to address b) issues most likely to be ignored c) all the other issues

Slide 142

Slide 142 text

kind of issue; unlikely; the rest; the quality of your report!

Slide 143

Slide 143 text

[ grand finale ]

Slide 144

Slide 144 text

take this with you

Slide 145

Slide 145 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

Slide 146

Slide 146 text

takeaways; almost all kernel developer are volunteers

Slide 147

Slide 147 text

takeaways; they should act on every bug report, but can and will ignore bad reports

Slide 148

Slide 148 text

takeaways; act accordingly and sent a decent report, then you'll be heard

Slide 149

Slide 149 text

takeaways; (1) check what kind of issue you deal with, as itโ€ฆ

Slide 150

Slide 150 text

takeaways; (a) might save you from wasting time on reporting known deficits

Slide 151

Slide 151 text

takeaways; (b) tells you what to expect from developers

Slide 152

Slide 152 text

takeaways; (2) do your homework

Slide 153

Slide 153 text

takeaways; (a) test and report with a *vanilla* kernel

Slide 154

Slide 154 text

takeaways; (b) test with a fresh mainline kernel

Slide 155

Slide 155 text

takeaways; (c) rule out local interferences

Slide 156

Slide 156 text

takeaways; (d) check MAINTAINERS to submit the report to the right place

Slide 157

Slide 157 text

takeaways; (e) write a friendly and decent report easy to gasp for others

Slide 158

Slide 158 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

Slide 159

Slide 159 text

takeaways; chances then are pretty good someone will help you

Slide 160

Slide 160 text

takeaways; and nearly perfect, if you report a bisected regression

Slide 161

Slide 161 text

takeaways; that's how you make the Linux developers fix kernel bugs they are able to fix

Slide 162

Slide 162 text

No content

Slide 163

Slide 163 text

questions?

Slide 164

Slide 164 text

Thorsten Leemhuis mail: [email protected] GPG Key: 0x72B6E6EF4C583D2D social media: @kernellogger, @knurd42, @knurd42rhfc, @thleemhuis and @thleemhuisfoss on #twitter & #friendica #EOF

Slide 165

Slide 165 text

Thorsten Leemhuis Make Linux developers fix your kernel bug * let me start by being fully honest * the title promises a little more than the reality can fulfill [simple reason]

Slide 166

Slide 166 text

intro; sometimes reports on kernel bugs will just fizzle out ๐Ÿ˜• * this will always happen [not even worsed news yet]

Slide 167

Slide 167 text

intro; in rare cases, developers will be unable to fix an issues ๐Ÿ˜“ [one more bad thing]

Slide 168

Slide 168 text

intro; kernel contains code nobody is really responsible for ๐Ÿ˜“ [there is a simple reason for these three aspects]

Slide 169

Slide 169 text

intro; the Linux kernel is made by volunteers * note: volunteers != hobbyists * some of them hobbyists * most of them employees * but employees from companies contributing voluntarily [thing with them is]

Slide 170

Slide 170 text

intro; you can't really force volunteers to do work they can't do or don't want to do * motivate them * which Linus Torvalds actually does * only works up to a point * risk alienating them * might make them stop contributing * companies might decide to team up and fork [analogy helps understanding this situation]

Slide 171

Slide 171 text

* Linux is a bit like an playground built and maintained completely by a volunteers * some of those volunteers are hobbyist that wanted to build something for their kids, to learn new stuff, or enjoy helping * many of those volunteers are actually employees from local or international companies that see some benefit in helping โ€“ for example if they have a coffee or gift shop nearby [but the thing is: sooner or later all all hobbyists and companies move to something new, as their interest and priorities change over time]

Slide 172

Slide 172 text

* say because kids became adults or companies closed * some volunteer then vanish * others still help when at least kindly asked * often some other volunteer will step in * but you can force them * luckily their often is no need to * unless some play structure breaks or is found to be dangerous [and that's the same with Linux and the reason]

Slide 173

Slide 173 text

intro; Linux kernel developers are obliged to fix some issues! ๐Ÿ˜ƒ * if they don't they will be looked at like this

Slide 174

Slide 174 text

* luckily things seldom breaks or become dangerous, as Software doesn't decay like play structures on a playground, ;-) [more good news: developer should fix other issues as well; and most]

Slide 175

Slide 175 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ if you write a decent report!if * as most feel proud of what they build and want to ensure it works well * thing is:

Slide 176

Slide 176 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ unless life gets in the way :-/ * the particular developer might be short on time * stressed, sick, overwhelmed with reports or the boss forces the developer to focus on other things * thing is: that happens frequently * I guess you all known these from your life [and thenโ€ฆ]

Slide 177

Slide 177 text

intro; then bad bug reports are the first developers will let fall through the cracks! * happens quite often * most kernel devs have a lot on their plate already * it's in your hand to prevent this fate [so let's reframe...]

Slide 178

Slide 178 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ unless life gets in the way :-/ [as it's more like this]

Slide 179

Slide 179 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ if you write a decent report! [emphasis]

Slide 180

Slide 180 text

intro; developers will gladly address most issues in their code, ๐Ÿ˜ƒ if you write a decent report! [that's why I'll tell you how to write one]

Slide 181

Slide 181 text

intro; that's how you make most developers fix your bug, if they are able to [in addition โ€ฆ]

Slide 182

Slide 182 text

intro; you'll also learn when you can insist on a fix * in case such a report isn't acted upon

Slide 183

Slide 183 text

intro; and how to spot issues unlikely to be fixed * safe yourself trouble [which concludes the intro and brings usโ€ฆ]

Slide 184

Slide 184 text

[ act 1 ]

Slide 185

Slide 185 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately

Slide 186

Slide 186 text

* let me please stretch the playground analogy a bit further * for two reasons [first: the example is to small; think of something bigger]

Slide 187

Slide 187 text

* these days Linux more like a really big amusement park * even bigger than this * let's call it "Linus land" * no entry free * doesn't need staff * built, maintained, and constantly improved by volunteers * that's more accurate [the second: the kernel is immaterial ]

Slide 188

Slide 188 text

* Linux is this more like an freely available ebook with instructions how to build your own "Linus land" * maintained by volunteers * and everyone everywhere can put the book into a gigantic 3d printer to build their own "Linus land" within a few minutes * or update theirs in a few minutes * sorry, a bit fat fetched, but good real life analogies for Software are hard to come by [got that? okay]

Slide 189

Slide 189 text

* you visit some park build from the book * your kid is injured on an water coaster a really good friend from school days designed * you tell your friend, who's living 2000 km away and just got a kid * friend checks the instructions for hours * can't find anything and reluctantly flies over, * notices that a few things look slightly different

Slide 190

Slide 190 text

* turns out: the one that built that park modified the book before building the park * bigger water pipes and higher water pressure were used to "improve the performance", which lead to the accident * your friend travels home really annoyed, he wasted money and hours and was blamed for an accident that's someone else fault * you don't want to do that to a friend, don't you? * that's way you don't want to do that to the volunteers that ,make Linux [and that's why want toโ€ฆ]

Slide 191

Slide 191 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; * vanilla == built from sources as distributed by kernel.org [thing is: most kernels...]

Slide 192

Slide 192 text

decent report; vanilla; most kernels used in the wild are not vanilla often heavily modified & enhanced * especially those from RHEL, SLE, and Ubuntu kernels [ such modification make... ]

Slide 193

Slide 193 text

decent report; vanilla; makes most distro kernel's unsuitable for reporting issues Linux kernel devs. * most kernel devs don't care at all about bugs with them * small mods can have a big impact * that's why some devs even reject bugs from distros that use a lightly patched kernel, like Fedora

Slide 194

Slide 194 text

decent report; vanilla; you might want to report the issue to your Linux distributor * warning: but most of the time it will be a dead end, as they don't have the resources to deal with all the reports they got [that's why you mightโ€ฆ]

Slide 195

Slide 195 text

decent report; vanilla; or install a vanilla kernel yourself instead โ€“ for example a pre-built one * pretty easy * available for all the big distros * and a few actually use them directly [there is another option]

Slide 196

Slide 196 text

decent report; vanilla; or compile a kernel yourself hint: `make olddefconfig localmodconfig` makes things easier and relatively fast * lots of howtos on the net * use those with the mentioned commands [after installing vanillaโ€ฆ]

Slide 197

Slide 197 text

decent report; vanilla; check if issue happens with a vanilla kernel, too

Slide 198

Slide 198 text

decent report; vanilla; focus on this kernel in your report, forget the distro's mentioning the distro's even briefly often just complicates report unnecessarily

Slide 199

Slide 199 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; * concludes this point * next one is related

Slide 200

Slide 200 text

* you build a park and complain to your friend about a problem with an attraction designed by the friend * checks unsuccessfully and flies over * turns out: you used a two year old book that had a bug eliminated 18 months ago * friend was not aware of the bug as it was caused by the infrastructure used by friend's attraction * or accidentally fixed it with a big redesign or improvement * or fixed it and forgot about it [you don't want to do this to a friend, don't you?]

Slide 201

Slide 201 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; * kernel changes a lot all the time * bug might be fixed already [what qualifies as fresh?]

Slide 202

Slide 202 text

decent report; freshness; test with latest mainline (aka -RC) release * every bugfix has to land here first * no, it's testing RCs is dangerous, they are pretty stable * and you have backups, don't you? [find it on kernel.org]

Slide 203

Slide 203 text

* ignore the big yellow field * look at the top of table * pick that version [unless it looks like]

Slide 204

Slide 204 text

* then test the latest stable release [but when a rc-kernel exisitsโ€ฆ]

Slide 205

Slide 205 text

* you better avoid the latest stable * developer will wonder if the bug was fixed already by someone * already increases the chances your report might be ignored * while it's not ideal to use such kernel, but not totally bad * okay as fallback * definitely better reporting with this than not at all

Slide 206

Slide 206 text

decent report; freshness; focus your report on the freshest kernel you tested mentioning older briefly somewhere can be okay, but often just make report hard to grasp [one more thing: don't use a longterm kernel]

Slide 207

Slide 207 text

* not even when a new version was released today [because]

Slide 208

Slide 208 text

decent report; freshness; some bugfixes are never backported to stable/longterm kernel series * sometimes that's simply to risky * quite a few known bugs there

Slide 209

Slide 209 text

decent report; freshness; makes longterm (LTS) kernels quite unsuitable for reporting * still better than not reporting at all * but there is a high risk that your report will not lead to anything * depends on the developer [no rule without...]

Slide 210

Slide 210 text

decent report; freshness; exception: regressions within a stable or longterm series something breaks 5.15.10 -> 5.15.11 * then it's okay to test the latest version from that series

Slide 211

Slide 211 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; [another important aspect follows]

Slide 212

Slide 212 text

* accidents regularly in your own up2date Linus land * say water and roller coasters stop somewhere along the track often * friend can't explain things and flies over * spot a mobile attraction in a corner of your park you allowed to come by every day and use the park's infra * friend notices the workers of the mobile attraction even modified some water pipes in the park * friend suspects the power grid is able to handle the extra load * but is not allowed to look closer at the mobile attraction, as owners consider it their "trade secret" [you don't want to annoy friends like that]

Slide 213

Slide 213 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; * IOW: make sure kernel remains vanilla when used and is healthy

Slide 214

Slide 214 text

* kernel can detect some integrity problems itself [checkโ€ฆ]

Slide 215

Slide 215 text

decent report; integrity; * this is how is should look like [and not like...]

Slide 216

Slide 216 text

decent report; integrity; * everything other than a 0: * kernel likely unsuitable for reporting [one popular thing that can cause a "1" here]

Slide 217

Slide 217 text

decent report; integrity; Nvidia's proprietary graphics driver * uses the kernel in unexpected ways and even changes it * that's why most kernel developers don't care about reports with kernels using this drivers [but the thing is]

Slide 218

Slide 218 text

decent report; integrity; all out-of-tree drivers are a problem incl. Nvidia's new open kernel driver * kernel not vanilla anymore * taint number for FOSS drivers just different [that's why you...]

Slide 219

Slide 219 text

decent report; integrity; deinstall such drivers, reboot, check if issue still present and recheck the tainted flag! * then you are free to proceed with reporting [but note, there are]

Slide 220

Slide 220 text

decent report; integrity; many other incidents can taint kernel

Slide 221

Slide 221 text

decent report; integrity; an "Oops" for example * Oops = a critical error that could was detected, intercepted, and contained * kernel can continue, but is an undefined state * which can lead to subsequent faults * and thus considered unreliable * taint flag indicates that

Slide 222

Slide 222 text

[BTW: the oops shows if kernel is tainted]

Slide 223

Slide 223 text

[in the end: tainted kernels in the endโ€ฆ]

Slide 224

Slide 224 text

decent report; integrity; tainted kernels most of the time unsuitable for reporting bugs [exception]

Slide 225

Slide 225 text

decent report; integrity; big exception: the first Oops, warning, etc. [the kernel's docs explain this in more detail]

Slide 226

Slide 226 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html * reporting issues * section * links to a page dedicated to tainted kernels * which has a script and a table to decode the taint number/flags

Slide 227

Slide 227 text

https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html [there is more about integrity]

Slide 228

Slide 228 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity [continued] d) depict the problem adequately decent report; integrity; * there are things the kernel can't detect * that's why you better want to think about a few other things as well

Slide 229

Slide 229 text

decent report; integrity; is your hardware working reliably and as specified? memtest: great idea! overclocking: stupid idea!

Slide 230

Slide 230 text

decent report; integrity; issue with file-system? fsck the volume!

Slide 231

Slide 231 text

decent report; integrity; check `dmesg -H` look out for anything red or bold [looks like this]

Slide 232

Slide 232 text

* it might tell you what's wrong * might give you a error msg to google * and save everyone a lot of time

Slide 233

Slide 233 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 234

Slide 234 text

* you have a valid problem but only mention it on a school reunion where the friend later got pretty drunk and headed off with love interest from the school days * or you reported it via a chat, message board, or forum on a website you know your friend used to visit when you both where young * even after months or years, your friend didn't do anything to fix the problem * that's your fault, as you friend might not visit the website anymore * and maybe someone else is responsible anwayway these days

Slide 235

Slide 235 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; * web-forums definitely won't work * distro's bug tracker often a dead end as well [sadly. most of the time bugzilla.kernel.org is the wrong place, too]

Slide 236

Slide 236 text

* might looks like the central bug tracker

Slide 237

Slide 237 text

* but it's not, which you lean when you follow that link

Slide 238

Slide 238 text

bugzilla situation: it's complicated * set up by some people that thought it was a good idea * some devs liked it and started using it * but many (most?) devs never liked the idea * didn't really fit into thee email based work-flow * the idea was to have volunteers as go-between for such subsystems/maintainers * that never really worked out * that's why even today a lot of reports never reach the responsible developers (and are thus ignored) * that's why bugzilla.kernel.org often is a bad idea [instead do, what...]

Slide 239

Slide 239 text

* used by developers to find each others contact addresses

Slide 240

Slide 240 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS [most of the entries]

Slide 241

Slide 241 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS * mention the maintainer and the list * just sent your report by mail there * always CC the lists! * most prefer this way and it should always work * some subsystems uses a bug-tracker * MAINTAINERS file mentions those few that do

Slide 242

Slide 242 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS * mainly ACPI, PCI, and PM * about 20 out of more than 2000 entries [there are also a few that use other bug-trackers]

Slide 243

Slide 243 text

https://www.kernel.org/doc/html/latest/process/maintainers.html https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/MAINTAINERS * graphics drivers for AMD, Intel, etc.

Slide 244

Slide 244 text

decent report; right place; sadly MAINTAINERS contains more than 2000 entries:-/ * why are things so complicated and bugzilla.kernel.org? * not design, just happened over time * and no volunteer in sight to bring order into this

Slide 245

Slide 245 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report; [brings us to the last point]

Slide 246

Slide 246 text

* imagine your friend showing you a bug report from someone you both both went to school with, but never really liked * you read subject and first para of a report and don't get the slightest idea what this is all about * and the whole text is confusing and full of unnecessary or distracting details * and has five attachments and ten links * and is written in a unfriendly, demanding, and bearish way * reminder, friend can ignore this without consequences * would you suggest to do that?

Slide 247

Slide 247 text

1. create a decent report a) ensure your kernel is vanilla b) ensure your kernel is fresh c) ensure your kernel's and system's integrity d) submit your report to the right place e) depict the problem adequately decent report;

Slide 248

Slide 248 text

decent report; depiction; "how to write a good report" worth its own, quite long talk

Slide 249

Slide 249 text

decent report; depiction; a balancing act

Slide 250

Slide 250 text

decent report; depiction; think of it as asking for a favor

Slide 251

Slide 251 text

decent report; depiction; a favor from someone that doesn't have to help you

Slide 252

Slide 252 text

decent report; depiction; a favor from a someone that might be stressed or really short on time

Slide 253

Slide 253 text

decent report; depiction; hence, make your depiction easy to grasp for recipients

Slide 254

Slide 254 text

decent report; depiction; describe the problem neither to brief nor as a novella

Slide 255

Slide 255 text

decent report; depiction; mention version, vanilla, and taint status * avoids doubts * mention environment (distro, hw if relevant)

Slide 256

Slide 256 text

decent report; depiction; upload & link clearly relevant logs or attach them but *don't* overload the report! * if something missing, developer will ask for it

Slide 257

Slide 257 text

decent report; depiction; often relevant: output from `dmesg` & `lspci -nn`; maybe kernel's '.config', too * depends on the issue [then]

Slide 258

Slide 258 text

decent report; depiction; add two or three sentences summarizing the situation on top of your depiction * really important to get right * developers get a lot of reports * many will stop reading after first para [some don't even get that far]

Slide 259

Slide 259 text

decent report; depiction; use a even more condensed and crystal-clear depiction as subject * also really important to get right * most people will only read that

Slide 260

Slide 260 text

decent report; depiction; in general: don't over-think or overdo your report!

Slide 261

Slide 261 text

decent report; depiction; short report will often do getting the basics right (vanilla, fresh version, no taint, easy to grasp, ...) is important [ohh, and remember]

Slide 262

Slide 262 text

decent report; depiction; check for existing reports about the problem to join check , lore.kernel.lorg/all/, and bugzilla.kernel.org * You might wonder: * shouldn't I have done this earlier * correct, that's why you should do things in the order described by this talk [instead]

Slide 263

Slide 263 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html * tries to catch local problems early * reference section providing details when you need them [document also tells you]

Slide 264

Slide 264 text

decent report; depiction; it tells you to check what kind of issue you deal with * a some require a few additional steps * there is another reason why you want to do that * it determines what you can expect * which is kinda important, too [which brings us to]

Slide 265

Slide 265 text

[ act 2 ]

Slide 266

Slide 266 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions [one kind are those]

Slide 267

Slide 267 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; * and there are three of those [the first]

Slide 268

Slide 268 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix; * will only happen to a few of you * but if follow the reporting issue [which will point you]

Slide 269

Slide 269 text

https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html [the second]

Slide 270

Slide 270 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix; * not "the paint is off" somewhere [somethingโ€ฆ]

Slide 271

Slide 271 text

kind of issue; mustfix; devastating; something really really bad data is lost or damaged, hardware is bricked, ... [luckily even more rare]

Slide 272

Slide 272 text

kind of issue; mustfix; devastating; make impact & urgency obvious in your report [and in case it's not quickly acted upon, get Linus in the loop]

Slide 273

Slide 273 text

[which brings us to the third, more common type]

Slide 274

Slide 274 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix; [a regression is...]

Slide 275

Slide 275 text

kind of issue; mustfix; regressions; something breaks when updating the kernel say from 5.15 -> 5.16 or from 5.17.3 -> 5.17.4 [not allowed in Linux]

Slide 276

Slide 276 text

kind of issue; mustfix; regressions; first rule of Linux kernel development: "we don't cause regressions" * coined and enforced by Linus * wants to take the fear out of updating * they nevertheless happen frequently :-/ * sadly some of the reports even fall through the cracks :-/ * that why I volunteered as the kernel's regression tracker [and built a bot]

Slide 277

Slide 277 text

* ugly, isn't it? :-/ [funding from EU]

Slide 278

Slide 278 text

https://linux-regtracking.leemhuis.info/about/ * many thx for that * project ended

Slide 279

Slide 279 text

No content

Slide 280

Slide 280 text

* could talk about regressions and tracking them for hours * no time for it [once again there is a document]

Slide 281

Slide 281 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-regressions.html * new in 5.18 * mentioning everything important [that among others includes]

Slide 282

Slide 282 text

kind of issue; mustfix; regressions; make it obvious your report is about a regression

Slide 283

Slide 283 text

kind of issue; mustfix; regressions; CC for forward the report to [email protected] [note, there is someโ€ฆ]

Slide 284

Slide 284 text

kind of issue; mustfix; regressions; fine print(1): only userland interfaces matter [it's thus not a regression if your out-of-tree kernel module breaks] * these modules use kernel-internal interfaces

Slide 285

Slide 285 text

kind of issue; mustfix; regressions; fine print(2): the build config of the newer kernel version must be similar to the older one * otherwise optional new features might interfere * say a new security technique blocking something a few very rare apps need * the doc I mentioned explains you how to realize

Slide 286

Slide 286 text

kind of issue; mustfix; regressions; fine print(3): you often will be asked to find the culprit yourself * many bug only happen in a certain environment * that why the change that causes often needs to be found by the reporter * the aforementioned doc explains you how to do that with a bisection using "git bisec" * sounds hard, but might only take an hour or two * initial report without this is okay, as problem might be known already [and the good thing is]

Slide 287

Slide 287 text

kind of issue; mustfix; regressions; if you find the culprit, a fix is pretty much guaranteed * and the responsible volunteer and subsystem will be known * might be possible to revert it * tell me if that doesn't work out * that's why you really want to do that in case you face a regression!

Slide 288

Slide 288 text

2. the kind of issue at hand a) issues someone is obliged to address I. security vulnerabilities II. devastating bugs III. regressions kind of issue; mustfix; * enough about regressions and issues that have to be fixed * just one more thing you might be wondering about * who will take care of fixing such bugs * for regressions it's the author of the culprit * if MIA and for everything else it's the [maintainer]

Slide 289

Slide 289 text

[and sometimes this person]

Slide 290

Slide 290 text

No content

Slide 291

Slide 291 text

2. the kind of issue at hand a) issues someone is obliged to address b) issues most likely to be ignored kind of issue;

Slide 292

Slide 292 text

2. the kind of issue at hand b) issues most likely to be ignored I. known deficits kind of issue; unlikely;

Slide 293

Slide 293 text

kind of issue; unlikely; deficits; Linux contains many incomplete drivers * a basic, incomplete driver is way better than none at all [sometimes these drivers are never improved, ifโ€ฆ]

Slide 294

Slide 294 text

kind of issue; unlikely; deficits; might lack a volunteer with enough time and/or motivation to improve it [second reason for known deficits]

Slide 295

Slide 295 text

kind of issue; unlikely; deficits; or some real-world issue prevents improvements * example: Nouveau * docs scarce * firmware prevents using the full capabilities of the hardware * what do these known deficits mean for your report? [if it looks like a missing feature]

Slide 296

Slide 296 text

kind of issue; unlikely; deficits; check internet and docs for known deficits * prevents wasting your time on preparing a report * if in a doubt, send a quick "is this known" before writing a proper and lengthy report

Slide 297

Slide 297 text

2. the kind of issue at hand b) issues most likely to be ignored I. known deficits kind of issue; unlikely; [another reason why some bugs are ignored]

Slide 298

Slide 298 text

2. the kind of issue at hand b) issues most likely to be ignored I. known deficits II. code without an active maintainer kind of issue; unlikely; * Linux contains quite a bit of such code [and it remains]

Slide 299

Slide 299 text

kind of issue; unlikely; w/o maintainer; code often remains, as it useful for people * removing it would cause a regression, too * "no regression rule" should ensure it nothings break * if people like you and me tests and reports problems [two different kinds of unmaintained code]

Slide 300

Slide 300 text

https://www.kernel.org/doc/html/latest/process/maintainers.html * nearly orphaned, but not fully

Slide 301

Slide 301 text

kind of issue; unlikely; w/o maintainer; sending at least a quick brief report definitely a good idea * the "odd fixer" or someone else might take care of it [there is also fully orphaned code]

Slide 302

Slide 302 text

https://www.kernel.org/doc/html/latest/process/maintainers.html

Slide 303

Slide 303 text

kind of issue; unlikely; w/o maintainer; sending at least a quick brief report likely worth it * maybe you find others affected and can team up with them

Slide 304

Slide 304 text

2. the kind of issue at hand a) issues someone is obliged to address b) issues most likely to be ignored * this concludes this section [leaves the big and wibbly-wobbly area in between those two extremes]

Slide 305

Slide 305 text

2. the kind of issue at hand a) issues someone is obliged to address b) issues most likely to be ignored c) all the other issues [what matters here is quickly explained, as we discussed this in act 1 already]

Slide 306

Slide 306 text

kind of issue; unlikely; the rest; the quality of your report! [which brings usโ€ฆ]

Slide 307

Slide 307 text

[ grand finale ] * summary

Slide 308

Slide 308 text

take this with you

Slide 309

Slide 309 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html * looks a bit scary, but it is not * tries to catch local problems early * it's in your own interest to follow the steps [to understand why things are as they are, always keep in mind]

Slide 310

Slide 310 text

takeaways; almost all kernel developer are volunteers

Slide 311

Slide 311 text

takeaways; they should act on every bug report, but can and will ignore bad reports

Slide 312

Slide 312 text

takeaways; act accordingly and sent a decent report, then you'll be heard [to do that, you]

Slide 313

Slide 313 text

takeaways; (1) check what kind of issue you deal with, as itโ€ฆ

Slide 314

Slide 314 text

takeaways; (a) might save you from wasting time on reporting known deficits

Slide 315

Slide 315 text

takeaways; (b) tells you what to expect from developers [in addition to that]

Slide 316

Slide 316 text

takeaways; (2) do your homework

Slide 317

Slide 317 text

takeaways; (a) test and report with a *vanilla* kernel

Slide 318

Slide 318 text

takeaways; (b) test with a fresh mainline kernel

Slide 319

Slide 319 text

takeaways; (c) rule out local interferences

Slide 320

Slide 320 text

takeaways; (d) check MAINTAINERS to submit the report to the right place

Slide 321

Slide 321 text

takeaways; (e) write a friendly and decent report easy to gasp for others

Slide 322

Slide 322 text

https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html * reporting issues takes care of this

Slide 323

Slide 323 text

takeaways; chances then are pretty good someone will help you

Slide 324

Slide 324 text

takeaways; and nearly perfect, if you report a bisected regression [and in the end that is]

Slide 325

Slide 325 text

takeaways; that's how you make the Linux developers fix kernel bugs they are able to fix [which is in everybody's interest and makes everyone happy]

Slide 326

Slide 326 text

No content

Slide 327

Slide 327 text

questions?

Slide 328

Slide 328 text

Thorsten Leemhuis mail: [email protected] GPG Key: 0x72B6E6EF4C583D2D social media: @kernellogger, @knurd42, @knurd42rhfc, @thleemhuis and @thleemhuisfoss on #twitter & #friendica #EOF