For Computer Science 202 "Introduction to Computing," summer 2021. Includes an extended discussion of learning analytics and Canvas tracking.
The surveillance industrial
complex: tracking, ML/AI,
Knowledge is power!
• I actually hate that cliché, being in the knowledge business.
Nothing’s that simple.
• But I do want to say out loud that knowledge about you
translates into power over you.
• And the other way around, too! It is NOT COINCIDENCE that areas with a lot of
people of color in them are heavily oversurveilled in the US! It’s a power play!
• It’s also not coincidence that you students are more heavily surveilled by UW-
Madison than I am as an employee!
• UW-Madison is more scared of me than it is of you, even if only a little bit. I have
certain protections enshrined in campus policy. You… don’t have nearly as many.
AI/ML, in a nutshell
• AI: “arti
cial intelligence.” An unrealistic pipe dream.
• Even humans don’t know exactly how humans think. So we’re gonna train a
computer to do it? Yeah… no, not really. It’s been tried; it’s always failed.
• In certain sharply limited situations, can be made to work. Kind of.
• ML: “machine learning.” A set of computational and
mathematical/statistical techniques that enable computers
nd (“model”), report, and act on patterns in the data used
to train them… and sometimes (only sometimes!!!) similar
data to the training data.
• Limitations in the training data = limitations in pattern detection capacity
• Bias in the training data = bias in the model
• Chose a wrong technique to use with your data? Garbage in, garbage out.
• They’re computers, not magic wands!
With that in mind…
• I’m going to toss y’all into breakout rooms again.
• In your rooms, make two lists:
• One list of information about students’ homework practices that you think it’s reasonable
for your instructors (good ones AND bad ones) to know.
• One list of information about students’ homework practices that you think is just plain
none of our business.
• “Homework practices” include (but are not limited to):
• When and how long you do homework
• Where you are (in realspace) and whom you’re with when you do homework
• What you read and do online (e.g. your web searches, clicks, browser URLs, online library
visits, etc) while you’re working on homework
• What you read and do of
ine (e.g. reading print books, highlighting, writing things on
paper, going to the library) while you’re working on homework
• Mistakes you make and then correct on your homework
• NO RIGHT/WRONG ANSWERS! Just what you think.
• Personally Identi
able Information. Anything that
uniquely (or close to…) identi
es you, such as:
• Your name
• ID number (SSN, driver’s license number, student ID number, passport number,
• Sensitive (combinations of) personal information are
sometimes included, such as:
• race/ethnicity, gender, birth date
• In the US, PII tends to be more protected, legally, than
other kinds of information about you (for example, your
• It is NOT FREE. You’re paying with your data, the analysis
and sale of which can and do make money.
• Data you give the service about yourself (often PII)
• Data the service
nds out about you by observing your behavior as you
interact with it
• Data the service
nds out about you by observing your behavior
elsewhere online, through ad networks and/or other online-behavior
• Data the service can infer (
gure out) from your behavior (or, in especially creepy
cases, by trying to manipulate your behavior)
• Data the service can buy about you from other services, from ISPs and
cell-service providers, or from data clearinghouses called data brokers (which
in turn are happy to buy data about you from services you use…)
• When you hear “Big Data” in the media, usually they mean
huge aggregations of Data About Individual People.
• Just to be perfectly clear: it doesn’t HAVE to be. Weather forecasting uses Big
Data, for example, but it’s not data about people.
• So does physics, biomedicine, economic modeling/forecasting…
• When you hear “AI” or “machine learning” in the media,
though, they’re usually talking about analyzing Big Data
About People with computers to look for patterns.
• Computers are quite a bit better at sifting through large piles of data looking for
patterns than human beings are.
• What they can’t do is
gure out why the pattern is the way it is (as we saw with
patterns of bias) or the implications of acting on the pattern.
• One variety of analysis for Big Data about people is often called “inference.”
• Once a computer
nds patterns in Big Data about people,
you can infer things about the people that they didn’t
actually tell you and might not even want you knowing.
• Real-world inference based on data collected online has included: gender, race/
ethnicity, biological-family relationships (including ones unknown to the people
involved), romantic relationships, age, country of origin, sexuality, religion,
marital status, income, debt level, political preferences/beliefs, educational level,
(lots of variations on) health status (including mental health), veteran status,
pregnancy status, gullibility…
• Again, computers detect patterns humans can’t.
• So we usually can’t know exactly what it is in the data collected about us that will
give away something we don’t want known (or don’t want used against us). WE
CANNOT KNOW. And at present we have no realistic way to challenge the
conclusions or stop the inferencing.
Who collects Big Data?
• Online advertising relies on data collection.
• “Real-time ad bidding:” 1) You visit website. 2) Website
gures out who you are and what
it already knows about you, and 3) sends that (plus new info about you) to an ad network
asking “what ad should I show this person?”
• QUITE PERSONAL INFORMATION, e.g. race/ethnicity, sexuality, health status, MAY TRAVEL
BETWEEN THE PARTIES HERE.
• Online media (including journalism) relies on online advertising
as well as other kinds/uses of surveillance.
• Social media, mobile apps, and companies called “data brokers”
rely on data collection, analysis, exploitation, and (re)sale.
• Increasingly, workplaces, schools (K-12 through higher ed), and
whole societies are surveilling their employees/students/citizens!
• In some cases, theoretically to “serve them better.” Some, “to justify our existence.” Some,
it’s plain old naked authoritarian control.
“So Big Data is all online?”
• Thanks for asking that! … No.
• Your physical location over time is heavily tracked.
• This happens substantially through your phone… but also via camera and voice
surveillance, license-plate surveillance, credit-card and other ID records, etc.
• Interactions you have in the physical world also leave traces
• Non-cash purchases, of course, but also…
• … through facial, gait, and voice recognition technologies.
• “Brick-and-mortar” stores are actively researching how to
track you more.
• All your devices have them.
• Network-card identi
ers: MAC addresses (though these are less useful for
surveillance than formerly)
• Advertising identi
ers on mobile devices, speci
cally for ad tracking but used in
many other surveillance contexts as well
• Phone numbers
• Serial numbers
• Since most devices are one-owner, if they’ve got a device
er, THEY PRETTY MUCH KNOW IT’S YOU.
• Weasel trick: don’t collect the stuff that counts as PII, but DO collect device identi
• July 2021: revelation of the existence of companies that exist to match device
ers to their owners’ identity
• Fingerprinting: combos of settings that make your device
(or web browser) unique
• Fake cell phone towers
• Cell phones have to connect to towers to work at all.
• Exist to collect phone identi
ers and locations,
undetectably to phone users
• In use by many US local, state, and federal law-enforcement
• Stores and some college/university campuses considering
using these (and similar follow-you-via-your-phone tech).
“Wait, don’t they have to
• Sort of, some places. Not in the US.
• In the US, this is mostly governed through contract law,
which online means those Terms of Service and Privacy
Policy things you never read.
• Don’t lie; I know you don’t read them. There isn’t enough time in the universe to
read them all! Which is part of the problem here!
• US law is perfectly happy to let you agree to ToSes and PPs that are terrible for
you. You’re an adult, you are supposed to have read it! If you didn’t and you’re
hosed, OH WELL.
Notice and consent
• Notice and consent: A common legal maneuver
(particularly for US-based online businesses) in which
privacy concerns are considered satis
• the business noti
es its customers how their data will be collected and used
• and has them consent to it somehow (e.g. via clickthrough).
• It doesn’t work; gives people false trust.
• Impenetrable legalese allowed! Vague weasel-wording allowed! Changes without
cation allowed! That is not communication.
• Misleading language is allowed in the gaining-consent process.
absolutely say “we share and/or sell your data” and most of them do!
“But it’s all anonymized,
so no big deal, right?”
• Wrong. Given enough data, removal of PII is meaningless.
Big Data knows it’s you.
• Remember those device identi
ers? Yeah. Those. They don’t count as PII, legally.
• Pet peeve of mine: removal of PII is not “anonymization,” but deidenti
• Anonymization is “ensuring that no one in the data can be identi
ed from it”—
and most security researchers believe it to be impossible.
cation is “determining someone’s identity from deidenti
There are several ways to do it, and it’s often not even hard.
• Common weasel phrase in ToSes and PPs: “We guard your
• What this actually means: “All the rest of the data is fair game for whatever we
want to do with it and whoever we want to sell it to!”
What are data about you
• What they’ll tell you:
• Personalization: “Ads/content tailored to your interests!” (Not… exactly. Ads/
content they believe you will click on, whether it’s good for you or not. People get
bamboozled into believing conspiracy theories via “tailoring.”)
• “A better experience!” (Whatever that even means.)
• What they won’t tell you:
• Outright data sale, without your knowledge or consent
• Inferring further data (including stuff you’d rather keep private) about you
• Manipulating you (especially though not exclusively
nancially and politically)
• Making important decisions about you (loans, insurance, college admission, jobs)
• Rolling over on you to government, law enforcement, and other authorities
• Lots of other things! They won’t tell you what they’re doing! (FACEBOOK!)
When does personalization
• Well, always, really. All personalization is aimed at manipulating
• Social media, for example, uses personalization to manipulate you into staying longer.
Even when that’s not good for you or anyone else.
• With product advertising, we understand that and we tend to be
okay with it?
• Arguably in many situations we shouldn’t be! For example, Facebook tricking children
into overspending on Facebook games.
• But what about “personalized” education? Or “personalized”
news? Or “personalized” politics?
• We can end up with a dangerously skewed sense of the world this way… and that can
lead us to do dangerously messed-up things.
• (I say “us” because no one is immune to this! Not me, not you, not anyone. That’s not
how human brains work.)
How else can Big Data be
used against you?
• Deny you opportunity
• Facebook patented a system to test your loanworthiness based on analysis of your
• Colleges and universities use data brokers and monitor use of the campus
website and social media to make admissions decisions.
• Deny you services
• Health insurers want to kick people who may have expensive health situations off
their rolls. They’ll do almost anything (short of violating HIPAA) to
nd out if
you’re in such a situation.
• Yes, in the US Big Data can deny you health care!
• Get you in legal or reputational trouble
• Employers, for example, also want to know if you’re a “health risk” or if you’re
liable not to be the perfect employee for some reason.
But it’s okay if it’s the
• Inferences especially can be wrong, and often are.
• Garbage (data) in, garbage (inferences) out.
• Bias in, bias out. For example, Amazon tried to infer who would be a good hire
from analyzing resumes. The result? The computer marked down anything
showing an applicant was female, because Amazon’s longstanding gender biases
in hiring showed up in the data!
• The data we can collect—even Big Data—can never paint a whole picture. (Jargon:
availability bias—we judge on the data we can easily collect, which is not
always the best or most useful data.)
• Correlation is not causation, and all the other cautions that come from a good
statistics or research-methods course!
• Even truths can harm you unfairly.
• Ask anyone who’s dealt with discrimination based on a personal characteristic.
But ____ are the good
guys, right? So it’s okay?
• Even if that’s so (and it’s debatable)…
• … anything ____ can collect, a bad guy (or bad
government, or terrorist organization) can
gure out how
to collect also.
• Or buy. Or hack ____ for.
• Plenty of Big Datastores and data brokers and governments and whatnot have
been hacked, or have leaked info. (EQUIFAAAAAAAAAX)
• There is no such thing as tracking or data collection
“only for the good guys.”
• Moral: Even if you trust me as a person, or UW-Madison as
an organization… you shouldn’t just trust us with lots of
your data. Make us explain and justify what we’re doing!
Who’s making it easier
for law enforcement?
• Any information marketers can collect, law
enforcement can also collect.
• Sometimes directly from the marketers! After all, many sell data.
• But more commonly, they just copy marketers’ tracking techniques.
• Any tracking marketers can do, much law enforcement can
also do. It’s all algorithms!
• Anything marketers can learn about people, individually
or collectively, law enforcement can also learn.
• Few legal constraints on any of this!
Geofeedia: or, police
surveillance of social
• Geofeedia used geotagging on various social media along
with text mining to create alerts for police.
• Twitter, Facebook, Instagram, YouTube, Vine, Periscope, and more
• One known dubiously-legal, unethical use: tracking the real-world locations of
activists of color
• Geofeedia got bad press, was abandoned by police.
• Many similar tools still in use!
• Do you know what your local police use? Possibly time to ask!
• George Floyd protests: a data broker called Mobilewalla geolocated protestors via
their phones… just for fun, apparently… and published the results. They were
super-proud of themselves!
But I’m just a student…
• What if the computer
nds a pattern suggesting students “like
you” shouldn’t be in your major because they fail it a lot?
• What if the computer is actually noticing your gender or your race/ethnicity, and the
REAL problem is that the department offering that major behaves in racist and sexist
ways? Do you trust UW-Madison to handle this correctly?
• What if the computer discovers that students who eat that
thing you often get at the Union don’t do well on
Or have mental-health issues?
• Such a pattern would almost certainly be nonsense coincidence! But would that
necessarily stop UW-Madison from acting on it?
• Basically, what if a computer thinks You Are Studenting Wrong?
• What is your recourse if that’s used against you? Or it’s incorrect? Or the problem isn’t
actually your problem, but the university’s?
Don’t these people have
• Often they don’t. Or they care about money more than they
care about you, or about us.
• Or they believe total garbage like “technology is neutral” or “it’s just a tool (so the
implications aren’t my problem)” or “anything not actually illegal is obviously
• (If you believe any of these things, PLEASE STOP. Learn better.)
• There’s serious Fear of Missing Out (“FOMO”) around Big
Data collection and analysis… even in non-pro
like public education.
• Yes, even in libraries. This DEVASTATES me as a librarian. I
was taught to do better! But it’s true.
also known as “learning analytics”
Did anybody ever tell
you “this is going on your
They were bluf
But now they’re not.
• Family Educational Rights and Privacy Act
• Protects any US-based “educational record” you have.
• Grades are educational records, for example.
• Not just anybody can waltz in and ask to see it; usually you
(if you are adult) or your parents/guardians (if not) have to
• Even your instructors/advisors/counselors etc. have to have a reason to look up
• Not perfect law, but not bad either, for its time… which was
roughly my lifetime ago.
Here’s the thing…
• The current de
nition of what counts as an “educational
record” is pretty speci
c and completely print-based.
• Most digital surveillance it’s possible to do in (for example)
Canvas? Doesn’t count as a “record” under FERPA.
• FERPA also has a giant loophole: an organization can use
your data for internal research and assessment.
• And can extend this ability to companies it contracts with to do research or
assessment. You see (based on what I’ve said) how that could get sticky, I trust?
• Add that to the Big Data movement, and you get…
• Surveilling students as they learn, both online and in the
physical world, and (supposedly… but not always) trying
to use the information to help them learn.
• Not properly tested. A lot of the “innovation” in this
space is going on hunches and guesses, even as it’s
affecting real students.
• I repeat: WE DON’T EVEN KNOW IF/WHEN THIS WORKS. The results we
have so far are pretty unimpressive.
• Worse, a lot of the experimentation is not undergoing regular research oversight.
• Obviously this information is gold to lots of others too…
• … imagine if a prospective employer got hold of it. (Some of them are sleazy
enough with the educational records they CAN get.)
• Anything you do in Canvas or Canvas add-ons
• E-resources you use from the library
• Website-based interactions and use, sometimes
• Anything you do in the Student Center
• enrollment, classes you look at (without enrolling in them), etc.
• And more.
• Higher-education institutions building “data warehouses” to retain all this
information and connect it up with other information… just like a commercial
data broker, really, except for (so far! no guarantees!) somewhat less actual data
Wait, the physical world
too? How does that work?
• ID-card swiping
• Any time you swipe your WisCard, that turns into a row in a campus database, tied
directly to you (via your student ID number as identi
• That includes purchases, whenever you use WisCard for them!
• So that you know: if you WisCard-swipe your way into a building or other physical
space, that data goes to the UW Police Department.
• Same space-surveillance techniques that retail and law
• Stingrays to track student cell phones
• Video surveillance
• IP address (for off-campus geolocation) when you use campus’s online resources
geolocation of your devices when you’re physically on-campus
(from its documentation)
When you did what
Exact page views
And that’s just
what Canvas shows ME.
more data than this.
The Unizin Consortium
is building a “data platform”
to hold it.
Here’s what UW-Madison
thinks it’s okay to do
with learning analytics.
As I explain, please consider
misinterpretations and abuses.
WHAT DID I SAY
So many assumptions.
• “A pattern of interactions that works for one or a few
students (or even many!) must work for everybody.”
• Y’all are individuals!!! In individual situations!!!!!
• “There’s a correlation between time spent and grades.”
• OH MY GOSH Y’ALL. This is such nonsense I can’t even. Canvas can’t even measure
all the time you spend! (And what if you just leave a window open, inactive?)
• Some A+ students don’t spend much time (often due to prior experience). Some F
students spend LOTS of time (because they’re lost). Do I trust that “predictive
analytics” systems understand this? I DO NOT.
• Communication habits depend on a lot of things — such as whether interactions
with the instructor and/or other students have been positive. (*-isms do happen
here. Are we going to blame their targets for “not interacting enough”?)
• Students do not all have equal amounts of time to dedicate to school! Are we
measuring “engagement” or privilege here?
Not gonna lie:
when I was a new instructor
I had a lot to learn
about how best to understand
and interact with students.
(I don’t pretend
that I’m perfect at it now.)
Based on what I know, I can say
that these systems and assumptions
are lots more clueless than I was then.
I am ANGRY about this.
(Can you tell?)
But I do not control it.
(I’ve been told by someone I trust
that campus folks responsible for this
won’t work with me
or even talk with me
because of what they know
about my work and my beliefs
and how shy I’m NOT
about calling bad practice out.)
Do I use this stuff?
• Thank you for thinking about this. You are absolutely right
that you are owed this information.
• The answer: Very, very rarely.
• The only situation in which I go into an individual student’s
Canvas analytics is if they’ve dropped off the radar.
• And when that happens, I email them to ask what’s up. No rush to judgment.
• When or where or with whom you do homework, as long as
it’s in on time and you come prepared to class? I consider it
NONE OF MY BUSINESS.
• I also don’t pretend that beyond the incredibly obvious (turn work in!), I (much
less a computer) can make reliable predictions about student performance.
• And I wish the rest of campus thought as I do.
• Is there cheating? Yeah. A lot of it? Not that I’ve ever noticed
in my classes.
• Pedagogy research: A lot of cheating comes from students feeling anxious, unsure
what to do, or overloaded. I try to make clear that I’d rather students ASK than cheat!
• How instructors respond to potential cheating matters.
• Pedagogy research: Calling on students to be honorable people stops a lot of
• Draconian measures to prevent cheating damage student trust and raise student
anxiety. (I mean, duh, right?) My teaching style relies on student trust quite a bit.
• So I strongly prefer to treat you as the honorable adults you
are. I think I teach better and you learn better that way.
• I also believe that real cheaters get theirs, even if not directly
Where’s the harm?
Where’s the harm?
• I hope you now understand why I was vehement about not
using it when the course began.
• Northwestern University students are suing under Illinois’s
Biometric Information Protection Act.
• Student protest has gotten proctoring contracts cancelled at
several universities nationwide.
• And proctoring settings (marginally) improved elsewhere, here included.
• You DO NOT have to take this garbage lying down.
Where’s the harm?
• Naming-and-shaming: this is the University of Arizona,
business-school researcher Dr. Sudha Ram.
(the research ethics here
but that’s a whole other
To help possible dropouts?
using these tools to judge and punish
instead of understand
is a serious, unsolved problem.
(And a microcosm
of Big Data About People use
in the rest of society.)
What students can do
• EACH ONE TEACH ONE. Tell others what you now know!
• Tell helicopter parents to step off.
• Too many parents are demanding that campus surveil students “for safety.”
• When your instructor says “Let’s use this online thing!” ask
back “What’ll it do with my personal data? Behavioral data?
• They probably won’t have thought about it. At least ask them to THINK.
• (“Don’t use your real name!” is one reason I’m comfortable using Scratch.)
• Raise this with student organizations and *PIRGs.
• If you need someone to explain it, CALL ON ME. PLEASE. I will back you up!
• We will discuss some tracking-prevention tools and
techniques a bit later on. Cross my heart.