Science "What do you think of when you read the phrase ‘data science’? It’s probably some combination of keywords like statistics, machine learning, deep learning, and ‘sexiest job of the 21st century’. Or maybe it’s an image of a data scientist, sitting at her computer, putting together stunning visuals from well-run A/B tests. Either way, it’s glamorous, smart, and sophisticated. This is the narrative that data science has been selling since I entered the field almost ten years ago.” - Vicki Boykis
Do • A/B Testing on Website Path • Image Recognition • Predictions (Supervised Learning) • Unsupervised Learning • Natural Language Processing • Recommendation System (Netflix) • Time Series Analysis (trading) • Operation Research (Uber)
contains at least two aspects. One is adding spatial variables; the other is adding spatial items in the model. For example, f(x)=aX+b+error this is simple linear regression. you can add as many as self-created spatial variables here in X ; f(x)=aX+Sigma+b+error we add a covariance item Sigma which represents the influences caused by the spatial relationships from the whole dataset (not necessarily related to specific input x)
• Where are things and where do they happen: clusters, hot spots, disparities • Why do they happen where they happen: understanding locations decisions and movement patterns • How does where things happen affect other things (context/environment) and how does context affect what happens: "I am my neighbors neighbor" • Where should things be located: optimization
data unique • Spatial Context: Understanding the effect that your neighbors have on an observation, and vice versa • Spatial Support Problem: where scales of data do not match (zip codes and block groups) • Spatial Scale of Observations: behavior does not match the unit of observation (you know what neighborhood you live in but not with block group)
data unique • Spatial Spillover: the activity of one location will impact the costs of other locations (closing a road will have an increased cost for a distribution center) • Spatial Multiplier: a successful store will not just impact that store, but the nearby stores as well • Spatial Decay: Observations change and decay as you move away from an observation
the toolkit to work spatial data and do something with it - here are the tools and we train people to use those tools. Much (but not all) of this work is descriptive in nature - where the things are. Traditional GIS programs trained you on these tools, then you found careers using those tools. And with ESRI there are many many tools.
Intelligence focuses on outcomes from location data - you take the data in and you get some valuable insight out. Think about human mobility, they market immediate insight from location data, when the real process is much more complex.
Spatial Data Science focuses on the the journey and the data, or the underlying conditions leading to the insight and prescribes recommendations for optimization Looking at the factors that cause something to occur, and creating models for spatial "optimization".
• Data science is a verbose space with many tools and skills • Spatial data science concepts are newer to traditional 'data scientists' • There are many different roles and titles that all triangulate on ‘data science’
Takeaways • Spatial Data Science is data science with spatial attributes (with a little more complexity) • In Spatial Data Science, the journey to discover the problem is just as, if not more important, than the answer • It is important to start with a problem, then work from the beginning to align the steps to get there
• Intro to Python • NumPy, Pandas, Seaborn • Introduction to CARTOframes • Helper Functions in CARTOframes • Los Angeles Real Estate Data Cleaning • Los Angeles Real Estate Exploratory Data Analysis