field that uses scientific methods, processes, algorithms and systems • to extract knowledge and insights from data in various forms, both structured and unstructured. • Data science combines skills from mathematics, statistics, computer science, domain knowledge and communication to solve complex problems and create value for organizations and society.
of data science that focuses on creating systems that can learn from data and make predictions or decisions without being explicitly programmed. • Machine learning uses algorithms that can learn from data and improve over time.
applied to various domains and industries, such as: • Healthcare: diagnosis, prognosis, treatment recommendation, drug discovery, etc. • Finance: credit scoring, fraud detection, portfolio optimization, algorithmic trading, etc. • Marketing: customer segmentation, churn prediction, recommendation systems, sentiment analysis, etc. • Manufacturing: quality control, predictive maintenance, process optimization, etc. • Education: adaptive learning, student performance prediction, plagiarism detection, etc. • And many more!
it can help organizations and society to: • Discover new insights and knowledge from data that were previously hidden or unknown • Make better decisions and actions based on data-driven evidence and predictions • Innovate new products, services and solutions that leverage data and analytics • Enhance efficiency, productivity and performance of processes and operations • Create value and competitive advantage for organizations and society
(BI) are both related to data analysis, but they have some key differences: Data Science is more exploratory and experimental, while BI is more descriptive and reporting Data Science uses advanced techniques such as machine learning, natural language processing, computer vision, etc., while BI uses mainly traditional techniques such as SQL, OLAP, dashboards, etc. Data Science aims to answer complex questions such as why, what if and how, while BI aims to answer simple questions such as what, when and where Data Science focuses on generating insights and predictions from data, while BI focuses on delivering information and reports from data
some advantages over Business Intelligence (BI), such as: • Data Science can handle unstructured or semi- structured data, such as text, images, audio, video, etc., while BI can only handle structured data, such as tables or spreadsheets • Data Science can discover hidden patterns and trends in data that are not obvious or predefined, while BI can only show predefined metrics and indicators in data • Data Science can provide actionable recommendations and solutions based on data analysis, while BI can only provide information and reports based on data analysis
they can • Prevent the company from making risky investments that could create value and competitive advantage for stakeholders • Limit the company’s ability to explore new opportunities and discover new insights and knowledge from data • Reduce the company’s efficiency, productivity and performance by causing delays, errors or inefficiencies in processes and operations • Hinder the company’s innovation and creativity by discouraging experimentation and learning from mistakes • Backfire on the company when the status quo is unacceptable or threatened, and the only way to avoid loss is to take a risky option.
to understand how a machine learning model works and why it makes certain predictions or decisions. Model explanability is transformative because it can help users to: • Trust the model and its outputs by verifying its logic and reasoning • Debug the model and improve its performance by identifying and correcting errors or biases • Explain the model and its outputs to stakeholders and customers by providing clear and intuitive interpretations and visualizations • Comply with ethical and legal standards and regulations by ensuring transparency and accountability of the model and its outputs
data science that focuses on understanding the causal relationships between variables or events in a system. Causal inference aims to answer questions such as
Go beyond correlation and discover the true causes and effects of phenomena in data Go Estimate the causal effects of treatments or interventions in observational data without conducting experiments Estimate Test and validate causal hypotheses and assumptions using data and statistical methods Test and validate Make better decisions and policies based on causal evidence and counterfactual analysis Make
that did not happen, but that could have happened under different conditions. It can be used to: • test cause-and-effect relationships • evaluate the impact of interventions • interrogate model decisions
of an online store and you want to know the effect of offering free shipping on customer satisfaction. You have data from a survey of 1000 customers who bought products from your store, including whether they received free shipping or not, and their satisfaction rating on a scale of 1 to 5. How can you use causal inference to answer this question?
inference is to use a technique called propensity score matching. This technique matches customers who received free shipping with similar customers who did not receive free shipping based on their observable characteristics, such as age, gender, product category, etc. This way, we can create a balanced sample of customers who are comparable in terms of their likelihood of receiving free shipping.
the average satisfaction rating of the customers who received free shipping with the average satisfaction rating of the customers who did not receive free shipping. This difference is called the average treatment effect (ATE), and it measures the causal effect of free shipping on customer satisfaction Suppose we find that the ATE is 0.2, meaning that customers who received free shipping are on average 0.2 points more satisfied than customers who did not receive free shipping. We can then use this information to decide whether offering free shipping is worth the cost and how it affects customer loyalty and retention.