Data Analytics 2. Various types of Data Analytics 3. Role of Data Analytics in solving real-world problems 4. The importance of Data Analytics 5. What are the analytical tools used in data analytics 6. What are the best techniques and different usage of Data Analytics 7. What is the career growth in data analytics What to expect
2015, user experience designer by profession and an educator at heart with excellent skills demonstrated by years of experience in the field of innovation, entrepreneurship and design. For the past several years (11+ years), I’m passionately helping local entrepreneurs and micro, small to medium brands & businesses in the Philippines transform, design and build functional-meaningful services and digital products. About the speaker
Technopreneurship and Innovations – Batangas State University • Hack Manila 2018 • WeRemote Philippines • Ambidextr Media • Creative Manila – Portfolio of the week • Fomolist – Filipino Tech & Business • Innovation+ Creative Edge • Hustle to Freedom Podcast (US) • And other local educational podcasts, design and business related digital media initiatives Notable Feature
• Draw meaningful conclusions with actionable approach • Execution, usability testing and iteration PROCESS DATA PROBLEM & SOLUTION Identify the problem and sketch possible solutions 01 • Identify data sources • Select the data • Clean the data • Transform the data 02 Data Analytics at a glance 03 Preprocessing Analytics Post-processing
descriptive analytics. Descriptive analytics aims to answer the question “what happened?” Advance Analytics Industry practice used that is part of data science which takes advantage of advanced tools to extract data, make predictions and discover trends. This process addresses “what if?”
sketching of hundreds & thousands of possible solutions in sketchpad or similar digital tools. Expected Outcome Process begins on drawing meaningful conclusions from complex and varied data sources, and presentation of ROI / Return of Investment It’s great to understand why your idea(s) would be worth pursuing and who will greatly benefit when it is executed. Understanding and giving almost perfect financial evaluations of worth of your solution including timeline, skillsets and expertise involve
from direct competitors, teammates, other similar organization and notable materials such as academic or scientifically proven research. Select Important Data Clean the Data Transform the Data This process involves of identifying what’s really useful for your project or suggested solution. You have to be very careful and selective to be able to extract meaningful conclusion. Involves of carefully and selectively data cleaning. It’s needs a very detailed-oriented individual aided by data analytics tools. Process involves of transforming and customizing cleaned data to meet desired goals.
works and what stick in a long-term basis then deploying solutions (in a form of software, new process or innovative team) to meet expected outcome and goals. 0
about what happened Diagnostic Helps answer questions about why things happened Predictive helps answer questions about what will happen in the future Prescriptive helps answer questions about what should be done
stakeholders. By developing key performance indicators (KPIs,) these strategies can help track successes or failures. Metrics such as return on investment (ROI) are used in many industries. • Specialized metrics are developed to track performance in specific industries. This process requires the collecHon of relevant data, processing of the data, data analysis and data visualizaHon. This process provides essenHal insight into past performance. Descriptive Analytics
the findings from descripHve analyHcs and dig deeper to find the cause. • The performance indicators are further invesHgated to discover why they got beJer or worse. This generally occurs in three steps: Diagnostic Analytics 1. IdenHfy anomalies in the data. These may be unexpected changes in a metric or a parHcular market. 2. Data that is related to these anomalies is collected. 3. StaHsHcal techniques are used to find relaHonships and trends that explain these anomalies.
determine if they are likely to recur. • PredicHve analyHcal tools provide valuable insight into what may happen in the future and its techniques include a variety of staHsHcal and machine learning techniques, such as: neural networks, decision trees, and regression. Predictive Analytics
be made. This allows businesses to make informed decisions in the face of uncertainty. • PrescripHve analyHcs techniques rely on machine learning strategies that can find paJerns in large datasets. By analyzing past decisions and events, the likelihood of different outcomes can be esHmated. Prescriptive Analytics
staHsHcs and business. They combine these fields in order to help businesses and organizaHons succeed. The primary goal of a data analyst is to increase efficiency and improve performance by discovering paJerns in data. • The work of a data analyst involves working with data throughout the data analysis pipeline. This means working with data in various ways. The primary steps in the data analyHcs process are data mining, data management, staHsHcal analysis, and data presentaHon. The importance and balance of these steps depend on the data being used and the goal of the analysis. • AddiHonally, they discover how data can be used to answer quesHons and solve problems. With the development of computers and an ever increasing move toward technological intertwinement, data analysis has evolved. The development of the relaHonal database gave a new breath to data analysts, which allowed analysts to use SQL (pronounced “sequel” or “s-q-l”) to retrieve data from databases. Data Analyst
tasks. This involves extracHng data from unstructured data sources. These may include wriJen text, large complex databases, or raw sensor data. The key steps in this process are to extract, transform, and load data (oZen called ETL.) These steps convert raw data into a useful and manageable format. This prepares data for storage and analysis. Data mining is generally the most Hme-intensive step in the data analysis pipeline. Data Mining
a data analyst’s job. Data warehousing involves designing and implemenHng databases that allow easy access to the results of data mining. This step generally involves creaHng and managing SQL databases. Non-relaHonal and NoSQL databases are becoming more common as well. Data Management
presenta+on. This step allows insights to be shared with stakeholders. Data visualizaHon is oZen the most important tool in data presentaHon. Compelling visualizaHons can help tell the story in the data which may help execuHves and managers understand the importance of these insights. Data Presentation
the increasing value of data and — to our liking — accurately characterizes data as raw material. Data are to be seen as an input or basic resource needing further processing before actually being of use.”
problem to be addressed is needed INTERPRET + EVALUATE The key issue is to find the unknown yet interesting and actionable patterns (sometimes also referred to as knowledge diamonds) that can provide new insights into your data that can then be translated into new profit opportunities! SOURCE The golden rule here is: the more data, the better! The analytical model itself will later decide which data are relevant and which are not for the task at hand. PROCESS + TRANSFORM Depending on the business objective and the exact task at hand, a particular analytical technique will be selected and implemented by the data scientist. STEP 01 STEP 02 STEP 03 STEP 04 Steps in the development, implementation, and operation of analytics within an organization. Some examples are: customer segmentation of a mortgage portfolio, retention modeling for a postpaid Telco subscription, or fraud detection for credit cards. All data will then be gathered and consolidated in a staging area which could be, for example, a data warehouse, data mart, or even a simple spreadsheet file. Analytical model will be estimated on the preprocessed and transformed data STEP 05 VALIDATED + APPROVED it can be put into production as an analytics application (e.g., decision support system, scoring engine). Important considerations here are how to represent the model output in a user- friendly way
and collect visitor insights. It can help organizaHons determine top sources of user traffic, gauge the success of their markeHng acHviHes and campaigns, track goal compleHons (such as purchases, adding products to carts), discover paJerns and trends in user engagement and obtain other visitor informaHon such as demographics. • Small and medium-sized retail websites oZen use Google AnalyHcs to obtain and analyze various customer behavior analyHcs, which can be used to improve markeHng campaigns, drive website traffic and beJer retain visitors. • Google AnalyHcs acquires user data from each website visitor through the use of page tags. A JavaScript page tag is inserted into the code of each page. This tag runs in the web browser of each visitor, collecHng data and sending it to one of Google's data collecHon servers. • Google AnalyHcs can then generate customizable reports to track and visualize data such as the number of users, bounce rates, average session duraHons, sessions by channel, page views, goal compleHons and more. The page tag funcHons as a web bug or web beacon, to gather visitor informaHon. However, because it relies on cookies, the system can't collect data for users who have disabled them. Google Analytics
language for programming and web improvement and later upgraded for data science. • It is the quickest developing programming language today. Python is an amazing data analyHcs tool and has an incredible set of friendly libraries for any part of scienHfic compuHng. • With Python, you can do advanced data manipulaHons and numeric analysis uHlizing data frames. Pandas is an integral tool for data masking, indexing and grouping data, data visualizing, data cleaning, and much more. Python
data analyHcs tools. It offers more than 80 high-level administrators that make it simple to assemble parallel applicaHons. • It is one of the open-source data analyHcs tools uHlized by a wide range of companies to handle huge datasets. • It assists with running an applicaHon in a Hadoop cluster, up to mulHple Hmes quicker in memory, and mulHple Hmes quicker on disk. It is one of the open- source big data analyHcs tools that gives built-in APIs in Java, Scala, or Python. Apache Spark
where the master is called “Driver” and slaves are called “Workers”. When you run a Spark applicaHon, Spark Driver creates a context that is an entry point to your applicaHon, and all operaHons (transformaHons and acHons) are executed on worker nodes, and the resources are managed by Cluster Manager.
Drug discovery is a complex task with many variables. Machine learning can greatly improve drug discovery. PharmaceuHcal companies also use data analyHcs to understand the market for drugs and predict their sales. Data Analytics in Healthcare
products and services • Being uHlized for compeHtor research • Being uHlized for predicHng trends and business value • Being uHlized for markeHng and sales report • Being uHlised for analyzing and predicHng future consumer behaviour
8% between 2019-2029. On average, data analysts earned $94,280 in 2019. However, salary compensaHon for data analysts varies depending on where they work and what industry they work in. Industry Insight
Royal StaHsHcal Society (RSS) (2019) “A Guide for Ethical Data Science.” hJps:/ /www.rss.org.uk/Images/PDF/influencing-change/ 2019/A-Guide-for-Ethical-Data-Science-Final-Oct-2019.pdf - Unwin, A. (2020). “Why is Data VisualizaHon Important? What is Important in Data VisualizaHon?” Harvard Data Science Review, 2(1). hJps:/ /doi.org/10.1162/99608f92.8ae4d525 - Wing, J. M. (2019). “The Data Life Cycle,” Harvard Data Science Review, 1(1). hJps:/ /doi.org/10.1162/99608f92.e26845b4 - Wing, J. M. (2020). “Ten Research Challenge Areas in Data Science,” Harvard Data Science Review, 2(3). hJps:/ /doi.org/10.1162/99608f92.c6577b1f - Yong, F. H. (2015), “QuanHtaHve Methods for StraHfied Medicine.” PhD Disserta3on, Department of BiostaHsHcs, Harvard T.H. Chan School of Public Health, Harvard University. hJps:/ / dash.harvard.edu/handle/1/17463130 - Yousra, A., Salleh, M., & Razzaque, M.A. (2015). “A Comprehensive Review on Privacy Preserving Data Mining.” SpringerPlus 4:694. hJps:/ /link.springer.com/arHcle/10.1186/ s40064-015-1481-x - Zhao, Y. (2017). “UpliZ Modeling with MulHple Treatments.” PhD Disserta3on, Department of Electrical Engineering and Computer Science, MIT. hJps:/ /dspace.mit.edu/handle/ 1721.1/113979 - Zhang, W., Li, J., & Liu, L. (2020). “A Unified Survey on Treatment Effect Heterogeneity and UpliZ Modeling.” hJps:/ /arxiv.org/pdf/2007.12769.pdf - Master in Data Science (2021) - Northeastern University (2020) - SAS InsHtute Inc. (2019). Big data in business analyHcs: Talking about the analyHcs process model - TechTarget (2020). Search Business AnalyHcs - Towards Data Science (2019) - AnalyHcs Insight (2020) - Technopreneurship in the Philippines (2020) - Slidesgo Data AnalyHcs Keynote Template References