I believe that data science oﬀers the most value when the models are in production. Some of us call this a 'Data Product' In this talk I will explain how to use ScienceOps from Yhat to build a model in production Why should Amazon or Google get all the fun? Or competitive advantage?
an ODE solver Possible Solutions (and their Possible Solutions (and their problems) problems) Port code to Java -----> Cross language validation PMML ----> Doesn't have great language support Batch Jobs -------> High maintenance and conﬁg More tools, more work, more time More tools, more work, more time
an Ordinary Diﬀerential Equation. This is speciﬁc to ScienceOps by Yhat, undoubtedly there are other products on the market. An alternative would be to build your own server and expose it via a service. The schema I used The schema I used
a data science technology company that provides tools and systems that allow enterprises to turn data insights into data-driven products. ScienceOps, Yhat's ﬂagship product, is a data science operations system for managing predictive and advanced decision-making APIs and workﬂows. From product recommendation systems to credit scoring models and customer attrition estimators, ScienceOps lets data science teams go from insight to prototype to data-driven product eﬃciently and at scale. (They helped me during my project so I promised to plug them)
Diﬀerent set of tools between analysts and software developers With advanced mathematics such as ODE or statistics not all languages have the libraries to do that eﬀectively. For example Ruby doesn't have good Stats libraries Producing tools is harder than producing a report but provides a lot more value.
1) There is a bit of backlash against 'big data' and data science 2) A possible solution is producing results and data products are a good way to do that.... 3) Producing results allows you to give knowledge to customers. The goal of data science is to turn data into knowledge!
in Python and have it deployed! - Software Engineers aren't data scientists and shouldn't be expected to write models in code. - Models only provide value when they are in production - Getting information from stakeholders is really valuable in improving models. - A data scientist is often a 'translator' between business and developers.
science projects, this is no exception. ScienceOps has a great free service but it is a challenge when they shut you down :( Products such as ScienceOps bring us a step closer in covering the gap in understanding between Mathematician (or data scientist) and Web Developer. Someone should write a book titled 'Probability Distributions for Software Engineers'