Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How data science changes modern world

How data science changes modern world

Moore's law is unbreakable - computing capabilities doubles every 18 months. With petabytes of storage we can collect as much raw data as one can imagine.

Techniques like map-reduce, machine learning, data mining, neural networks and AI we can answer few interesting questions: are you human, potential buyer, engaged fan, terrorist or even are you ill?

Let's dig into Big Data world and explore new range of possibilities.

Wojciech Sznapka

October 04, 2013
Tweet

More Decks by Wojciech Sznapka

Other Decks in Programming

Transcript

  1. Wikipedia says: Data science incorporates varying elements and builds on

    techniques and theories from many fields, including mathematics, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products
  2. Judging by Amazon's success, the recommendation system works. The company

    reported a 29% sales increase to $12.83 billion during its second fiscal quarter, up from $9.9 billion during the same time last year. A lot of that growth arguably has to do with the way Amazon has integrated recommendations into nearly every part of the purchasing process from product discovery to checkout. http://tech.fortune.cnn.com/2012/07/30/amazon-5/
  3. Recursive Deep Models for Semantic Compositionality Over a Sentiment TreebankSemantic

    word spaces have been very useful but cannot express the meaning of longer phrases in a principled way. Further progress towards understanding compositionality in tasks such as sentiment detection requires richer supervised training and evaluation resources and more powerful models of composition. To remedy this, we introduce a Sentiment Treebank. It includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality. To address them, we introduce the Recursive Neural Tensor Network. When trained on the new treebank, this model outperforms all previous methods on several metrics. It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%. The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines. Lastly, it is the only model that can accurately capture the effect of contrastive conjunctions as well as negation and its scope at various tree levels for both positive and negative phrases. http://nlp.stanford.edu/sentiment/
  4. Spies Like Us: How We All Helped Build Prism [...]

    the NSA said in 2009 that it was building a system based on Hadoop, a software program for processing vast amounts of data that Google and Yahoo had popularized. The agency also set up its own open-source project for data mining called Accumulo. Among the citizen coders who’ve contributed to the NSA effort are employees of Silicon Valley startups (Hortonworks), cybersecurity firms (Endgame), and federal contractors (you guessed it: Booz Allen Hamilton (BAH)). The leaked NSA PowerPoint presentation shows that the agency considers Hadoop and MapReduce, another program designed for handling big data sets, crucial to its surveillance efforts. [...] http://www.businessweek.com/articles/2013-06-12/spies-like-us- how-we-all-helped-build-prism
  5. Big Data From Alzheimer's Disease Whole Genome Sequencing Will Be

    Available to Researchers Due to Novel Global Research Database The Alzheimer's Association and the Brin Wojcicki Foundation announced today that massive amounts of new data have been generated by the first "Big Data" project for Alzheimer's disease. The data will be made freely available to researchers worldwide to quickly advance Alzheimer's science. Discussed today at the Alzheimer's Association International Conference (AAIC) 2013 in Boston, the project obtained whole genome sequences on the largest cohort of individuals related to a single disease – more than 800 people enrolled in the Alzheimer's Disease Neuroimaging Initiative (ADNI). The genome sequencing data – estimated to be 200 terabytes – will be housed in and available through the Global Alzheimer's Association Interactive Network (GAAIN), a planned massive network of Alzheimer's disease research data made available by the world's foremost Alzheimer's researchers from their own laboratories, and which also is being publicly announced today at AAIC 2013. GAAIN is funded by an initial $5 million dollar investment by the Alzheimer's Association, made possible due to the generous support of donors. http://www.alz.org/aaic/_releases_2013/fri_400pm_gaain.asp
  6. The $1.3B Quest to Build a Supercomputer Replica of a

    Human Brain Even by the standards of the TED conference, Henry Markram’s 2009 TEDGlobal talk was a mind- bender. He took the stage of the Oxford Playhouse, clad in the requisite dress shirt and blue jeans, and announced a plan that—if it panned out—would deliver a fully sentient hologram within a decade. He dedicated himself to wiping out all mental disorders and creating a self-aware artificial intelligence. And the South African–born neuroscientist pronounced that he would accomplish all this through an insanely ambitious attempt to build a complete model of a human brain—from synapses to hemispheres—and simulate it on a supercomputer. Markram was proposing a project that has bedeviled AI researchers for decades, that most had presumed was impossible. He wanted to build a working mind from the ground up. [...] And now Markram has funding almost as outsized as his ideas. On January 28, 2013, the European Commission—the governing body of the European Union—awarded him 1 billion euros ($1.3 billion). For decades, neuroscientists and computer scientists have debated whether a computer brain could ever be endowed with the intelligence of a human. It’s not a hypothetical debate anymore. Markram is building it. Will he replicate consciousness? The EU has bet $1.3 billion on it. http://www.wired.com/wiredscience/2013/05/neurologist- markam-human-brain/all/