open source in 2012 The ‘alpha’ version is under development 2. Uses LLVM to produce speeds similar to C and Java 3. Has LISP-like macros and ‘generated’ functions 4. Has genuine shared memory and multi-tasking 5. Easy connectivity to C and Python; R, Java and C++ are also possible. 6. Solves the ‘two-language’ problem 7. Build with parallel/distributed processing in mind
BI and BD analy1cs; implemented in JVM based programming languages Hadoop : Batch processing; latency in minutes; Map-Reduced jobs used for programming Spark : Batch, Graph and ML; latency few seconds; programmed in Scala/Java Storm : Only streaming; latency sub-seconds; own Java-API
on message passing to allow programs to run on multiple processors in shared or distributed memory. 2. Julia’s implementation of message passing is one-sided: • the programmer needs to explicitly manage only one processor in a two-processor operation • these operations typically do not look like message send and message receive but rather resemble higher-level operations like calls to user functions.
reference is an object that can be used from any processor to refer to an object stored on a particular processor. • A remote call is a request by one processor to call a certain function on certain arguments on another (possibly the same) processor. • A remote call returns a remote reference • Remote calls return immediately: the processor that made the call can then proceeds to its next operation while the remote call happens somewhere else. • You can wait for a remote call to finish by calling wait on its remote reference, and you can obtain the full value of the result using fetch
by numerical means alone. " Among the different types of ML tasks, a crucial distinction is drawn between supervised and unsupervised learning " Supervised machine learning: The program is “trained” on a pre-defined set of “training examples”, which then facilitate its ability to reach an accurate conclusion when given new data. " Unsupervised machine learning: The program is given a bunch of data and must find patterns and relationships within them
where the value being predicted falls somewhere on a con1nuous spectrum. These systems help us with ques1ons of “How much?” or “How many?”. Classifica:on ML: Systems where we seek a yes-or-no predic1on, such as “Is this tumer cancerous?”, “Does this product meet specified quality standards?”, and so on. Bayesian ML: Systems where we have some prior insight and wish to use the data to establish beSer predic1ve models.
API written in Lua that supports machine-learning algorithms; used by large tech companies such as Facebook and Twitter " Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation. " TensorFlow™ is an open source software library for numerical computation using data flow graphs. A flexible architecture allows deployment computation to one or more CPUs or GPUs. " Caffe is a well-known and widely used machine-vision library that ported Matlab’s implementation of fast convolutional nets to C and C++. " MxNet is a machine-learning framework with APIs is languages such as R, Python and Julia which has been adopted by Amazon Web Services.
" General purpose and specific hardware is becoming increasingly more important. " Distributed systems and/ parallelism is necessary to handle non-trivial problems. " Networked systems based on Hadoop will not be sufficient in the future. " Languages such as Julia enable the Data Scientist to process and analyse large datasets within sensible timescales