48

# Topics - Weka

April 30, 2014

## Transcript

1. ### Introduction to Machine Learning – Data Analysis in Weka –

Gregory Ditzler Drexel University Ecological and Evolutionary Signal Processing & Informatics Lab Department of Electrical & Computer Engineering Philadelphia, PA, USA gregory.ditzler@gmail.com http://github.com/gditzler/eces436-week1 April 21, 2014
2. ### Overview of the Support Vector Machine Problem: given a binary

classiﬁcation problem, the soft-margin between the classes is maximized by solving: Maximize L(α) = n i=1 αi − 1 2 n i=1 n j=1 αi αj ti tj k(xi , xj ) Subject to 0 ≤ αi ≤ C n i=1 αi ti = 0
3. ### Overview of the Support Vector Machine y = 1 y

= 0 y = −1 margin y = 1 y = 0 y = −1 Maximization of the margin between two classes given by y ∈ {−1, +1} (Graphic from: C. Bishop, PRML, 2006.)
4. ### Support Vector Machine Implementation Optimization of L(α) is convex and

can be solved using quadratic programming g(x) = n i=1 αi ti Φ(xi )TΦ(x) + w0 = n i=1 αi ti k(xi , x) + w0 where w0 = 1 | S | i∈S tj − j∈S αj tj k(xi , xj )
5. ### What are kernels? In layman’s terms: A kernel is a

measure of similarity between two patterns x and x Consider a similarity measure of the form: k :X × X → R (x, x ) → k(x, x ) k(x, x ) returns a real valued quantity measuring the similarity between x and x One simple measure of similarity is the canonical dot product computes the cosine of the angle between two vectors, provided they are normalized to length 1 dot product of two vectors form a pre-Hilbert space Kernels represent patterns in some dot product space H Φ : X → H
6. ### Feature Space: X → H A few notes about our

mapping into H via Φ 1. Mapping lets us deﬁne a similarity measure from the dot product in H k(x, x ) := Φ(x)TΦ(x ) 2. Patterns are dealt with geometrically; eﬃcient learning algorithms may be applied using linear algebra 3. The selection of Φ(·) leads to a large selection of similarity measures

8. ### The Kernel Trick Consider the following dot product:  

x2 1 √ 2x1 x2 x2 2   T   x2 1 √ 2x1 x2 x2 2   = x2 1 x2 1 + 2x2 1 x2 2 + x2 2 x2 2 = (x2 1 + x2 2 )2 = x1 x2 T x1 x2 2 = (xTx)2 Φ(x)TΦ(x) = k(x, x) Can you think of a kernel where to dot product occurs in an inﬁnite dimensional space? Why?
9. ### Gaussian kernels implement dot products in an inﬁnite dimensional space

The Gaussian kernel is deﬁned as k(x, x ) = exp −γ x − x 2 The term in the exponent (−γ x − x 2) is a scaler value computed in the vector space of the data. Recall the dot product for two arbitrary vectors is given by xTx = d i=1 xi xi Thus all we need to show is that the calculation of the Gaussian kernel occurs with an inﬁnite number of elements. Φ(x)TΦ(x ) = k(x, x ) = exp −γ x − x 2 = ∞ n=0 −γ x − x 2 n n!
10. ### A simple kernel example Φ(x)TΦ(x) = k(x, x) −3 −2

−1 0 1 2 3 −3 −2 −1 0 1 2 3 Figure : Original space 0 5 10 −20 0 20 0 5 10 Figure : Feature space
11. ### What is Weka? The Weka (Gallirallus australis) is a ﬂightless

bird species of the rail family that is native only to New Zealand. This bird’s conservation status is currently listed as vulnerable on the threatened conversation status. The Weka is thought to be a curious bird. Folklore has tales of the Weka stealing shiny items and sugar. How is a Weka going to help us in this class? Photo credit: J¨ org Hempel
12. ### What is Weka? (take two) Waikato Environment for Knowledge Analysis

Weka is a project run at the University of Waikato aimed at making machine learning methods available to the public. Weka is a collection of machine learning algorithms for data mining tasks. Weka has tools that can be directly applied to your data, or integrated into code that you are writing. Weka is open source and written in Java. Its possible to apply Weka to problems of Big Data, and Massive Online Analysis (developed at the University of Waikato) can handle large volume data streams.

14. ### Attribute Relation File Format (ARFF) ARFF is the ﬁle format

that Weka expects when you are going to perform classiﬁcation or regression. Experiments in Weka are output in ARFF format. The Header Section The header contains information about the data, such as expected formats and the number of attributes in a data set. A relation setting is used to give the data set a name. A relation is like naming a ﬁle, but it lets the computer program know the task. Comments can be used with “%”. They are not interpreted by the parser. The Data Section The data are listed in a CSV format. The features (i.e., attributes) appear in the order they were deﬁned in the header section (examples to come). Dense and sparse formats are available.
15. ### ARFF Example % 1. Title: Iris Plants Database % 2.

Sources: % (a) Creator: R.A. Fisher % (b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov) % (c) Date: July, 1988 % @RELATION iris @ATTRIBUTE sepallength NUMERIC @ATTRIBUTE sepalwidth NUMERIC @ATTRIBUTE petallength NUMERIC @ATTRIBUTE petalwidth NUMERIC @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica} @DATA 5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iris-setosa 4.7,3.2,1.3,0.2,Iris-setosa 4.6,3.1,1.5,0.2,Iris-setosa 5.0,3.6,1.4,0.2,Iris-setosa 5.4,3.9,1.7,0.4,Iris-setosa
16. ### Attributes The format for the @attribute statement is: @attribute <attribute-name>

<datatype> datatype Numeric Integer (treated as numeric) Real (treated as numeric) Nominal. for example {cat,dog}. any value that has a space must be placed in quotes. String Date. The default format string accepts the ISO-8601 combined date and time format: yyyy-MM-dd’T’HH:mm:ss @attribute <name> date [<date-format>] @ATTRIBUTE weight NUMERIC @ATTRIBUTE age INTEGER @ATTRIBUTE height REAL @ATTRIBUTE sex {male,female} @ATTRIBUTE lastname STRING @ATTRIBUTE birthdate DATE "yyyy-MM-dd"
17. ### Sparse ARFF Sparse Data Representations Some data sets contain many

entries in the data matrix marked with a zero. We refer to these matrices as sparse. Saving a tuple such as (x-index, y-index, value) could be more space eﬃcient than saving every entry in the matrix. Header remains the same; however, the data entry is diﬀerent. Dense Format @DATA 0, X, 0, Y, "class A" 0, 0, W, 0, "class B" Sparse Format @DATA {1 X, 3 Y, 4 "class A"} {2 W, 4 "class B"}
18. ### The Weka GUI The Weka GUI allows users a straightforward

way to evaluate classiﬁer and run comparisons of multiple classiﬁers. Explorer: basic interface to Weka’s tools Experimenter: setup experiments with multiple classiﬁers on multiple data sets KnowledgeFlow: pipeline experimental development. similar to simulink Simple CLI: simple command line interface to the Weka tools