Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JGS594 Lecture 16

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

JGS594 Lecture 16

Software Engineering for Machine Learning
Clustering III
(202203)

Avatar for Javier Gonzalez-Sanchez

Javier Gonzalez-Sanchez PRO

March 31, 2022
Tweet

More Decks by Javier Gonzalez-Sanchez

Other Decks in Programming

Transcript

  1. jgs SER 594 Software Engineering for Machine Learning Lecture 16:

    Clustering III Dr. Javier Gonzalez-Sanchez [email protected] javiergs.engineering.asu.edu | javiergs.com PERALTA 230U Office Hours: By appointment
  2. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 4

    jgs Algorithms § K-Means - distance between points. Minimize square-error criterion. § DBSCAN (Density-Based Spatial Clustering of Applications with Noise) - distance between nearest points. § Simple EM (Expectation Maximization) is finding likelihood of an observation belonging to a cluster(probability). Maximize log- likelihood criterion
  3. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 24

    jgs Assignment | Part 1 § https://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/iris.arff § The data was used to learn the description of an acceptable and unacceptable contract. § Number of Instances: 150 § @attribute 'class' {Iris-setosa,Iris-versicolor,Iris-virginica} § K-means (3) § DBSCAN § EM § Evaluation: Likelihood Values § Confusion Matrix, Accuracy
  4. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 25

    jgs Assignment | Part 2 § https://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/labor.arff § Iris flowers classification § Number of Instances: 57 § @attribute 'class' {'bad’, ’good’} § K-means (2) § DBSCAN § EM § Evaluation: Likelihood Values § Confusion Matrix, Accuracy
  5. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 26

    jgs Assignment | Part 3 § Students Grades Dataset (From Previous Quiz) § Students' grades § Number of Instances: ~150 § @attribute 'class’ unknown § K-means (?) § DBSCAN § EM § Evaluation: Likelihood Values § Confusion Matrix, Accuracy
  6. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 27

    jgs Notes § Do not forget to separate Training and Testing datasets § Use your programming skills to calculate Confusion Matrix and Accuracy § As usual submit a paper including: A) Source Code B) Results B) Explain your findings and Conclusions § Academic Integrity 👀
  7. jgs SER 594 Software Engineering for Machine Learning Javier Gonzalez-Sanchez,

    Ph.D. [email protected] Spring 2022 Copyright. These slides can only be used as study material for the class CSE205 at Arizona State University. They cannot be distributed or used for another purpose.