Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CSC570 Lecture 06

CSC570 Lecture 06

Applied Affective Computing
Clustering
(202304)

Javier Gonzalez-Sanchez

April 17, 2023
Tweet

More Decks by Javier Gonzalez-Sanchez

Other Decks in Research

Transcript

  1. jgs CSC 570 Current Topics in Computer Science Applied Affective

    Computing Lecture 06: Clustering Dr. Javier Gonzalez-Sanchez [email protected] www.javiergs.com Building 14 -227 Office Hours: By appointment
  2. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    5 § K-Means - distance between points. Minimize square-error criterion. § DBSCAN (Density-Based Spatial Clustering of Applications with Noise) - distance between nearest points. § Simple EM (Expectation Maximization) is finding likelihood of an observation belonging to a cluster(probability). Maximize log- likelihood criterion Algorithms
  3. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    9 @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature real @attribute humidity real @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny, 85, 85, FALSE, no sunny, 80, 90, TRUE, no overcast, 83, 86, FALSE, yes rainy, 70, 96, FALSE, yes rainy, 68, 80, FALSE, yes rainy, 65, 70, TRUE, no overcast, 64, 65, TRUE, yes sunny, 72, 95, FALSE, no sunny, 69, 70, FALSE, yes rainy, 75, 80, FALSE, yes sunny, 75, 70, TRUE, yes overcast, 72, 90, TRUE, yes overcast, 81, 75, FALSE, yes rainy, 71, 91, TRUE, no Dataset :: weather.arff
  4. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    12 @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature real @attribute humidity real @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny, 85, 85, FALSE, no sunny, 80, 90, TRUE, no overcast, 83, 86, FALSE, yes rainy, 70, 96, FALSE, yes rainy, 68, 80, FALSE, yes rainy, 65, 70, TRUE, no overcast, 64, 65, TRUE, yes sunny, 72, 95, FALSE, no sunny, 69, 70, FALSE, yes rainy, 75, 80, FALSE, yes sunny, 75, 70, TRUE, yes overcast, 72, 90, TRUE, yes overcast, 81, 75, FALSE, yes rainy, 71, 91, TRUE, no Dataset :: weather.arff
  5. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    13 Thoughts? Is this good, acceptable, bad ?
  6. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    15 A summary of prediction results on a classification problem Definition TP FP FN TN positive 0 negative 1 positive 0 negative 1
  7. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    16 A summary of prediction results on a classification problem Definition 6 3 3 2 YES 0 NO 1 YES 0 NO 1
  8. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    18 § overall measure of how much the model is correctly predicting on the entire set of data § Addition of the elements in the main diagonal divide by the sum of all the entries of the confusion matrix at the denominator. Definition
  9. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    19 Accuracy Accuracy = TP + TN / TP + TN + FP + FN Accuracy = 6 + 2 / 6 + 2 + 3 + 3 Accuracy = 8 / 8 + 6 Accuracy = 8 / 14 Accuracy = 0.57 TP FP FN TN YES 0 NO 1 YES 0 NO 1
  10. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    20 Thoughts? What is a Good Value? TP FP FN TN
  11. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    22 Precision and Recall / / How much can we trust the model when predicting a Positive Precision = TP / TP + FP Precision = 6 / 6 + 3 Precision = 6 / 9 = 0.66 / / Measure the ability of the model to find all Positive units Recall = TP / TP + FN Recall = 6 / 6 + 3 Recall = 6 / 9 = 0.66 TP FP FN TN YES 0 NO 1 YES 0 NO 1
  12. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    23 Thoughts? What is a Good Value? TP FP FN TN
  13. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    25 § The harmonic mean of precision and recall. § Mixture of: How much we can trust the model when predict a Positive (Precision), and The ability of the model to find all Positive units (Recall) Definition
  14. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    26 Harmonic mean a = 7 b = 3 / / if same units (big) Arithmetic mean = 7+3 / 2 = 5 / / if diverse units (small) Geometric mean = sqrt (7*3) = 4.58 / / ratios of diverse units (smaller) Harmonic mean = pow (sqrt (7*3)) / (7+3 / 2) = 21 / 5 = 4.2
  15. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    27 F1-score Precision = TP / TP + FP / / predicted Recall = TP / TP + FN / / real F1-score = 2 * Precision * Recall / Precision + Recall F1-score = 2 * 0.66 * 0.66 / 0.66 + 0.66 = 0.66 TP FP FN TN YES 0 NO 1 YES 0 NO 1
  16. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    28 Thoughts? What is a Good Value? TP FP FN TN
  17. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    30 What about not binary classifiers? e.g. emotion recognition Definition TP FP FN TN positive 😀 😡 . 🙁 negative positive 😀 😡. 🙁 negative FN TN FP TN TN
  18. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    31 Definition positive negative TP FP FN TN positive negative FN X FP X TN Accuracy = TP + TN / TP + TN + FP + FN + X
  19. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    32 Definition positive negative TP FP FN TN positive negative FN X FP X TN / / How much we can trust the model when predict a Positive Precision = TP / TP + FP / / Measure the ability of the model to find all Positive units Recall = TP / TP + FN
  20. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    33 Definition positive negative TP FP FN TN positive negative FN X FP X TN Precision = TP / TP + FP / / predicted Recall = TP / TP + FN / / real F1-score = 2 * Precision * Recall / Precision + Recall
  21. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    35 no no yes yes yes no yes no yes yes yes yes yes no K-means, DBSCAN, EM K-means DBSCAN EM
  22. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    36 no no yes yes yes no yes no yes yes yes yes yes no K-means, DBSCAN, EM K-means DBSCAN EM 8/14 8/14 10/14
  23. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    37 Homework • Open or Closed Eyes VS. Brain • 5 diverse stimulation scenarios VS. Brain • 5 diverse stimulation scenarios VS. Affect Follow 2 approaches: a) Clustering as described today (EM, K-means, Density) b) Explore another solution to the best of your knowledge (Machine Learning, Data mining, Statistics) Due: Monday (April 24)
  24. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    42 Thoughts? Clustering was easy, What about something more precise for our data (such as a RandomForest) ?
  25. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    44 § https://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/iris.arff § The data was used to learn the description of an acceptable and unacceptable contract. § Number of Instances: 150 § @attribute 'class' {Iris-setosa,Iris-versicolor,Iris-virginica} § K-means (3) § DBSCAN § EM § Evaluation: Likelihood Values § Confusion Matrix, Accuracy Just To Practice
  26. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    45 ARFF Examples https://storm.cis.fordham.edu/~gweiss/data-mining/datasets.html
  27. jgs Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 |

    47 Office Hours Tuesday and Thursday 3 - 5 pm But an appointment required Sent me an email – [email protected]
  28. jgs CSC 570 Applied Affective Computing Javier Gonzalez-Sanchez, Ph.D. [email protected]

    Spring 2023 Copyright. These slides can only be used as study material for the class CSC308 at Cal Poly. They cannot be distributed or used for another purpose.