Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CSC570 Lecture 06

CSC570 Lecture 06

Applied Affective Computing
Clustering
(202304)

Javier Gonzalez-Sanchez
PRO

April 17, 2023
Tweet

More Decks by Javier Gonzalez-Sanchez

Other Decks in Research

Transcript

  1. jgs
    CSC 570
    Current Topics in Computer Science
    Applied Affective Computing
    Lecture 06:
    Clustering
    Dr. Javier Gonzalez-Sanchez
    [email protected]
    www.javiergs.com
    Building 14 -227
    Office Hours: By appointment

    View Slide

  2. jgs
    Previously …

    View Slide

  3. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 3
    Dataset 2

    View Slide

  4. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 4
    Machine Learning
    EM

    View Slide

  5. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 5
    § K-Means - distance between points. Minimize square-error criterion.
    § DBSCAN (Density-Based Spatial Clustering of Applications with
    Noise) - distance between nearest points.
    § Simple EM (Expectation Maximization) is finding likelihood of an
    observation belonging to a cluster(probability). Maximize log-
    likelihood criterion
    Algorithms

    View Slide

  6. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 6
    Weka GUI

    View Slide

  7. jgs
    Weka API
    Code

    View Slide

  8. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 8
    Weka API
    weka.jar

    View Slide

  9. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 9
    @relation weather
    @attribute outlook {sunny, overcast, rainy}
    @attribute temperature real
    @attribute humidity real
    @attribute windy {TRUE, FALSE}
    @attribute play {yes, no}
    @data
    sunny, 85, 85, FALSE, no
    sunny, 80, 90, TRUE, no
    overcast, 83, 86, FALSE, yes
    rainy, 70, 96, FALSE, yes
    rainy, 68, 80, FALSE, yes
    rainy, 65, 70, TRUE, no
    overcast, 64, 65, TRUE, yes
    sunny, 72, 95, FALSE, no
    sunny, 69, 70, FALSE, yes
    rainy, 75, 80, FALSE, yes
    sunny, 75, 70, TRUE, yes
    overcast, 72, 90, TRUE, yes
    overcast, 81, 75, FALSE, yes
    rainy, 71, 91, TRUE, no
    Dataset :: weather.arff

    View Slide

  10. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 10
    Weka API

    View Slide

  11. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 11
    Output

    View Slide

  12. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 12
    @relation weather
    @attribute outlook {sunny, overcast, rainy}
    @attribute temperature real
    @attribute humidity real
    @attribute windy {TRUE, FALSE}
    @attribute play {yes, no}
    @data
    sunny, 85, 85, FALSE, no
    sunny, 80, 90, TRUE, no
    overcast, 83, 86, FALSE, yes
    rainy, 70, 96, FALSE, yes
    rainy, 68, 80, FALSE, yes
    rainy, 65, 70, TRUE, no
    overcast, 64, 65, TRUE, yes
    sunny, 72, 95, FALSE, no
    sunny, 69, 70, FALSE, yes
    rainy, 75, 80, FALSE, yes
    sunny, 75, 70, TRUE, yes
    overcast, 72, 90, TRUE, yes
    overcast, 81, 75, FALSE, yes
    rainy, 71, 91, TRUE, no
    Dataset :: weather.arff

    View Slide

  13. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 13
    Thoughts?
    Is this good, acceptable, bad ?

    View Slide

  14. jgs
    Confusion Matrix

    View Slide

  15. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 15
    A summary of prediction results on a classification problem
    Definition
    TP FP
    FN TN
    positive 0
    negative 1
    positive
    0
    negative
    1

    View Slide

  16. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 16
    A summary of prediction results on a classification problem
    Definition
    6 3
    3 2
    YES 0
    NO 1
    YES
    0
    NO
    1

    View Slide

  17. jgs
    Accuracy

    View Slide

  18. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 18
    § overall measure of how much the model is correctly predicting on the entire
    set of data
    § Addition of the elements in the main diagonal divide by the sum of all the
    entries of the confusion matrix at the denominator.
    Definition

    View Slide

  19. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 19
    Accuracy
    Accuracy = TP + TN / TP + TN + FP + FN
    Accuracy = 6 + 2 / 6 + 2 + 3 + 3
    Accuracy = 8 / 8 + 6
    Accuracy = 8 / 14
    Accuracy = 0.57
    TP FP
    FN TN
    YES 0
    NO 1
    YES
    0
    NO
    1

    View Slide

  20. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 20
    Thoughts?
    What is a Good Value?
    TP FP
    FN TN

    View Slide

  21. jgs
    Precision and Recall

    View Slide

  22. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 22
    Precision and Recall
    /
    / How much can we trust the model when predicting a Positive
    Precision = TP / TP + FP
    Precision = 6 / 6 + 3
    Precision = 6 / 9 = 0.66
    /
    / Measure the ability of the model to find all Positive units
    Recall = TP / TP + FN
    Recall = 6 / 6 + 3
    Recall = 6 / 9 = 0.66
    TP FP
    FN TN
    YES 0
    NO 1
    YES
    0
    NO
    1

    View Slide

  23. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 23
    Thoughts?
    What is a Good Value?
    TP FP
    FN TN

    View Slide

  24. jgs
    F1-score

    View Slide

  25. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 25
    § The harmonic mean of precision and recall.
    § Mixture of:
    How much we can trust the model when predict a Positive (Precision), and
    The ability of the model to find all Positive units (Recall)
    Definition

    View Slide

  26. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 26
    Harmonic mean
    a = 7
    b = 3
    /
    / if same units (big)
    Arithmetic mean = 7+3 / 2 = 5
    /
    / if diverse units (small)
    Geometric mean = sqrt (7*3) = 4.58
    /
    / ratios of diverse units (smaller)
    Harmonic mean = pow (sqrt (7*3)) / (7+3 / 2) = 21 / 5 = 4.2

    View Slide

  27. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 27
    F1-score
    Precision = TP / TP + FP /
    / predicted
    Recall = TP / TP + FN /
    / real
    F1-score = 2 * Precision * Recall / Precision + Recall
    F1-score = 2 * 0.66 * 0.66 / 0.66 + 0.66 = 0.66
    TP FP
    FN TN
    YES 0
    NO 1
    YES
    0
    NO
    1

    View Slide

  28. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 28
    Thoughts?
    What is a Good Value?
    TP FP
    FN TN

    View Slide

  29. jgs
    What about more than 2 categories?

    View Slide

  30. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 30
    What about not binary classifiers?
    e.g. emotion recognition
    Definition
    TP FP
    FN TN
    positive
    😀
    😡
    .
    🙁
    negative
    positive
    😀 😡. 🙁
    negative
    FN TN
    FP
    TN
    TN

    View Slide

  31. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 31
    Definition
    positive
    negative
    TP FP
    FN TN
    positive negative
    FN X
    FP
    X
    TN
    Accuracy = TP + TN / TP + TN + FP + FN + X

    View Slide

  32. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 32
    Definition
    positive
    negative
    TP FP
    FN TN
    positive negative
    FN X
    FP
    X
    TN
    /
    / How much we can trust the model when predict a Positive
    Precision = TP / TP + FP
    /
    / Measure the ability of the model to find all Positive units
    Recall = TP / TP + FN

    View Slide

  33. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 33
    Definition
    positive
    negative
    TP FP
    FN TN
    positive negative
    FN X
    FP
    X
    TN
    Precision = TP / TP + FP /
    / predicted
    Recall = TP / TP + FN /
    / real
    F1-score = 2 * Precision * Recall / Precision + Recall

    View Slide

  34. jgs
    Can we do better?

    View Slide

  35. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 35
    no
    no
    yes
    yes
    yes
    no
    yes
    no
    yes
    yes
    yes
    yes
    yes
    no
    K-means, DBSCAN, EM
    K-means DBSCAN EM

    View Slide

  36. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 36
    no
    no
    yes
    yes
    yes
    no
    yes
    no
    yes
    yes
    yes
    yes
    yes
    no
    K-means, DBSCAN, EM
    K-means DBSCAN EM
    8/14 8/14 10/14

    View Slide

  37. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 37
    Homework
    • Open or Closed Eyes VS. Brain
    • 5 diverse stimulation scenarios VS. Brain
    • 5 diverse stimulation scenarios VS. Affect
    Follow 2 approaches:
    a) Clustering as described today (EM, K-means, Density)
    b) Explore another solution to the best of your knowledge
    (Machine Learning, Data mining, Statistics)
    Due: Monday (April 24)

    View Slide

  38. jgs
    One Last Thing

    View Slide

  39. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 39
    For CSV files

    View Slide

  40. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 40
    For CSV files

    View Slide

  41. jgs
    Can we do better?

    View Slide

  42. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 42
    Thoughts?
    Clustering was easy,
    What about something more precise for our data
    (such as a RandomForest) ?

    View Slide

  43. jgs
    Test Yourselves

    View Slide

  44. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 44
    § https://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/iris.arff
    § The data was used to learn the description of an acceptable and
    unacceptable contract.
    § Number of Instances: 150
    § @attribute 'class' {Iris-setosa,Iris-versicolor,Iris-virginica}
    § K-means (3)
    § DBSCAN
    § EM
    § Evaluation: Likelihood Values
    § Confusion Matrix, Accuracy
    Just To Practice

    View Slide

  45. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 45
    ARFF Examples
    https://storm.cis.fordham.edu/~gweiss/data-mining/datasets.html

    View Slide

  46. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 46
    Questions

    View Slide

  47. jgs
    Javier Gonzalez-Sanchez | CSC 309 | Winter 2023 | 47
    Office Hours
    Tuesday and Thursday 3 - 5 pm
    But an appointment required
    Sent me an email – [email protected]

    View Slide

  48. jgs
    CSC 570 Applied Affective Computing
    Javier Gonzalez-Sanchez, Ph.D.
    [email protected]
    Spring 2023
    Copyright. These slides can only be used as study material for the class CSC308 at Cal Poly.
    They cannot be distributed or used for another purpose.

    View Slide