Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Narrative of Iris

kilometer
September 19, 2020

Narrative of Iris

in #88 Tokyo.R

kilometer

September 19, 2020
Tweet

More Decks by kilometer

Other Decks in Programming

Transcript

  1. Who!? Name: @kilometer Job: Post-Doc (Ph. D. in Engineering) Field:

    Behavioral Neurosci. Brain Imaging Medical System R: ~ 10 years
  2. > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5

    1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa Iris Data > str(iris) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 ...
  3. library(tidyverse) iris_long <- iris %>% pivot_longer(cols = -Species, names_sep =

    "⧵⧵.", names_to = c("key", ".value")) > iris_long ## # A tibble: 6 x 4 ## Species key Length Width ## <fct> <chr> <dbl> <dbl> ## 1 setosa Sepal 5.1 3.5 ## 2 setosa Petal 1.4 0.2 ## 3 setosa Sepal 4.9 3 ## 4 setosa Petal 1.4 0.2 ## 5 setosa Sepal 4.7 3.2 Iris Data
  4. ggplot(data = iris_long) + aes(x = Width, y = Length,

    color = Species, shepe = key) + geom_point() Iris Data
  5. Iris flowers Northern Blue flag (Iris versicolor) Gordon, D. &

    Robertson, E., from Wikipedia, CC BY-SA 3.0
  6. Iris setosa var. canadensis Iris setosa var. interior Iris setosa

    Iris versicolor Iris virginica Iris virginica var. Shrevei Iris flowers (Morphological classification of northern and sub-artic blue flags) Anderson, E., 1936, Ann Mo Bot Gard.
  7. Iris setosa var. canadensis The species problem Specific name (種⼩名)

    Genus (属) Variety (変種) Species are groups of actually or potentially interbreeding natural populations, which are reproductively isolated from other such groups. Queiroz, K., 2005, PNAS Mayr, E., 1942, Columbia Univ. Press ---- by Ernst Mayr, 1942
  8. The species problem Queiroz, K., 2005, PNAS SC (species criterion)

    1-8: the times at which the daughter lineages acquire different properties relative to one another
  9. "The species Problem in Iris" ---- by Edgar Anderson, 1936

    Anderson, E., 1936, Ann Mo Bot Gard. As a biological phenomenon the species problem is worthy of serious study as an end in itself.
  10. Iris setosa Iris versicolor Iris virginica minutely papillate at the

    base of blade macroscopically pubescent at the base of blade Petals setose laminate Sepals Iris flowers (Morphological classification of northern and sub-artic blue flags) Anderson, E., 1936, Ann Mo Bot Gard. (花弁) (萼⽚) (剛⽑を有する) (滑らか) (基部に⾁眼で観察できる軟⽑を有する) (基部は細かな乳頭状)
  11. The botanical Garden' detailing plants brought to Egypt after the

    campaigns of Tuthmosis III (around 1426 B.C.), Karnak Temple Farrar, L., 2016, Windgather Press, photo: en.wikipedia.org/wiki/Iris_albicans Iris albicans
  12. Iris, a Greek goddess ・Daughter of Taumas (sun of Pontus)

    & Electra (daughter of Oceanus) ・Messenger of Hera ・Goddess of the rainbow ・Goddess of the sky and sea Rainbow: bridge between Heaven and Earth (in ancient Greek, iris = rainbow, eiris = messenger) Koudu, H., 1953, Iwanami
  13. Iris setosa Iris versicolor Iris virginica minutely papillate at the

    base of blade macroscopically pubescent at the base of blade Petals setose laminate Sepals Iris flowers (Morphological classification of northern and sub-artic blue flags) Anderson, E., 1936, Ann Mo Bot Gard. (花弁) (萼⽚) (剛⽑を有する) (滑らか) (基部に⾁眼で観察できる軟⽑を有する) (基部は細かな乳頭状)
  14. Anderson, E., 1936, Ann Mo Bot Gard. I. versicolor I.

    virginica I. virginica var. shrevei ideograph Sepal Petal
  15. The northern blue flags ...... study the minutae of variation

    so intensively in these two species that one might demonstrate the way in which one species had evolved from the other, or from some common ancestor. Iris versicolor might vary greatly and that Iris virginica might vary greatly but that each remained itself. ...... The variation within could never be compounded into the variation between. Q. A.
  16. Fisher, R. A., 1936, Annals of Eugenics I. 判別関数 2つ以上の集団がx1

    , ....,x8 で測定されているとして、集団が最もよ く識別される線形関数を求めることに関⼼がある。著者の提案は (a)......、および(b) ......など、頭蓋測定において既に⾏われている 最も明確に進歩的または世俗的な傾向を⽰す。本論⽂では、同じ 原理の応⽤を分類学的な問題に例⽰し、採⽤されたプロセスの精 度に関連したいくつかの問題についても議論する。
  17. Fisher, R. A., 1936, Annals of Eugenics I. 判別関数 2つ以上の集団がx1

    , ....,x8 で測定されているとして、集団が最もよ く識別される線形関数を求めることに関⼼がある。著者の提案は (a)......、および(b) ......など、頭蓋測定において既に⾏われている 最も明確に進歩的または世俗的な傾向を⽰す。本論⽂では、同じ 原理の応⽤を分類学的な問題に例⽰し、採⽤されたプロセスの精 度に関連したいくつかの問題についても議論する。 (優⽣学年鑑)
  18. Fisher, R. A., 1936, Annals of Eugenics I. 判別関数 2つ以上の集団がx1

    , ....,x8 で測定されているとして、集団が最もよ く識別される線形関数を求めることに関⼼がある。著者の提案は (a)......、および(b) ......など、頭蓋測定において既に⾏われている 最も明確に進歩的または世俗的な傾向を⽰す。本論⽂では、同じ 原理の応⽤を分類学的な問題に例⽰し、採⽤されたプロセスの精 度に関連したいくつかの問題についても議論する。 (優⽣学年鑑)
  19. -- Publisher's comment -- The work of eugenicists was often

    pervaded by prejudice against racial, ethnic and disabled groups. (優⽣学者の仕事は時として⼈種・⺠族・障害者グルー プに対する偏⾒が蔓延していた。) Publication of this material online is for scholarly research purposes is not an endorsement or promotion of the views expressed in any of these articles or eugenics in general. (この資料をオンラインで公開するのは学術研究を⽬的 としたものであり、これらの論⽂や優⽣学⼀般の⾒解 を⽀持したり宣伝したりするものではない。)
  20. Num. of citation is one of the most popular index

    of scientific research impact. Do you REALLY want to give this paper any more impact?
  21. When you use the iris data, you also become one

    of the characters in its narrative.
  22. Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy

    Systems "We do not guarantee that all the results we discuss for “the” Iris data really pertain to the same numerical inputs."
  23. Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy

    Systems Specifically, two vectors in Iris Sestosa were wrong: vector 35 in Fisher is (4.9, 3.1, 1.5, 0.2), but in the machine learning electronic database it had the coordinates (4.9, 3.1, 1.5, 0.1); and vector 38 in Fisher is (4.9, 3.6, 1.4, 0.1), but in the electronic database it was (4.9, 3.1, 1.5, 0.1).
  24. "Better yet (and we know many of you will check

    our version this way), return to the source and take the values directly from Fisher’s paper." Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy Systems
  25. "Better yet (and we know many of you will check

    our version this way), return to the source and take the values directly from Fisher’s paper." Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy Systems -> Or, stop using the Iris data.
  26. 2. Anderson, E., "Species problem in iris.", 1936 Fisher, R.,

    Annals of Eugenic, 1936 Should not be cited any more, because it is one measure of scientific impact
  27. The only way to stop citing Fisher's paper is to

    not use iris data. That would solve the other annoying problem of checking for miscopying. Don't forget when you use the iris data you also become one of the characters in its narrative. We can start stopping the use of iris data today. Actually, it's quite easy. 5. My opinion