Narrative of Iris

8284465a94bbdf1ea82cf1a67d55f447?s=47 kilometer
September 19, 2020

Narrative of Iris

in #88 Tokyo.R

8284465a94bbdf1ea82cf1a67d55f447?s=128

kilometer

September 19, 2020
Tweet

Transcript

  1. #88 2020.09.19 Narrative of iris data kilometer00

  2. Who!? 誰だ?

  3. Who!? Name: @kilometer Job: Post-Doc (Ph. D. in Engineering) Field:

    Behavioral Neurosci. Brain Imaging Medical System R: ~ 10 years
  4. Introduction of Iris data

  5. > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5

    1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa Iris Data > str(iris) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 ...
  6. library(tidyverse) iris_long <- iris %>% pivot_longer(cols = -Species, names_sep =

    "⧵⧵.", names_to = c("key", ".value")) > iris_long ## # A tibble: 6 x 4 ## Species key Length Width ## <fct> <chr> <dbl> <dbl> ## 1 setosa Sepal 5.1 3.5 ## 2 setosa Petal 1.4 0.2 ## 3 setosa Sepal 4.9 3 ## 4 setosa Petal 1.4 0.2 ## 5 setosa Sepal 4.7 3.2 Iris Data
  7. ggplot(data = iris_long) + aes(x = Width, y = Length,

    color = Species, shepe = key) + geom_point() Iris Data
  8. “R for Data Science” (Wickham & Grolemund, 2017)

  9. “R for Data Science” (Wickham & Grolemund, 2017) Data

  10. “R for Data Science” (Wickham & Grolemund, 2017) Data Hypothesis

    & observation Objectives Background
  11. “R for Data Science” (Wickham & Grolemund, 2017) Data Hypothesis

    & observation Objectives Background
  12. Iris flowers

  13. Iris flowers Northern Blue flag (Iris versicolor) Gordon, D. &

    Robertson, E., from Wikipedia, CC BY-SA 3.0
  14. Iris setosa var. canadensis Iris setosa var. interior Iris setosa

    Iris versicolor Iris virginica Iris virginica var. Shrevei Iris flowers (Morphological classification of northern and sub-artic blue flags) Anderson, E., 1936, Ann Mo Bot Gard.
  15. Iris setosa var. canadensis The species problem Specific name (種⼩名)

    Genus (属) Variety (変種) Species are groups of actually or potentially interbreeding natural populations, which are reproductively isolated from other such groups. Queiroz, K., 2005, PNAS Mayr, E., 1942, Columbia Univ. Press ---- by Ernst Mayr, 1942
  16. The species problem Queiroz, K., 2005, PNAS SC (species criterion)

    1-8: the times at which the daughter lineages acquire different properties relative to one another
  17. "The species Problem in Iris" ---- by Edgar Anderson, 1936

    Anderson, E., 1936, Ann Mo Bot Gard. As a biological phenomenon the species problem is worthy of serious study as an end in itself.
  18. Iris setosa Iris versicolor Iris virginica minutely papillate at the

    base of blade macroscopically pubescent at the base of blade Petals setose laminate Sepals Iris flowers (Morphological classification of northern and sub-artic blue flags) Anderson, E., 1936, Ann Mo Bot Gard. (花弁) (萼⽚) (剛⽑を有する) (滑らか) (基部に⾁眼で観察できる軟⽑を有する) (基部は細かな乳頭状)
  19. The other Irises

  20. The botanical Garden' detailing plants brought to Egypt after the

    campaigns of Tuthmosis III (around 1426 B.C.), Karnak Temple Farrar, L., 2016, Windgather Press, photo: en.wikipedia.org/wiki/Iris_albicans Iris albicans
  21. Iris, a Greek goddess ・Daughter of Taumas (sun of Pontus)

    & Electra (daughter of Oceanus) ・Messenger of Hera ・Goddess of the rainbow ・Goddess of the sky and sea Rainbow: bridge between Heaven and Earth (in ancient Greek, iris = rainbow, eiris = messenger) Koudu, H., 1953, Iwanami
  22. photo: https://www.theoi.com/Gallery/P21.6.html Hera & Iris (ca. 480 B.C.) kerykeion oinochoe

    jug wings skkos Hera Iris
  23. Figures: en.wikipedia.org/wiki/Iris_(anatomy) Iris in anatomy Iris(虹彩)

  24. Iris, as a symbol Fleur-de-lis photos: en.wikipedia.org/wiki/Fleur-de-lis, wiki/Iris_pseudacorus, wiki/Iris_florentina I.

    pseudacorus I. florentina
  25. Iris Encode

  26. Ramen Encode

  27. &ODPEF "QQMF 3FBM "QQMF *OGPSNBUJPO %FDPEF

  28. %JWFSHFODF 3FBM *OGP %BUB "QQMF &ODPEJOH

  29. -PTT͛ Symbol grounding problem %JWFSHFODF 3FBM *OGP %BUB "QQMF &ODPEJOH

  30. Anderson's Iris study

  31. "The species Problem in Iris" ---- by Edgar Anderson, 1936

    Symbol grounding problem
  32. Iris setosa Iris versicolor Iris virginica minutely papillate at the

    base of blade macroscopically pubescent at the base of blade Petals setose laminate Sepals Iris flowers (Morphological classification of northern and sub-artic blue flags) Anderson, E., 1936, Ann Mo Bot Gard. (花弁) (萼⽚) (剛⽑を有する) (滑らか) (基部に⾁眼で観察できる軟⽑を有する) (基部は細かな乳頭状)
  33. Anderson, E., 1936, Ann Mo Bot Gard., towardsdatascience.com

  34. Anderson, E., 1936, Ann Mo Bot Gard. I. versicolor I.

    virginica I. virginica var. shrevei ideograph Sepal Petal
  35. The northern blue flags ...... study the minutae of variation

    so intensively in these two species that one might demonstrate the way in which one species had evolved from the other, or from some common ancestor. Iris versicolor might vary greatly and that Iris virginica might vary greatly but that each remained itself. ...... The variation within could never be compounded into the variation between. Q. A.
  36. The iris data

  37. > ?iris

  38. None
  39. None
  40. None
  41. “R for Data Science” (Wickham & Grolemund, 2017) Data Hypothesis

    & observation Objectives Background
  42. Fisher, R. A., 1936, Annals of Eugenics I. 判別関数 2つ以上の集団がx1

    , ....,x8 で測定されているとして、集団が最もよ く識別される線形関数を求めることに関⼼がある。著者の提案は (a)......、および(b) ......など、頭蓋測定において既に⾏われている 最も明確に進歩的または世俗的な傾向を⽰す。本論⽂では、同じ 原理の応⽤を分類学的な問題に例⽰し、採⽤されたプロセスの精 度に関連したいくつかの問題についても議論する。
  43. Fisher, R. A., 1936, Annals of Eugenics I. 判別関数 2つ以上の集団がx1

    , ....,x8 で測定されているとして、集団が最もよ く識別される線形関数を求めることに関⼼がある。著者の提案は (a)......、および(b) ......など、頭蓋測定において既に⾏われている 最も明確に進歩的または世俗的な傾向を⽰す。本論⽂では、同じ 原理の応⽤を分類学的な問題に例⽰し、採⽤されたプロセスの精 度に関連したいくつかの問題についても議論する。 (優⽣学年鑑)
  44. Fisher, R. A., 1936, Annals of Eugenics I. 判別関数 2つ以上の集団がx1

    , ....,x8 で測定されているとして、集団が最もよ く識別される線形関数を求めることに関⼼がある。著者の提案は (a)......、および(b) ......など、頭蓋測定において既に⾏われている 最も明確に進歩的または世俗的な傾向を⽰す。本論⽂では、同じ 原理の応⽤を分類学的な問題に例⽰し、採⽤されたプロセスの精 度に関連したいくつかの問題についても議論する。 (優⽣学年鑑)
  45. -- Publisher's comment -- The work of eugenicists was often

    pervaded by prejudice against racial, ethnic and disabled groups. (優⽣学者の仕事は時として⼈種・⺠族・障害者グルー プに対する偏⾒が蔓延していた。) Publication of this material online is for scholarly research purposes is not an endorsement or promotion of the views expressed in any of these articles or eugenics in general. (この資料をオンラインで公開するのは学術研究を⽬的 としたものであり、これらの論⽂や優⽣学⼀般の⾒解 を⽀持したり宣伝したりするものではない。)
  46. Num. of citation is one of the most popular index

    of scientific research impact. Do you REALLY want to give this paper any more impact?
  47. When you use the iris data, you also become one

    of the characters in its narrative.
  48. Is your iris "the iris"?

  49. Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy

    Systems "We do not guarantee that all the results we discuss for “the” Iris data really pertain to the same numerical inputs."
  50. Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy

    Systems Specifically, two vectors in Iris Sestosa were wrong: vector 35 in Fisher is (4.9, 3.1, 1.5, 0.2), but in the machine learning electronic database it had the coordinates (4.9, 3.1, 1.5, 0.1); and vector 38 in Fisher is (4.9, 3.6, 1.4, 0.1), but in the electronic database it was (4.9, 3.1, 1.5, 0.1).
  51. http://archive.ics.uci.edu/ml/datasets/Iris

  52. http://archive.ics.uci.edu/ml/datasets/Iris

  53. None
  54. "Better yet (and we know many of you will check

    our version this way), return to the source and take the values directly from Fisher’s paper." Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy Systems
  55. "Better yet (and we know many of you will check

    our version this way), return to the source and take the values directly from Fisher’s paper." Bezdek, J. C. et al., 1999, IEEE Transactions on Fuzzy Systems -> Or, stop using the Iris data.
  56. Movements

  57. None
  58. None
  59. None
  60. None
  61. Penguins?

  62. None
  63. https://allisonhorst.github.io/palmerpenguins/

  64. Summary

  65. 1. Data has always its own narrative. Data Hypothesis &

    observation Objectives Background
  66. 2. Anderson, E., "Species problem in iris.", 1936 Fisher, R.,

    Annals of Eugenic, 1936 Should not be cited any more, because it is one measure of scientific impact
  67. Original Miscopy 3. There are several miscopy version of the

    "iris".
  68. 4. Community movement

  69. The only way to stop citing Fisher's paper is to

    not use iris data. That would solve the other annoying problem of checking for miscopying. Don't forget when you use the iris data you also become one of the characters in its narrative. We can start stopping the use of iris data today. Actually, it's quite easy. 5. My opinion
  70. Enjoy!!! KTM