What you want
to do
Algorithm 1: int main(int argc,char
2: int i=0;
3: char c;
4: while(i==0){
5: c=getchar();
Program
Do it!
language
environment
Slide 4
Slide 4 text
Agenda
1. Preparation (Installing Studio)
2. Invoking command and find the result
3. Importing data from Excel
4. Summarizing data
5. Visualizing data shape
6. Try a few statistical methods
4
Slide 5
Slide 5 text
1. Preparation
Slide 6
Slide 6 text
Rͷ४උ
‣ R
✓ Engine of R
✓ Include minimum environment
‣ RStudio
✓ More convenience environment
✓ For your daily use
✓ It requires Engine of R
6
ʴ
Slide 7
Slide 7 text
RͷΠϯετʔϧ 7
https://www.r-project.org/about.html
Select mirror site in Japanese
CRAN page
Download R for (Mac) OS X → R-3.4.3.pkg
Download R for Windows → Install R for the first time
ɹɹɹɹɹɹɹɹɹɹɹɹɹˠ Download R 3.4.3 for Windo
Slide 8
Slide 8 text
RStudio ͷΠϯετʔϧ 8
https://www.rstudio.com
at the bottom
RStudio 1.1.414 - Windows Vista/7/8/10
RStudio 1.1.414 - Mac OS X 10.6+ (64-bit)
Slide 9
Slide 9 text
Launch RStudio 9
Check version
Slide 10
Slide 10 text
2. Invoking commands and Find the results
Slide 11
Slide 11 text
Input here
Slide 12
Slide 12 text
R is a calculator
‣ Try to input mathematical formula
> 1+1
> 100/3
> 100/3*3
‣ You can use parenthesis
> (3+5)*7
> 3+5*7 # check the result
12
Slide 13
Slide 13 text
Save R commands to a File
‣ You can bind several files in "Project"
✓ Command file (script file)
✓ Data file
✓ Visualization
‣ File → New Project
ɹɹ → New Directory → New Project
13
Slide 14
Slide 14 text
Save R commands to a File
‣ File→New File→R Script
14
Type and
ctl-enter
Results
Slide 15
Slide 15 text
R is a high level calculator
‣ Power
> 2^16
‣ Functions
> sin(pi/2) #
> exp(1) #
> factorial(10) #
> choose(5,2) #
15
sin
⇡
2
10!
5C2
e1
Slide 16
Slide 16 text
Try graph
> plot(c(5,5,4,3,3,4,1,1))
> x=c(5,4,3,3,1,4,1,1) # variable definition
> plot(x) # simple!
> plot(x,type="b") # What is this data?
> plot(x,type="b",ylim=c(6,1))
> yr=2010:2017
> plot(yr,x,type="b",ylim=c(6,1))
16
Slide 17
Slide 17 text
In R, variable (object) is vector
> x # Just type name of variable
> x+10 # Check the result
> x+c(10,100)
> x[1] # The first value of vector
> x[c(1,3,5)] # the 1st, 3rd, 5th values
17
Slide 18
Slide 18 text
Close and re-open it
‣ Save script file
‣ Quit RStudio
‣ Check the script file in folder
‣ Re-open it by double-clicking the project
file
18
Slide 19
Slide 19 text
3. Importing data from Excel
Slide 20
Slide 20 text
Import Excel data with CSV
‣ Download "carp-e.xlsx" from Bb9
‣ Open it with Excel and check it
‣ Save as "CSV (UTF-8)" → carp.csv
‣ Save "CSV" on Windows
20
Slide 21
Slide 21 text
"Data frame"
> read.csv("carp.csv") # just print
> c=read.csv("carp.csv") # save in "c"
> c # print the content
‣ You can bind several vectors with name in
"data frame"
> c$height # vector with name "height"
> c$height[3] # third value of the vector
> mean(c$height) # average of "height"
21
Slide 22
Slide 22 text
"Data frame" (cont.)
> c$BMI=c$weight/(c$weight/100)^2 #Calcurate
Body Mass Index
> c[c$BMI>30,] # Players whose BMI is lager
than 30
22
BMI =
Weight(kg)
Height(m)2
Slide 23
Slide 23 text
Advance 1: You can read Excel file directly
‣ install.packages("readxl")
‣ library(readxl)
‣ carp=read_excel("carp.xlsx",1)
‣ dragons=read_excel("carp.xlsx",2)
23
Slide 24
Slide 24 text
4. Summarizing data
Slide 25
Slide 25 text
At first, try summary()
✓ summary(c)
‣ Numerical data
✓ Min(Minimum value),1st Qu.(the 1st
quarter), Median, Mean, 3rd Qu. (the 3rd
quarter), Max(Maximum value)
‣ Categorical data
✓ Frequency
25
Slide 26
Slide 26 text
Summary of data group
> # summary of players whose BMI is lager
than 26
> summary(c[c$BMI>26,])
> # summary of height by position
> tapply(c$height,c$position,summary)
26
Slide 27
Slide 27 text
5. Visualizing data shape
Slide 28
Slide 28 text
On Mac, you need to set font to use Japanese in graphics
> # ヒラギノ角ゴシックをW3使うように指定
> par(family = "HiraKakuProN-W3")
28
Regarding baseball player, is lefty ratio significantly large?
Put the usual ratio to 0.1, try the binomial test.
> summary(c$throwing)
> binom.test(13,13+56,0.1)
> cp=c[c$position=="pitcher",]
> summary(cp$throwing)
> binom.test(11,11+23,0.1)
35
Slide 36
Slide 36 text
The advantage of R
‣ Designed for statistic calculation
‣ Beautiful graphics
‣ Operations are recorded as script (=text),
so it will be re-played easily
‣ New statistical methods are going to be
implemented on R
‣ It's free! Open source!
36