Me
• Shoichi Kaji
• Tokyo, Japan
• pause/github: skaji
• Perl5: cpm, App::FatPacker::Simple,
Mojo::SlackRTM
• Perl6: mi6, Frinfon, evalbot in Slack:)
Slide 3
Slide 3 text
Agenda
• What is cpm, and why?
• cpanm VS cpm
• The internal of cpm
• divide installing processes into pieaces
• learn from go language
• Roadmap
Slide 4
Slide 4 text
Q: What is cpm?
Slide 5
Slide 5 text
A: It’s yet another
CPAN client
Slide 6
Slide 6 text
Why a new CPAN client?
• Yes, I always use cpanm to install CPAN
modules. It’s awesome!
• Because cpanm installs modules in series,
it takes quite a lot of time to install a module
that has many dependencies
Slide 7
Slide 7 text
I want to install
CPAN modules
as fast as possible
Slide 8
Slide 8 text
Why a new CPAN client?
• So I created cpm
• Actually cpm is not a new CPAN client,
but it uses cpanm in parallel,
so that it can install CPAN modules much
faster
Slide 9
Slide 9 text
How fast?
cpanm VS cpm
installing Plack
Slide 10
Slide 10 text
cpanm
Slide 11
Slide 11 text
cpm
Slide 12
Slide 12 text
cpanm: 30sec
cpm: 10sec
cpm is 3x faster than cpanm!
Slide 13
Slide 13 text
Why cpm is so fast?
— The internal of cpm —
Slide 14
Slide 14 text
First, let’s think simple
$ cat modules | xargs cpanm
Can we just use xargs to parallelize cpanm?
NO, WE CAN’T.
Slide 15
Slide 15 text
The problem with
• The modules to be installed are not determined in advance.
• Even if you have a list of modules to be installed, cpanm
workers will be broken unless you synchronize cpanm
workers
• So we have to
• (1) divide installing process of CPAN module into pieces
that can be executed individually
• (2) synchronize cpanm workers in some way
$ cat modules | xargs cpanm
Slide 16
Slide 16 text
(1) Divide installing process
of CPAN modules
sub installing_process {
my $module = shift;
# 1. resolve
# query cpanmetadb
my $dist_url = resolve($module);
# 2. fetch (and extract)
# wget && tar xzf && read META.json
my ($dir, @configure_deps) = fetch($dist_url);
install_module($_) for @configure_deps;
# 3. configure
# perl Makefile.PL/Build.PL && read MYMETA.json
my @deps = configure($dir);
install_module($_) for @deps;
# 4. install
# make install (or ./Build install)
install($dir);
}
I divided the process
into 4 jobs:
* resolve
* fetch
* configure
* install
which are
independent
Slide 17
Slide 17 text
(2) synchronize
cpanm workers
Slide 18
Slide 18 text
Take a look at go language…
go introduces two concurrency primitives:
* goroutines
* channels
They are very simple but powerful.
func work(in <-chan string, out chan<- string) {
for {
job := <-in
// do work with job
out <- "result"
}
}
func main() {
in := make(chan string)
out := make(chan string)
go work(in, out)
in <- "job"
result := <-out
}
Slide 19
Slide 19 text
Take a look at go language…
func main() {
in1 := make(chan string)
out1 := make(chan string)
go work(in1, out1)
in2 := make(chan string)
out2 := make(chan string)
go work(in2, out2)
in1 <- "job1"
in2 <- "job2"
select {
case result1 := <-out1:
// do something with result1
case result2 := <-out2:
// do something with result2
}
}
It is very easy to
increase workers
You can use select() to
await multiple channels
simultaneously
Slide 20
Slide 20 text
Can we adopt this
idea to Perl5?
Slide 21
Slide 21 text
Of cource, we can.
Slide 22
Slide 22 text
go <-> Perl5
go Perl5
goroutine fork(2)
channel pipe(2)
select select(2)
Slide 23
Slide 23 text
The internal of cpm
.BTUFS
DQOBN
XPSLFS
DQOBN
XPSLFS
DQOBN
XPSLFS
TFMFDU
QJQFY
QJQFY QJQFY
cpanm worker
1. get job via pipe
2. work, work, work!
3. send result via pipe
Master
1. prepare pipes for
workers by pipe(2)
2. launch workers by
fork(2) and connect
them with pipes
3. loop {
calculate jobs and send
jobs to idle workers. if
all workers are busy,
then wait them and
recieve results by
select(2)
}
Slide 24
Slide 24 text
Roadmap
• Last year I talked with Tatsuhiko Miyagawa
about cpanm 2.0 (menlo)
• Then he said “why don’t you merge cpm into
cpanm itself?”
• I was very happy to hear that!
Slide 25
Slide 25 text
Roadmap
• So if you all find cpm is useful and stable,
then cpm should be merged into cpanm 2.0
• Before merging, there are some problems
that need to be resolved:
• The log file is very messy
• I will highly appreciate your feedback!