cpm at PerlCon 2019 - Speaker Deck

Slide 1

Slide 1 text

cpm Shoichi Kaji

Slide 2

Slide 2 text

Shoichi Kaji (SKAJI) author of cpm and mi6

Slide 3

Slide 3 text

Agenda • 5 features of cpm • The internal of cpm — why cpm fast — • Toward cpm version 1.0

Slide 4

Slide 4 text

5 features of cpm

Slide 5

Slide 5 text

1. fast

Slide 6

Slide 6 text

… and this is why I created cpm

Slide 7

Slide 7 text

Let’s compare cpm with cpanm

Slide 8

Slide 8 text

https://raw.githubusercontent.com/skaji/images/master/cpm/cpanm-Plack.gif

Slide 9

Slide 9 text

https://raw.githubusercontent.com/skaji/images/master/cpm/cpm-Plack.gif

Slide 10

Slide 10 text

cpm is 3x faster than cpanm

Slide 11

Slide 11 text

2. self-contained

Slide 12

Slide 12 text

You can use dependency-free  self-contained cpm, which works ﬁne with fresh perl 5.8.1+

Slide 13

Slide 13 text

› curl -sL https://git.io/cpm > cpm › chmod +x cpm › ./cpm -V cpm 0.983 (./cpm) This is a self-contained version, 0.983

Slide 14

Slide 14 text

language: perl perl: - "5.30" install: - curl -sL https://git.io/cpm | perl - install -g script: - prove -l t It also is easy to use cpm in CI such as travis CI

Slide 15

Slide 15 text

3. ﬂexible resolvers

Slide 16

Slide 16 text

• You can change cpm’s resolvers via —resolver option • Let’s say you have DarkPAN which contains your private distributions, but not whole CPAN distributions. • In fact, other CPAN clients do not work well for such case, but cpm does.

Slide 17

Slide 17 text

› cpm install \ --resolver 02package,http://your-darkpan \ --resolver metadb \ Module1 Module2 ... resolve your darkpan ﬁrst and if it fails,  fall back to normal metadb resolver

Slide 18

Slide 18 text

4. static install

Slide 19

Slide 19 text

• Leon Timmermans has proposed a new concept for CPAN distribution installation called static-install • It is much simpler, safer, faster than traditional one (i.e, executing Makeﬁle.PL) • cpm support static-install • I wrote a blog post about static install

Slide 20

Slide 20 text

5. prebuilt

Slide 21

Slide 21 text

• cpm keeps builds in ~/.perl-cpm and never fetch/build them again • This is (of cource!) inspired by Carmel • This makes cpm even faster!

Slide 22

Slide 22 text

https://raw.githubusercontent.com/skaji/images/master/cpm/cpm-prebuilt.gif

Slide 23

Slide 23 text

5 features of cpm • 1. fast • 2. self-contained • 3. ﬂexible resolvers • 4. static install • 5. prebuilt

Slide 24

Slide 24 text

The internal of cpm — why cpm fast —

Slide 25

Slide 25 text

Programming paradigms • To make program fast, there are some programming paradigms • Multi Thread • I don’t think it is good idea to use thread in perl5. Oops. • Event Driven • This is a good choice. But once we adopt event-loop, we cannot use synchronous code anymore. This means that we cannot relay on cpanminus code. Oops. • Multi Process • Let’s use this

Slide 26

Slide 26 text

Multi Process • Let’s use multi-process paradigm.  So the next questions are: • Q1: How do we pass data from the master to workers and vice versa? • Q2: The master need to know that workers ﬁnish their jobs as soon as possible. How do we achieve this? worker worker worker master process

Slide 27

Slide 27 text

IPC • Q1: How do we pass data between processes? • Idea1: ﬁles • Other process does not detect ﬁles are changed quickly. It appers inotify is "slow". • Idea2: TCP/IP • A Good choice. Because master and worker processes are in the same host, we don’t have to use TCP/IP necessarily. • Idea3: pipes • Let’s use this   We should prepare 2 pipes for master -> woker and woker-> master

Slide 28

Slide 28 text

select • How does the master detect which workers are finished? • Workers will send results to the master via pipe, which also means the workers are finished. • So, if the master monitor pipes by select(2), it can detect which workers are finished quickly

Slide 29

Slide 29 text

Wrap-up so far • We adopt multi process paradigm • Connect master with workers via 2 pipes • Master monitors pipes by select(2) so that it detects which workers are ﬁnished quickly worker worker worker master process pipe pipe pipe select

Slide 30

Slide 30 text

Actual code of cpm The master "calculates" jobs Get ready (= ﬁnished) workers here (internally, we do select pipes!) Modularize connection between  the master and workers as Parallel::Pipes module And send job to the ready worker This only 11-line code makes cpm fast!

Slide 31

Slide 31 text

Toward cpm version 1.0 • Make distributions "ﬁrst-class objects" • Traditionally we install CPAN modules into one speciﬁc directory. So, after install, we cannot see which distribution a module come from and it is hard to re-use distributions • On the other hand, MIYAGAWA has introduced a concept "central repositories" in his project Carmel • Let’s keep each distributuions separately • Once cpm implement it, for example, cpm can easily install only runtime dependencies

Slide 32

Slide 32 text

Wrap up • cpm is a fast CPAN client • it also has some other iteresting features • cpm uses system calls fork/pipe/select effectvely so that it install CPAN modules fast • cpm 1.0 will treat distributions as "ﬁrst-class objects"

Slide 33

Slide 33 text

Thanks • PerlCon 2019 • Perl Toolchain Summit 2017, 2018, 2019 • and You ❤