Opera&ng
a
public
cloud
Loic
Lambiel
Head
of
Opera&ons
Slide 2
Slide 2 text
#whoami
!
Loic
Lambiel
– Email
:
[email protected]
!
Passionate
sysadmin
!
I’ve
been
working
in
IT
for
over
a
decade,
mainly
as
a
consultant
in
the
system
area
!
Head
of
Opera&ons
@
exoscale
–
Responsible
for
the
overall
exoscale
plaIorm
&
product
opera&on
Slide 3
Slide 3 text
Our
Cloud
recipe
!
To
provide
efficient
public
cloud
services,
we
need
:
!
A
cloud
–
Cloudstack
backed
of
course
!
!
A
team
–
Devops
of
course
!
!
Third
party
tools
ecosystem
–
Open
Source
of
course
!
Slide 4
Slide 4 text
Our
Cloudstack
based
cloud
!
Running
version
4.0.x
!
Linux
KVM
hypervisor
(Ubuntu
based)
!
Local
storage
only,
no
clustering
– Keep
it
Simple,
scalable
and
fast
!
Basic
networking
mode
– Means
one
public
IP
per
VM,
No
VLANs
– Secured
by
security
groups,
AWS
style
Slide 5
Slide 5 text
Opera&ng
means
also
people
!
Team
must
be
kept
small
– Team
growth
ra&o
!=
number
of
virtual
machines
hosted
!
Devops
doctrine
applied
!
Development
is
opera&ons
aware
– What
impact
my
code
has
on
produc&on
?
!
Opera&ons
is
managed
by
development
principles
and
tools
– Revert
(git)
– Documented
in
line
and
in
commits
(avoid
rewrite
and
informa&on
loss)
– Accountable:
who
did
what
Slide 6
Slide 6 text
User
Management
console
Visibility
stack
Configura7on
management
system
User
backend
stack
Our
Cloudstack
Ecoystem
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
Exoscale
management
console
!
Homemade
management
console
!
Na&ve
cloudstack
UI
not
suitable
for
our
needs
!
Simple,
fast
&
efficient
!
Hook
the
cloudstack
API
!
“AWS
style
console”
Slide 9
Slide 9 text
“User
backend
stack”
!
Homemade
user
backend
modules
!
Provide
user
database,
billing/chargeback,
&cke&ng
and
knowledge
base
!
Used
by
Cloudstack
for
user
authen&ca&on
!
Cloudstack
does
not
offer
any
na&ve
billing/chargeback
capability
!
Cloudstack
only
provide
usage
data
which
must
be
processed
for
billing
Slide 10
Slide 10 text
Cloudstack
usage
record
sample
(Json)
Slide 11
Slide 11 text
Configura&on
management,
why
?
!
FALSE
:
“It
is
only
infrastructure,
it
does
not
change”
!
Repe&&ve
tasks
are
boring
and
cost
&me
!
Small
devops
team
!
Adding
&
managing
more
and
more
– Quickly
if
required
!
!
Deploy,
maintain
&
enforce
the
same
configura&on
everywhere
!
Adjust
con&nuously
Slide 12
Slide 12 text
Therefore
we
need
“good
ci&zens”
!
A
machine
should:
Automa&cally
deploy
itself
(Almost)
Find
its
iden&ty
seings
(name,
networks,...)
Install
the
necessary
packages
for
which
it
was
intended
Register
itself
to
all
tools
Live
along
its
peers
and
respect
regula&ons
Report
to
cityhall
if
anything
goes
wrong
Slide 13
Slide 13 text
Configura&on
management
:
Puppet
!
We
use
the
well
known
open
source
configura&on
management
tool
Puppet
!
Exoscalepuppet
:
– 40+
modules
– Each
applica&on
got
it’s
module
– Between
10
to
100
commits
per
week
Slide 14
Slide 14 text
Monitoring
vs
Visibility
!
Monitoring
is
part
of
visibility
– Tradi&onally:
service
up,
CPU,
RAM,
network
&
disk
I/O
!
Are
we
genera&ng
business
value
?
– Need
more
insight
into
applica&on
behavior
(who
using
what,
...)
Slide 15
Slide 15 text
Trends
!
If
it
moves,
graph
it
!
If
it
doesn't
move,
graph
it
in
case
it
starts
moving
!
If
it
breaks
once,
monitor
it
!
Ques&on,
adapt
and
modify
thresholds
con&nuously
Slide 16
Slide 16 text
What
is
different
in
the
cloud
?
!
Distributed
systems
!
Lots
of
moving
parts
!
Scale
!
Easy
tools
to
quickly
assess
produc&on
status
required
Slide 17
Slide 17 text
Visibility
stack:
Logstash
!
Open
Source
Log
collector
!
Collects
every
logs
on
the
machine
– Rsyslogd,
Cloudstack,
Nginx,
Load
balancer,
tomcat,
java
etc…
!
Configura&on
managed
by
puppet
!
Backed
with
Elas&c
search
cluster
Slide 18
Slide 18 text
Visibility
stack:
Elas&c
search
/
Kibana
!
Open
Source
distributed
resIul
search
and
analy&cs
system
!
Distributed,
NoSQL
!
Data
indexing
is
done
thru
HTTP
PUT
method,
search
by
HTTP
GET
!
We
store
all
our
logs
in
a
central
ES
cluster
!
Logs
kept
only
24
hours
locally
on
the
server
!
Open
Source
Kibana
used
as
search
portal
Slide 19
Slide 19 text
No content
Slide 20
Slide 20 text
Visibility
stack:
Graphite
!
Open
Source
real-‐&me
graphing
system
!
Store
numeric
&me-‐series
data
!
Render
graphs
of
this
data
on
demand
!
Web
Dashboard
available
Slide 21
Slide 21 text
No content
Slide 22
Slide 22 text
Visibility
stack:
Riemann
!
Open
Source
monitoring
system
!
“passive”
monitoring,
Events
stream
processor
!
Good
for
monitoring
distributed
systems
!
Well-‐suited
for
infrastructure
with
lot
of
moving
parts
Slide 23
Slide 23 text
Visibility
stack:
Collectd
!
Open
Source
Sta&s&cs
collector
with
many
exis&ng
plugins
!
Plugin
may
be
a
script
!
Collectd
@
exoscale
– Metrics
sent
to
graphite
– Metrics
sent
to
Riemann
– Metrics
sent
to
custom
dashboard
apps
– SNMP
polling
Slide 24
Slide 24 text
Visibility
stack:
Command
and
control
!
Private
IRC
server
&
bots
used
as
a
«
control
tower
»
!
Central
view
of
our
infrastrcture
:
!
Monitoring
alarms
!
All
git
commits
!
Ability
to
pilot
our
servers
thru
IRC
bots:
– Puppet
apply,
apt-‐get,
service
restart
etc…
– No
need
to
log
on
server
– Changes
can
be
performed
on
a
group
of
servers
very
quickly
Slide 25
Slide 25 text
CI:
Jenkins
!
Jenkins
is
an
Open
Source
Con&nuous
integra&on
server
!
Almost
every
of
our
apps
are
built
with
Jenkins
!
Applica&on
build
may
be
piloted
from
IRC
!
Linked
with
IRC
and
Git
!
We
build
cloudstack
with
jenkins
Slide 26
Slide 26 text
Management
thru
IRC
Slide 27
Slide 27 text
Visibility
stack:
Dashboards
!
Extensive
use
of
dashboards
on
TVs
!
Graphs,
network
maps
!
Monitoring
alerts
!
Custom
Dashboard
applica&on
with
Cloudstack
metrics
– Custom
Collectd
python
scripts
Slide 28
Slide 28 text
Prac&cal
use
case
!
Add
a
node
to
our
Web
portal
frontend
infrastructure
:
!
Allocate
IP/name
!
Define
machine
belonging
to
the
service
plaIorm
(by
a
fact)
!
“press
deploy”
!
Let
puppet
deploy
the
configura&on
and
applica&on
on
the
host
:
– Nginx
– Web
app
!
Let
puppet
reconfigure
load
balancing
to
add
this
node
in
the
farm
!
Watch
logs,
graph
and
traffic
to
this
new
host
in
real
&me
on
dashboards
Slide 29
Slide 29 text
Conclusion
!
“just”
installing
Cloudstack
wasn’t
enough
for
us
!
We’ve
built
a
complete
ecosystem
around
Cloudstack
!
Massive
automa&on
is
the
key
!
Required
to
be
scalable
and
being
operated
by
small
a
team
!
Give
it
a
try
:
hsps://portal.exoscale.ch