ELK @ Leboncoin - Meetup Elastic France #14

LEBONCOIN RETOUR D'EXPÉRIENCE SUR LE DÉPLOIEMENT D'ELK

UTILISATION Log http Log des recherches Log des applications métiers

EXEMPLE

VOLUMÉTRIE Environ 200 Go par jour d'index primaire 400 Go
avec 1 replica 560 millions de documents par jour 8000 documents par secondes

ARCHI PREMIÈRE VERSION

VERSION ACTUELLE

PLATEFORME DE TEST 5 noeuds Ram: 32Go Cpu: Bi xeon
X5450 Stockage: 6x146Go en raid0

PLATEFORME DE PROD 6 noeuds Ram: 96Go Cpu: Bi xeon
E5-2640 Stockage: 8x600Go

PROBLÈMES Selon l'appli, les logs utilisent des encodages différents R
e c e i v e d a n e v e n t t h a t h a s a d i f f e r e n t c h a r a c t e r e n c o d i n g t h a n y o u c o n f i g u r e d . Solution: deux ports d'écoute différents i n p u t { t c p { p o r t = > 5 5 1 4 } t c p { p o r t = > 5 5 1 5 c o d e c = > p l a i n { c h a r s e t = > ' I S O 8 8 5 9 1 5 ' } } }

RÉPARTITION Un seul des deux indexer reçoit les données Solution:
forcer rsyslog à ouvrir une nouvelle connexion R e b i n d I n t e r v a l 1 0 0 0

CONSOMMATION MÉMOIRE La heap des elasticsearch explose lors de certaines
aggrégations Solution: utilisation des "doc values"

Configuration dans le mapping " r e q u e
s t " : { " t y p e " : " s t r i n g " , " d o c _ v a l u e s " : t r u e , }

TEST DES FILTRES À la main # v i t
e s t . c o n f i n p u t { s t d i n { } } o u t p u t { s t d o u t { c o d e c = > r u b y d e b u g } } f i l t e r { g r o k { . . . } d a t e { } } # b i n / l o g s t a s h f t e s t . c o n f M a y 1 8 2 1 : 2 1 : 5 0 h i p p o c a m p e k e r n e l : [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s m p b o o t : C P U 2 i s n o w o f f l i n e { " m e s s a g e " = > " [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s m p b o o t : C P U 2 i s n o w o f f l i n e " , " @ v e r s i o n " = > " 1 " , " @ t i m e s t a m p " = > " 2 0 1 5 0 5 2 0 T 2 0 : 5 5 : 4 1 . 6 1 2 Z " , " h o s t " = > " h i p p o c a m p e " , " t i m e s t a m p " = > " M a y 1 8 2 1 : 2 1 : 5 0 " , " l o g s o u r c e " = > " h i p p o c a m p e " , " p r o g r a m " = > " k e r n e l " }

Avec rspec # e n c o d i n
g : u t f 8 r e q u i r e " l o g s t a s h / d e v u t i l s / r s p e c / s p e c _ h e l p e r " r e q u i r e " l o g s t a s h / f i l t e r s / g r o k " d e s c r i b e " p a r s i n g s y s l o g " d o c o n f i g < < C O N F I G f i l t e r { g r o k { . . . } d a t e { } } C O N F I G s a m p l e " M a y 1 8 2 1 : 2 1 : 5 0 h i p p o c a m p e k e r n e l : [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s m p b o o t : C P U 2 i n s i s t { s u b j e c t [ " l o g s o u r c e " ] } = = " h i p p o c a m p e " i n s i s t { s u b j e c t [ " @ t i m e s t a m p " ] } = = " 2 0 1 5 0 5 1 8 T 1 9 : 2 1 : 5 0 . 0 0 Z " e n d e n d

# b i n / r s p e c
t e s t . r b . . . F a i l u r e s : 1 ) p a r s i n g s y s l o g " M a y 1 8 2 1 : 2 1 : 5 0 h i p p o c a m p e k e r n e l : [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s . . . " F a i l u r e / E r r o r : U n a b l e t o f i n d m a t c h i n g l i n e f r o m b a c k t r a c e I n s i s t : : F a i l u r e : E x p e c t e d " 2 0 1 5 0 5 1 8 T 1 9 : 2 1 : 5 0 . 0 0 Z " , b u t g o t " 2 0 1 5 0 5 2 0 T 2 1 : 0 0 : 3 6 . 5 3 1 Z " . . .

Avec un test générique : M a y 1 8
2 1 : 2 1 : 5 0 h i p p o c a m p e k e r n e l : [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s m p b o o t : C P U 2 i s n o w o f f l i n e { " l o g s o u r c e " : " h i p p o c a m p e " , " @ t i m e s t a m p " : " 2 0 1 5 0 5 1 8 T 1 9 : 2 1 : 5 0 . 0 0 0 Z " , " @ v e r s i o n " : " 1 " , " l o g s o u r c e " : " h i p p o c a m p e " , " m e s s a g e " : " [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s m p b o o t : C P U 2 i s n o w o f f l i n e \ n " , " p r o g r a m " : " k e r n e l " , " t i m e s t a m p " : " M a y 1 8 2 1 : 2 1 : 5 0 " }

# P U P P E T D I R
= . . . b i n / r s p e c g e n e r i c . r b . . . F a i l u r e s : 1 ) f i l e # { f i l e } " M a y 1 8 2 1 : 2 1 : 5 0 h i p p o c a m p e k e r n e l : [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s . . . " w h e n F a i l u r e / E r r o r : U n a b l e t o f i n d m a t c h i n g l i n e f r o m b a c k t r a c e E x p e c t e d e q u i v a l e n t J S O N D i f f : @ @ 1 , 5 + 1 , 5 @ @ { " @ t i m e s t a m p " : " 2 0 1 5 0 5 2 0 T 2 1 : 2 5 : 0 5 . 5 3 3 Z " , + " @ t i m e s t a m p " : " 2 0 1 5 0 5 1 8 T 1 9 : 2 1 : 5 0 . 0 0 0 Z " , " @ v e r s i o n " : " 1 " , " l o g s o u r c e " : " h i p p o c a m p e " , " m e s s a g e " : " [ 2 7 6 8 2 . 1 5 5 5 9 0 ] s m p b o o t : C P U 2 i s n o w o f f l i n e \ n " , # . / i p l o g . r b : 1 8 : i n ` ( r o o t ) ' # / h o m e / k e r m i t / s y s / l o g s t a s h / l o g s t a s h 1 . 5 . 0 r c 3 / l i b / l o g s t a s h / r u n n e r . r b

QUI RÉCUPÈRE LES NUMÉROS DE TÉLÉPHONE ? Term Aggregation {
" q u e r y " : { . . . } , " a g g s " : { " b o t " : { " t e r m s " : { " f i e l d " : " r e m o t e " , " s i z e " : 1 0 } } } }

Significant terms { " q u e r y "
: { . . . } , " a g g s " : { " b o t " : { " s i g n i f i c a n t _ t e r m s " : { " f i e l d " : " r e m o t e " , " s i z e " : 1 0 } } } }

. . . { " k e y " :
A , " d o c _ c o u n t " : 3 7 8 , " s c o r e " : 0 . 1 2 3 7 8 0 2 5 1 0 9 2 6 4 6 0 4 , " b g _ c o u n t " : 4 7 6 } , { " k e y " : B , " d o c _ c o u n t " : 4 7 8 , " s c o r e " : 0 . 0 9 2 2 6 8 3 9 9 2 7 3 9 3 5 4 , " b g _ c o u n t " : 1 0 1 9 } , . . .

QUESTIONS ?

ELK @ Leboncoin - Meetup Elastic France #14

ELK @ Leboncoin - Meetup Elastic France #14

Clément Demonchy

Featured

Transcript

LEBONCOIN RETOUR D'EXPÉRIENCE SUR LE DÉPLOIEMENT D'ELK

UTILISATION Log http Log des recherches Log des applications métiers

EXEMPLE

VOLUMÉTRIE Environ 200 Go par jour d'index primaire 400 Go

ARCHI PREMIÈRE VERSION

VERSION ACTUELLE

PLATEFORME DE TEST 5 noeuds Ram: 32Go Cpu: Bi xeon

PLATEFORME DE PROD 6 noeuds Ram: 96Go Cpu: Bi xeon

PROBLÈMES Selon l'appli, les logs utilisent des encodages différents R

RÉPARTITION Un seul des deux indexer reçoit les données Solution:

CONSOMMATION MÉMOIRE La heap des elasticsearch explose lors de certaines

Configuration dans le mapping " r e q u e

TEST DES FILTRES À la main # v i t

Avec rspec # e n c o d i n

# b i n / r s p e c

Avec un test générique : M a y 1 8

# P U P P E T D I R

QUI RÉCUPÈRE LES NUMÉROS DE TÉLÉPHONE ? Term Aggregation {

Significant terms { " q u e r y "

. . . { " k e y " :

QUESTIONS ?