Slide 1

Slide 1 text

DATABASE 
 INDEXES #SperasoftTalks

Slide 2

Slide 2 text

Who We Are? We are a team of professionals specializing in game development, art production, online engineering and creation of amazing products. Our technology competencies include solid experience and background in delivering scalable platforms and online solutions. That serve millions of players all over the world and run beyond amazing games. Read more: http://www.sperasoft.com

Slide 3

Slide 3 text

•  RDBMS  store  data  only  in  Trees •  Index  is  a  tree  in  terms  of  data  structure •  a  Table  is  an  Index •  a  Clustered  Index  is  a  Table  itself •  a  Non-­‐clustered  Index  is  a  copy  of  data •  all  Non-­‐clustered  Indexes  refer  to  Clustered  one •  all  keys  in  Tree  Nodes  are  always  unique The Simple Truth

Slide 4

Slide 4 text

•  Oracle  Database •  SQL  Server •  IBM  DB2 •  MySQL •  PostgreSQL •  Sybase •  Informix What’s Common Between

Slide 5

Slide 5 text

RDBMS  is  a  type  of  Database  Management  System   that  stores  data  in  the  form  of  related  tables RDBMS  is  a  Database  Management  System  that  is   based  on  the  relaJonal  model  introduced  by  E.F.   Codd Data  is  stored  in  tables  and  the  relaJonships  among   the  data  are  also  stored  in  tables Relational Database Management Systems

Slide 6

Slide 6 text

•  Born  on  the  Isle  of  Portland  in  England   in  1923 •  Died  in  Florida  US  in  2003,  aged  79 •  MathemaJc •  Worked  for  IBM Edgar Frank “Ted” Codd

Slide 7

Slide 7 text

•  Introduced  “A  RelaJonal  Model  of  Data  for  Large   Shared  Data  Banks”  and  Alpha  database  language •  IBM  started  implemenJng  the  RelaJonal  model  and   introduced  another  language  named  SEQUEL Edgar Frank “Ted” Codd

Slide 8

Slide 8 text

•  Larry  Ellison  came  up  in  Jme  with  his   implementaJon  of  RelaJonal  model   and  the  language  –  Oracle  Database  and   SQL •  ANSI  started  making  SQL  standard Birth of Oracle

Slide 9

Slide 9 text

•  It’s  all  about  Table  RelaJons Relation Model Briefly

Slide 10

Slide 10 text

•  Database  contains  tables  (two  dimensional   arrays) •  Tables  have  relaJonships  enforced  by  Foreign   Key  constraints  (1-­‐to-­‐Many  relaJonship) •  NormalizaJon  of  tables  is  a  key  concept •  That’s  why  RDBMS  are  called  RelaJonal Relation Model

Slide 11

Slide 11 text

What’s Database Physically

Slide 12

Slide 12 text

•  Files  are  flat  in  nature FILE READING CURSOR 0 OFFSET All Tables Are Stored in a File

Slide 13

Slide 13 text

•  What’s  the  value  behind  relaJons? •  What  is  a  database  table? •  What  is  a  table  index? •  RelaJons  vs  How  data  is  stored What’s Actually Matter

Slide 14

Slide 14 text

Id User  Name Country City Age 1 Michael USA Boston 30 2 Jane USA Boston 24 3 Scoe USA NYC 18 4 Bob UK London 41 5 Prescoe UK London 35 •  Such  array  seems  to  be  a  table •  How  to  find  Users  from  Boston  faster? ArrayList  users  =  new  ArrayList(); How to Handle Millions of Users

Slide 15

Slide 15 text

Boston 1,  Michael,  USA,  Boston,  30 2,  Jane,  USA,  Boston,  24 NYC 3,  Scoe,  USA,  NYC,  18   London 4,  Bob,  UK,  London,  41 5,  Prescoe,  UK,  London,  35   Index is a Tree

Slide 16

Slide 16 text

ID  =  2 2,  Jane,  USA,  Boston,  24 ID  =  1 1,  Michael,  USA,  Boston,  30 ID  =  3 3,  Scoe,  USA,  NYC,  18   Can replace an initial array with Index

Slide 17

Slide 17 text

•  Key  values  in  a  Key  node  should  be  unique •  Otherwise  Trees  do  not  work What’s important to note

Slide 18

Slide 18 text

•  Indexes  are  Trees  in  terms  of  data  structure •  Trees  are  suitable  to  store  any  array  of  data  to   make  search  faster Returning to our sheep

Slide 19

Slide 19 text

•  All  RDBMS  store  data  as  Balanced  Trees •  The  concrete  implementaJon  of  B-­‐Tree  could   differ  from  vendor  to  vendor •  It  means  the  only  way  to  store  data  is  Tree •  No  excepJons  here  -­‐  table  is  a  tree,  index  is  a   tree Balanced Trees

Slide 20

Slide 20 text

What’s a Clustered Index

Slide 21

Slide 21 text

•  The  next  record  in  Clustered  Index  is  always   stored  aoer  the  previous  one RECORD  1 RECORD  2 1    |  Michael  |  USA  |  Boston  |  30 2    |  Jane  |  USA  |  Boston  |  24 The clustered index storage

Slide 22

Slide 22 text

Have  a  quesKon?   Like  this  deck?   Tweet  us  @SperasoR Like  deck  on  SlideShare.com/sperasoR

Slide 23

Slide 23 text

•  Clustered  Indexes •  Non-­‐clustered  indexes •  Both  could  be  unique  and  non-­‐unique •  Table  can  be  without  any  indexes •  How  is  that  comply  with  how  data  is  actually   stored? What SQL allows us to do

Slide 24

Slide 24 text

•  Unique  and  non-­‐unique •  CREATE  CLUSTERED  INDEX  [name]  ON   [table_name]  ([column1],  [column2]) •  CREATE  UNIQUE  CLUSTERED  INDEX  [name]   ON  [table_name]  ([column1],  [column2]) Clustered Indexes

Slide 25

Slide 25 text

•  Unique  and  non-­‐unique •  CREATE  NONCLUSTERED  INDEX  [name]  ON   [table_name]  ([column1],  [column2]) •  CREATE  UNIQUE  NONCLUSTERED  INDEX   [name]  ON  [table_name]  ([column1],   [column2]) None Clustered Indexes

Slide 26

Slide 26 text

ID  =  2 Jane,  USA,  Boston,  24 ID  =  1 Michael,  USA,  Boston,  30 ID  =  3 Scoe,  USA,  NYC,  18   Unique Clustered Index

Slide 27

Slide 27 text

•  We  know  Key  values  should  be  unique •  How  RDBMS  resolves  this  problem? Non-unique Clustered Index

Slide 28

Slide 28 text

•  SQL  Server  adds  4-­‐byte  uniquifier  to  each   duplicated  key  value •  Algorithms  could  differ  from  vendor  to  vendor •  But  the  principle  is  the  same  –  add  something   to  make  them  unique Non-unique Clustered Index

Slide 29

Slide 29 text

•  Just  omitng  Unique  keyword  makes  Key  values   bigger  (why  it’s  bad  realize  later) •  The  simple  truth  is  that  Each  table  should  have   Clustered  Index •  The  Clustered  Index  should  be  always  Unique •  The  situaJons  when  its  not  so  should  be  excepJonal Clustered Indexes

Slide 30

Slide 30 text

•  Such  tables  are  called  Heap  Tables •  How  are  they  stored  in  database  if  they  do  not   have  a  Key  value  specified? Tables without Clustered Index

Slide 31

Slide 31 text

•  Heap  Tables  are  also  stored  in  Trees •  What’s  in  a  Key  value  for  Tables  without   Clustered  Index? •  The  value  called  RID •  the  unique  idenJfier  which  refers  to  the   physical  locaJon  of  the  record  in  a  file No magic over here

Slide 32

Slide 32 text

•  There  is  no  meaningful  data  in  Keys •  Table  records  are  not  stored  physically  in   Keys’  order Why Heap Tables are so bad

Slide 33

Slide 33 text

•  Clustered  Index  has  the  actual  data  columns  in  Leaf-­‐ nodes •  What’s  in  Leaf-­‐node  of  Non-­‐clustered  index? •  Remember  that  Non-­‐clustered  Indexes  are   duplicated  data Non-clustered Indexes

Slide 34

Slide 34 text

Jane Lookup  value:  ID=2 Michael Lookup  value:  ID=1 Scoe Lookup  value:  ID=3 •  Leaf-­‐nodes  contain  the  lookup  values •  Lookup  value  is  Clustered  Index’s  Key Non-clustered Index

Slide 35

Slide 35 text

•  We  know  Key  values  should  be  unique •  How  non-­‐clustered  index’s  key  becomes   unique? Non-unique Non-clustered Index

Slide 36

Slide 36 text

•  SQL  Server  adds  Clustered  Index  Key  value  to   Non-­‐clustered  Index  Key  value  to  make  it   unique Jane,  2 Lookup  value:  ID=2 Michael,  1 Lookup  value:  ID=1 Scoe,  3 Lookup  value:  ID=3 Non-unique Non-clustered Index

Slide 37

Slide 37 text

•  from  SELECT  statement  the  WHERE  condiJon   is  taken •  based  on  the  Columns  in  WHERE  we  know   what  columns  we  search  by •  look  through  available  indexes  trying  to  find   the  appropriate  one,  starJng  from  Clustered •  found  out  non-­‐clustered  index  which  fits  best How indexes are used (1)

Slide 38

Slide 38 text

•  get  the  needed  Node  in  Non-­‐clustered  index •  get  the  Lookup  value  from  that  Node •  use  that  lookup  value  to  find  a  record  in   Clustered  index •  get  selected  columns  from  Clustered  index   (table  itself) How indexes are used (2)

Slide 39

Slide 39 text

•  Unique  Clustered  Index  on  Id  column •  Non-­‐unique  Non-­‐clustered  Index  on  City  column •  Select  UserName  from  tbl  where  City  =  ‘Boston’ Sample 1

Slide 40

Slide 40 text

•  Unique  Clustered  Index  on  Id  column •  Non-­‐unique  Non-­‐clustered  Index  on  City  column •  Select  Id  from  tbl  where  City  =  ‘Boston’ Sample 2

Slide 41

Slide 41 text

•  Unique  Clustered  Index  on  Id  column •  Non-­‐unique  Non-­‐clustered  Index  on  City  column •  Select  UserName  from  tbl  where  City  =  ‘Boston’   select  should  not  go  to  Clustered  Index Sample 3

Slide 42

Slide 42 text

•  Unique  Clustered  Index  on  Id,  UserName  column •  Select  Id  from  tbl  where  City  =  ‘Boston’  and   UserName  =  ‘Michael’ •  What  columns  Non-­‐unique  Non-­‐clustered  Index   would  include? Sample 1

Slide 43

Slide 43 text

WE  ARE  SPERASOFT   DELIVERING  AMAZING   PRODUCTS Follow  us:     @SperasoA     hCp://www.sperasoA.com