Who We Are?
We are a team of professionals specializing in game
development, art production, online engineering and
creation of amazing products.
Our technology competencies include solid experience
and background in delivering scalable platforms and
online solutions. That serve millions of players all over
the world and run beyond amazing games.
Read more: http://www.sperasoft.com
Slide 3
Slide 3 text
• RDBMS store data only in Trees
• Index is a tree in terms of data structure
• a Table is an Index
• a Clustered Index is a Table itself
• a Non-‐clustered Index is a copy of data
• all Non-‐clustered Indexes refer to Clustered one
• all keys in Tree Nodes are always unique
The Simple Truth
Slide 4
Slide 4 text
• Oracle Database
• SQL Server
• IBM DB2
• MySQL
• PostgreSQL
• Sybase
• Informix
What’s Common Between
Slide 5
Slide 5 text
RDBMS is a type of Database Management System
that stores data in the form of related tables
RDBMS is a Database Management System that is
based on the relaJonal model introduced by E.F.
Codd
Data is stored in tables and the relaJonships among
the data are also stored in tables
Relational Database Management Systems
Slide 6
Slide 6 text
• Born on the Isle of Portland in England
in 1923
• Died in Florida US in 2003, aged 79
• MathemaJc
• Worked for IBM
Edgar Frank “Ted” Codd
Slide 7
Slide 7 text
• Introduced “A RelaJonal Model of Data for Large
Shared Data Banks” and Alpha database language
• IBM started implemenJng the RelaJonal model and
introduced another language named SEQUEL
Edgar Frank “Ted” Codd
Slide 8
Slide 8 text
• Larry Ellison came up in Jme with his
implementaJon of RelaJonal model
and the language – Oracle Database and
SQL
• ANSI started making SQL standard
Birth of Oracle
Slide 9
Slide 9 text
• It’s all about Table RelaJons
Relation Model Briefly
Slide 10
Slide 10 text
• Database contains tables (two dimensional
arrays)
• Tables have relaJonships enforced by Foreign
Key constraints (1-‐to-‐Many relaJonship)
• NormalizaJon of tables is a key concept
• That’s why RDBMS are called RelaJonal
Relation Model
Slide 11
Slide 11 text
What’s Database Physically
Slide 12
Slide 12 text
• Files are flat in nature
FILE
READING
CURSOR
0
OFFSET
All Tables Are Stored in a File
Slide 13
Slide 13 text
• What’s the value behind relaJons?
• What is a database table?
• What is a table index?
• RelaJons vs How data is stored
What’s Actually Matter
Slide 14
Slide 14 text
Id
User Name
Country
City
Age
1
Michael
USA
Boston
30
2
Jane
USA
Boston
24
3
Scoe
USA
NYC
18
4
Bob
UK
London
41
5
Prescoe
UK
London
35
• Such array seems to be a table
• How to find Users from Boston faster?
ArrayList users = new ArrayList();
How to Handle Millions of Users
Slide 15
Slide 15 text
Boston
1, Michael, USA, Boston, 30
2, Jane, USA, Boston, 24
NYC
3, Scoe, USA, NYC, 18
London
4, Bob, UK, London, 41
5, Prescoe, UK, London, 35
Index is a Tree
Slide 16
Slide 16 text
ID = 2
2, Jane, USA, Boston, 24
ID = 1
1, Michael, USA, Boston, 30
ID = 3
3, Scoe, USA, NYC, 18
Can replace an initial array with Index
Slide 17
Slide 17 text
• Key values in a Key node should be unique
• Otherwise Trees do not work
What’s important to note
Slide 18
Slide 18 text
• Indexes are Trees in terms of data structure
• Trees are suitable to store any array of data to
make search faster
Returning to our sheep
Slide 19
Slide 19 text
• All RDBMS store data as Balanced Trees
• The concrete implementaJon of B-‐Tree could
differ from vendor to vendor
• It means the only way to store data is Tree
• No excepJons here -‐ table is a tree, index is a
tree
Balanced Trees
Slide 20
Slide 20 text
What’s a Clustered Index
Slide 21
Slide 21 text
• The next record in Clustered Index is always
stored aoer the previous one
RECORD 1
RECORD 2
1 | Michael | USA | Boston | 30
2 | Jane | USA | Boston | 24
The clustered index storage
Slide 22
Slide 22 text
Have a quesKon?
Like this deck?
Tweet us @SperasoR
Like deck on SlideShare.com/sperasoR
Slide 23
Slide 23 text
• Clustered Indexes
• Non-‐clustered indexes
• Both could be unique and non-‐unique
• Table can be without any indexes
• How is that comply with how data is actually
stored?
What SQL allows us to do
Slide 24
Slide 24 text
• Unique and non-‐unique
• CREATE CLUSTERED INDEX [name] ON
[table_name] ([column1], [column2])
• CREATE UNIQUE CLUSTERED INDEX [name]
ON [table_name] ([column1], [column2])
Clustered Indexes
Slide 25
Slide 25 text
• Unique and non-‐unique
• CREATE NONCLUSTERED INDEX [name] ON
[table_name] ([column1], [column2])
• CREATE UNIQUE NONCLUSTERED INDEX
[name] ON [table_name] ([column1],
[column2])
None Clustered Indexes
Slide 26
Slide 26 text
ID = 2
Jane, USA, Boston, 24
ID = 1
Michael, USA, Boston, 30
ID = 3
Scoe, USA, NYC, 18
Unique Clustered Index
Slide 27
Slide 27 text
• We know Key values should be unique
• How RDBMS resolves this problem?
Non-unique Clustered Index
Slide 28
Slide 28 text
• SQL Server adds 4-‐byte uniquifier to each
duplicated key value
• Algorithms could differ from vendor to vendor
• But the principle is the same – add something
to make them unique
Non-unique Clustered Index
Slide 29
Slide 29 text
• Just omitng Unique keyword makes Key values
bigger (why it’s bad realize later)
• The simple truth is that Each table should have
Clustered Index
• The Clustered Index should be always Unique
• The situaJons when its not so should be excepJonal
Clustered Indexes
Slide 30
Slide 30 text
• Such tables are called Heap Tables
• How are they stored in database if they do not
have a Key value specified?
Tables without Clustered Index
Slide 31
Slide 31 text
• Heap Tables are also stored in Trees
• What’s in a Key value for Tables without
Clustered Index?
• The value called RID
• the unique idenJfier which refers to the
physical locaJon of the record in a file
No magic over here
Slide 32
Slide 32 text
• There is no meaningful data in Keys
• Table records are not stored physically in
Keys’ order
Why Heap Tables are so bad
Slide 33
Slide 33 text
• Clustered Index has the actual data columns in Leaf-‐
nodes
• What’s in Leaf-‐node of Non-‐clustered index?
• Remember that Non-‐clustered Indexes are
duplicated data
Non-clustered Indexes
Slide 34
Slide 34 text
Jane
Lookup value: ID=2
Michael
Lookup value: ID=1
Scoe
Lookup value: ID=3
• Leaf-‐nodes contain the lookup values
• Lookup value is Clustered Index’s Key
Non-clustered Index
Slide 35
Slide 35 text
• We know Key values should be unique
• How non-‐clustered index’s key becomes
unique?
Non-unique Non-clustered Index
Slide 36
Slide 36 text
• SQL Server adds Clustered Index Key value to
Non-‐clustered Index Key value to make it
unique
Jane, 2
Lookup value: ID=2
Michael, 1
Lookup value: ID=1
Scoe, 3
Lookup value: ID=3
Non-unique Non-clustered Index
Slide 37
Slide 37 text
• from SELECT statement the WHERE condiJon
is taken
• based on the Columns in WHERE we know
what columns we search by
• look through available indexes trying to find
the appropriate one, starJng from Clustered
• found out non-‐clustered index which fits best
How indexes are used (1)
Slide 38
Slide 38 text
• get the needed Node in Non-‐clustered index
• get the Lookup value from that Node
• use that lookup value to find a record in
Clustered index
• get selected columns from Clustered index
(table itself)
How indexes are used (2)
Slide 39
Slide 39 text
• Unique Clustered Index on Id column
• Non-‐unique Non-‐clustered Index on City column
• Select UserName from tbl where City = ‘Boston’
Sample 1
Slide 40
Slide 40 text
• Unique Clustered Index on Id column
• Non-‐unique Non-‐clustered Index on City column
• Select Id from tbl where City = ‘Boston’
Sample 2
Slide 41
Slide 41 text
• Unique Clustered Index on Id column
• Non-‐unique Non-‐clustered Index on City column
• Select UserName from tbl where City = ‘Boston’
select should not go to Clustered Index
Sample 3
Slide 42
Slide 42 text
• Unique Clustered Index on Id, UserName column
• Select Id from tbl where City = ‘Boston’ and
UserName = ‘Michael’
• What columns Non-‐unique Non-‐clustered Index
would include?
Sample 1
Slide 43
Slide 43 text
WE ARE SPERASOFT
DELIVERING AMAZING
PRODUCTS
Follow us:
@SperasoA
hCp://www.sperasoA.com