Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DBMS Architectures and Features - Lecture 7 - I...

DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007156ANR)

This lecture forms part of the course Introduction to Databases given at the Vrije Universiteit Brussel.

Beat Signer

March 29, 2019
Tweet

More Decks by Beat Signer

Other Decks in Education

Transcript

  1. 2 December 2005 Introduction to Databases DBMS Architectures and Features

    Prof. Beat Signer Department of Computer Science Vrije Universiteit Brussel beatsigner.com
  2. Beat Signer - Department of Computer Science - [email protected] 2

    March 27, 2019 DBMS Components Access Methods System Buffers Authorisation Control Integrity Checker Command Processor Program Object Code DDL Compiler File Manager Buffer Manager Recovery Manager Scheduler Query Optimiser Transaction Manager Query Compiler Queries Catalogue Manager DML Preprocessor Database Schema Application Programs Database Manager Data Manager DBMS Programmers Users DB Admins Based on 'Components of a DBMS', Database Systems, T. Connolly and C. Begg, Addison-Wesley 2010 Data, Indices and System Catalogue
  3. Beat Signer - Department of Computer Science - [email protected] 3

    March 27, 2019 DBMS Components ... ▪ DML preprocessor ▪ transforms embedded SQL statements into statements of the host language ▪ interacts with the query compiler to generate the appropriate host language code ▪ Query compiler ▪ transforms queries into a set of low-level instructions (query plan) which are forwarded to the database manager component ▪ DDL compiler ▪ converts a set of DDL statements into a set of tables ▪ tables and metadata are stored in the system catalogue (catalogue manager)
  4. Beat Signer - Department of Computer Science - [email protected] 4

    March 27, 2019 DBMS Components ... ▪ Catalogue manager ▪ provides access and manages the system catalogue ▪ used by most DBMS components ▪ Database manager ▪ processes user-submitted queries ▪ interfaces with application programs ▪ contains a set of components - query optimiser - transaction manager - ...
  5. Beat Signer - Department of Computer Science - [email protected] 5

    March 27, 2019 Database Manager Components ▪ Authorisation control ▪ checks whether as user has the necessary rights to execute a specific operation ▪ Command processor ▪ executes the steps of a given query plan handed over by the authorisation control component ▪ Integrity checker ▪ ensures that the operation is not going to violate any integrity constraints (e.g. key constraints) ▪ Query optimiser ▪ computes an optimal query execution strategy ▪ transforms the initial query plan into the best available sequence of operations on the given data
  6. Beat Signer - Department of Computer Science - [email protected] 6

    March 27, 2019 Database Manager Components ... ▪ Transaction manager ▪ processes any transaction-specific operations ▪ Scheduler ▪ manages the relative order in which transaction operations are executed ▪ Recovery manager ▪ deals with commits and aborts of transactions ▪ ensures that the database remains in a consistent state in case of failures ▪ Buffer manager ▪ transfers data between main memory and secondary storage
  7. Beat Signer - Department of Computer Science - [email protected] 7

    March 27, 2019 DBMS Architectures ▪ There is a wide variety of different DBMS architectures ▪ Teleprocessing ▪ File-Server Architecture ▪ Two-Tier Client-Server Architecture ▪ Three-Tier Client Server Architecture ▪ N-Tier Architecture ▪ Peer-to-Peer Architecture ▪ Distributed DBMS ▪ Service-Oriented Architecture ▪ Cloud Architecture ▪ ...
  8. Beat Signer - Department of Computer Science - [email protected] 8

    March 27, 2019 Teleprocessing ▪ Traditional multi-user system architecture ▪ single mainframe and multiple (dumb) terminals ▪ Heavy load on the central mainframe ▪ runs application programs and DBMS ▪ formats data for presentation on terminals Mainframe Terminal 1 Terminal 2 Terminal 3 Terminal n
  9. Beat Signer - Department of Computer Science - [email protected] 9

    March 27, 2019 Teleprocessing ... ▪ Tendency to replace expensive mainframes with network of personal computers (downsizing) ▪ file-server architectures ▪ client-server architectures ▪ ...
  10. Beat Signer - Department of Computer Science - [email protected] 10

    March 27, 2019 File-Server Architecture ▪ A file-server is a computer that is connected to a network and mainly serves as a shared storage ▪ e.g. for realising shared access to a database ▪ In a file-server architecture the processing is distributed over the network ▪ workstations (application and DBMS) request data (files) Workstation 2 (DBMS) Workstation n (DBMS) Workstation 1 (DBMS) File-Server Database data request file LAN or WAN
  11. Beat Signer - Department of Computer Science - [email protected] 11

    March 27, 2019 File-Server Architecture ... ▪ SQL Request Example ▪ Since the file-server is not SQL-aware, the Customer and Order relations (files) have to be transferred to the client ▪ Disadvantages ▪ heavy network traffic ▪ high total costs of ownership (TCO) - maintain a full instance of the DBMS on each client (workstation) ▪ complex integrity, concurrency and recovery control - multiple DBMSs may concurrently access the same shared file SELECT name, street FROM Customer, Order WHERE Order.customerID = Customer.customerID AND name = 'Max Frisch';
  12. Beat Signer - Department of Computer Science - [email protected] 12

    March 27, 2019 Two-Tier Client-Server Architecture ▪ Application consists of a client (first tier) and a server (second tier) that might run on different machines ▪ clear separation of concerns between client and server ▪ thin client vs. thick client - less or more application logic on the client side ▪ Supports decentralised business environments LAN or WAN Client 2 Client n Client 1 Server (DBMS) Database data request selected data
  13. Beat Signer - Department of Computer Science - [email protected] 13

    March 27, 2019 Two-Tier Client-Server Architecture ... ▪ Client (first tier) tasks ▪ presentation of data (user interface) ▪ business and data processing logic ▪ send database requests to the server and process the results ▪ Server (second tier) tasks ▪ manage (concurrent) database access (data services) - authorisation, integrity checks, query/update processing, recovery control, ... ▪ business logic (e.g. validation of data) ▪ Different possible client-server topologies ▪ single client and single server ▪ multiple clients and single server ▪ multiple clients and multiple servers
  14. Beat Signer - Department of Computer Science - [email protected] 14

    March 27, 2019 Two-Tier Client-Server Architecture ... ▪ Communication between client and server via inter- process communication (IPC) or over network ▪ special protocols such as ODBC (JDBC) introduced earlier in the course when discussing the Call Level Interface (CLI) ▪ Advantages ▪ increased performance - certain tasks performed in parallel and server can be tuned for DB processing ▪ reduced communication costs - only selected data is transferred ▪ reduced hardware costs - only server has to run a DBMS ▪ increased consistency/security through separation of concerns - constraint checking in a single place (server)
  15. Beat Signer - Department of Computer Science - [email protected] 15

    March 27, 2019 Two-Tier Client-Server Architecture ... ▪ Disadvantages ▪ limitations in terms of enterprise scalability with thousands of potential clients - significant client-side administration overhead • e.g. expensive deployment of new business and data application logic - thick client requires a considerable amount of resources (CPU, RAM, ...) to run applications efficiently
  16. Beat Signer - Department of Computer Science - [email protected] 16

    March 27, 2019 Three-Tier Client-Server Architecture ▪ In the 1990s, the three-tier client-server architecture was introduced to address the enterprise scalability problem ▪ e.g. driven by emerging web applications ▪ Application consists of a presentation tier (client), a logic tier (application server) and a data tier (database server) that might run on different platforms LAN or WAN Client 1 Database Server Database data request selected data Client 2 Client n Application Server
  17. Beat Signer - Department of Computer Science - [email protected] 17

    March 27, 2019 Three-Tier Client-Server Architecture ... ▪ Presentation tier (first tier) tasks ▪ presentation of data (user interface) ▪ basic input validation (thin client) ▪ send requests to the server and visualise results ▪ Logic tier (second tier / middle tier) tasks ▪ business logic ▪ data processing logic ▪ Data tier (third tier) tasks ▪ basic data validation ▪ manage (concurrent) database access (data services) - authorisation, integrity checks, query/update processing, recovery control, ...
  18. Beat Signer - Department of Computer Science - [email protected] 18

    March 27, 2019 Three-Tier Client-Server Architecture ... ▪ Advantages ▪ reduced costs for thin clients due to lower resource requirements (CPU, RAM, ...) - e.g. applications running in a web browser ▪ application logic is centralised in a single application server - reduces the software distribution problem (updates) that is present in two-tier client-server architectures ▪ increased modularity - easier to replace one tier without affecting the other tiers ▪ load balancing becomes easier with a clear separation between the core business logic and the database functionality
  19. Beat Signer - Department of Computer Science - [email protected] 19

    March 27, 2019 Web Information Systems (WIS) ▪ The three-tier architecture maps very naturally to web environments ▪ browser as a thin client, application server and database server ▪ The move towards thin browser clients has dramatically reduced the costs for software deployment Internet HTTP Request HTTP Response Client Application Server DB Server Database
  20. Beat Signer - Department of Computer Science - [email protected] 20

    March 27, 2019 N-Tier Architecture ▪ The three-tier architecture can be extended with additional intermediary tiers for increased flexibility ▪ e.g. separation between web server and application server in the previous web information system example - increases the flexibility for load balancing by introducing multiple web servers - only dynamic content delivered by the application server whereas static content is directly managed by the web server Internet HTTP Request HTTP Response Client Application Server HTML Pages Web Server DB Server Database
  21. Beat Signer - Department of Computer Science - [email protected] 21

    March 27, 2019 Peer-to-Peer Architecture ▪ Systems exchanging information and services in a peer- to-peer-like manner without a central authority ▪ no global schema → need for schema integration (matching) ▪ Data and service sharing ▪ no dedicated clients and servers ▪ sites may dynamically form new client/server relationships LAN or WAN Site 3 Site n Database 3 Site 2 Database 2 Site 1 Database 1 Database n
  22. Beat Signer - Department of Computer Science - [email protected] 22

    March 27, 2019 Middleware ▪ Software that connects (mediates) between software components or applications ▪ hide complexity of heterogenous and distributed components (e.g. servers) and provide a uniform interface ▪ There exist different types of middleware ▪ remote procedure call (RPC) - Java RMI - CORBA - XML RPC ▪ asynchronous publish/subscribe - subscribe for different types of messages ▪ SQL-oriented data access - open database connectivity (ODBC), JDBC, ... ▪ ...
  23. Beat Signer - Department of Computer Science - [email protected] 23

    March 27, 2019 Transaction Processing Monitor ▪ Complex applications can be built on top of several resource managers ▪ e.g. multiple DBMSs, operating systems, ... ▪ A Transaction Processing Monitor (TP Monitor) is a middleware component that provides uniform access to the services of a number of resource managers DB Server 1 Database 1 DB Server n Database n TP Monitor Service 1 Service r . . . Client 1 Client m Application Server
  24. Beat Signer - Department of Computer Science - [email protected] 24

    March 27, 2019 Transaction Processing Monitor ... ▪ A TP Monitor offers a number of advantages ▪ transaction routing - increase scalability by directing transactions to specific DBMSs ▪ distributed transaction management - manage transactions that require access to multiple heterogeneous DBMSs - e.g. based on XA standard for distributed transaction processing ▪ load balancing - balance requests across multiple DBMSs by directing to least loaded server ▪ funnelling - establish connections with DBMSs and funnel user requests through these connections thereby reducing the number of required connections ▪ increased reliability - if a DBMS fails, the TP monitor can resubmit the request to another DBMS or hold the transaction request until the DBMS becomes available and resubmit
  25. Beat Signer - Department of Computer Science - [email protected] 25

    March 27, 2019 Parallel Database Architectures ▪ Parallel machines (multiple processors) can be used to speed up the transaction processing ▪ Different models for parallel database architectures ▪ shared memory, shared disk, shared nothing and hierarchical Processor Memory Processor Processor Disk Disk shared memory Processor Processor Processor Disk Disk shared disk Memory Memory Memory Processor Disk shared nothing Memory Processor Disk Memory
  26. Beat Signer - Department of Computer Science - [email protected] 26

    March 27, 2019 Parallel Database Architectures ... ▪ Shared memory ▪ processors and disks have access to a shared memory via a bus ▪ very efficient communication between processors ▪ not scalable since memory bus becomes a bottleneck ▪ Shared disk ▪ all processors can access all disks via an interconnection network ▪ each processor has its own memory ▪ certain degree of fault tolerance if processor/memory fails ▪ also disks maybe have fault tolerance (e.g. RAID architecture) ▪ interconnection to the disk systems becomes bottleneck
  27. Beat Signer - Department of Computer Science - [email protected] 27

    March 27, 2019 Parallel Database Architectures ... ▪ Shared nothing ▪ each node consists of a processor, memory and one or more disks ▪ high-speed interconnection network between processors ▪ more scalable than shared memory or shared disk model ▪ increased communication costs for non-local disk access ▪ Hierarchical ▪ combines the different models (composition) ▪ top level is shared nothing between nodes - each node can be a shared memory or shared disk "subsystem"
  28. Beat Signer - Department of Computer Science - [email protected] 28

    March 27, 2019 Distributed DBMS (DDBMS) ▪ Distributed database ▪ logically related collection of shared data and metadata that is distributed over a network ▪ Distributed DBMS ▪ software system to manage the distributed database in a transparent way LAN or WAN Site 3 Site n Database Site 2 Database Site 1 Database Database
  29. Beat Signer - Department of Computer Science - [email protected] 29

    March 27, 2019 Distributed DBMS (DDBMS) ... ▪ Distinction between local and global transactions ▪ local transaction - accesses only data from the site from which the transaction was initiated ▪ global transaction - accesses data from several different sites ▪ Reasons for building a distributed DBMS ▪ data sharing - possibility to access data that resides at other sides ▪ autonomy - each site retains a certain degree of control over the local data ▪ availability - if one site fails the other sites may still be able to continue operating - data might be replicated at serveral sites for increased availability
  30. Beat Signer - Department of Computer Science - [email protected] 30

    March 27, 2019 Distributed DBMS (DDBMS) ... ▪ Reasons for building a distributed DBMS ... ▪ costs and scalability - use cluster of PCs instead of large mainframe systems ▪ integration of existing DBMS - coexistence of legacy systems with new applications ▪ dynamic organisational structure - mergers and acquisitions ▪ Implementation issues ▪ transactions have to be executed atomically accross different sites (two-phase commit protocol) - commit decision is left to a single coordinator ▪ distributed concurrency control - deadlock detection has to be carried out across multiple sites
  31. Beat Signer - Department of Computer Science - [email protected] 31

    March 27, 2019 Service-Oriented Architectures (SOA) ▪ Architecture that modularises functionality as interoperable services ▪ loose coupling of services ▪ service encapsulation ▪ interoperability between different operating systems and programming languages ▪ new services can be defined as a mashup of existing services ▪ Service-oriented database architecture (SODA) ▪ e.g. single SQL Server processes acting as service providers
  32. Beat Signer - Department of Computer Science - [email protected] 32

    March 27, 2019 Service-Oriented Architectures (SOA) ... ▪ Share business logic, data and processes via web service APIs ▪ Big Web Services - Universal Description, Discovery and Integration (UDDI) - Web Services Description Language (WSDL) - Simple Object Access Protocol (SOAP) ▪ RESTful Web Services ▪ Web services are based on established technologies such as the Extensible Markup Language (XML) ▪ Special service orchestration languages for the use of services ▪ e.g. Business Process Execution Language (BPEL)
  33. Beat Signer - Department of Computer Science - [email protected] 33

    March 27, 2019 Cloud Computing ▪ Internet-based computing with on-demand and pay-per- use access to shared resources, data and software ▪ Main characteristics ▪ web-based access (e.g. Web Service API or browser) ▪ pay only for the services that are actually used (pay-per-use) ▪ no initial investment (e.g. for resources) required The Cloud Google Microsoft Amazon Yahoo Client 1 Client 2 Client 3 Client n
  34. Beat Signer - Department of Computer Science - [email protected] 34

    March 27, 2019 Cloud Computing ... ▪ Cloud services ▪ infrastructure as a service (IaaS) - OS virtualisation - e.g. Amazon EC2, Rackspace, ... ▪ platform as a service (PaaS) - provide platform to run applications - e.g. Google App Engine, Windows Azure Platform, ... ▪ software as a service (SaaS) - provide software as a service over the Internet - e.g. Google Docs, ... ▪ Cloud service vendor gets some degree of control
  35. Beat Signer - Department of Computer Science - [email protected] 35

    March 27, 2019 Cloud Computing ... ▪ New challenges for database management in cloud computing ▪ cloud database server might be less reliable - might become difficult to guarantee a specific quality of service (QoS) for an application realised in the cloud ▪ backup, replication, ... ▪ Online or web-based databases ▪ store data in the cloud or on servers on the Internet
  36. Beat Signer - Department of Computer Science - [email protected] 36

    March 27, 2019 Cloud Data Service Example ▪ Amazon Simple Storage Service (Amazon S3) ▪ online storage service with unlimited storage space ▪ store objects (up to 5 TB in size) in buckets ▪ Web Service API ▪ Amazon SimpleDB ▪ distributed database written in Erlang ▪ offers a Web Service API ▪ makes use of S3 and EC2 ▪ on demand scaling ▪ non-relational data store - schemaless - hashtables with set of key value pairs - eventual consistency
  37. Beat Signer - Department of Computer Science - [email protected] 37

    March 27, 2019 Mobile DBMS ▪ Users want access to information on the move via mobile devices ▪ tourist information systems ▪ salesperson who is visiting their customers ▪ emergency services ▪ ... ▪ New requirements for mobile DBMSs ▪ small footprint databases that can run on mobile devices with limited resources - e.g. db4objects, http://www.db4o.com ▪ location-dependent queries ▪ context-aware queries
  38. Beat Signer - Department of Computer Science - [email protected] 38

    March 27, 2019 Mobile DBMS ... ▪ New requirements for mobile DBMSs ... ▪ communicate with centralised database server via wireless network or fixed Internet connections ▪ replicate data on a centralised server and on a mobile device - synchronisation challenges ▪ caching of data and transactions to cope with potential network connection failures ▪ opportunistic (peer-to-peer based) information exchange with other mobile DBMSs - e.g. dynamic P2P Bluetooth connections with other devices in range (proximity-based information exchange) ▪ security - which portion of a database can/should be replicated on a mobile device? ▪ ...
  39. Beat Signer - Department of Computer Science - [email protected] 39

    March 27, 2019 Homework ▪ Study the following chapter of the Database System Concepts book ▪ chapter 17 - sections 17.1-17.6 - Database-System Architectures
  40. Beat Signer - Department of Computer Science - [email protected] 40

    March 27, 2019 Exercise 7 ▪ Structured Query Language (SQL) ▪ JDBC
  41. Beat Signer - Department of Computer Science - [email protected] 41

    March 27, 2019 References ▪ A. Silberschatz, H. Korth and S. Sudarshan, Database System Concepts (Sixth Edition), McGraw-Hill, 2010 ▪ T. Connolly and C. Begg, Database Systems: A Practical Approach to Design, Implementation and Management (Fifth Edition), Addison-Wesley, 2010