$30 off During Our Annual Pro Sale. View Details »

Azure Storage - Learnings

Azure Storage - Learnings

Azure Storage is underpinning of lot of services and used extensively to provision machines, store data. This implies developers need to understand challenges of performance, availability and make right choices.

Govind Kanshi

November 06, 2014
Tweet

More Decks by Govind Kanshi

Other Decks in Technology

Transcript

  1. Learn. Connect. Explore.
    Learn. Connect. Explore.

    View Slide

  2. Azure Storage
    Learnings from MTC engagements
    Govind Kanshi
    MTC Microsoft, India

    View Slide

  3. Agenda
    • Where is storage
    • Storage options
    • Factors
    • Availability/Performance/Elasticity/Cost

    View Slide

  4. Questions
    • Do websites use storage?
    • How much throughput can I get from disks
    • How do I upload data
    • How do I track issues

    View Slide

  5. Where is Azure storage
    More than 20 trillion stored objects 2+ Million requests/sec on average (build
    2014)
    • VM Store
    • Data disk
    • Logs
    • Media (photo/video/audio/docs)
    • Table store
    • SSD Disk
    CDP-B382

    View Slide

  6. Virtual Machine Storage Architecture
    Azure Virtual Machine
    C:\
    OS Disk
    E:\, F:\, etc.
    Data Disks
    D:\
    Temporary Disk
    Disk Cache
    G:\, H:\, etc.
    SMB Share

    View Slide

  7. Azure Files
    Azure VM
    SMB 2.1
    Shared settings, diagnostic share
    Lift and Shift Applications
    Azure VM Azure VM

    View Slide

  8. Performance of storage
    • Disk
    • Linux
    • fio
    • Windows
    • Sqlio, crystal disk
    • Striped vs unstriped (same storage account)
    • RAID 1-0 – heavy data volume, double the cost
    • Windows – Storage Pool
    • Gates for performance
    • Stated throughput goal of Storage account
    • Machine Size
    • Connectivity to storage/pipe – max

    View Slide

  9. Single Queue – Account Name + Queue Name
    Up to 2,000 (1KB) messages per second
    Single Table Partition – Account Name + Table Name +
    PartitionKey value
    Up to 2,000 (1KB) entities per second
    Single Blob – Account Name + Container Name + Blob Name
    Up to 60 MBytes/sec
    Scalability Targets – Partition

    View Slide

  10. Think of performance
    •Performance gate(s)
    • (storage) VM disk, attached disk
    • Per vm
    •Network
    • Gateway (200 Mbps) across vpn
    • Expressroute 1 Gbps – 10 Gbps

    View Slide

  11. Blob Service - Best Practices
    •How do I upload a folder the fastest?
    • Upload multiple blobs simultaneously
    •How do I upload a blob the fastest?
    • Use parallel block upload
    • Aspera/send file over mail 
    •Distribute the load across the namespace
    •Prefer block upload size of 1 - 4MB range unless read
    requests are for small ranges

    View Slide

  12. Demo of striped/single/ssd

    View Slide

  13. Premium storage
    Up to 32 TB of storage per VM
    >50,000 IOPS per VM
    Less than 1ms read latency

    View Slide

  14. Storage Options
    Cost Latency Size of
    entity
    Store Size
    range
    Availability Data
    Ephemeral low low type 1 TB 3 local, geo any
    Azure Storage
    (page)
    lowest more 512-byte
    pages
    1 TB 3 local/geo r/w data
    Azure Table Low more
    Azure Storage
    (block)
    Low More 4-MB 200 GB 3 local/geo Block
    data(backup
    delta), files
    SQL Azure Flexible medium Datatype 1Gb-500 Gb 3 local, 1 geo,
    RO
    Structured data
    Azure Redis Flexible Low Datatype Upto 53 GB Master –slave cache
    DocumentDB Flexible Lowest 256 Kb Xx Terabytes Auto Any (json)
    Azure Search Flexible Low Datatype GBs Auto Any
    HDInsight Flexible More Custom TBs Auto Any

    View Slide

  15. LRS
    Stores 3 replicas of the data within a single zone (facility) in a single region
    Provides data durability for disk, node and rack failures
    ZRS *
    Available only for block blobs
    Stores 3 replicas of the data across multiple zones (facilities). Designed to keep all 3 replicas across
    zones within a single region, but may span across two regions.
    Provides additional durability to protect data against zone failures (e.g., fire in a facility)
    GRS
    Stores 6 replicas of the data across two regions (3 in each region)
    Provides additional durability to protect data against major regional disasters (e.g., tornado,
    hurricane, earthquake, etc.)
    3 Types of Durability offered for Azure
    Storage

    View Slide

  16. Geo Redundant Storage (GRS)
    Data geo-replicated across regions hundreds of miles apart
    • Provide data durability in face of potential major regional disasters
    • Provided for Blob, Tables and Queues
    User chooses primary region during account creation
    • Each primary region has a predefined secondary region
    Asynchronous geo-replication
    • Off critical path of live requests
    US West US East
    US North US South
    US Central US East 2
    Europe North Europe West
    Asia East Asia South East
    China North China South
    Japan East Japan West
    South Brazil US South

    View Slide

  17. Read-Only Access to GRS (RA-GRS) – Scenarios
    • Read-only access to secondary data even if primary is unavailable
    • Access to an eventually consistent copy of the data in the other region
    • For these, the application semantics need to allow for eventually
    consistent reads

    View Slide

  18. Turn on storage analytics with retention on
    Send client request id with data that needs to be tracked
    Logs can be analyzed to retrieve information and aggregate
    based on it
    Logs can be used to determine hot spots
    Logs are not sorted in a blob and clock skew needs to be
    factored in
    Look at minute & hourly metrics to understand usage and
    Performance
    Look for throttling errors
    Storage Analytics

    View Slide

  19. Choosing the Right Authentication Method
    Symmetric Shared Key Authentication
    Trusted service that owns the storage accounts
    Shared Access Signature (SAS)
    3rd party services
    Mobile device applications
    Restricted access for services
    Allow client applications to directly communicate with Storage rather
    than scaling a proxy web service
    Proxy used for authentication and providing SAS tokens
    Public (Blob service only)
    CDN access
    Content accessed via browsers

    View Slide

  20. Designing Your Service For Security (1 of 2)
    How to store Secret keys/Shared Access Signature (SAS) tokens?
    Persist only encrypted key/token
    Use cert to decrypt the encrypted key in the application
    Certs available only on required nodes
    How to transfer SAS tokens?
    Use HTTPS to transfer SAS tokens
    How often should I change my Secret keys/SAS tokens?
    Automate the process to enable changing it frequently
    Always be ready to revoke SAS tokens or change Secret keys/SAS
    tokens

    View Slide

  21. Designing Your Service For Security (2 of 2)
    How do I rotate Secret Keys/SAS tokens?
    Two 512-bit keys provided.
    Push configuration change to all services to use one
    of them
    Other key can be changed using service management
    API

    View Slide

  22. Shared Access Signatures – Best Practices (1
    of 2)
    Authenticate the service requesting SAS token
    Use HTTPS
    Token provider and consumer need to agree on storage REST version
    Semantics for SAS Token can change from version
    Token generating service should be capable of generating multiple
    versions of tokens and consumer can select the version it can
    understand
    Clock Skew
    Sufficient buffer for start time and end time because of clock skew
    Avoid setting start time if access should start right away

    View Slide

  23. • Azure Storage
    • Durable, Scalable and highly Available Cloud Storage
    • Auto load balances to meet scale needs
    • Performance from local, ssd, striped, provisioned
    • Storage Durability Options – LRS, ZRS, and GRS
    • RA-GRS
    • Provides Higher Availability as applications can read from secondary when
    primary is not available.
    • Client Library retries provides this capability out of the box
    • Details on Internals can be found in the SOSP paper:
    • “Windows Azure Storage: A Highly Available Cloud Storage Service with Strong
    Consistency”, ACM Symposium on Operating System Principals (SOSP), Oct.
    2011
    Summary

    View Slide

  24. •Increased Scale Targets for Storage Accounts
    •Each storage account can hold up to 500TBs for all regions
    •Increased BW for US regions per storage account
    •10Gbps Ingress and 20Gbps Egress
    •Improved Versioning for Shared Access Signatures
    •Client Libraries & Tools
    •.NET Library Desktop, Phone and Runtime with support for Files and Rest Version 2014-02-14
    •Java 1.0 RTM
    •Android 0.1 CTP
    •C++ Library CTP
    •AzCopy for Files CTP
    •PowerShell for Files CTP
    •Azure Files Preview
    What is New?

    View Slide

  25. Follow us online
    Facebook
    facebook.com/MicrosoftDeveloper.India
    twitter.com/msdevindia
    Twitter
    Twitter:

    View Slide

  26. Your Feedback is Important
    OPTION 3: Feedback stations outside the hall
    Fill out evaluation of this session and help shape future events.
    OPTION 1 OPTION 2
    Replace this space with the
    actual QR Code

    View Slide

  27. View Slide