Add value to your data using S3

by Marketing OGZ

Slide 1

Slide 1 text

Add value to your data using S3 Use Case: Akrivverket – Norwegian Public Digital Archive

Slide 2

Slide 2 text

2 Arkivvkert - Intro Public Norwegian Digital Archive Preserving, securing & make available State archive & Private archive 7PB of Indexed Digital Archive About Arkivverket

Slide 3

Slide 3 text

3 Why S3 is the right Storage API for digital archives • Open storage API with 100s of advanced features • Metadata orientated (Allows for custom metadata to be attached to the data object) • Scale-out architecture (Build on x86 platform with affordable capacity drives) • Security features (S3ObjectLock, IAM, Versioning) • Works across Public Cloud & Private Cloud

Slide 4

Slide 4 text

4 Arkivverket – High-Level Workflow Dataset recived (Physical or digital) Upload Data Object to S3 Bucket Ingest application Generate Data Object & Metadata Tag custom metadata to Data Object (x-amz-meta-*) Archive Web App Search Engine (Object Metadata) S3 API Data presentation (Object Data)

Slide 5

Slide 5 text

5 Metadata – Why is it so important? BLOB - Image file stored as jpg/tif.. System metadata; Date - Object creation date. Content-Length - Object size in bytes. Last-Modified - Creation date or the last modified date, whichever is the latest. User-defined metadata; x-amz-meta-Patient: Homer x-amz-meta-Age: 50 x-amz-meta-”Scan of”: Brain x-amz-meta-Scanner: Xray 1 x-amz-meta-operator: Bart …….

Slide 6

Slide 6 text

6 What does it look like?

Slide 7

Slide 7 text

7 What does it look like?

Slide 8

Slide 8 text

8 What does it look like?

Slide 9

Slide 9 text

9 What does it look like?

Slide 10

Slide 10 text

10 Why did Arkivverket go from Traditional NAS Storage to S3? • Metadata is KEY! – Keeping separate metadata DB´s is not sufficient! • Defacto standard open API with a huge community (S3 is the most used ObjectStorage API) • Future proof (Amazon is adding new advanced functionality) • Ease of use (App developers already know the S3 API from Cloud Applications) • Cost per PB/TB/GB/MB! – S3 runs X86 servers with affordable capacity HDD´s

Slide 11

Slide 11 text

11 How to scale based on Metadata ? • Monitoring of S3 Metdata requests (S3HeadObject). • Amount of objects X Metadata requests pr sec = Metadata performance / capacity need.

Slide 12

Slide 12 text

12 Key take aways • Bring value and availability to your data with Custom Metadata • Maximize the value of the S3 API by leveraging advanced functionality • Offload your application level with Smart Storage API`s • Availability from everywhere with top-level security

Slide 13

Slide 13 text

No content