Ivan Subotic 1 Lukas Rosenthaler1 Heiko Schuldt2 1Digital Humanities Lab, 2Databases and Information Systems Group University of Basel, Switzerland {firstname.lastname}@unibas.ch JCDL 2013, July 23, 2013
Tolerance and Failure Management • Management of Complex Information Objects • Scalability • Openness and Extensibility • Resource Discovery and Load Balancing • Authentication, Authorization, and Auditing Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 3 / 20
Archival System 3 The DISTARNET System Prototype 4 Evaluation 5 Conclusion and Future Work Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 5 / 20
Processes • Self-Configuration + Node Joining Process + Periodic Neighbor-Node Checking Process + Automated Dynamic Replication Process • Self-Healing + Periodic Integrity Checking Process + DAO Repairing Process + Node Lost Process + Reliable Copying Process + Data Format Migration Process • Self-Optimization + State Dissemination Process + Resource Discovery + Parameter Optimization Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 7 / 20
and Fault-Tolerance • Distributed Infrastructure Faults + Node Loss Periodic Neighbor-Node Checking Process, Node Lost Process, Automated Dynamic Replication Process + Node Dependability Periodic Neighbor-Node Checking Process • Content Faults + DAO Corruption / Destruction Periodic Integrity Checking Process, DAO Repairing Process, Reliable Copying Process + DAO Representation Unreadable Data Format Migration Process • Node Engine Faults + Process Execution / Implementation dependent (more in next section) Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 8 / 20
Archival System 3 The DISTARNET System Prototype 4 Evaluation 5 Conclusion and Future Work Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 9 / 20
Test Setup • 4 nodes running on one machine • 1 JVM per node Initial Network State • Each node owns 1 collection (100 Image and Annotation DAOs) • 3 replicas of each DAO in network Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 13 / 20
Distributed Infrastructure Fault Class 2. Content Corruption Content Fault Class 3. Data Format Obsolescence Content Fault Class Add new representation 4. Multi-Failure Mixed Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 14 / 20
PICP over Archive with 10 % data loss, DRP, RCP + DFMP running for 1 DAO • Inter-Node Transfer Times + 100 Mb/s ⇡ 12.5 MB/s ) 8 seconds for 100 MB + 1 Gb/s ⇡ 125 MB/s ) 0.8 seconds for 100 MB • Procedure + One node with data for F = 1 + Measure execution time of process mix + Add calculated inter-node transfer times + Extrapolate under linear assumption Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 16 / 20
under 24 h • total duration over 24 h Scaling Factor 1 10 100 1’000 10’000 100’000 1’000’000 Image DAOs 100 1’000 10’000 105 106 107 108 Archive Size 10 GB 100 GB 1 TB 10 TB 100 TB 1 PB 10 PB Duration w. 100 Mb/s 129.415 s 0.36 h 3.59 h 35.95 h 15 d 150 d 1’495 d Duration w. 1 Gb/s 50.894 s 0.14 h 1.41 h 14.14 h 6 d 59 d 589 d Assumptions / Results • Conservative: execution under 24 h ) ⇡ 10 TB • Relaxed: execution under 60 days ) ⇡ 1 PB Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 17 / 20
RCP 95.46 % 95.57 % 1.881 2.256 4 0.400 4.54 % 4.43 % 0 s 38 s 75 s 113 s 150 s 100 Mb/s 1 Gb/s (1) PICP + DRP (1) RCP (2) DFMP (2) RCP Σ 4.54% Σ 95.46% Σ 4.43% Σ 95.57% Periodic Integrity Checking Process + DAO Repairing Process Reliable Copying Process Data Format Migration Process Reliable Copying Process (1) (2) Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 18 / 20
Archival System 3 The DISTARNET System Prototype 4 Evaluation 5 Conclusion and Future Work Ivan Subotic (University of Basel) DISTARNET JCDL 2013, July 23 19 / 20