Building multi-million document applications with scalability in mind
What does it mean to work with a huge quantity of data? We know that sharding and replication are our friends, but those are also other things that we should consider. In this session I will show you how we have used Raven with scalability in mind.
class learning • Dedicated software to fill tax return statements and to run the companies accounting Leader company in offering services to accountants
only this year • Thanks to a network of 35.000 professionals that use its software free to fill and send the tax return statements of their clients 1° CAF in Italy as private business
different fields that can be filled by the user or automatically • Some fields can be validated by rules • Once correctly filled the statement will be sent to the Revenue Agency
associated objects that are treated as a unit for the purpose of data changes • The document databases are the best way to save the aggregates (Aggregate = Document) “A data model is the model through which we perceive and manipulate our data.” Martin Fowler, NoSQL Distilled
.NET Framework and the C# language • We operate with sensitive data so we have the need of ACID transactions • For the technical support provided by Managed Designs
section • A save command will be sent to the server • On the server the statement is loaded and through an object mapper it is filled with the data in the command • All the rules are applied • Finally, all the changes will be saved on the database How it works UI Command Handling command result Thin Data Layer DTOs query load, change and persist Domain Model Query-‐side Command-‐side
It persists all the changes made within the transaction in an atomic operation • If there are mistakes or exceptions no changes are saved UnitOfWork + Repository BeginUnitOfWork() EndUnitOfWork() using(var tran = new TransactionScope()) { tran.Complete(); }
and data layer using a collection-like interface for accessing our aggregates • Changes will be persisted on storage in a completely transparent way • With RavenDB we don’t have to write a mapping layer UnitOfWork + Repository BeginUnitOfWork() EndUnitOfWork() using(var tran = new TransactionScope()) { Repository.GetById(id) tran.Complete(); } Repository.Update() session.SaveChanges(); var s = session.Load<Statement>(id); // Changes occur in the aggregate root
statement • With an operation reverse to the saving, a view model is populated with only the data of the section to display • The view model is then sent to the client serialized in json How it works UI Command Handling command result Thin Data Layer DTOs query load, change and persist Domain Model Query-‐side Command-‐side
in this case because: 1. All the data that we needed were a subset of the statement document 2. Not to handle the asynchronous update of the indexes • We used indexes for reports, look-up lists and summary information
or a software solution to handle increased loads of work • A solution that can scale out can usually grow to larger loads in a more convenient way SCALE UP SCALE OUT growing by using stronger hardware growing by adding more hardware
be consistent so they can be easily distributed without compromising the performance • Obviously this “indivisible whole of consistency” is an aggregate • So, first of all, define your aggregate boundaries
to divide the data per accountant • Each statement may be made only by an accountant • So we will not need to run queries on multiple servers at the same time
• From the support/back office POV we need cross accountants queries • Accountant affinity requires us to keep “related” accountants as near as possible Node 0 Node 1 Node N Shard(s) DB Shard(s) DB Shard(s) DB
be on the cutting edge • The unavailability of the software in those moments can bring serious harm to our clients • A master-slave replication can be crucial to ensure high availability to the application
Switch – logical-slave becomes the logical-master – logical-master switches to logical-slave • VIP Switch the application to use the logical-master through Master-Master Replication
be replicated at least to n servers in the replica set; • Deploy RavenDB server on VMs on SAN; – Each SAN shadow copies the VM at each change, amazingly fast; – If HW fails, restore the SAN snapshot to a replicated SAN and move on;