Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript

PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript

Time Series have an amazing potential for Querying, Parallelisation and Query Composition, because of the nature of the Data itself, this is exactly why general-purpose stores won't ever be the best fit for Time Series data.
Continuum, a Time Series store was built keeping the key features of Time Series data in mind. State-of-art Algebraic structures, can be used to build flexible and composite Query API, Compact data structures can be used to save up space and reduce deserialisation effort while reading data from DB.
An adventure of building a Modern Time Series database in C + Haskell, with ClojureScript front-end, that brings the visibility of everything going on in your system on a whole new level.

αλεx π

July 03, 2015
Tweet

More Decks by αλεx π

Other Decks in Research

Transcript

  1. ClojureWerkz 35+ high-quality Clojure libraries User reports from all over

    the world 20+ active contributors We value documentation
  2. Time Series Data LevelDB backend Tight Combination of Haskell and

    C Optimising space, reads Flexible aggregates Parallel queries
  3. Used to avoid timestamp resolution collisions To ensure sub-resolution order

    Snapshot the data on overflow or timeout Ensures idempotence Sequence ID
  4. 1 2 3 4 5 6 7 8 9 10

    11 12 13 Range Tables store the snapshotted ranges
  5. Full Table Scan 1 2 3 4 5 6 7

    8 9 10 11 12 13 Start End
  6. Open Range 1 2 3 4 5 6 7 8

    9 10 11 12 13 Start End
  7. “Between” Range 1 2 3 4 5 6 7 8

    9 10 11 12 13 Start End
  8. data Step a s = Yield a !s | Skip

    !s | Done data Stream a = ∃s. Stream (s → Step a s) s
  9. maps :: (a → b) → Stream a → Stream

    b maps f (Stream next0 s0 ) = Stream next s0 where next !s = case next0 of Done → Done Skip s' → Skip s' Yield x s' → Yield (f x) s'
  10. filters :: (a → Bool) → Stream a → Stream

    a filters p (Stream next0 s0) = Stream next s0 where next !s = case next0 s of Done → Done Skip s' → Skip s' Yield x s' | p x → Yield x s' | otherwise → Skip s'
  11. foldls :: (b → a → b) → b →

    Stream a → b foldls f z (Stream next s0) = loop z s0 where loop z s = case next s of Yield x s' → loop (f z x) s' Skip s' → loop z s' Done → z
  12. data (Monoid b) => Fold a b = ∃x. Fold

    (x → a → x) x (x → b) step initial finalize Append class Monoid a where mempty :: a mappend :: a -> a -> a -- ^ Identity of 'mappend' -- ^ An associative operation
  13. Count data Count = Count Int op_count :: ∃a. Fold

    a Count op_count = Fold (\i _ -> i + 1) 0 Count instance Monoid Count where mempty = Count 0 mappend (Count a) (Count b) = Count $ a + b instance Aggregate Count Int where combine (Count a) = a
  14. Mean data (Num a) => Mean a = Mean [a]

    op_mean :: (Integral a) => Fold a (Mean a) op_mean = Fold (flip (:)) [] Mean instance (Integral a) => Monoid (Mean a) where mempty = Mean [] mappend (Mean a) (Mean b) = Mean $ a ++ b instance (Integral a) => Aggregate (Mean a) Double where combine (Mean []) = 0 combine (Mean a) = s / l where s = fromIntegral $ sum a l = fromIntegral $ length a
  15. Group op_groupBy :: (Ord a, Monoid b) => (r ->

    Maybe a) -> (Fold r b) -> Fold r (MapResult a b) op_groupBy groupFn (Fold f z0 e) = let subStep n Nothing = return $! (f z0 n) subStep n (Just a) = return $! (f a n) localStep m record = maybe m (\r -> Map.alter (subStep record) r m) (groupFn record) done a = MapResult $ Map.map e a in Fold localStep Map.empty done
  16. Other examples Several Aggregates in one run Group by field,

    time or combination Nested aggregates of any type
  17. data DbValue = DbString ByteString | DbLong Integer | DbInt

    Integer | DbShort Integer | DbByte Integer | DbFloat Float | DbDouble Double deriving (Eq, Show, Ord, Generic)
  18. Layout 0 8 16 +----------------+----------------+----------------+ ... | field 1 offset

    | field 2 offset | field 3 offset | ... +----------------+----------------+----------------+ ... 24 24 + offset 24 + offset ... +--------------+--------------+--------------+ ... | field 1 data | field 2 data | field 2 data | ... +--------------+--------------+--------------+
  19. Adding fields appends to the end Written data is unchanged

    or nullified Removed fields are ignored and unavailable
  20. Snapshot consensus Rolling CRC of the data Asynchronous No quorum

    for snapshot reads Parallel Reads from Snapshotted Data