• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources • Internally, that “blocking” operation is a big loop per operation asking “Are we there yet?” Friday, March 9, 12
• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources • Internally, that “blocking” operation is a big loop per operation asking “Are we there yet?” • “Idleness” occurs because the thread waiting for I/O to complete is doing nothing while it waits Friday, March 9, 12
• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources • Internally, that “blocking” operation is a big loop per operation asking “Are we there yet?” • “Idleness” occurs because the thread waiting for I/O to complete is doing nothing while it waits • This can vastly limit our ability to scale Friday, March 9, 12
Eschew blocking individually on each operation • Find ways to work with the kernel to more efficiently manage multiple blocking resources in groups Friday, March 9, 12
Eschew blocking individually on each operation • Find ways to work with the kernel to more efficiently manage multiple blocking resources in groups • Free up threads which are no longer blocked waiting on I/O to handle other requests while those requests waiting on I/O idle Friday, March 9, 12
longer blocking and instead reusing threads while I/O waits, we need a way to handle “completion” events • Asynchronous techniques (such as callbacks) allow us to achieve this Friday, March 9, 12
longer blocking and instead reusing threads while I/O waits, we need a way to handle “completion” events • Asynchronous techniques (such as callbacks) allow us to achieve this • “Step to the side of the line, and we’ll call you when your order is ready” Friday, March 9, 12
us into some paradigms • Servers • A 1:1 ratio of threads to client connections (scale limited) • Clients • Connection pools often larger than necessary to account for unavailable-while-blocking connection resources Friday, March 9, 12
change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously Friday, March 9, 12
change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. Friday, March 9, 12
change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) Friday, March 9, 12
change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) • Clients Friday, March 9, 12
change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) • Clients • Significantly reduce pool sizes Friday, March 9, 12
change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) • Clients • Significantly reduce pool sizes • connection resources can be leveraged by many more simultaneous threads Friday, March 9, 12
1.4 • Focus on low-level I/O as opposed to “Old” I/O’s high level-API • ByteBuffers: http://www.kdgregory.com/index.php?page=java.byteBuffer Friday, March 9, 12
work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector Friday, March 9, 12
work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations Friday, March 9, 12
work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations • Channel instances are “bulk” data wrappers to Buffer Friday, March 9, 12
work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations • Channel instances are “bulk” data wrappers to Buffer • Selector is an event monitor which can watch multiple Channel instances in a single thread (the kernel coordinator) Friday, March 9, 12
work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations • Channel instances are “bulk” data wrappers to Buffer • Selector is an event monitor which can watch multiple Channel instances in a single thread (the kernel coordinator) • Relocate the task of checking I/O status out of the “execution” threads Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status • Dispatch “time to write” to original caller Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status • Dispatch “time to write” to original caller • Write Friday, March 9, 12
a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status • Dispatch “time to write” to original caller • Write • Rinse, repeat. Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer • Supports ByteBuffer, Arrays, etc Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer • Supports ByteBuffer, Arrays, etc • Avoid memory copy as much as possible Friday, March 9, 12
oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer • Supports ByteBuffer, Arrays, etc • Avoid memory copy as much as possible • Direct Memory allocated (rumors of memory leaks abound) Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages • Handling errors & exceptions across threads, time, and space Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages • Handling errors & exceptions across threads, time, and space • “Follow up” operations which rely on serverside “same connection” context Friday, March 9, 12
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages • Handling errors & exceptions across threads, time, and space • “Follow up” operations which rely on serverside “same connection” context • Working with multi-state iterations which depend on IO for operations... a.k.a. “Database Cursors” Friday, March 9, 12
• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes Friday, March 9, 12
• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where Friday, March 9, 12
• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place Friday, March 9, 12
• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place • Conceptually, “Law of Demeter” (loose coupling) helps here. Doing it by hand you have to be careful not to eat somebody else’s lunch Friday, March 9, 12
• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place • Conceptually, “Law of Demeter” (loose coupling) helps here. Doing it by hand you have to be careful not to eat somebody else’s lunch • NIO leaves you on your own Friday, March 9, 12
• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place • Conceptually, “Law of Demeter” (loose coupling) helps here. Doing it by hand you have to be careful not to eat somebody else’s lunch • NIO leaves you on your own • Netty’s pipeline helps provide the “don’t eat my lunch” fix quite well IMHO Friday, March 9, 12
ReplyMessageDecoder extends LengthFieldBasedFrameDecoder(1024 * 1024 * 4, 0, 4, -4, 0) with Logging { protected override def decode(ctx: ChannelHandlerContext, channel: Channel, buffer: ChannelBuffer): AnyRef = { val frame = super.decode(ctx, channel, buffer).asInstanceOf[ChannelBuffer] if (frame == null) { // don't have the whole message yet; netty will retry later null } else { // we have one message (and nothing else) in the "frame" buffer MongoMessage.unapply(new ChannelBufferInputStream(frame)) match { case reply: ReplyMessage 㱺 reply case default 㱺 // this should not happen; throw new Exception("Unknown message type '%s' incoming from MongoDB; ignoring.".format(default)) } } } Friday, March 9, 12
we return a new object rather than a buffer from decode(), // we can use slice() here according to the docs (the slice won't escape // the decode() method so it will be valid while we're using it) protected override def extractFrame(buffer: ChannelBuffer, index: Int, length: Int): ChannelBuffer = { return buffer.slice(index, length); } } Friday, March 9, 12
ChannelHandlerContext, e: MessageEvent) { val message = e.getMessage.asInstanceOf[MongoMessage] log.debug("Incoming Message received type %s", message.getClass.getName) message match { case reply: ReplyMessage 㱺 { Friday, March 9, 12
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] Friday, March 9, 12
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success Friday, March 9, 12
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success Friday, March 9, 12
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success • Node does a similar passing of Success vs. Result Friday, March 9, 12
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success • Node does a similar passing of Success vs. Result • No special “different” handling in Netty vs. NIO Friday, March 9, 12
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success • Node does a similar passing of Success vs. Result • No special “different” handling in Netty vs. NIO • Implicit tricks for the lazy who want to just write a “success” block Friday, March 9, 12
sealed trait RequestFuture { type T val body: Either[Throwable, T] 㱺 Unit def apply(error: Throwable) = body(Left(error)) def apply[A <% T](result: A) = body(Right(result.asInstanceOf[T])) protected[futures] var completed = false } /** * Will pass any *generated* _id along with any relevant getLastError information * For an update, don't expect to get ObjectId */ trait WriteRequestFuture extends RequestFuture { type T <: (Option[AnyRef] /* ID Type */ , WriteResult) } implicit def asWriteOp(f: Either[Throwable, (Option[AnyRef], WriteResult)] 㱺 Unit) = RequestFutures.write(f) Friday, March 9, 12
RequestFutures.write((result: Either[Throwable, (Option[AnyRef], WriteResult)]) 㱺 { result match { case Right((oid, wr)) 㱺 { ok = Some(true) id = oid } case Left(t) 㱺 { ok = Some(false) log.error(t, "Command Failed.") } } }) mongo.insert(Document("foo" -> "bar", "bar" -> "baz"))(handler) Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done • In Async, harder Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done • In Async, harder • Only solution I’ve found is “ballot box stuffing” Friday, March 9, 12
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done • In Async, harder • Only solution I’ve found is “ballot box stuffing” • Deliberate reversal of the “decoding problem” Friday, March 9, 12
iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A • In a pure and simple form, the Iterator[A] is prepopulated with all of its elements. Friday, March 9, 12
iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A • In a pure and simple form, the Iterator[A] is prepopulated with all of its elements. • If the buffer is non-empty, hasNext == true and next() returns another element. Friday, March 9, 12
iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A • In a pure and simple form, the Iterator[A] is prepopulated with all of its elements. • If the buffer is non-empty, hasNext == true and next() returns another element. • When the buffer is empty, iteration halts completely because there’s nothing left to iterate. hasNext == false, next() returns null, throws an exception or similar. Friday, March 9, 12
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous Friday, March 9, 12
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient Friday, March 9, 12
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? Friday, March 9, 12
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? • With async, we may have a lot of potentially large result sets buffered Friday, March 9, 12
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? • With async, we may have a lot of potentially large result sets buffered • Many databases (MongoDB, MySQL, Oracle, etc) use a multi-state result known as a “cursor” Friday, March 9, 12
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? • With async, we may have a lot of potentially large result sets buffered • Many databases (MongoDB, MySQL, Oracle, etc) use a multi-state result known as a “cursor” • Let the server buffer memory and chunk up batches Friday, March 9, 12
return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch • Client can use getMore to fetch additional batches until server results are exhausted Friday, March 9, 12
return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch • Client can use getMore to fetch additional batches until server results are exhausted • Once exhausted, getMore will return a batch and a Cursor ID of 0 (indicating “no more results”) Friday, March 9, 12
return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch • Client can use getMore to fetch additional batches until server results are exhausted • Once exhausted, getMore will return a batch and a Cursor ID of 0 (indicating “no more results”) • Try doing this cleanly without blocking... Friday, March 9, 12
have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” Friday, March 9, 12
have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” • next: A Friday, March 9, 12
have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” • next: A • If non-empty local buffer, return item Friday, March 9, 12
have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” • next: A • If non-empty local buffer, return item • If more on server, call getMore (Smarter code may be “predictive” about this and prefetch ahead of need) Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking • John de Goes (@jdegoes) and Josh Suereth (@jsuereth) suggested Iteratees as a solution Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking • John de Goes (@jdegoes) and Josh Suereth (@jsuereth) suggested Iteratees as a solution • Reading Haskell white papers and Scalaz code made my brain hurt... Friday, March 9, 12
that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking • John de Goes (@jdegoes) and Josh Suereth (@jsuereth) suggested Iteratees as a solution • Reading Haskell white papers and Scalaz code made my brain hurt... • ... As such, what follows is my interpretation and any mistakes & stupidity are entirely my own Friday, March 9, 12
next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” Friday, March 9, 12
next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state Friday, March 9, 12
next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state • Code is now asynchronous Friday, March 9, 12
next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state • Code is now asynchronous • If the response to “No more on client, server has some” is “go get more”, no blocking I/O is needed Friday, March 9, 12
next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state • Code is now asynchronous • If the response to “No more on client, server has some” is “go get more”, no blocking I/O is needed • Pass a copy of the current method with the “get more” command, iteration continues after buffer replenishment Friday, March 9, 12
entry”, have fun! case class Entry[T: SerializableBSONObject](doc: T) extends IterState // Client buffer empty, but more on server case object Empty extends IterState // Both client buffer and server are exhausted case object EOF extends IterState trait IterCmd // I’m all done with this cursor - clean it up, shut it down, take out the trash case object Done extends IterCmd // Go get me an item to work on ... here’s a function to handle all states case class Next(op: (IterState) 㱺 IterCmd) extends IterCmd // Call getMore & retrieve another batch - here’s a function to handle all states case class NextBatch(op: (IterState) 㱺 IterCmd) extends IterCmd Friday, March 9, 12
{ var x = 0 conn(integrationTestDBName).find("books")(Document.empty, Document.empty)((cursor: Cursor[Document]) 㱺 { def next(op: Cursor.IterState): Cursor.IterCmd = op match { case Cursor.Entry(doc) 㱺 { x += 1 if (x < 100) Cursor.Next(next) else Cursor.Done } case Cursor.Empty 㱺 { if (x < 100) Cursor.NextBatch(next) else Cursor.Done } case Cursor.EOF 㱺 { Cursor.Done } } Cursor.iterate(cursor)(next) }) x must eventually(5, 5.seconds)(be_==(100)) } Friday, March 9, 12
level async API to NIO without going as high level as Netty does • NIO.2 / AIO brings in “AsynchronousSocketChannels” • Removes need to select / poll by hand • Configurable timeouts • Two options for how to get responses; both sanely map to Scala • java.util.concurrent.Future • CompletionHandler • Easily logically mapped to Either[E, T] with implicits • Probably not ‘prime time’ usable for library authors *yet* ... due to dependency on JDK7 Friday, March 9, 12
| LinkedIn http://linkd.in/joinmongo download at mongodb.org github.com/mongodb/casbah github.com/bwmcadams/hammersmith These slides will be online later at: http://speakerdeck.com/u/bwmcadams/ brendan@10gen.com (twitter: @rit) Friday, March 9, 12