Asynchronous and Non-Blocking Scala For Fun & Profit

Brendan McAdams 10gen, Inc. [email protected] @rit Asynchronous + Non-Blocking Scala
for Fun & Proﬁt A look at Netty & NIO for Asynchronous networking via Scala Friday, March 9, 12

Goals Friday, March 9, 12

Goals • Some simple goals ... Friday, March 9, 12

Goals • Some simple goals ... • Stop wasting resources
on blocking I/O Friday, March 9, 12

on blocking I/O • Achieve C10K (10,000 simultaneous clients) Friday, March 9, 12

on blocking I/O • Achieve C10K (10,000 simultaneous clients) • Profit Friday, March 9, 12

Explaining Non-Blocking & Asynchronous Friday, March 9, 12

Explaining Non-Blocking & Asynchronous • We are talking about I/O
here Friday, March 9, 12

here • By which, of course, we mean “Input / Output” Friday, March 9, 12

here • By which, of course, we mean “Input / Output” • We’ll focus on networking I/O Friday, March 9, 12

Explaining Non-Blocking & Asynchronous •Asynchronous describes a way to accomplish
Non-Blocking I/O Friday, March 9, 12

Explaining Non-Blocking & Asynchronous •Asynchronous describes a way to accomplish
Non-Blocking I/O • I like to think of these as a mini-stack, with Async on top of Non-blocking Friday, March 9, 12

Explaining Non-Blocking & Asynchronous •Blocking presents a series of problems:
Friday, March 9, 12

• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources Friday, March 9, 12

• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources • Internally, that “blocking” operation is a big loop per operation asking “Are we there yet?” Friday, March 9, 12

• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources • Internally, that “blocking” operation is a big loop per operation asking “Are we there yet?” • “Idleness” occurs because the thread waiting for I/O to complete is doing nothing while it waits Friday, March 9, 12

• When a blocking I/O operation occurs, everything in that thread halts and waits; potentially idle system resources • Internally, that “blocking” operation is a big loop per operation asking “Are we there yet?” • “Idleness” occurs because the thread waiting for I/O to complete is doing nothing while it waits • This can vastly limit our ability to scale Friday, March 9, 12

Explaining Non-Blocking & Asynchronous • Non-Blocking I/O presents solutions: Friday,
March 9, 12

Explaining Non-Blocking & Asynchronous • Non-Blocking I/O presents solutions: •
Eschew blocking individually on each operation Friday, March 9, 12

Eschew blocking individually on each operation • Find ways to work with the kernel to more efficiently manage multiple blocking resources in groups Friday, March 9, 12

Eschew blocking individually on each operation • Find ways to work with the kernel to more efficiently manage multiple blocking resources in groups • Free up threads which are no longer blocked waiting on I/O to handle other requests while those requests waiting on I/O idle Friday, March 9, 12

Where does Asynchronous come in? Friday, March 9, 12

Where does Asynchronous come in? • If we are no
longer blocking and instead reusing threads while I/O waits, we need a way to handle “completion” events Friday, March 9, 12

longer blocking and instead reusing threads while I/O waits, we need a way to handle “completion” events • Asynchronous techniques (such as callbacks) allow us to achieve this Friday, March 9, 12

longer blocking and instead reusing threads while I/O waits, we need a way to handle “completion” events • Asynchronous techniques (such as callbacks) allow us to achieve this • “Step to the side of the line, and we’ll call you when your order is ready” Friday, March 9, 12

Breaking it Down Friday, March 9, 12

Breaking it Down • The synchronous/blocking I/O model typically forces
us into some paradigms Friday, March 9, 12

us into some paradigms • Servers Friday, March 9, 12

us into some paradigms • Servers • A 1:1 ratio of threads to client connections (scale limited) Friday, March 9, 12

us into some paradigms • Servers • A 1:1 ratio of threads to client connections (scale limited) • Clients Friday, March 9, 12

us into some paradigms • Servers • A 1:1 ratio of threads to client connections (scale limited) • Clients • Connection pools often larger than necessary to account for unavailable-while-blocking connection resources Friday, March 9, 12

Breaking it Down Friday, March 9, 12

Breaking it Down • If we go asynchronous/non-blocking we can
change our destiny Friday, March 9, 12

change our destiny • Servers Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) • Clients Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) • Clients • Significantly reduce pool sizes Friday, March 9, 12

change our destiny • Servers • <Many Client Connections>:<One Thread> Ratio becomes possible (Scale - C10k and beyond) • A thread waiting for an I/O response no longer needs to block it, allowing other threads to reuse that connection simultaneously • Operations such as a “write” can queue up until resource is ready, returning thread of execution immediately, callback results later. • Of course, this makes dispatch and good concurrency even tougher (Who said doing things right was ever easy?) • Clients • Significantly reduce pool sizes • connection resources can be leveraged by many more simultaneous threads Friday, March 9, 12

Introducing NIO Friday, March 9, 12

Introducing NIO • History Friday, March 9, 12

Introducing NIO • History • “New I/O”, introduced in Java
1.4 Friday, March 9, 12

1.4 • Focus on low-level I/O as opposed to “Old” I/O’s high level-API Friday, March 9, 12

1.4 • Focus on low-level I/O as opposed to “Old” I/O’s high level-API • ByteBuffers: http://www.kdgregory.com/index.php?page=java.byteBuffer Friday, March 9, 12

Introducing NIO Friday, March 9, 12

Introducing NIO • For working with Networks, we must manually
work with Selector, and request a window to read and write Friday, March 9, 12

work with Selector, and request a window to read and write • Callback when ready Friday, March 9, 12

work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector Friday, March 9, 12

work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations Friday, March 9, 12

work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations • Channel instances are “bulk” data wrappers to Buffer Friday, March 9, 12

work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations • Channel instances are “bulk” data wrappers to Buffer • Selector is an event monitor which can watch multiple Channel instances in a single thread (the kernel coordinator) Friday, March 9, 12

work with Selector, and request a window to read and write • Callback when ready • Core units of work: Buffer, Channel and Selector • Buffer are contiguous memory slots, offering data transfer operations • Channel instances are “bulk” data wrappers to Buffer • Selector is an event monitor which can watch multiple Channel instances in a single thread (the kernel coordinator) • Relocate the task of checking I/O status out of the “execution” threads Friday, March 9, 12

NIO in Practice Friday, March 9, 12

NIO in Practice • I kicked back & forth with
a few possible examples of NIO here Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status • Dispatch “time to write” to original caller Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status • Dispatch “time to write” to original caller • Write Friday, March 9, 12

a few possible examples of NIO here • Conclusion: for time & sanity, omit a code sample • Here’s what you need to know • Register “Interests” with a Selector (Read, Write, etc) • Write a Selector loop which checks for notification events • Dispatch incoming events (such as “read”) • Want to write? Tell the Selector you want to write • Eventually, when the Channel is available to write, an event will notify the Selector • Remove “interested in writing” status • Dispatch “time to write” to original caller • Write • Rinse, repeat. Friday, March 9, 12

Introducing Netty Friday, March 9, 12

Introducing Netty • Simplify NIO (but still provides access to
oio) Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer • Supports ByteBuffer, Arrays, etc Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer • Supports ByteBuffer, Arrays, etc • Avoid memory copy as much as possible Friday, March 9, 12

oio) • Really, just wrapping NIO to provide higher level abstraction • Hides the Selector nonsense away • Composable “Filter Pipeline” allows you to intercept multiple levels of input and output • ChannelBuffers • Can composite multiple ChannelBuffers • Organize individual pieces in one composite buffer • Supports ByteBuffer, Arrays, etc • Avoid memory copy as much as possible • Direct Memory allocated (rumors of memory leaks abound) Friday, March 9, 12

Pipelines == Sanity val channelFactory = new NioClientSocketChannelFactory(Executors.newCachedThreadPool(ThreadFactories("Hammersmith Netty Boss")),
Executors.newCachedThreadPool(ThreadFactories("Hammersmith Netty Worker"))) protected implicit val bootstrap = new ClientBootstrap(channelFactory) bootstrap.setPipelineFactory(new ChannelPipelineFactory() { /* <snip> Big long, epic + awesomely insightful @havocp comment */ private val appCallbackExecutor = new ThreadPoolExecutor(/** constructor snipped for slide sanity */) private val appCallbackExecutionHandler = new ExecutionHandler(appCallbackExecutor) def getPipeline = { val p = Channels.pipeline(new ReplyMessageDecoder(), appCallbackExecutionHandler, handler) p } }) bootstrap.setOption("remoteAddress", addr) private val _f = bootstrap.connect() protected implicit val channel = _f.awaitUninterruptibly.getChannel Friday, March 9, 12

CAN HAS NETWORK WRITE? // It turns out writing can
be easy... channel.write(outStream.buffer()) Friday, March 9, 12

Stupid Pet Tricks Friday, March 9, 12

Every good story can use a MacGuﬃn... Friday, March 9,
12

Every good story can use a MacGuffin... • MacGuffin (n)
“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages • Handling errors & exceptions across threads, time, and space Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages • Handling errors & exceptions across threads, time, and space • “Follow up” operations which rely on serverside “same connection” context Friday, March 9, 12

“A plot element that catches the viewers’ attention or drives the plot of a work of fiction” (Sometimes “maguffin” or “McGuffin” as well) • We aren’t writing fiction here, but we can use a MacGuffin to drive our story • For our MacGuffin, we’ll examine Asynchronous networking against a MongoDB Server • This is a project I’ve already spent a good bit of time on – Hammersmith • A few focal points to lead our discussion • Decoding & dispatching inbound messages • Handling errors & exceptions across threads, time, and space • “Follow up” operations which rely on serverside “same connection” context • Working with multi-state iterations which depend on IO for operations... a.k.a. “Database Cursors” Friday, March 9, 12

Problem #1: Decoding & Dispatching Reads Friday, March 9, 12

Problem #1: Decoding & Dispatching Reads • Packets aren’t bytes
Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place • Conceptually, “Law of Demeter” (loose coupling) helps here. Doing it by hand you have to be careful not to eat somebody else’s lunch Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place • Conceptually, “Law of Demeter” (loose coupling) helps here. Doing it by hand you have to be careful not to eat somebody else’s lunch • NIO leaves you on your own Friday, March 9, 12

• Network layers don’t know or care about your fancy application layer protocol • The kernel reads things off the network into a big buffer of bytes • It’s up to us to figure out what parts of the bytes are relevant where • Challenge: Separate out individual writes and send them to the right place • Conceptually, “Law of Demeter” (loose coupling) helps here. Doing it by hand you have to be careful not to eat somebody else’s lunch • NIO leaves you on your own • Netty’s pipeline helps provide the “don’t eat my lunch” fix quite well IMHO Friday, March 9, 12

In Netty, add a decoder to the Pipeline protected[mongodb] class
ReplyMessageDecoder extends LengthFieldBasedFrameDecoder(1024 * 1024 * 4, 0, 4, -4, 0) with Logging { protected override def decode(ctx: ChannelHandlerContext, channel: Channel, buffer: ChannelBuffer): AnyRef = { val frame = super.decode(ctx, channel, buffer).asInstanceOf[ChannelBuffer] if (frame == null) { // don't have the whole message yet; netty will retry later null } else { // we have one message (and nothing else) in the "frame" buffer MongoMessage.unapply(new ChannelBufferInputStream(frame)) match { case reply: ReplyMessage 㱺 reply case default 㱺 // this should not happen; throw new Exception("Unknown message type '%s' incoming from MongoDB; ignoring.".format(default)) } } } Friday, March 9, 12

In Netty, add a decoder to the Pipeline // Because
we return a new object rather than a buffer from decode(), // we can use slice() here according to the docs (the slice won't escape // the decode() method so it will be valid while we're using it) protected override def extractFrame(buffer: ChannelBuffer, index: Int, length: Int): ChannelBuffer = { return buffer.slice(index, length); } } Friday, March 9, 12

Expect the pipeline to have a MongoMessage override def messageReceived(ctx:
ChannelHandlerContext, e: MessageEvent) { val message = e.getMessage.asInstanceOf[MongoMessage] log.debug("Incoming Message received type %s", message.getClass.getName) message match { case reply: ReplyMessage 㱺 { Friday, March 9, 12

Problem #2: Handling Errors, Exceptions and Eldritch Horrors Friday, March
9, 12

Problem #2: Handling Errors, Exceptions and Eldritch Horrors • Asynchronous
and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated Friday, March 9, 12

and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] Friday, March 9, 12

and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success Friday, March 9, 12

and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success Friday, March 9, 12

and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success • Node does a similar passing of Success vs. Result Friday, March 9, 12

and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success • Node does a similar passing of Success vs. Result • No special “different” handling in Netty vs. NIO Friday, March 9, 12

and callback nature makes dispatching errors difficult: the “throw / catch” model becomes complicated • Scala has a great construct to help with this: Either[L, R] • Pass a monad that can have one of two states: Failure or Success • By convention, “Left” is an Error, “Right” is success • Node does a similar passing of Success vs. Result • No special “different” handling in Netty vs. NIO • Implicit tricks for the lazy who want to just write a “success” block Friday, March 9, 12

Yes, I wrote my own Futures. It was an accident...
sealed trait RequestFuture { type T val body: Either[Throwable, T] 㱺 Unit def apply(error: Throwable) = body(Left(error)) def apply[A <% T](result: A) = body(Right(result.asInstanceOf[T])) protected[futures] var completed = false } /** * Will pass any *generated* _id along with any relevant getLastError information * For an update, don't expect to get ObjectId */ trait WriteRequestFuture extends RequestFuture { type T <: (Option[AnyRef] /* ID Type */ , WriteResult) } implicit def asWriteOp(f: Either[Throwable, (Option[AnyRef], WriteResult)] 㱺 Unit) = RequestFutures.write(f) Friday, March 9, 12

Did we succeed, or did we fail? val handler =
RequestFutures.write((result: Either[Throwable, (Option[AnyRef], WriteResult)]) 㱺 { result match { case Right((oid, wr)) 㱺 { ok = Some(true) id = oid } case Left(t) 㱺 { ok = Some(false) log.error(t, "Command Failed.") } } }) mongo.insert(Document("foo" -> "bar", "bar" -> "baz"))(handler) Friday, March 9, 12

Errors? We don’t need no stinkin’ errors... implicit def asSimpleWriteOp(f:
(Option[AnyRef], WriteResult) 㱺 Unit): WriteRequestFuture = SimpleRequestFutures.write(f) def write(f: (Option[AnyRef], WriteResult) 㱺 Unit) = new WriteRequestFuture { val body = (result: Either[Throwable, (Option[AnyRef], WriteResult)]) 㱺 result match { case Right((oid, wr)) 㱺 f(oid, wr) case Left(t) 㱺 log.error(t, "Command Failed.") } override def toString = "{SimpleWriteRequestFuture}" } Friday, March 9, 12

Problem #3: “Same Connection” Follow Up Operations Friday, March 9,
12

Problem #3: “Same Connection” Follow Up Operations • Some databases,
etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done • In Async, harder Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done • In Async, harder • Only solution I’ve found is “ballot box stuffing” Friday, March 9, 12

etc. have contextual operations as “follow ups” to a write, which can only be called on the same connection as the write • MySQL has last_insert_id() to fetch generated autoincrement • MongoDB has getLastError() to check success/failure of a write (and explicitly specify consistency requirements) • Somewhat easy in a synchronous framework • Lock the connection out of the pool and keep it private • Don’t let anyone else touch it until you’re done • In Async, harder • Only solution I’ve found is “ballot box stuffing” • Deliberate reversal of the “decoding problem” Friday, March 9, 12

Stuﬃng the Ballot Box // Quick callback when needed to
be invoked immediately after write val writeCB: () 㱺 Unit = if (isWrite) { msg match { case wMsg: MongoClientWriteMessage 㱺 if (concern.safe_?) { val gle = createCommand(wMsg.namespace.split("\\.")(0), Document("getlasterror" - > 1)) dispatcher.put(gle.requestID, CompletableRequest(msg, f)) gle.write(outStream) () 㱺 {} } else () 㱺 { wMsg.ids.foreach(x 㱺 f((x, WriteResult(true)).asInstanceOf[f.T])) } case unknown 㱺 { val e = new IllegalArgumentException("Invalid type of message passed; WriteRequestFutures expect a MongoClientWriteMessage underneath them. Got " + unknown) () 㱺 { f(e) } } } } else () 㱺 {} Friday, March 9, 12

Problem #4: Multi-state resource iteration a.k.a. Cursors Friday, March 9,
12

Problem #4: Multi-state resource iteration a.k.a. Cursors • In typical
iteration, we are working with a dual-state monad Friday, March 9, 12

iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] Friday, March 9, 12

iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean Friday, March 9, 12

iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A Friday, March 9, 12

iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A • In a pure and simple form, the Iterator[A] is prepopulated with all of its elements. Friday, March 9, 12

iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A • In a pure and simple form, the Iterator[A] is prepopulated with all of its elements. • If the buffer is non-empty, hasNext == true and next() returns another element. Friday, March 9, 12

iteration, we are working with a dual-state monad • Two primary calls on an Iterator[A] • hasNext: Boolean • next(): A • In a pure and simple form, the Iterator[A] is prepopulated with all of its elements. • If the buffer is non-empty, hasNext == true and next() returns another element. • When the buffer is empty, iteration halts completely because there’s nothing left to iterate. hasNext == false, next() returns null, throws an exception or similar. Friday, March 9, 12

12

Problem #4: Multi-state resource iteration a.k.a. Cursors • In a
simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] Friday, March 9, 12

simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous Friday, March 9, 12

simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient Friday, March 9, 12

simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? Friday, March 9, 12

simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? • With async, we may have a lot of potentially large result sets buffered Friday, March 9, 12

simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? • With async, we may have a lot of potentially large result sets buffered • Many databases (MongoDB, MySQL, Oracle, etc) use a multi-state result known as a “cursor” Friday, March 9, 12

simple database, a query would return a batch of all of the query results, populating an Iterator[DBRow] • This maps nice and simply to the Iterator[A] monad and everyone can chug along happily, not caring if I/O is synchronous or asynchronous • In reality though, forcing a client to buffer all of the results to a large query is tremendously inefficient • Do you have enough memory on the client side for the entire result set? • With async, we may have a lot of potentially large result sets buffered • Many databases (MongoDB, MySQL, Oracle, etc) use a multi-state result known as a “cursor” • Let the server buffer memory and chunk up batches Friday, March 9, 12

12

Problem #4: Multi-state resource iteration a.k.a. Cursors • Cursors will
return an initial batch of results Friday, March 9, 12

return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch Friday, March 9, 12

return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch • Client can use getMore to fetch additional batches until server results are exhausted Friday, March 9, 12

return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch • Client can use getMore to fetch additional batches until server results are exhausted • Once exhausted, getMore will return a batch and a Cursor ID of 0 (indicating “no more results”) Friday, March 9, 12

return an initial batch of results • If there are more results available on the server a “Cursor ID” is returned w/ the batch • Client can use getMore to fetch additional batches until server results are exhausted • Once exhausted, getMore will return a batch and a Cursor ID of 0 (indicating “no more results”) • Try doing this cleanly without blocking... Friday, March 9, 12

12

Problem #4: Multi-state resource iteration a.k.a. Cursors • Now we
have 3 states with a Cursor Friday, March 9, 12

have 3 states with a Cursor • The typical solution in a synchronous driver Friday, March 9, 12

have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean Friday, March 9, 12

have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” Friday, March 9, 12

have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” • next: A Friday, March 9, 12

have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” • next: A • If non-empty local buffer, return item Friday, March 9, 12

have 3 states with a Cursor • The typical solution in a synchronous driver • hasNext: Boolean • “Is the local buffer non-empty?” || “are there more results on the server?” • next: A • If non-empty local buffer, return item • If more on server, call getMore (Smarter code may be “predictive” about this and prefetch ahead of need) Friday, March 9, 12

12

Problem #4: Multi-state resource iteration a.k.a. Cursors • Working asynchronously,
that block on getMore will put you in the weeds Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking • John de Goes (@jdegoes) and Josh Suereth (@jsuereth) suggested Iteratees as a solution Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking • John de Goes (@jdegoes) and Josh Suereth (@jsuereth) suggested Iteratees as a solution • Reading Haskell white papers and Scalaz code made my brain hurt... Friday, March 9, 12

that block on getMore will put you in the weeds • Typically a small pool of threads are being reused for multiple incoming reads (in Netty you must NEVER block on a receiver thread) • blocking for getMore will block all of the interleaved ops • I hit this problem with Hammersmith • It initially led to heavy drinking • John de Goes (@jdegoes) and Josh Suereth (@jsuereth) suggested Iteratees as a solution • Reading Haskell white papers and Scalaz code made my brain hurt... • ... As such, what follows is my interpretation and any mistakes & stupidity are entirely my own Friday, March 9, 12

Better Living Through Iteratees Friday, March 9, 12

Better Living Through Iteratees • Instead of a potentially blocking
next: A, with iteratees we can handle any number of states cleanly and asynchronously Friday, March 9, 12

next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function Friday, March 9, 12

next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” Friday, March 9, 12

next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state Friday, March 9, 12

next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state • Code is now asynchronous Friday, March 9, 12

next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state • Code is now asynchronous • If the response to “No more on client, server has some” is “go get more”, no blocking I/O is needed Friday, March 9, 12

next: A, with iteratees we can handle any number of states cleanly and asynchronously • Introduce a higher order function • Pass a function which takes an argument of “Iteration State” • Return “Iteration Commands” based on the state • Code is now asynchronous • If the response to “No more on client, server has some” is “go get more”, no blocking I/O is needed • Pass a copy of the current method with the “get more” command, iteration continues after buffer replenishment Friday, March 9, 12

Iteration “State” and “Command” trait IterState // “Here’s a valid
entry”, have fun! case class Entry[T: SerializableBSONObject](doc: T) extends IterState // Client buffer empty, but more on server case object Empty extends IterState // Both client buffer and server are exhausted case object EOF extends IterState trait IterCmd // I’m all done with this cursor - clean it up, shut it down, take out the trash case object Done extends IterCmd // Go get me an item to work on ... here’s a function to handle all states case class Next(op: (IterState) 㱺 IterCmd) extends IterCmd // Call getMore & retrieve another batch - here’s a function to handle all states case class NextBatch(op: (IterState) 㱺 IterCmd) extends IterCmd Friday, March 9, 12

The “next” method on the Cursor Class def next() =
try { log.trace("NEXT: %s ", decoder.getClass) if (docs.length > 0) Cursor.Entry(docs.dequeue()) else if (hasMore) Cursor.Empty else Cursor.EOF } catch { // just in case case nse: java.util.NoSuchElementException 㱺 { log.debug("No Such Element Exception") if (hasMore) { log.debug("Has More.") Cursor.Empty } else { log.debug("Cursor Exhausted.") Cursor.EOF } } } def iterate = Cursor.iterate(this) _ Friday, March 9, 12

Iteration “Helper” function def iterate[T: SerializableBSONObject](cursor: Cursor[T])(op: (IterState) 㱺 IterCmd)
{ log.trace("Iterating '%s' with op: '%s'", cursor, op) def next(f: (IterState) 㱺 IterCmd): Unit = op(cursor.next()) match { case Done 㱺 { log.trace("Closing Cursor.") cursor.close() } case Next(tOp) 㱺 { log.trace("Next!") next(tOp) } case NextBatch(tOp) 㱺 cursor.nextBatch(() 㱺 { log.debug("Next Batch Loaded.") next(tOp) }) } next(op) } Friday, March 9, 12

A Simple Iteration of a Cursor def iterateComplexCursor(conn: MongoConnection) =
{ var x = 0 conn(integrationTestDBName).find("books")(Document.empty, Document.empty)((cursor: Cursor[Document]) 㱺 { def next(op: Cursor.IterState): Cursor.IterCmd = op match { case Cursor.Entry(doc) 㱺 { x += 1 if (x < 100) Cursor.Next(next) else Cursor.Done } case Cursor.Empty 㱺 { if (x < 100) Cursor.NextBatch(next) else Cursor.Done } case Cursor.EOF 㱺 { Cursor.Done } } Cursor.iterate(cursor)(next) }) x must eventually(5, 5.seconds)(be_==(100)) } Friday, March 9, 12

Briefly: NIO.2 / AIO • JDK 7 introduces a higher
level async API to NIO without going as high level as Netty does • NIO.2 / AIO brings in “AsynchronousSocketChannels” • Removes need to select / poll by hand • Configurable timeouts • Two options for how to get responses; both sanely map to Scala • java.util.concurrent.Future • CompletionHandler • Easily logically mapped to Either[E, T] with implicits • Probably not ‘prime time’ usable for library authors *yet* ... due to dependency on JDK7 Friday, March 9, 12

@mongodb conferences, appearances, and meetups http://www.10gen.com/events http://bit.ly/mongofb Facebook | Twitter
| LinkedIn http://linkd.in/joinmongo download at mongodb.org github.com/mongodb/casbah github.com/bwmcadams/hammersmith These slides will be online later at: http://speakerdeck.com/u/bwmcadams/ [email protected] (twitter: @rit) Friday, March 9, 12

Asynchronous and Non-Blocking Scala For Fun & P...

Asynchronous and Non-Blocking Scala For Fun & Profit

More Decks by Brendan McAdams

Other Decks in Programming

Featured

Transcript