Wednesday, September 10, 2014

Understanding Asynchronous I/O in Play Framework

The problem: Many web based applications invoke external web services. Network I/O involves significant amount of waiting. Waiting comes from two sources:

  • Service latency: The time it takes for the external service to complete the task.
  • Network latency: A thread doing read can wait for data to become available in the network socket. A thread doing write sometimes need to wait when the write buffer is full and can no longer accept new data.
In traditional web application each HTTP request is handled by an application thread. For example, if we have 50 concurrent requests we will have 50 threads created to process those requests. If the application code makes web service calls we will have many of these threads simply waiting for the I/O to finish. To serve more concurrent requests we will need to create more threads. Having a large number of threads increases memory overhead.

The asynchronous model

In this model we offload all the network I/O to a very small number of worker threads. These threads use the select system call (or, equivalent calls like epoll and kqueue) to efficiently wait on a large number of sockets to become readable or writable. Application threads are then used only to run application code. For example, let’s say that you have 50 concurrent requests and 40 of them are making web service calls and 10 are executing application code. The 40 web service calls can be made by a small number of worker threads, say about 2. The 10 requests that need to run actual application logic will need 10 application threads. This way, we are serving 50 HTTP requests with 12 threads. This model scales very well. Again assuming your application makes a lot of web service calls, you can support a very large number of HTTP requests using a small number of threads.

Java NIO wraps the select (or epoll or kqueue) call in the java.nio.channels.Selector class. If you are not familiar with how these system calls work I will highly recommend you to read about them. The real magic of any asynchronous network I/O comes from these functions.

How do I do asynchronous I/O?

For the asynchronous model to work it is essential that you are using the correct way to make web service calls. For example, if you use the traditional way of using the java.net.URL to make HTTP calls you will always get the synchronous model. The key is to do these things when a HTTP call is made:
  1. Ask a worker thread to add the HTTP operation to the collection of HTTP calls that it is already tracking.
  2. Return the application thread back to the pool of available threads.
  3. The worker thread will send the request and wait for response data to be available. It does so for all the sockets that it is tracking. As response data begins to trickle in it will parse the data and detect when the response is completed. It will then invoke the completion callback of the application in an application thread.
Doing these things can be very very tricky. You will almost always rely on a well known library for this. In Play the play.api.libs.ws.WS class does this. In Java you can use a library like the AsyncHttpClient.

Watching Play in action

To observe how async I/O works in Play, we use this bit of controller code.
object Application extends Controller {
def index = Action.async {
 def onResult(r: Any) {
   println("Response completed. " +
     Thread.currentThread().getName)
 }
 //Call web services in parallel
 println("Sending out requests to WS. " +Thread.currentThread().getName)
 val svc1 = WS.url("http://example.com/svc1").get().map(onResult)
 val svc2 = WS.url("http://example.com/svc2").get().map(onResult)
 val svc3 = WS.url("http://example.com/svc3").get().map(onResult)

 //Wait for all web service calls to finish
 Future.sequence(Seq(svc1, svc2, svc3)).map { case times => Ok("We are done")
 }
}
}

Basically, we print out the name of the thread before making the web service calls and after the calls complete. The console output may look something like this:
Sending out requests to WS. play-akka.actor.default-dispatcher-6
Response completed. play-akka.actor.default-dispatcher-4
Response completed. play-akka.actor.default-dispatcher-6
Response completed. play-akka.actor.default-dispatcher-4

In Play, a dispatcher thread is what I called application thread earlier. They run actual application logic. After a web service call completed system invoked the completion callback function (onResult) in one of the available dispatcher threads.

To get a better idea about what the threads are doing we need to use a JVM monitoring tool. I will use jvisualvm which comes with JDK.

Here are some of the threads running in a freshly started JVM. I ran my test controller a few times using curl.


Note that a small number of network I/O worker threads have been created. Even smaller number of application or dispatcher threads have been created.
What will happen if I hammer the controller concurrently from 50 users? If all goes well we should see the number of worker threads remain more or less the same. But the dispatcher thread count will grow.

Let’s go!

ab -n 50 -c 50 http://localhost:9000/

I immediately saw 50 “Sending out requests to WS” message printed on the console. Not all requests came exactly at the same time. Some of the dispatcher threads already made the call to WS.url().get() and hence became free. They were reused to serve another HTTP request. Below is a sample console output where we can see that the dispatcher thread #19 being reused a lot.
Sending out requests to WS. play-akka.actor.default-dispatcher-18
Sending out requests to WS. play-akka.actor.default-dispatcher-19
Sending out requests to WS. play-akka.actor.default-dispatcher-14
Sending out requests to WS. play-akka.actor.default-dispatcher-19
Sending out requests to WS. play-akka.actor.default-dispatcher-19
Sending out requests to WS. play-akka.actor.default-dispatcher-14

According to jvisualvm, the number of dispatcher threads grew to abut 12. The important thing to note here is that the number is not 50. In a traditional application this number will certainly be 50.

Very quickly all 50 requests will issue a total of 150 HTTP web service calls. According to jvisualvm, the number of I/O worker thread was only 8. (Note, these threads are used not only to do the external web service calls, they are also used by the Play web server to accept request from the browser). Play limits the maximum size of these worker threads. Even with say 8 threads the server can handle many thousand connections (both incoming to the server and outgoing to the web services).

In summary, I saw about 20 threads used to handle 50 concurrent requests.
If you take a thread dump at any point, you should see the worker threads calling select.

"New I/O worker #12" daemon prio=5 tid=0x00007fc6ec383800 nid=0x7413 runnable [0x000000011a742000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked (a sun.nio.ch.Util$2)
- locked (a java.util.Collections$UnmodifiableSet)
- locked (a sun.nio.ch.KQueueSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:68)

The dispatcher threads will either be idle or running application code. Below is an idle thread.
"play-akka.actor.default-dispatcher-14" prio=5 tid=0x00007fc6eef8e000 nid=0x8b0f waiting on condition [0x000000011b49c000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for (a akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool)

Final word

Bottom line, asynchronous I/O allows you serve a large number of HTTP requests with a small number of threads. This is true for applications that make a lot of external web service calls. It is essential that you use proper method to make web service calls. In Play use the WS class. Finally, always verify that you are indeed benefiting from the asynchronous model. Run a stress test and observe the threads like we did here. The total number of I/O worker and dispatcher threads should be well below the number of active HTTP requests.

Friday, September 5, 2014

Google Chrome, Mac OS X and Self-Signed SSL Certificates

Let's say you have a server with a self-signed HTTP SSL certificate. Every time you hit a page, you get a nasty error message. You ignore it once and it's fine for that browsing session. But when you restart, it's back. Unlike Firefox, there's no easy way to say "yes, I know what I'm doing, ignore this." This is an oversight I wish Chromium would correct, but until they do, we have to hack our way around it.

Caveat: these instructions are written for Mac OS X. PC instructions will be slightly different at PCs don't have a keychain, and Google Chrome (unlike Firefox) uses the system keychain.

So here's how to get Google Chrome to play nicely with your self-signed SSL certificate:

  1. In the address bar, click the little lock with the X. This will bring up a small information screen. Click the button that says "Certificate Information." 
  2. Click and drag the image to your desktop. It looks like a little certificate. 
  3. Double-click it. This will bring up the keychain Access Utility. Enter your password to unlock it.
  4. Be sure you add the certificate to the System keychain, not the login keychain. Click "Always Trust," even though this doesn't seem to do anything.
  5. After it has been added, double-click it. You may have to authenticate again.
  6. Expand the "Trust" section.
  7. "When using this certificate," set to "Always Trust".



or from command line execute following command:
sudo security add-trusted-cert -d -r trustRoot -k /Library/Keychains/System.keychain site.crt


That's it! Close Keychain Access and restart Chrome, and your self-signed certificate should be recognized now by the browser.

This is one thing I hope Google/Chromium fixes soon as it should not be this difficult. Self-signed SSL certificates are used *a lot *in the business world, and there should be an easier way for someone who knows what they are doing to be able to ignore this error than copying certificates around and manually adding them to the system keychain.