Monday, 25 June 2018

You don't necessarily want "async all the way up"!

Introduction

.Net introduced async methods in version 4.5 of the framework and it provided some really easy-to-use functionality to get asynchronous methods without the pain of managing threads and semaphores etc.

Unfortunately, it can be poorly understood because people think of it too much like easy "multi-threading" and overuse it, or they don't use it because they think it is not thread-safe. Neither of these are correct. If understood properly, it can reduce latency on your app. Used incorrectly, you will make your app even slower and just move a bottleneck to somewhere else.

What is Asynchronous?

Asynchronous execution means that the thread you are calling something from is not blocked while waiting for the operation to take place. In general terms, it would involve spinning up another thread for the async operation to use and keeping your calling thread free to continue. There are only two ways this will make your application better! Firstly, if the wait is external to your application e.g. a file or network call, the thread can be made to wait on an external event (usually from the kernel) and so is not taking up resources while it is waiting. Secondly, if you have a UI based model, like Windows applications (and to some extend web applications in .Net), you allow the UI to remain responsive while something else is happening in the background, since only the UI thread is allowed to update the UI and if it is blocked waiting for something else, the application appears to hang.

It should be obvious that if you start another thread but then park the calling thread waiting for the second thread, you are getting no benefit overall.

History

Back in early versions of .Net 1.1, asynchronous support was added in the form of an Async Programming Model (APM), which involved calling a BeginXX method and passing the callback method that will be executed when the operation has finished. The callback will be passed the result.

This was only really a thin layer over the basic threading libraries and did not lead to readable code because you ended up with lots of callback methods (callback hell!) that then had to somehow keep the main operation moving as callbacks were fired, especially since they could come back in different orders.

In .Net 2, this was improved and took the form of a more event driven approach called EAP. Where you subscribed to various events and then called a method to start the async operation. The events would signal back to the caller that something had finished/happened/errored. Although it was slightly cleaner and perhaps more generic, it still didn't really solve the problem of orchestrating the events and dealing with the sequencing of async tasks.

It should also be noted that in both of these scenarios, writing your own async code was also not the neatest thing in the world, although there were some helper classes you could extend to get some functionality. One of the worst ways of writing code is to have more supporting code than main code and these were both a little guilty of that approach.

Async in .Net 4.5

This brings us to the latest and great model: the Task-based Async Pattern (TAP) which finally makes async a first-class citizen and massively simplifies the task of both calling and implementing async code.

The basic style includes two new built-in keywords: await and async and one new type called System.Threading.Task. A method might look like this:

public async Task CallAPIAsync(ApiRequest request) {...}

Firstly, a few basics. If the method is marked async, it MUST return a Task or a Task<> if there is a return type. You do NOT have to mark the method async to return a Task, more on that later. Also, ending the method name with the word Async is not required but has been a convention, although that might change if everything starts becoming async.

If you want to call this method, you have a few options but none of these are specific to the CallAPIAsync method being "async" (the caller doesn't care) but all of them because it returns a Task<>.

CallAPIAsync(req);     // Will not block but not recommended since Task is lost
var myTask = CallAPIAsync(req1);    // Will not block
var returnValue = await CallAPIAsync(req2);    // Will 'block'
var returnValue2 = await myTask;     // Will 'block'

The bit that people struggle with is what actually happens when you call an async method and why not all Task methods are marked as async...

Calling an external async method

We will build up our example from the ground and assume we are writing a method that is calling an API, written by someone else, and assume that all the methods are async and even if they weren't, we don't want to wait for the operation to complete synchronously.

public async Task MyApiMethod(Request request)
{
    return = await ApiLibraryCallAsync(request);
}

First things first. We have four options here:

Use await

If we want to use await, we are basically saying that it is long-running but I can't do anything until the response comes back. What happens when we call await? Under the covers, Windows will allocate a thread from the general pool for the application and send that away to call the API, the calling thread will be released (NOT BLOCKED!) and told to check back later. The calling thread cannot continue with its current request since it is awaiting, but it can go and pick up another request, button press etc.

In a desktop app, you do not usually need massive parallelism but you want the UI to be able to update while it is waiting for the response, including possibly handling another button etc. while still waiting. In web applications, however, this freeing of the calling thread could massively increase the throughput of the application, especially if 1) the latency on the external calls are significant and 2) there are plenty of other less slow requests that could be handled in the meantime. On the other hand, depending on the application, this model could queue up lots of external API calls, although you might simply move the bottleneck and not actually help overall.

Return the task

If the API call does nothing after it calls the external library, it doesn't have to assume that the call needs to block and could instead directly return the Task to the caller and allow the caller (or its caller etc) to await if it needs to:

public Task MyApiMethod(Request request)
{
    return ApiLibraryCallAsync(request);
}

This means that we should not mark the method async. If we do, and don't use await anywhere inside the method, Visual Studio will underline the method and tell us it will run synchronously. The method would still return a Task<> however and the calling code might do something like this:

var task = MyApiMethod(request);
DoSomethingElse();
var response = await task;

If we did this, the calling method would have to be async (because it uses await) but the point here is that MyApiMethod is not making assumptions about blocking. It could await if it had to do something else after the response came back.

Use Task.Run()

There is an overhead to using async (which shouldn't be a surprise). Imagine you update your API one day and all of the methods have changed from Sync to Async, you would have to modify loads of methods, some of them will become async, some will return Tasks, and you might end up with "async all the way up" and rename loads of methods to end with "Async". As well as the code overhead, every method marked async has a hidden but necessarily mechanism for every time it is called. The system creates a new Task object, starts a new thread, attaches the thread and signals to the task and kicks off the new thread. This might be milliseconds but if used heavily will add noticeable time to the overall system.

Now, this might be unavoidable but you need to consider at what level the awaiting needs to take place because you can use Task.Run() to allow a synchronous method to behave like it is asynchronous. If you have a call stack of, say, 5 levels from the initial request and each of those are async and each of them await the method they are calling, you create 5 lots of tasks, 5 threads, multiple context switches before you have even reached any benefit.

What if the code at the bottom doesn't always call a long-running method? You would still normally have to make the entire stack async so you basically make every request slower to improve the response of 1 request every, say, 100 or 1000. So here are some scenarios where you should not use async methods but instead consider using Task.Run():

  1. You only provisionally call the long-running method or perhaps only every >1000 times. Perhaps it is only called after cache goes stale or if the app is restarted.
  2. You can never logically continue with the operation until the long-running operation has completed i.e. there is no reason to return the task to the caller. For example, if an encryption method must obtain a key from an external service, the encryption method cannot logically continue until the key is returned. Does it really make sense to signal to the caller that it can continue and get the result later? Imagine what the calling code would look like just to encrypt or decrypt something.
  3. You have no major desire to maximise the number of threads that can handle requests while your code is awaiting something. As long as the desktop UI thread can continue to run, you probably get little benefit from the extra complexity of async (although the performance penalty would also be lower due to less request.)
If you think these might be true - and you can always test your code to measure the performance difference, you should call the async method like this instead:

var result = Task.Run(() => MyApiMethod(request).Result;

This does a hidden await and returns the result but most importantly, because await is hidden, the method this code lives in does NOT have to be async. Note that this is a synchronous block however, the thread in the method calling this will block in the normal way and go to sleep until MyApiMethod has finished. You can call Wait() instead of Result if the method only returns a Task.

Conclusion

I realise I have only touched the surface of this subject but you should consider using async as a pattern when:
  1. The UI thread needs to be prevented from blocking by any "long-running" method e.g. over 0.5/1 second - at which point it would be noticeable/unacceptable.
  2. You want to maximise the request threads without them being blocked by long-running processes, such as in a web server that makes slow back-end calls to something but only when some of the other requests can be made while waiting as opposed to simply creating a longer queue of long-running requests!
  3. You want to get some easy-to-use parallelism where you might e.g. start a long-running process and be able to do other tasks before waiting for the result.
You should not necessarily use the async pattern when:
  1. Your "long-running" calls are not particularly slow, perhaps anything less than 200mS, at which point you might move the performance hit elsewhere. Most database calls and all memory-cache/redis type calls generally should NOT be async.
  2. You only use a low-level long running method occasionally, and it does not block the UI thread, use Task.Run() instead and take the hit. Consider another background update pattern if necessary to avoid hitting the occasional user.
  3. Your only use of calling async methods is at the lowest level and nothing else happens in parallel, in which case, Task.Run() will usually provide you the functionality without the trouble of putting async all the way up.
Test, test, test! If I have learned one thing about programming, many things don't work as you assume they will.

I have surely missed many things, so please let me know on your comments if I need to change anything!

Post a Comment