Learnosity Logo
Learnosity Banner Image

Managing long running requests on ColdFusion

The Problem

We've been working on an application which needed significant integration with an external system which is done via web services.

During out load testing we came across a major issue with the web services when the application was under load which caused the ColdFusion application server to crash very badly.

The problem was caused by the fact that the web service calls typically took about 1 second to complete whereas pages that didn't need to use a web-service completed in about 100ms.

When we put the application under load if the number of requests that needed web-services ran at a sustained high rate then very soon all the running requests would be doing web service calls. This meant that all 25 java threads were getting swallowed up the "long running" web-service calls.

This caused all the other threads to queue up and very shortly the application fell over in a heap.

The Solution

Initially we looked a using cflock tags to handle it but this would essentially serialize all web service requests and only one web-service thread would run at any one time. This meant that the application would not be able to handle the required load.

After a bit of a brainstorming session we came up with the idea to develop a semaphore type object which would limit the amount of threads that could get tied up with long running web-service requests.

Ideally it works like this:

  • A web-service request comes in
  • It requests a thread from the Semphore object
  • If it gets one it runs the Web Service and then releases the thread

However, if the system is busy it works like this:

  • A web-service request comes in
  • It requests a thread from the Semphore object
  • It doesn't get a thread so doesn't run the web service and quickly returns and lets the user know the app is busy

Usage

In order to use the semaphore you need to set it up in a shared scope - typically application or server scope. We use it in our webservice wrapper CFC which is stored in the application scope.

<cfset variables.instance.Semaphore = createObject("component", "Semaphore").init("unique_name", 15,"logfilename")>

Here is a code example of how this looks in practice:

<cfset threadID = variables.instance.Semaphore.acquireThread()>
<cfif threadID GT 0>
   <cftry>
   <!--- We handle all errors in here to ensure we release the thread afterwards --->
      <cfinvoke webservice here>
   
      <cfcatch type="any">
      <!--- Log errors but continue so we release thread--->
      </cfcatch>
   </cftry>
   <!--- Make sure we release the thread - even if everything above explodes --->
   <cfset variables.instance.Semaphore.releaseThread(threadID)>
</cfif>

Note: One of the issues that we came across during the implementation of this was that when long running requests were caught by the long running request timeout the releaseThread function would not get called as the webservice code was timing out and it was getting killed before it hit the releaseThread line. To work around this we implemented an internal garbage collection mechanism to ensure that we didn't leak threads, or when we did we could recover from it.

This has allowed us to manage the number thread by limiting the number of long running requests so as the server gets busy it will now reject web service calls before it runs out of memory.

You can get the full code for the Semaphore object from our Opensource CF Library

Comments
Tom Chiverton's Gravatar Couldn't you use a named CFLOCK with a small timeout ?
# Posted By Tom Chiverton | 6/13/08 12:30 PM
Mark Lynch's Gravatar Hi Tom, That's what we we thought about originally. But that would have the effect we needed.

I'm presuming that you mean a named lock around the webservice call. If we did that we'd be only allowing a single web service request to run at any one time. Which might stop our CF server from getting completely locked up (as it would bounce the other requests with the cflock timeout) but it that has a number of problems:

1. Only one Webservice can happen at a time (we needed more than this)
2. Even with a 1 second timeout on cflock the threads will wait for up to one second to get a lock before returning. If you have 30 requests coming in second you will have 25 thread waiting for locks very quickly.

With this solution up to 15 of our 25 thread will be used for web-service calls and anything over that will be instantly returned with a busy message - which means the thread get's freed up very quickly.

Cheer,
Mark
# Posted By Mark Lynch | 6/13/08 3:36 PM
Tom Chiverton's Gravatar Interesting.
If something we're doing takes off, I might have to look into some way of throttling requests.

Another idea might just be to accept all requests straight away, and add them to a work queue that a separate process runs against - right ?
# Posted By Tom Chiverton | 6/13/08 4:01 PM
Mark Lynch's Gravatar Hi Tom,

Putting the request into a queue is a very good way to do it - if you don't need the info back straight away. So if it is a really long running request - i.e. more 10 seconds then I would definitely suggest putting it off into a queue.

Just remember, that if you a handling the queue with a CF scheduled task that it will eat into your available threads.

You are welcome to use the Semaphore code if it suits your needs.

Cheers,
Mark
# Posted By Mark Lynch | 6/14/08 4:03 AM