[coldbox-3.5.0-BE] cachebox on a cluster nlb environment

“Error”,“cfthread-167”,“03/20/12”,“15:18:22”,"CACHEBOXPROVIDER.REAP_B25C2404DFBD401A998024C27363EDDC: A timeout occurred while attempting to lock CacheBoxProvider.reap.1699375701. "
coldfusion.tagext.lang.LockTag$TimeoutException: A timeout occurred while attempting to lock CacheBoxProvider.reap.1699375701.

Windows Server 2003, IIS, ColdFusion 9.0.0

Testing environments are not clustered and so everything is normal in testing and didn’t realize the production servers were clustered and never built to a clustered environment so any informational nuggets would be helpful.

Thanks,
Jade

I don’t believe this is related to clustering.

Can you post your cache config?

Thanks,

Aaron

In the config folder I have a seperate file named CacheBox.cfc which I have always just left alone:

function configure(){
cacheBox = {
defaultCache = {
objectDefaultTimeout = 120
,objectDefaultLastAccessTimeout = 30
,useLastAccessTimeouts = true
,reapFrequency = 2
,freeMemoryPercentageThreshold = 0
,evictionPolicy = “LRU”
,evictCount = 1
,maxObjects = 300
,objectStore = “ConcurrentStore”
,coldboxEnabled = true
},

caches = {
template = {
provider = “coldbox.system.cache.providers.CacheBoxColdBoxProvider”
,properties = {
objectDefaultTimeout = 120
,objectDefaultLastAccessTimeout = 30
,useLastAccessTimeouts = true
,freeMemoryPercentageThreshold = 0
,reapFrequency = 2
,evictionPolicy = “LRU”
,evictCount = 2
,maxObjects = 300
,objectStore = “ConcurrentSoftReferenceStore”
}
}
}
};
}

Thanks for taking a look,
Jade

Also, can you post the stacktrace? It will help determine which object store has this issue.

Either the ConcurrentSoftReferenceStore or ConcurrentStore are having a hard time releasing a lock.

Thanks.

Here is the stacktrace, again, thanks for looking at this. One of the things on my bucket list was to dig in and start understand the nuts & bolts of the caching, however, it’s still on the list and CacheBox has always done a great job of not needing my attention.

“Error”,“cfthread-167”,“03/20/12”,“15:18:22”,"CACHEBOXPROVIDER.REAP_B25C2404DFBD401A998024C27363EDDC: A timeout occurred while attempting to lock CacheBoxProvider.reap.1699375701. "
coldfusion.tagext.lang.LockTag$TimeoutException: A timeout occurred while attempting to lock CacheBoxProvider.reap.1699375701.
at coldfusion.tagext.lang.LockTag.doStartTag(LockTag.java:220)
at cfCacheBoxProvider2ecfc296076663$func_REAP.runFunction(D:\JRun4\servers\93_CFMX2\cfusion.ear\cfusion.war\cf\ISRFW\coldbox\system\cache\providers\CacheBoxProvider.cfc:583)
at coldfusion.runtime.UDFMethod.invoke(UDFMethod.java:472)
at coldfusion.filter.SilentFilter.invoke(SilentFilter.java:47)
at coldfusion.runtime.UDFMethod$ReturnTypeFilter.invoke(UDFMethod.java:405)
at coldfusion.runtime.UDFMethod$ArgumentCollectionFilter.invoke(UDFMethod.java:368)
at coldfusion.filter.FunctionAccessFilter.invoke(FunctionAccessFilter.java:55)
at coldfusion.runtime.UDFMethod.runFilterChain(UDFMethod.java:321)
at coldfusion.runtime.UDFMethod.invoke(UDFMethod.java:220)
at coldfusion.runtime.CfJspPage._invokeUDF(CfJspPage.java:2582)
at coldfusion.runtime.CfJspPage._invoke(CfJspPage.java:2383)
at cfCacheBoxProvider2ecfc296076663$func_CFFUNCCFTHREAD_CFCACHEBOXPROVIDER2ECFC2960766632.runFunction(D:\JRun4\servers\93_CFMX2\cfusion.ear\cfusion.war\cf\ISRFW\coldbox\system\cache\providers\CacheBoxProvider.cfc:559)
at coldfusion.runtime.UDFMethod.invoke(UDFMethod.java:472)
at coldfusion.runtime.UDFMethod$ArgumentCollectionFilter.invoke(UDFMethod.java:368)
at coldfusion.filter.FunctionAccessFilter.invoke(FunctionAccessFilter.java:55)
at coldfusion.runtime.UDFMethod.runFilterChain(UDFMethod.java:321)
at coldfusion.runtime.UDFMethod.invoke(UDFMethod.java:220)
at coldfusion.runtime.UDFMethod.invokeCFThread(UDFMethod.java:201)
at coldfusion.thread.Task.invokeFunction(Task.java:274)
at coldfusion.thread.Task.run(Task.java:140)
at coldfusion.scheduling.ThreadPool.run(ThreadPool.java:201)
at coldfusion.scheduling.WorkerThread.run(WorkerThread.java:71)

And also I just was looking at logs again and noticed this occured several times 2 days ago (on the 20th) but I haven’t seen it again since.

As I have dealt with reinit patches I now understand the value/significance of the deploy interceptor and could this issue be rooted in reinit’ing 1 instance and not getting others reinit’d (blind dart tossing)

Jade

Yea reinit is a hard thing to do as it forces a reinit. Basically any threads waiting or contending or running are abruptly interrupted meaning stuff can go wrong. The deploy interceptor or using application stop on request end is much much safer.

Luis Majano
CEO
Ortus Solutions, Corp
www.ortussolutions.com

ColdBox Platform: http://www.coldbox.org
Linked In: http://www.linkedin.com/pub/3/731/483
Social: twitter.com/ortussolutions | twitter.com/coldbox | twitter.com/lmajano

Reinit is not quite the answer. I believe this is related to locking.

I’ve observed this on my production servers, but, have not had the time to debug. Performing an applicationStop() does fix the issue.

The server admins ended up restarting the instances the day this started so that’s probably why it stopped and I haven’t seen it since.

Thanks.

For me, it seems to start appearing after the application is about four days old.

Just found this thread - getting the same issue. Seems to happen after several days of uptime. Really hard to track down.