ColdBox Cache fails under high load

Hi All,

We have a fairly high-usage CB site (about 3 mil requests per day)
running on CB 3.0 M3. We have default settings for CB caching. We
have recently been having weird issues with configuration settings
seemingly vanishing and VAR scoped INSTANCE variables that are tied to
Cached items dieing due to references to items that appear to have
been reaped. We kept seeing errors in the log that related to trying
to reference missing methods on a STRING object, or making array
references to a STRING object that were baffling. After a while we
were able to determine that model objects that we were fetching and
trying to execute methods on were actually a STRING object
"_NOTFOUND".

We did some research today to understand how GetModel() interacts with
the CacheManager.

We noticed that the BeanFactory GetModel() method checks first to see
if the model object is in cache, and it so then calls the CacheManger
to return the bean from Cache. However, the actual FETCH of the
object with a LOCK from cache occurs about 7 lines into the get() call
in the CacheManager. Our theory is that under heavy load there is a
race condition and the cache evicts the object AFTER the BeanFactory
checks for the objects existence in cache but BEFORE the LOCK
occurs. Had anyone else experienced anything like this?

Thanks !

BeanFactory.cfc

That cache engine is no longer in development. Please try the latest milestone

To enable the new CacheBox just download the latest M6 ColdBox and
remove the "CacheEngine" settings from your configuration.

You can further tweak the CacheBox configuration if you need to.

What version of CB are you on?

Simply upgrading to the newest Coldbox isn't necessarily an option.
Our app, as you know, is very large and mission critical. Upgrading
involves many, many hours of regression tests. In addition, comparing
the core code between M3, M5 and M6 (Cachebox), maintains this
specific set of code as best I can tell:

if( getObjectPool().lookup(arguments.objectKey) ){
   // Get Object from cache
   refLocal.tmpObj = getobjectPool().get(arguments.objectKey);

Our problem doesn't seem to be that something is missing in the cache,
exactly. The problem is that at some point the value in the cache
becomes the internal string of "_NOTFOUND_". We strictly use
getModel() to fetch our objects. The beanFactory knows which objects
are cached. However, this spot of code does not actually look at
what's in there. In other places in the Coldbox code, you'll see
something like this:
    if ( isSimpleValue(refLocal.tmpObj) and reLocal.tmpObj EQ
'_NOTFOUND' ) ..... do something different.

In our issue, the object that *is* in the cache and the value actually
is the string "_NOTFOUND_". We do not manually set that value
anywhere. It is our supposition that at some point Coldbox is storing
that back into the cache, since that is an internal string value built
by Coldbox. I'm not 100% certain, but I have also found other places
in the code where the special "_NOTFOUND_" value is represented by
different string values, such as "NOT_FOUND". I don't know if that
discrepancy is what causes a "_NOTFOUND_" value to overwrite the
cached value or not.

Unfortunately, the top snippet of code only asks "Does it exist in
cache?" The lookup function doesn't actually examine the value, nor
does the get() functionality. If it DID look to see if simpleValue
and equals "_NOTFOUND", the lookup() function could return a false,
causing Coldbox to recreate the model object as is required to resolve
this overall issue. I simply don't see that functionality changing
with M6/Cachebox.

Any thoughts on that discourse?

- Will B.

Will,

That cache engine is no longer in development and cachebox has changed its way of retrievals of objects also the internal services have changed for final release. I can see what you are saying about the beanfactory lookup/get combination and I am updating that to match the handler plugin services approach. However, I recommend migrating to the next release as we have 2 more releases left for final release.

I am curious into HOW the cache is being used in terms of time persisted services, singletons? How are they retrieved, etc? The only way these objects are removed is if they are time persisted services or you are doing your own setup. So I am curious to see what you are doing.

Because our load tests, don’t indicate your behavior, but again, this is on cachebox and not the archived cache engine.

Luis F. Majano
President
Ortus Solutions, Corp

ColdBox Platform: http://www.coldbox.org
Linked In: http://www.linkedin.com/pub/3/731/483
Blog: http://www.luismajano.com
IECFUG Manager: http://www.iecfug.com