Since we released our app on M4 a couple weeks ago we've been
experiencing performance issues where, under load, the server would
get bogged down and eventually max out on running requests and spin
until I restart CF. We use the coldbox cache for quite a few short-
lived and medium-lived objects. One of those objects is a large
result set used for security that is loaded into cache when a user
logs in (if it doen't already exist). Generating it is pretty
intensive, but since it only happens about once per login it's ok.
When hanging, SeeFusion showed me that pretty much every running
request was trying to generate this large security query over and over
again.
Stack traces showed me that the eviction policy was running over and
over and the CB cache debug panel showed my cache was way over the
limit of items (400), and a large number of of those items were
expired but hadn't been reaped.
So, long story short, the reap() method in CB 3 cache manager from
alpha to M5 has a couple bugs. In previous versions (2.6.3), a copy
of the pool metadata was obtained with
getObjectPool().getpool_metadata() and that struct was used to look up
cache items and see if they were expired and clear them. The newer
version replaced that with the lookup() method which not only ignores
expired objects (so they were never getting cleared and remaining in
the cache causing constant evictions) but also registers a hit against
them (resetting lastAccessed and inflating the hit count which throws
off your eviction policy). Secondly, the clearKey() method used the
lookup() method as well which prevented it from even clearing anything
since lookup said it didn't exist.
Now, having said all that-- from the looks of the code, M6 has cleared
all this up with the way indexer's objectExists method works. (Very
happy to see the indexer caches metadata now Luis-- I was going to
suggest that be cached!)
So basically, I put all this out here in case anyone is using
milestone 5 or less of CB 3 and having memory/cache issues like what I
described above. Until you upgrade to the latest milestone, you can
fix the problem fairly easily with some changes to the cache manager.
Ask if you need more specifics.
Hopefully I'll be able to upgrade to M6 soon, but I've got to check my
release schedule since we like to give any new version of ColdBox a
while to sit on dev while we (hopefully) find any kinks. Of course,
our lack of load testing kept us from noticing this bug until we had
production traffic hitting the site.
Thanks!
~Brad