[cachebox] - Clearing All Caches (couchbase)

So here’s something I’m trying to find a way to handle and not sure how best to approach it.

We’re switching to using the couchbase provider with a standalone cachebox install, primarily so server and CF reboots won’t wipe out the cache. So I updated my afterCacheRegistration() listener to check if the cache data exists for each cache item before setting it, so if it gets returned from couchbase, it won’t get pulled from the database and set again.

However, there will be times when we need to purge all the cache items and reinit the whole thing, such as a recent SP fix I did that made a change to a large number of the items in cache. I tried doing a clearAll() followed by restarting the cachebox which will rerun the afterCacheRegistration() methods, but as far as I can tell, couchbase is still clearing out items when cachebox starts up again. So as a result, a lot of the items don’t get created.

Other than just pausing the page, which IMO is never a good solution, is there a way for cachebox to know when couchbase finishes clearing out all the caches?

Mary Jo

So, the clearAll() method in the provider uses the flush functionality of Couchbase. it works, but it comes with a giant warning label from the Couchbase folk who basically don’t recommend you ever actually use it other than for testing/debugging.

The biggest issue with it is that it can be quite slow and it completely asynchronous, so the code continues on immediately even though the actual flush operation might continue for the next minute or so on the server. I’ve never tried it, but you could call getSize() in a loop with a sleep and wait until it reports zero. You could also keep calling getKeys() until it comes back empty. This call has been coded to not allow stale data, so in theory it will wait until the data is fresh.

However, that being said, I’m not certain that the internal flush function of Couchbase actually follows the concurrency rules. It may be more of a backdoor thing that is designed to never block the other cache operations-- you’ll have to try it and see what it does.

So, the only other way to really handle this via the normal API would be to get a list of keys, then loop over them and delete each of them. Of course, even this is async by default since Couchbase ALWAYS errs on the side of performance over data consistency. It is possible to force Couhbase’s internal delete to be synchronous, but we didn’t do that in our cache provider’s clear() method. A call to getKeys() afterwards though should block until it’s complete though.

It is a little annoying that there’s not a super good way to clear an entire bucket synchronously, but Couchbase’s official stance on that is sort of, “Why would you want to do that??” If you were to ask them, they would probably tell you to change your app to always check for the item and set it if it doesn’t exist. Then, your code won’t really care if a bucket clear is async-- it will just put the item back as soon as it’s gone.

Thanks!

~Brad

ColdBox Platform Evangelist
Ortus Solutions, Corp

E-mail: brad@coldbox.org
ColdBox Platform: http://www.coldbox.org
Blog: http://www.codersrevolution.com

Yeah, see that's the problem, I do already do that, the issue is that the
client's database simply cannot handle the kind of load we get in that
case. Obviously we need to fix THAT issue but unfortunately we have not
been able to get the permission and resources needed to address the DB
structure and location that are such performance killers until very
recently (at least the DB structure which is a mess) but fixing it is going
to take some time (I was talking to a DBA friend of mine and he says I
should submit some of the things we're dealing with to WTF because they are
THAT bad!) so in the meantime we have to deal with needing to pre-load all
our most heavily used searches on startup so that when the site comes out
of maintenance mode CF doesn't bomb with slow running threads waiting on
these SPs to run.

So clearAll() would only be used when we are doing a build and the site is
offline. However, it sounds like rolling my own method that would loop over
all the keys and clear them, and then rebuild the startup keys would be the
best way to go for now.

Thanks!

The tried and true solution, is bring the site down for maintenance. That’s how the big boys do it.

Well, any time we’d be doing this we would indeed be putting the site in maintenance mode. I was just hoping we could simplify the process of clearing out the various caches and re-building our startup items. After working with this all day though, I’m finding the flush() just does not seem to work well from cachebox at all. At least not for our dev box. I even tried implementing a clearMulti() to go through and clear all the keys individually but that didn’t work any better for me. So I will just have to document the need to flush the buckets from the couchbase admin whenever we need to actually clear cache, in the event I’m not the one doing it (which would be pretty rare, probably not ever as long as I’m still on the project!) Which probably is the best way to do it any way (although even the couchbase admin seems to not handle this well as the UI throws up errors when you do it. But eventually the items do get flushed).

It does give me another reason though to have several buckets to handle our caching so in the event I have to rebuild due to changes, I don’t have to flush the entire thing.

Mary Jo

Mary Jo,

That is one thing that caught me out too, just I done all my stuff locally with testing servers. Now that got me thinking to something that may work even better, provided you have the space.

The moment you mentioned bucket, I had a thought, why can’t you have a separate bucket that you can use that is clean and swap the cache over? I know it’s a pain to remember to do it, or even still you could automate this in Coldbox to do it for you.

  1. Get old bucket
  2. Switch to new bucket
  3. clear old bucket

I would think that the switch could be painless, and it would be clean when switching as you clean it code wise. Would that work in your case?

Thanks for the idea, but not sure that would help, because I’m not seeing the flush happen on all my couchbase buckets when I try to initiate it from the CF side. It’s not a timing issue strictly in which case your idea would work, it doesn’t seem to happen consistently at all. I do have flushing enabled on all the buckets (it’s off by default) so I know that’s not the problem. The buckets with the most items in particular do not seem to want to be cleared…which of course are also the ones I am most likely to need to flush. But having to do it manually shouldn’t be that big a deal. Hopefully it will be a rare instance where I have to do it at all.

Mary Jo