[Linux-cachefs] Re: Using latest code

Tue Nov 18 19:16:53 UTC 2008

David,

> If you have a current copy of Linus's tree you can use as a reference to
> create a GIT repository off of, say /foo, then you can do:
>
>	git clone -l -s --reference /foo /foo /my/fscache/tree
>
> If not, then do this:
>
>	git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git /my/fscache/tree
>
> Then when you've created a tree to play with, do:
>
>	cd /my/fscache/tree
>	git reset --hard <commit-id>
>
> That'll wind the GIT tree back to where you want it to be.

Thank you - this is exactly what I was looking for. I'm just getting the hang of git and I just couldn't find any clues to the the "reset --hard" bit!

>> Any chance you could provide some foolproof instructions for getting the
>> latest fscache code working?

> Well, I'm just trying to fix a bug in it, and then I'll release a bunch of
> patches that are built on top of James Morris's security tree next branch.  To
> get that, do:
>
>	git clone git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/security-testing-2.6-next.git /my/fscache/tree
>
> You'll need this to get the current credentials code that CacheFiles requires
> for working around security.

I'll check it out - thanks.

>> It is a pity this code is not kept up to date in the EL5 kernel.

> The stuff in the RHEL-5 kernel is somewhat out of date, and requires kABI
> changes to be brought up to date.

I'd happily compile my own kernel and disable kabi if I could get the latest and greatest FS-Cache code into an RHEL5 image easily. So many userspace things break going from 2.6.18 -> 2.6.27. I'll have to use Fedora9 as a server or something.

>> Also is it normal to have the cachefilesd process chew 100% for long periods
>> of time? A quick strace suggests it is doing a lot of "culls" but the
>> filesystem is never near full enough for that to trigger.

> Is it culling things?  Or is it building up its cull table?  I suspect it's
> probably the latter.  Currently it does a tree scan of the cache's directory
> structure.  This has proven to be very slow and very CPU intensive.  What it
> requires is a couple of indices adding to the mix, but that leads to
> consistency issues over power failure:-/

To be honest I think it was mainly due to a slow disk I was using for testing. I am now using a RAID and the process now seems to be a lot better behaved. It does from time to time use up 100% of the CPU but now it lasts a few seconds instead of hours.

> When you say 'nfscache' do you mean 'NFS with FS-Cache', or do you mean
> something else entirely?

Yes. Apologies - this feature has been called many things in the past and I still get confused between them all. What I was interested in doing was to mount an NFS export on a server across a slow WAN link to a remote office. This server would then export the NFS mount via Samba to the 20 or so local office workers. Basically a homemade version of one of these (or similar):

  http://www.gear6.com/cachefx

If we could "prime" the remote cache overnight this would be much nicer than our current method of replicating data everyday. A write back cache (too dangerous?) could also make local writes at the remote office seem relatively quick too. We are also quite interested in doing this with Lustre so we eagerly wait for that ongoing work to be completed. Lustre is a lot less chatty over a WAN so I expect it will be far more efficient than using NFS.

Daire