• Kiran Kumar Modukuri's avatar
    fscache: Fix reference overput in fscache_attach_object() error handling · f29507ce
    Kiran Kumar Modukuri authored
    When a cookie is allocated that causes fscache_object structs to be
    allocated, those objects are initialised with the cookie pointer, but
    aren't blessed with a ref on that cookie unless the attachment is
    successfully completed in fscache_attach_object().
    If attachment fails because the parent object was dying or there was a
    collision, fscache_attach_object() returns without incrementing the cookie
    counter - but upon failure of this function, the object is released which
    then puts the cookie, whether or not a ref was taken on the cookie.
    Fix this by taking a ref on the cookie when it is assigned in
    fscache_object_init(), even when we're creating a root object.
    Analysis from Kiran Kumar:
    This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel
    BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277
    fscache cookie ref count updated incorrectly during fscache object
    allocation resulting in following Oops.
    kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
    kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
    Two threads are trying to do operate on a cookie and two objects.
    (1) One thread tries to unmount the filesystem and in process goes over a
        huge list of objects marking them dead and deleting the objects.
        cookie->usage is also decremented in following path:
           -> __fscache_relinquish_cookie
            ->BUG_ON(atomic_read(&cookie->usage) <= 0);
    (2) A second thread tries to lookup an object for reading data in following
        1) cachefiles_alloc_object
            -> fscache_object_init
               -> assign cookie, but usage not bumped.
        2) fscache_attach_object -> fails in cant_attach_object because the
             cookie's backing object or cookie's->parent object are going away
        3) fscache_put_object
            -> cachefiles_put_object
                   ->BUG_ON(atomic_read(&cookie->usage) <= 0);
    [NOTE from dhowells] It's unclear as to the circumstances in which (2) can
    take place, given that thread (1) is in nfs_kill_super(), however a
    conflicting NFS mount with slightly different parameters that creates a
    different superblock would do it.  A backtrace from Kiran seems to show
    that this is a possibility:
        kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
        RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
        Call Trace:
         __fscache_relinquish_cookie+0x87/0x120 [fscache]
         nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
         nfs_kill_super+0x29/0x40 [nfs]
    [Fix] Bump up the cookie usage in fscache_object_init, when it is first
    being assigned a cookie atomically such that the cookie is added and bumped
    up if its refcount is not zero.  Remove the assignment in
    I have run ~100 hours of NFS stress tests and not seen this bug recur.
    [Regression Potential]
     - Limited to fscache/cachefiles.
    Fixes: ccc4fc3d ("FS-Cache: Implement the cookie management part of the netfs API")
    Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
cache.c 10.8 KB