Skip to content
  • Dave Chinner's avatar
    xfs: flush removing page cache in xfs_reflink_remap_prep · 2c307174
    Dave Chinner authored
    
    
    On a sub-page block size filesystem, fsx is failing with a data
    corruption after a series of operations involving copying a file
    with the destination offset beyond EOF of the destination of the file:
    
    8093(157 mod 256): TRUNCATE DOWN        from 0x7a120 to 0x50000 ******WWWW
    8094(158 mod 256): INSERT 0x25000 thru 0x25fff  (0x1000 bytes)
    8095(159 mod 256): COPY 0x18000 thru 0x1afff    (0x3000 bytes) to 0x2f400
    8096(160 mod 256): WRITE    0x5da00 thru 0x651ff        (0x7800 bytes) HOLE
    8097(161 mod 256): COPY 0x2000 thru 0x5fff      (0x4000 bytes) to 0x6fc00
    
    The second copy here is beyond EOF, and it is to sub-page (4k) but
    block aligned (1k) offset. The clone runs the EOF zeroing, landing
    in a pre-existing post-eof delalloc extent. This zeroes the post-eof
    extents in the page cache just fine, dirtying the pages correctly.
    
    The problem is that xfs_reflink_remap_prep() now truncates the page
    cache over the range that it is copying it to, and rounds that down
    to cover the entire start page. This removes the dirty page over the
    delalloc extent from the page cache without having written it back.
    Hence later, when the page cache is flushed, the page at offset
    0x6f000 has not been written back and hence exposes stale data,
    which fsx trips over less than 10 operations later.
    
    Fix this by changing xfs_reflink_remap_prep() to use
    xfs_flush_unmap_range().
    
    Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    2c307174