1. 04 Nov, 2014 1 commit
    • Jaegeuk Kim's avatar
      f2fs: fix race conditon on truncation with inline_data · 1ce86bf6
      Jaegeuk Kim authored
      
      
      Let's consider the following scenario.
      
      blkaddr[0] inline_data i_size  i_blocks writepage           truncate
        NEW        X        4096        2    dirty page #0
        NEW        X         0                                    change i_size
        NEW        X         0          2    f2fs_write_inline_data
        NEW        X         0          2    get_dnode_of_data
        NEW        X         0          2    truncate_data_blocks_range
        NULL       O         0          1    memcpy(inline_data)
        NULL       O         0          1    f2fs_put_dnode
        NULL       O         0          1                         f2fs_truncate
        NULL       O         0          1                         get_dnode_of_data
        NULL       O         0          1                       *invalid block addr*
      
      This patch adds checking inline_data flag during f2fs_truncate not to refer
      corrupted block indices.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      1ce86bf6
  2. 07 Oct, 2014 2 commits
    • Jaegeuk Kim's avatar
      f2fs: support volatile operations for transient data · 02a1335f
      Jaegeuk Kim authored
      
      
      This patch adds support for volatile writes which keep data pages in memory
      until f2fs_evict_inode is called by iput.
      
      For instance, we can use this feature for the sqlite database as follows.
      While supporting atomic writes for main database file, we can keep its journal
      data temporarily in the page cache by the following sequence.
      
      1. open
       -> ioctl(F2FS_IOC_START_VOLATILE_WRITE);
      2. writes
       : keep all the data in the page cache.
      3. flush to the database file with atomic writes
        a. ioctl(F2FS_IOC_START_ATOMIC_WRITE);
        b. writes
        c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE);
      4. close
       -> drop the cached data
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      02a1335f
    • Jaegeuk Kim's avatar
      f2fs: support atomic writes · 88b88a66
      Jaegeuk Kim authored
      This patch introduces a very limited functionality for atomic write support.
      In order to support atomic write, this patch adds two ioctls:
       o F2FS_IOC_START_ATOMIC_WRITE
       o F2FS_IOC_COMMIT_ATOMIC_WRITE
      
      The database engine should be aware of the following sequence.
      1. open
       -> ioctl(F2FS_IOC_START_ATOMIC_WRITE);
      2. writes
        : all the written data will be treated as atomic pages.
      3. commit
       -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE);
        : this flushes all the data blocks to the disk, which will be shown all or
        nothing by f2fs recovery procedure.
      4. repeat to #2
      
      .
      
      The IO pattens should be:
      
        ,- START_ATOMIC_WRITE                  ,- COMMIT_ATOMIC_WRITE
       CP | D D D D D D | FSYNC | D D D D | FSYNC ...
                            `- COMMIT_ATOMIC_WRITE
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      88b88a66
  3. 30 Sep, 2014 2 commits
  4. 23 Sep, 2014 4 commits
    • Chao Yu's avatar
      f2fs: skip punching hole in special condition · 14cecc5c
      Chao Yu authored
      
      
      Now punching hole in directory is not supported in f2fs, so let's limit file
      type in punch_hole().
      
      In addition, in punch_hole if offset is exceed file size, we should skip
      punching hole.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      14cecc5c
    • Chao Yu's avatar
      f2fs: fix to truncate blocks past EOF in ->setattr · 09db6a2e
      Chao Yu authored
      
      
      By using FALLOC_FL_KEEP_SIZE in ->fallocate of f2fs, we can fallocate block past
      EOF without changing i_size of inode. These blocks past EOF will not be
      truncated in ->setattr as we truncate them only when change the file size.
      
      We should give a chance to truncate blocks out of filesize in setattr().
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      09db6a2e
    • Jaegeuk Kim's avatar
      f2fs: do not skip latest inode information · 19c9c466
      Jaegeuk Kim authored
      
      
      In f2fs_sync_file, if there is no written appended writes, it skips
      to write its node blocks.
      But, if there is up-to-date inode page, we should write it to update
      its metadata during the roll-forward recovery.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      19c9c466
    • Jaegeuk Kim's avatar
      f2fs: fix conditions to remain recovery information in f2fs_sync_file · 88bd02c9
      Jaegeuk Kim authored
      This patch revisited whole the recovery information during the f2fs_sync_file.
      
      In this patch, there are three information to make a decision.
      
      a) IS_CHECKPOINTED,	/* is it checkpointed before? */
      b) HAS_FSYNCED_INODE,	/* is the inode fsynced before? */
      c) HAS_LAST_FSYNC,	/* has the latest node fsync mark? */
      
      And, the scenarios for our rule are based on:
      
      [Term] F: fsync_mark, D: dentry_mark
      
      1. inode(x) | CP | inode(x) | dnode(F)
      2. inode(x) | CP | inode(F) | dnode(F)
      3. inode(x) | CP | dnode(F) | inode(x) | inode(F)
      4. inode(x) | CP | dnode(F) | inode(F)
      5. CP | inode(x) | dnode(F) | inode(DF)
      6. CP | inode(DF) | dnode(F)
      7. CP | dnode(F) | inode(DF)
      8. CP | dnode(F) | inode(x) | inode(DF)
      
      For example, #3, the three conditions should be changed as follows.
      
         inode(x) | CP | dnode(F) | inode(x) | inode(F)
      a)    x       o      o          o          o
      b)    x       x      x          x          o
      c)    x       o      o          x          o
      
      If f2fs_sync_file stops   ------^,
       it should write inode(F)    --------------^
      
      So, the need_inode_block_update should return true, since
       c) get_nat_flag(e, HAS_LAST_FSYNC), is false.
      
      For example, #8
      
      ,
            CP | alloc | dnode(F) | inode(x) | inode(DF)
      a)    o      x        x          x          x
      b)    x               x          x          o
      c)    o               o          x          o
      
      If f2fs_sync_file stops   -------^,
       it should write inode(DF)    --------------^
      
      Note that, the roll-forward policy should follow this rule, which means,
      if there are any missing blocks, we doesn't need to recover that inode.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      88bd02c9
  5. 16 Sep, 2014 1 commit
    • Jaegeuk Kim's avatar
      f2fs: give an option to enable in-place-updates during fsync to users · c1ce1b02
      Jaegeuk Kim authored
      
      
      If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file
      only starts to try in-place-updates.
      And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it
      keeps out-of-order manner. Otherwise, it triggers in-place-updates.
      
      This may be used by storage showing very high random write performance.
      
      For example, it can be used when,
      
      Seq. writes (Data) + wait + Seq. writes (Node)
      
      is pretty much slower than,
      
      Rand. writes (Data)
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c1ce1b02
  6. 11 Sep, 2014 1 commit
  7. 09 Sep, 2014 2 commits
  8. 04 Sep, 2014 1 commit
  9. 21 Aug, 2014 2 commits
  10. 19 Aug, 2014 2 commits
  11. 04 Aug, 2014 1 commit
  12. 31 Jul, 2014 1 commit
  13. 30 Jul, 2014 2 commits
  14. 09 Jul, 2014 2 commits
  15. 23 Jun, 2014 2 commits
  16. 12 Jun, 2014 1 commit
    • Al Viro's avatar
      ->splice_write() via ->write_iter() · 8d020765
      Al Viro authored
      
      
      iter_file_splice_write() - a ->splice_write() instance that gathers the
      pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
      it to ->write_iter().  A bunch of simple cases coverted to that...
      
      [AV: fixed the braino spotted by Cyrill]
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8d020765
  17. 07 Jun, 2014 1 commit
  18. 06 Jun, 2014 1 commit
  19. 07 May, 2014 5 commits
  20. 06 May, 2014 2 commits
  21. 07 Apr, 2014 2 commits
    • Kirill A. Shutemov's avatar
      mm: implement ->map_pages for page cache · f1820361
      Kirill A. Shutemov authored
      
      
      filemap_map_pages() is generic implementation of ->map_pages() for
      filesystems who uses page cache.
      
      It should be safe to use filemap_map_pages() for ->map_pages() if
      filesystem use filemap_fault() for ->fault().
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f1820361
    • Jaegeuk Kim's avatar
      f2fs: introduce f2fs_issue_flush to avoid redundant flush issue · 6b4afdd7
      Jaegeuk Kim authored
      
      
      Some storage devices show relatively high latencies to complete cache_flush
      commands, even though their normal IO speed is prettry much high. In such
      the case, it needs to merge cache_flush commands as much as possible to avoid
      issuing them redundantly.
      So, this patch introduces a mount option, "-o flush_merge", to mitigate such
      the overhead.
      
      If this option is enabled by user, F2FS merges the cache_flush commands and then
      issues just one cache_flush on behalf of them. Once the single command is
      finished, F2FS sends a completion signal to all the pending threads.
      
      Note that, this option can be used under a workload consisting of very intensive
      concurrent fsync calls, while the storage handles cache_flush commands slowly.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk.kim@samsung.com>
      6b4afdd7
  22. 20 Mar, 2014 2 commits
    • Jaegeuk Kim's avatar
      f2fs: skip unnecessary node writes during fsync · 479f40c4
      Jaegeuk Kim authored
      
      
      If multiple redundant fsync calls are triggered, we don't need to write its
      node pages with fsync mark continuously.
      
      So, this patch adds FI_NEED_FSYNC to track whether the latest node block is
      written with the fsync mark or not.
      If the mark was set, a new fsync doesn't need to write a node block.
      Otherwise, we should do a new node block with the mark for roll-forward
      recovery.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk.kim@samsung.com>
      479f40c4
    • Jaegeuk Kim's avatar
      f2fs: introduce fi->i_sem to protect fi's info · d928bfbf
      Jaegeuk Kim authored
      
      
      This patch introduces fi->i_sem to protect fi's info that includes xattr_ver,
      pino, i_nlink.
      This enables to remove i_mutex during f2fs_sync_file, resulting in performance
      improvement when a number of fsync calls are triggered from many concurrent
      threads.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk.kim@samsung.com>
      d928bfbf