1. 26 Feb, 2016 1 commit
    • Chao Yu's avatar
      f2fs: fix incorrect upper bound when iterating inode mapping tree · 80dd9c0e
      Chao Yu authored
      
      
      1. Inode mapping tree can index page in range of [0, ULONG_MAX], however,
      in some places, f2fs only search or iterate page in ragne of [0, LONG_MAX],
      result in miss hitting in page cache.
      
      2. filemap_fdatawait_range accepts range parameters in unit of bytes, so
      the max range it covers should be [0, LLONG_MAX], if we use [0, LONG_MAX]
      as range for waiting on writeback, big number of pages will not be covered.
      
      This patch corrects above two issues.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      80dd9c0e
  2. 23 Feb, 2016 11 commits
    • Yunlei He's avatar
      f2fs: avoid hungtask problem caused by losing wake_up · 0ff21646
      Yunlei He authored
      
      
      The D state of wait_on_all_pages_writeback should be waken by
      function f2fs_write_end_io when all writeback pages have been
      succesfully written to device. It's possible that wake_up comes
      between get_pages and io_schedule. Maybe in this case it will
      lost wake_up and still in D state even if all pages have been
      write back to device, and finally, the whole system will be into
      the hungtask state.
      
                      if (!get_pages(sbi, F2FS_WRITEBACK))
                               break;
      					<---------  wake_up
                      io_schedule();
      Signed-off-by: default avatarYunlei He <heyunlei@huawei.com>
      Signed-off-by: default avatarBiao He <hebiao6@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0ff21646
    • Chao Yu's avatar
      f2fs: trace old block address for CoWed page · 7a9d7548
      Chao Yu authored
      
      
      This patch enables to trace old block address of CoWed page for better
      debugging.
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
      f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
      
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
      f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7a9d7548
    • Shawn Lin's avatar
      f2fs: move sanity checking of cp into get_valid_checkpoint · 984ec63c
      Shawn Lin authored
      
      
      >From the function name of get_valid_checkpoint, it seems to return
      the valid cp or NULL for caller to check. If no valid one is found,
      f2fs_fill_super will print the err log. But if get_valid_checkpoint
      get one valid(the return value indicate that it's valid, however actually
      it is invalid after sanity checking), then print another similar err
      log. That seems strange. Let's keep sanity checking inside the procedure
      of geting valid cp. Another improvement we gained from this move is
      that even the large volume is supported, we check the cp in advanced
      to skip the following procedure if failing the sanity checking.
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      984ec63c
    • Chao Yu's avatar
      f2fs: split journal cache from curseg cache · b7ad7512
      Chao Yu authored
      
      
      In curseg cache, f2fs caches two different parts:
       - datas of current summay block, i.e. summary entries, footer info.
       - journal info, i.e. sparse nat/sit entries or io stat info.
      
      With this approach, 1) it may cause higher lock contention when we access
      or update both of the parts of cache since we use the same mutex lock
      curseg_mutex to protect the cache. 2) current summary block with last
      journal info will be writebacked into device as a normal summary block
      when flushing, however, we treat journal info as valid one only in current
      summary, so most normal summary blocks contain junk journal data, it wastes
      remaining space of summary block.
      
      So, in order to fix above issues, we split curseg cache into two parts:
      a) current summary block, protected by original mutex lock curseg_mutex
      b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
      
      When loading curseg cache during ->mount, we store summary info and
      journal info into different caches; When doing checkpoint, we combine
      datas of two cache into current summary block for persisting.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b7ad7512
    • Chao Yu's avatar
      f2fs: enhance IO path with block plug · e9f5b8b8
      Chao Yu authored
      
      
      Try to use block plug in more place as below to let process cache bios
      as much as possbile, in order to reduce lock overhead of queue in IO
      scheduler.
      1) sync_meta_pages
      2) ra_meta_pages
      3) f2fs_balance_fs_bg
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e9f5b8b8
    • Chao Yu's avatar
      f2fs: introduce f2fs_journal struct to wrap journal info · dfc08a12
      Chao Yu authored
      
      
      Introduce a new structure f2fs_journal to wrap journal info in struct
      f2fs_summary_block for readability.
      
      struct f2fs_journal {
      	union {
      		__le16 n_nats;
      		__le16 n_sits;
      	};
      	union {
      		struct nat_journal nat_j;
      		struct sit_journal sit_j;
      		struct f2fs_extra_info info;
      	};
      } __packed;
      
      struct f2fs_summary_block {
      	struct f2fs_summary entries[ENTRIES_IN_SUM];
      	struct f2fs_journal journal;
      	struct summary_footer footer;
      } __packed;
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      dfc08a12
    • Yunlei He's avatar
      f2fs: fix missing skip pages info · d31c7c3f
      Yunlei He authored
      
      
      fix missing skip pages info in f2fs_writepages trace event.
      Signed-off-by: default avatarYunlei He <heyunlei@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d31c7c3f
    • Chao Yu's avatar
      f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797
      Chao Yu authored
      
      
      f2fs use single bio buffer per type data (META/NODE/DATA) for caching
      writes locating in continuous block address as many as possible, after
      submitting, these writes may be still cached in bio buffer, so we have
      to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
      
      Unfortunately, in the scenario of high concurrency, bio buffer could be
      flushed by someone else before we submit it as below reasons:
      a) there is no space in bio buffer.
      b) add a request of different type (SYNC, ASYNC).
      c) add a discontinuous block address.
      
      For this condition, f2fs_submit_merged_bio will be devastating, because
      it could break the following merging of writes in bio buffer, split one
      big bio into two smaller one.
      
      This patch introduces f2fs_submit_merged_bio_cond which can do a
      conditional submitting with bio buffer, before submitting it will judge
      whether:
       - page in DATA type bio buffer is matching with specified page;
       - page in DATA type bio buffer is belong to specified inode;
       - page in NODE type bio buffer is belong to specified inode;
      If there is no eligible page in bio buffer, we will skip submitting step,
      result in gaining more chance to merge consecutive block IOs in bio cache.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0c3a5797
    • Jaegeuk Kim's avatar
      f2fs: wait on page's writeback in writepages path · fa3d2bdf
      Jaegeuk Kim authored
      
      
      Likewise f2fs_write_cache_pages, let's do for node and meta pages too.
      Especially, for node blocks, we should do this before marking its fsync
      and dentry flags.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fa3d2bdf
    • Shuoran Liu's avatar
      f2fs: introduce lifetime write IO statistics · 8f1dbbbb
      Shuoran Liu authored
      
      
      This patch introduces lifetime IO write statistics exposed to the sysfs interface.
      The write IO amount is obtained from block layer, accumulated in the file system and
      stored in the hot node summary of checkpoint.
      Signed-off-by: default avatarShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: default avatarPengyang Hou <houpengyang@huawei.com>
      [Jaegeuk Kim: add sysfs documentation]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8f1dbbbb
    • Jaegeuk Kim's avatar
      f2fs: use wait_for_stable_page to avoid contention · fec1d657
      Jaegeuk Kim authored
      
      
      In write_begin, if storage supports stable_page, we don't need to wait for
      writeback to update its contents.
      This patch introduces to use wait_for_stable_page instead of
      wait_on_page_writeback.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fec1d657
  3. 11 Jan, 2016 1 commit
  4. 31 Dec, 2015 1 commit
    • Jaegeuk Kim's avatar
      f2fs: write pending bios when cp_error is set · 8d4ea29b
      Jaegeuk Kim authored
      When testing ioc_shutdown, put_super is able to be hanged by waiting for
      writebacking pages as follows.
      
      INFO: task umount:2723 blocked for more than 120 seconds.
            Tainted: G           O    4.4.0-rc3+ #8
      
      
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      umount          D ffff88000859f9d8     0  2723   2110 0x00000000
       ffff88000859f9d8 0000000000000000 0000000000000000 ffffffff81e11540
       ffff880078c225c0 ffff8800085a0000 ffff88007fc17440 7fffffffffffffff
       ffffffff818239f0 ffff88000859fb48 ffff88000859f9f0 ffffffff8182310c
      Call Trace:
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff8182310c>] schedule+0x3c/0x90
       [<ffffffff81827fb9>] schedule_timeout+0x2d9/0x430
       [<ffffffff810e0f8f>] ? mark_held_locks+0x6f/0xa0
       [<ffffffff8111614d>] ? ktime_get+0x7d/0x140
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff8106a655>] ? kvm_clock_get_cycles+0x25/0x30
       [<ffffffff8111617c>] ? ktime_get+0xac/0x140
       [<ffffffff818239f0>] ? bit_wait+0x50/0x50
       [<ffffffff81822564>] io_schedule_timeout+0xa4/0x110
       [<ffffffff81823a25>] bit_wait_io+0x35/0x50
       [<ffffffff818235bd>] __wait_on_bit+0x5d/0x90
       [<ffffffff811b9e8b>] wait_on_page_bit+0xcb/0xf0
       [<ffffffff810d5f90>] ? autoremove_wake_function+0x40/0x40
       [<ffffffff811cf84c>] truncate_inode_pages_range+0x4bc/0x840
       [<ffffffff811cfc3d>] truncate_inode_pages_final+0x4d/0x60
       [<ffffffffc023ced5>] f2fs_evict_inode+0x75/0x400 [f2fs]
       [<ffffffff812639bc>] evict+0xbc/0x190
       [<ffffffff81263d19>] iput+0x229/0x2c0
       [<ffffffffc0241885>] f2fs_put_super+0x105/0x1a0 [f2fs]
       [<ffffffff8124756a>] generic_shutdown_super+0x6a/0xf0
       [<ffffffff812478f7>] kill_block_super+0x27/0x70
       [<ffffffffc0241290>] kill_f2fs_super+0x20/0x30 [f2fs]
       [<ffffffff81247b03>] deactivate_locked_super+0x43/0x70
       [<ffffffff81247f4c>] deactivate_super+0x5c/0x60
       [<ffffffff81268d2f>] cleanup_mnt+0x3f/0x90
       [<ffffffff81268dc2>] __cleanup_mnt+0x12/0x20
       [<ffffffff810ac463>] task_work_run+0x73/0xa0
       [<ffffffff810032ac>] exit_to_usermode_loop+0xcc/0xd0
       [<ffffffff81003e7c>] syscall_return_slowpath+0xcc/0xe0
       [<ffffffff81829ea2>] int_ret_from_sys_call+0x25/0x9f
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8d4ea29b
  5. 30 Dec, 2015 2 commits
  6. 17 Dec, 2015 2 commits
  7. 16 Dec, 2015 2 commits
  8. 15 Dec, 2015 3 commits
  9. 12 Oct, 2015 2 commits
    • Chao Yu's avatar
      f2fs: support lower priority asynchronous readahead in ra_meta_pages · 26879fb1
      Chao Yu authored
      
      
      Now, we use ra_meta_pages to reads continuous physical blocks as much as
      possible to improve performance of following reads. However, ra_meta_pages
      uses a synchronous readahead approach by submitting bio with READ, as READ
      is with high priority, it can not be used in the case of preloading blocks,
      and it's not sure when these RAed pages will be used.
      
      This patch supports asynchronous readahead in ra_meta_pages by tagging bio
      with READA flag in order to allow preloading.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      26879fb1
    • Chao Yu's avatar
      f2fs: don't tag REQ_META for temporary non-meta pages · 2b947003
      Chao Yu authored
      
      
      In recovery or checkpoint flow, we grab pages temperarily in meta inode's
      mapping for caching temperary data, actually, datas in these pages were
      not meta data of f2fs, but still we tag them with REQ_META flag. However,
      lower device like eMMC may do some optimization for data of such type.
      So in order to avoid wrong optimization, we'd better remove such flag
      for temperary non-meta pages.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2b947003
  10. 09 Oct, 2015 3 commits
    • Jaegeuk Kim's avatar
      f2fs: merge meta writes as many possible · 6066d8cd
      Jaegeuk Kim authored
      
      
      This patch tries to merge IOs as many as possible when background flusher
      conducts flushing the dirty meta pages.
      
      [Before]
      
      ...
      2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124320, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124560, size = 32768
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 95720, size = 987136
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123928, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123944, size = 8192
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123968, size = 45056
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124064, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 97648, size = 1007616
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123776, size = 8192
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123800, size = 32768
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 124624, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 99616, size = 921600
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123608, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123624, size = 77824
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123792, size = 4096
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 123864, size = 32768
      ...
      
      [After]
      
      ...
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 92168, size = 892928
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 93912, size = 753664
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 95384, size = 716800
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 96784, size = 712704
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 104160, size = 364544
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 104872, size = 356352
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 105568, size = 278528
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 106112, size = 319488
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 106736, size = 258048
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 107240, size = 270336
      f2fs_submit_write_bio: dev = (8,18), WRITE_SYNC(MP), META, sector = 107768, size = 180224
      ...
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6066d8cd
    • Jaegeuk Kim's avatar
      f2fs: introduce a periodic checkpoint flow · 60b99b48
      Jaegeuk Kim authored
      
      
      This patch introduces a periodic checkpoint feature.
      Note that, this is not enforcing to conduct checkpoints very strictly in terms
      of trigger timing, instead just hope to help user experiences.
      The default value is 60 seconds.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      60b99b48
    • Jaegeuk Kim's avatar
      f2fs: check end_io for metapages before making next checkpoint blocks · a7230d16
      Jaegeuk Kim authored
      
      
      This patch avoids to produce new checkpoint blocks before the previous meta
      pages were written completely.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a7230d16
  11. 24 Aug, 2015 1 commit
  12. 20 Aug, 2015 1 commit
  13. 14 Aug, 2015 1 commit
    • Chao Yu's avatar
      f2fs: handle error of f2fs_iget correctly · 8c14bfad
      Chao Yu authored
      
      
      In recover_orphan_inode, whenever f2fs_iget fail, we will make kernel panic,
      but it's not reasonable, because f2fs_iget can fail due to a lot of reasons
      including out of memory.
      
      So we change error handling method as below:
      a) when finding no entry for the orphan inode, bug_on for catching bugs;
      b) for other reasons, report it to caller.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8c14bfad
  14. 05 Aug, 2015 3 commits
  15. 04 Aug, 2015 2 commits
    • Chao Yu's avatar
      f2fs: cleanup write_orphan_inodes · bd936f84
      Chao Yu authored
      Previously, since 'commit 4531929e ("f2fs: move grabing orphan
      pages out of protection region")' was committed, in write_orphan_inodes(),
      we will grab all meta page in a batch before we use them under spinlock,
      so that we can avoid large time delay of grabbing meta pages under
      spinlock.
      
      Now, 'commit d6c67a4f
      
       ("f2fs: revmove spin_lock for
      write_orphan_inodes")' remove the spinlock in write_orphan_inodes,
      so there is no issue we describe above, we'd better recover to move
      the grab operation to original place for readability.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bd936f84
    • Chao Yu's avatar
      f2fs: fix to record dirty page count for symlink · 5ac9f36f
      Chao Yu authored
      
      
      Dirty page can be exist in mapping of newly created symlink, but previously
      we did not maintain the counting of dirty page for symlink like we maintained
      for regular/directory, so the counting we lookuped should be wrong.
      
      This patch adds missed dirty page counting for symlink to fix this issue.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5ac9f36f
  16. 01 Jun, 2015 1 commit
  17. 28 May, 2015 3 commits