1. 23 Feb, 2016 40 commits
    • Shawn Lin's avatar
      f2fs: move sanity checking of cp into get_valid_checkpoint · 984ec63c
      Shawn Lin authored
      
      
      >From the function name of get_valid_checkpoint, it seems to return
      the valid cp or NULL for caller to check. If no valid one is found,
      f2fs_fill_super will print the err log. But if get_valid_checkpoint
      get one valid(the return value indicate that it's valid, however actually
      it is invalid after sanity checking), then print another similar err
      log. That seems strange. Let's keep sanity checking inside the procedure
      of geting valid cp. Another improvement we gained from this move is
      that even the large volume is supported, we check the cp in advanced
      to skip the following procedure if failing the sanity checking.
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      984ec63c
    • Shawn Lin's avatar
      f2fs: slightly reorganize read_raw_super_block · 2b39e907
      Shawn Lin authored
      read_raw_super_block was introduced to help find the
      first valid superblock. Commit da554e48
      
       ("f2fs:
      recovering broken superblock during mount") changed the
      behaviour to read both of them and check whether need
      the recovery flag or not. So the comment before this
      function isn't consistent with what it actually does.
      Also, the origin code use two tags to round the err
      cases, which isn't so readable. So this patch amend
      the comment and slightly reorganize it.
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2b39e907
    • Chao Yu's avatar
      f2fs: reorder nat cache lock in cache_nat_entry · 1515aef0
      Chao Yu authored
      
      
      When lookuping nat entry in cache_nat_entry, if we fail to hit nat cache,
      we try to load nat entries a) from journal of current segment cache or b)
      from NAT pages for updating, during the process, write lock of
      nat_tree_lock will be held to avoid inconsistent condition in between
      nid cache and nat cache caused by racing among nat entry shrinker,
      checkpointer, nat entry updater.
      
      But this way may cause low efficient when updating nat cache, because it
      serializes accessing in journal cache or reading NAT pages.
      
      Here, we reorder lock and update flow as below to enhance accessing
      concurrency:
      
       - get_node_info
        - down_read(nat_tree_lock)
        - lookup nat cache --- hit -> unlock & return
        - lookup journal cache --- hit -> unlock & goto update
        - up_read(nat_tree_lock)
      update:
        - down_write(nat_tree_lock)
        - cache_nat_entry
         - lookup nat cache --- nohit -> update
        - up_write(nat_tree_lock)
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      1515aef0
    • Chao Yu's avatar
      f2fs: split journal cache from curseg cache · b7ad7512
      Chao Yu authored
      
      
      In curseg cache, f2fs caches two different parts:
       - datas of current summay block, i.e. summary entries, footer info.
       - journal info, i.e. sparse nat/sit entries or io stat info.
      
      With this approach, 1) it may cause higher lock contention when we access
      or update both of the parts of cache since we use the same mutex lock
      curseg_mutex to protect the cache. 2) current summary block with last
      journal info will be writebacked into device as a normal summary block
      when flushing, however, we treat journal info as valid one only in current
      summary, so most normal summary blocks contain junk journal data, it wastes
      remaining space of summary block.
      
      So, in order to fix above issues, we split curseg cache into two parts:
      a) current summary block, protected by original mutex lock curseg_mutex
      b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
      
      When loading curseg cache during ->mount, we store summary info and
      journal info into different caches; When doing checkpoint, we combine
      datas of two cache into current summary block for persisting.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b7ad7512
    • Chao Yu's avatar
      f2fs: enhance IO path with block plug · e9f5b8b8
      Chao Yu authored
      
      
      Try to use block plug in more place as below to let process cache bios
      as much as possbile, in order to reduce lock overhead of queue in IO
      scheduler.
      1) sync_meta_pages
      2) ra_meta_pages
      3) f2fs_balance_fs_bg
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e9f5b8b8
    • Chao Yu's avatar
      f2fs: introduce f2fs_journal struct to wrap journal info · dfc08a12
      Chao Yu authored
      
      
      Introduce a new structure f2fs_journal to wrap journal info in struct
      f2fs_summary_block for readability.
      
      struct f2fs_journal {
      	union {
      		__le16 n_nats;
      		__le16 n_sits;
      	};
      	union {
      		struct nat_journal nat_j;
      		struct sit_journal sit_j;
      		struct f2fs_extra_info info;
      	};
      } __packed;
      
      struct f2fs_summary_block {
      	struct f2fs_summary entries[ENTRIES_IN_SUM];
      	struct f2fs_journal journal;
      	struct summary_footer footer;
      } __packed;
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      dfc08a12
    • Chao Yu's avatar
      f2fs crypto: avoid unneeded memory allocation when {en/de}crypting symlink · 922ec355
      Chao Yu authored
      
      
      This patch adopts f2fs with codes of ext4, it removes unneeded memory
      allocation in creating/accessing path of symlink.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      922ec355
    • Chao Yu's avatar
      f2fs crypto: handle unexpected lack of encryption keys · ae108668
      Chao Yu authored
      This patch syncs f2fs with commit abdd438b
      
       ("ext4 crypto: handle
      unexpected lack of encryption keys") from ext4.
      
      Fix up attempts by users to try to write to a file when they don't
      have access to the encryption key.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ae108668
    • Chao Yu's avatar
      f2fs crypto: make sure the encryption info is initialized on opendir(2) · ed3360ab
      Chao Yu authored
      This patch syncs f2fs with commit 6bc445e0
      
       ("ext4 crypto: make
      sure the encryption info is initialized on opendir(2)") from ext4.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ed3360ab
    • Chao Yu's avatar
      f2fs: support revoking atomic written pages · 28bc106b
      Chao Yu authored
      
      
      f2fs support atomic write with following semantics:
      1. open db file
      2. ioctl start atomic write
      3. (write db file) * n
      4. ioctl commit atomic write
      5. close db file
      
      With this flow we can avoid file becoming corrupted when abnormal power
      cut, because we hold data of transaction in referenced pages linked in
      inmem_pages list of inode, but without setting them dirty, so these data
      won't be persisted unless we commit them in step 4.
      
      But we should still hold journal db file in memory by using volatile
      write, because our semantics of 'atomic write support' is incomplete, in
      step 4, we could fail to submit all dirty data of transaction, once
      partial dirty data was committed in storage, then after a checkpoint &
      abnormal power-cut, db file will be corrupted forever.
      
      So this patch tries to improve atomic write flow by adding a revoking flow,
      once inner error occurs in committing, this gives another chance to try to
      revoke these partial submitted data of current transaction, it makes
      committing operation more like aotmical one.
      
      If we're not lucky, once revoking operation was failed, EAGAIN will be
      reported to user for suggesting doing the recovery with held journal file,
      or retrying current transaction again.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      28bc106b
    • Chao Yu's avatar
      f2fs: split drop_inmem_pages from commit_inmem_pages · 29b96b54
      Chao Yu authored
      
      
      Split drop_inmem_pages from commit_inmem_pages for code readability,
      and prepare for the following modification.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      29b96b54
    • Jaegeuk Kim's avatar
      f2fs: avoid garbage lenghs in dentries · 7d9dfa1d
      Jaegeuk Kim authored
      
      
      This patch fixes to eliminate garbage name lengths in dentries in order
      to provide correct answers of readdir.
      
      For example, if a valid dentry consists of:
       bitmap : 1   1 1 1
       len    : 32  0 x 0,
      
      readdir can start with second bit_pos having len = 0.
      Or, it can start with third bit_pos having garbage.
      
      In both of cases, we should avoid to try filling dentries.
      So, this patch not only removes any garbage length, but also avoid entering
      zero length case in readdir.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7d9dfa1d
    • Jaegeuk Kim's avatar
      f2fs crypto: sync with ext4's fname padding · a263669f
      Jaegeuk Kim authored
      
      
      This patch fixes wrong adoption on fname padding.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a263669f
    • Jaegeuk Kim's avatar
      f2fs: use correct errno · 60b286c4
      Jaegeuk Kim authored
      
      
      This patch is to fix misused error number.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      60b286c4
    • Jaegeuk Kim's avatar
      f2fs crypto: add missing locking for keyring_key access · 745e8490
      Jaegeuk Kim authored
      
      
      This patch adopts:
      	ext4 crypto: add missing locking for keyring_key access
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      745e8490
    • Jaegeuk Kim's avatar
      f2fs crypto: check for too-short encrypted file names · 1dafa51d
      Jaegeuk Kim authored
      
      
      This patch adopts:
      	ext4 crypto: check for too-short encrypted file names
      
      An encrypted file name should never be shorter than an 16 bytes, the
      AES block size.  The 3.10 crypto layer will oops and crash the kernel
      if ciphertext shorter than the block size is passed to it.
      
      Fortunately, in modern kernels the crypto layer will not crash the
      kernel in this scenario, but nevertheless, it represents a corrupted
      directory, and we should detect it and mark the file system as
      corrupted so that e2fsck can fix this.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      1dafa51d
    • Jaegeuk Kim's avatar
      f2fs crypto: f2fs_page_crypto() doesn't need a encryption context · ce855a3b
      Jaegeuk Kim authored
      
      
      This patch adopts:
      	ext4 crypto: ext4_page_crypto() doesn't need a encryption context
      
      Since ext4_page_crypto() doesn't need an encryption context (at least
      not any more), this allows us to simplify a number function signature
      and also allows us to avoid needing to allocate a context in
      ext4_block_write_begin().  It also means we no longer need a separate
      ext4_decrypt_one() function.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ce855a3b
    • Jaegeuk Kim's avatar
      f2fs crypto: fix spelling typo in comment · 0fac2d50
      Jaegeuk Kim authored
      
      
      This patch adopts:
      	ext4 crypto: fix spelling typo in comment
      Signed-off-by: default avatarLaurent Navet <laurent.navet@gmail.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0fac2d50
    • Jaegeuk Kim's avatar
      f2fs crypto: replace some BUG_ON()'s with error checks · 66aa3e12
      Jaegeuk Kim authored
      
      
      This patch adopts:
      	ext4 crypto: replace some BUG_ON()'s with error checks
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      66aa3e12
    • Jaegeuk Kim's avatar
      f2fs: increase i_size to avoid missing data · 8ef2af45
      Jaegeuk Kim authored
      
      
      When finsert is doing with dirting pages, we should increase i_size right away.
      Otherwise, the moved page is able to be dropped by the following
      filemap_write_and_wait_range before updating i_size.
      Especially, it can be done by
      	if ((page->index >= end_index + 1) || !offset)
      		goto out;
      in f2fs_write_data_page.
      
      This should resolve the below xfstests/091 failure reported by Dave.
      
      $ diff -u tests/generic/091.out /home/dave/src/xfstests-dev/results//f2fs/generic/091.out.bad
      --- tests/generic/091.out       2014-01-20 16:57:33.000000000 +1100
      +++ /home/dave/src/xfstests-dev/results//f2fs/generic/091.out.bad       2016-02-08 15:21:02.701375087 +1100
      @@ -1,7 +1,18 @@
       QA output created by 091
       fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
      -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
      -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
      -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
      -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
      -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W
      +mapped writes DISABLED
      +skipping insert range behind EOF
      +skipping insert range behind EOF
      +truncating to largest ever: 0x11e00
      +dowrite: write: Invalid argument
      +LOG DUMP (7 total operations):
      +1(  1 mod 256): SKIPPED (no operation)
      +2(  2 mod 256): SKIPPED (no operation)
      +3(  3 mod 256): FALLOC   0x2e0f2 thru 0x3134a  (0x3258 bytes) PAST_EOF
      +4(  4 mod 256): SKIPPED (no operation)
      +5(  5 mod 256): SKIPPED (no operation)
      +6(  6 mod 256): TRUNCATE UP    from 0x0 to 0x11e00
      +7(  7 mod 256): WRITE    0x73400 thru 0x79fff  (0x6c00 bytes) HOLE
      +Log of operations saved to "/mnt/test/junk.fsxops"; replay with --replay-ops
      +Correct content saved for comparison
      +(maybe hexdump "/mnt/test/junk" vs "/mnt/test/junk.fsxgood")
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8ef2af45
    • Jaegeuk Kim's avatar
      f2fs: preallocate blocks for buffered aio writes · 24b84912
      Jaegeuk Kim authored
      
      
      This patch preallocates data blocks for buffered aio writes.
      With this patch, we can avoid redundant locking and unlocking of node pages
      given consecutive aio request.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      24b84912
    • Jaegeuk Kim's avatar
      f2fs: move dio preallocation into f2fs_file_write_iter · b439b103
      Jaegeuk Kim authored
      
      
      This patch moves preallocation code for direct IOs into f2fs_file_write_iter.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b439b103
    • Yunlei He's avatar
      f2fs: fix missing skip pages info · d31c7c3f
      Yunlei He authored
      
      
      fix missing skip pages info in f2fs_writepages trace event.
      Signed-off-by: default avatarYunlei He <heyunlei@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d31c7c3f
    • Chao Yu's avatar
      f2fs: introduce f2fs_submit_merged_bio_cond · 0c3a5797
      Chao Yu authored
      
      
      f2fs use single bio buffer per type data (META/NODE/DATA) for caching
      writes locating in continuous block address as many as possible, after
      submitting, these writes may be still cached in bio buffer, so we have
      to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
      
      Unfortunately, in the scenario of high concurrency, bio buffer could be
      flushed by someone else before we submit it as below reasons:
      a) there is no space in bio buffer.
      b) add a request of different type (SYNC, ASYNC).
      c) add a discontinuous block address.
      
      For this condition, f2fs_submit_merged_bio will be devastating, because
      it could break the following merging of writes in bio buffer, split one
      big bio into two smaller one.
      
      This patch introduces f2fs_submit_merged_bio_cond which can do a
      conditional submitting with bio buffer, before submitting it will judge
      whether:
       - page in DATA type bio buffer is matching with specified page;
       - page in DATA type bio buffer is belong to specified inode;
       - page in NODE type bio buffer is belong to specified inode;
      If there is no eligible page in bio buffer, we will skip submitting step,
      result in gaining more chance to merge consecutive block IOs in bio cache.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0c3a5797
    • Jaegeuk Kim's avatar
      f2fs: fix conflict on page->private usage · d48dfc20
      Jaegeuk Kim authored
      
      
      This patch fixes confilct on page->private value between f2fs_trace_pid and
      atomic page.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d48dfc20
    • Jaegeuk Kim's avatar
      f2fs: flush bios to handle cp_error in put_super · 17c19120
      Jaegeuk Kim authored
      
      
      Sometimes, if cp_error is set, there remains under-writeback pages, resulting in
      kernel hang in put_super.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      17c19120
    • Jaegeuk Kim's avatar
      f2fs: wait on page's writeback in writepages path · fa3d2bdf
      Jaegeuk Kim authored
      
      
      Likewise f2fs_write_cache_pages, let's do for node and meta pages too.
      Especially, for node blocks, we should do this before marking its fsync
      and dentry flags.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fa3d2bdf
    • Sheng Yong's avatar
      479c8bc4
    • Chao Yu's avatar
      f2fs: speed up handling holes in fiemap · da85985c
      Chao Yu authored
      
      
      This patch makes f2fs_map_blocks supporting returning next potential
      page offset which skips hole region in indirect tree of inode, and
      use it to speed up fiemap in handling big hole case.
      
      Test method:
      xfs_io -f /mnt/f2fs/file  -c "pwrite 1099511627776 4096"
      time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
      
      Before:
      time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
      /mnt/f2fs/file:
       EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
         0: [0..2147483647]:         hole             2147483648
         1: [2147483648..2147483655]: 81920..81927         8   0x1
      
      real    3m3.518s
      user    0m0.000s
      sys     3m3.456s
      
      After:
      time xfs_io -f /mnt/f2fs/file -c "fiemap -v"
      /mnt/f2fs/file:
       EXT: FILE-OFFSET              BLOCK-RANGE      TOTAL FLAGS
         0: [0..2147483647]:         hole             2147483648
         1: [2147483648..2147483655]: 81920..81927         8   0x1
      
      real    0m0.008s
      user    0m0.000s
      sys     0m0.008s
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      da85985c
    • Chao Yu's avatar
      f2fs: introduce get_next_page_offset to speed up SEEK_DATA · 3cf45747
      Chao Yu authored
      
      
      When seeking data in ->llseek, if we encounter a big hole which covers
      several dnode pages, we will try to seek data from index of page which
      is the first page of next dnode page, at most we could skip searching
      (ADDRS_PER_BLOCK - 1) pages.
      
      However it's still not efficient, because if our indirect/double-indirect
      pointer are NULL, there are no dnode page locate in the tree indirect/
      double-indirect pointer point to, it's not necessary to search the whole
      region.
      
      This patch introduces get_next_page_offset to calculate next page offset
      based on current searching level and max searching level returned from
      get_dnode_of_data, with this, we could skip searching the entire area
      indirect or double-indirect node block is not exist.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      3cf45747
    • Chao Yu's avatar
      f2fs: remove unneeded pointer conversion · 81ca7350
      Chao Yu authored
      
      
      There are redundant pointer conversion in following call stack:
       - at position a, inode was been converted to f2fs_file_info.
       - at position b, f2fs_file_info was been converted to inode again.
      
       - truncate_blocks(inode,..)
        - fi = F2FS_I(inode)		---a
        - ADDRS_PER_PAGE(node_page, fi)
         - addrs_per_inode(fi)
          - inode = &fi->vfs_inode	---b
          - f2fs_has_inline_xattr(inode)
           - fi = F2FS_I(inode)
           - is_inode_flag_set(fi,..)
      
      In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and
      addrs_per_inode to acept parameter with type of inode pointer.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      81ca7350
    • Chao Yu's avatar
      f2fs: simplify __allocate_data_blocks · 5b8db7fa
      Chao Yu authored
      
      
      This patch uses existing function f2fs_map_block to simplify implementation
      of __allocate_data_blocks.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5b8db7fa
    • Chao Yu's avatar
      f2fs: simplify f2fs_map_blocks · 4fe71e88
      Chao Yu authored
      
      
      In f2fs_map_blocks, we use duplicated codes to handle first block mapping
      and the following blocks mapping, it's unnecessary. This patch simplifies
      f2fs_map_blocks to avoid using copied codes.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4fe71e88
    • Shuoran Liu's avatar
      f2fs: introduce lifetime write IO statistics · 8f1dbbbb
      Shuoran Liu authored
      
      
      This patch introduces lifetime IO write statistics exposed to the sysfs interface.
      The write IO amount is obtained from block layer, accumulated in the file system and
      stored in the hot node summary of checkpoint.
      Signed-off-by: default avatarShuoran Liu <liushuoran@huawei.com>
      Signed-off-by: default avatarPengyang Hou <houpengyang@huawei.com>
      [Jaegeuk Kim: add sysfs documentation]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8f1dbbbb
    • Jaegeuk Kim's avatar
      f2fs: give scheduling point in shrinking path · 6fe2bc95
      Jaegeuk Kim authored
      
      
      It needs to give a chance to be rescheduled while shrinking slab entries.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6fe2bc95
    • Hou Pengyang's avatar
      f2fs: improve shrink performance of extent nodes · 201ef5e0
      Hou Pengyang authored
      
      
      On the worst case, we need to scan the whole radix tree and every rb-tree to
      free the victimed extent_nodes when shrinking.
      
      Pengyang initially introduced a victim_list to record the victimed extent_nodes,
      and free these extent_nodes by just scanning a list.
      
      Later, Chao Yu enhances the original patch to improve memory footprint by
      removing victim list.
      
      The policy of lru list shrinking becomes:
      1) lock lru list's lock
      2) trylock extent tree's lock
      3) remove extent node from lru list
      4) unlock lru list's lock
      5) do shrink
      6) repeat 1) to 5)
      Signed-off-by: default avatarHou Pengyang <houpengyang@huawei.com>
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      201ef5e0
    • Jaegeuk Kim's avatar
      f2fs: don't set cached_en if it will be freed · 42926744
      Jaegeuk Kim authored
      
      
      If en has empty list pointer, it will be freed sooner, so we don't need to
      set cached_en with it.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      42926744
    • Jaegeuk Kim's avatar
      f2fs: move extent_node list operations being coupled with rbtree operation · 43a2fa18
      Jaegeuk Kim authored
      
      
      This patch moves extent_node list operations to be handled together with
      its rbtree operations.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      43a2fa18
    • Hou Pengyang's avatar
      f2fs: reconstruct the code to free an extent_node · a03f01f2
      Hou Pengyang authored
      
      
      There are three steps to free an extent node:
      1) list_del_init, 2)__detach_extent_node, 3) kmem_cache_free
      
      In path f2fs_destroy_extent_tree, 1->2->3 to free a node,
      But in path f2fs_update_extent_tree_range, it is 2->1->3.
      
      This patch makes all the order to be: 1->2->3
      It makes sense, since in the next patch, we import a victim list in the
      path shrink_extent_tree, we could check if the extent_node is in the victim
      list by checking the list_empty(). So it is necessary to put 1) first.
      Signed-off-by: default avatarHou Pengyang <houpengyang@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a03f01f2
    • Jaegeuk Kim's avatar
      f2fs: use wq_has_sleeper for cp_wait wait_queue · 7c506896
      Jaegeuk Kim authored
      
      
      We need to use wq_has_sleeper including smp_mb to consider cp_wait concurrency.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7c506896