1. 04 Dec, 2015 1 commit
  2. 20 Oct, 2015 2 commits
  3. 13 Oct, 2015 1 commit
    • Chao Yu's avatar
      f2fs crypto: fix racing of accessing encrypted page among · 08b39fbd
      Chao Yu authored
      
      
       different competitors
      
      Since we use different page cache (normally inode's page cache for R/W
      and meta inode's page cache for GC) to cache the same physical block
      which is belong to an encrypted inode. Writeback of these two page
      cache should be exclusive, but now we didn't handle writeback state
      well, so there may be potential racing problem:
      
      a)
      kworker:				f2fs_gc:
       - f2fs_write_data_pages
        - f2fs_write_data_page
         - do_write_data_page
          - write_data_page
           - f2fs_submit_page_mbio
      (page#1 in inode's page cache was queued
      in f2fs bio cache, and be ready to write
      to new blkaddr)
      					 - gc_data_segment
      					  - move_encrypted_block
      					   - pagecache_get_page
      					(page#2 in meta inode's page cache
      					was cached with the invalid datas
      					of physical block located in new
      					blkaddr)
      					   - f2fs_submit_page_mbio
      					(page#1 was submitted, later, page#2
      					with invalid data will be submitted)
      
      b)
      f2fs_gc:
       - gc_data_segment
        - move_encrypted_block
         - f2fs_submit_page_mbio
      (page#1 in meta inode's page cache was
      queued in f2fs bio cache, and be ready
      to write to new blkaddr)
      					user thread:
      					 - f2fs_write_begin
      					  - f2fs_submit_page_bio
      					(we submit the request to block layer
      					to update page#2 in inode's page cache
      					with physical block located in new
      					blkaddr, so here we may read gabbage
      					data from new blkaddr since GC hasn't
      					writebacked the page#1 yet)
      
      This patch fixes above potential racing problem for encrypted inode.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      08b39fbd
  4. 12 Oct, 2015 3 commits
  5. 09 Oct, 2015 5 commits
    • Jaegeuk Kim's avatar
      f2fs: do not skip dentry block writes · 90b803e6
      Jaegeuk Kim authored
      
      
      Previously, we skip dentry block writes when wbc is SYNC_NONE with no memory
      pressure and the number of dirty pages is pretty small.
      
      But, we didn't skip for normal data writes, which gives us not much big impact
      on overall performance.
      Moreover, by skipping some data writes, kworker falls into infinite loop to try
      to write blocks, when many dir inodes have only one dentry block.
      
      So, this patch removes skipping data writes.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      90b803e6
    • Chao Yu's avatar
      f2fs: use correct flag in f2fs_map_blocks() · 46c9e141
      Chao Yu authored
      We introduce F2FS_GET_BLOCK_READ in commit e2b4e2bc
      
       ("f2fs: fix
      incorrect mapping for bmap"), but forget to use this flag in the right
      place, fix it.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      46c9e141
    • Chao Yu's avatar
      f2fs: fix to handle io error in ->direct_IO · f9811703
      Chao Yu authored
      Here is a oops reported as following message when testing generic/019 of
      xfstest:
      
       ------------[ cut here ]------------
       kernel BUG at /home/yuchao/git/f2fs-dev/segment.c:882!
       invalid opcode: 0000 [#1] SMP
       Modules linked in: zram lz4_compress lz4_decompress f2fs(O) ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4
      nf_def
       CPU: 2 PID: 25441 Comm: fio Tainted: G           O    4.3.0-rc1+ #6
      
      
       Hardware name: Hewlett-Packard HP Z220 CMT Workstation/1790, BIOS K51 v01.61 05/16/2013
       task: ffff8803f4e85580 ti: ffff8803fd61c000 task.ti: ffff8803fd61c000
       RIP: 0010:[<ffffffffa0784981>]  [<ffffffffa0784981>] new_curseg+0x321/0x330 [f2fs]
       RSP: 0018:ffff8803fd61f918  EFLAGS: 00010246
       RAX: 00000000000007ed RBX: 0000000000000224 RCX: 000000000000001f
       RDX: 0000000000000800 RSI: ffffffffffffffff RDI: ffff8803f56f4300
       RBP: ffff8803fd61f978 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000024 R11: ffff8800d23bbd78 R12: ffff8800d0ef0000
       R13: 0000000000000224 R14: 0000000000000000 R15: 0000000000000001
       FS:  00007f827ff85700(0000) GS:ffff88041ea80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffffffff600000 CR3: 00000003fef17000 CR4: 00000000001406e0
       Stack:
        000007ea00000002 0000000100000001 ffff8803f6456248 000007ed0000002b
        0000000000000224 ffff880404d1aa20 ffff8803fd61f9c8 ffff8800d0ef0000
        ffff8803f6456248 0000000000000001 00000000ffffffff ffffffffa078f358
       Call Trace:
        [<ffffffffa0785b87>] allocate_segment_by_default+0x1a7/0x1f0 [f2fs]
        [<ffffffffa078322c>] allocate_data_block+0x17c/0x360 [f2fs]
        [<ffffffffa0779521>] __allocate_data_block+0x131/0x1d0 [f2fs]
        [<ffffffffa077a995>] f2fs_direct_IO+0x4b5/0x580 [f2fs]
        [<ffffffff811510ae>] generic_file_direct_write+0xae/0x160
        [<ffffffff811518f5>] __generic_file_write_iter+0xd5/0x1f0
        [<ffffffff81151e07>] generic_file_write_iter+0xf7/0x200
        [<ffffffff81319e38>] ? apparmor_file_permission+0x18/0x20
        [<ffffffffa0768480>] ? f2fs_fallocate+0x1190/0x1190 [f2fs]
        [<ffffffffa07684c6>] f2fs_file_write_iter+0x46/0x90 [f2fs]
        [<ffffffff8120b4fe>] aio_run_iocb+0x1ee/0x290
        [<ffffffff81700f7e>] ? mutex_lock+0x1e/0x50
        [<ffffffff8120a1d7>] ? aio_read_events+0x207/0x2b0
        [<ffffffff8120b913>] do_io_submit+0x373/0x630
        [<ffffffff8120a4f6>] ? SyS_io_getevents+0x56/0xb0
        [<ffffffff8120bbe0>] SyS_io_submit+0x10/0x20
        [<ffffffff81703857>] entry_SYSCALL_64_fastpath+0x12/0x6a
       Code: 45 c8 48 8b 78 10 e8 9f 23 bf e0 41 8b 8c 24 cc 03 00 00 89 c7 31 d2 89 c6 89 d8 29 df f7 f1 29 d1 39 cf 0f 83 be fd ff ff eb
       RIP  [<ffffffffa0784981>] new_curseg+0x321/0x330 [f2fs]
        RSP <ffff8803fd61f918>
       ---[ end trace 2e577d7f711ddb86 ]---
      
      The reason is that: in the test of generic/019, we will trigger a manmade
      IO error in block layer through debugfs, after that, prefree segment will
      no longer be freed, because we always skip doing gc or checkpoint when
      there occurs an IO error.
      
      Meanwhile fio with aio engine generated a large number of direct IOs,
      which continue allocating spaces in free segment until we run out of them,
      eventually, results in panic in new_curseg as no more free segment was
      found.
      
      So, this patch changes to return EIO in direct_IO for this condition.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f9811703
    • Chao Yu's avatar
      f2fs: reorganize f2fs_map_blocks · 973163fc
      Chao Yu authored
      
      
      In this patch, we try to reorganize f2fs_map_blocks to make block mapping
      flow more clear by using following structure:
      
      /* check status of mapping */
      
      if (unmapped) {
      	/* blkaddr == NULL_ADDR || blkaddr == NEW_ADDR */
      
      	if (create) {
      		/* write path, handle dio write case here */
      		alloc_and_map;
      	} else {
      		/*
      		 * handle read cases from all call paths:
      		 *     1. generic read;
      		 *     2. dio read;
      		 *     3. fiemap;
      		 *     4. bmap
      		 */
      	}
      }
      
      /* map buffer_header */
      
      Besides, this patch handles the missing case correctly for dio write:
      When we fail in __allocate_data_blocks, then in f2fs_map_blocks, we will
      not allocate blocks correctly for preallocated blocks, but returning with
      an unmapped buffer head, which will result in failure of dio write.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      973163fc
    • Chao Yu's avatar
      f2fs: fix overflow of size calculation · 9edcdabf
      Chao Yu authored
      
      
      We have potential overflow issue when calculating size of object, when
      we left shift index with PAGE_CACHE_SHIFT bits, if type of index has only
      32-bits space in 32-bit architecture, left shifting will incur overflow,
      i.e:
      
      pgoff_t index =  0xFFFFFFFF;
      loff_t size = index << PAGE_CACHE_SHIFT;
      size: 0xFFFFF000
      
      So we should cast index with 64-bits type to avoid this issue.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      9edcdabf
  6. 22 Aug, 2015 1 commit
    • Chao Yu's avatar
      f2fs: fix incorrect mapping for bmap · e2b4e2bc
      Chao Yu authored
      
      
      The test step is like below:
      1. touch file
      2. truncate -s $((1024*1024)) file
      3. fallocate -o 0 -l $((1024*1024)) file
      4. fibmap.f2fs file
      
      Our result of fibmap.f2fs showed below is not correct:
      
      file_pos   start_blk     end_blk        blks
             0    -937166132    -937166132           1
          4096    -937166132    -937166132           1
          8192    -937166132    -937166132           1
         12288    -937166132    -937166132           1
         16384    -937166132    -937166132           1
         20480    -937166132    -937166132           1
      ...
       1040384    -937166132    -937166132           1
       1044480    -937166132    -937166132           1
      
      This is because f2fs_map_blocks will return with no error when meeting
      a hole or preallocated block, the caller __get_data_block will map the
      uninitialized variable value to bh->b_blocknr.
      
      Unfortunately generic_block_bmap will neither check the return value of
      get_data() nor check mapping info of buffer_head, result in returning
      the random block address.
      
      After fixing the issue, our result shows correctly:
      
      file_pos   start_blk     end_blk        blks
             0           0           0         256
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e2b4e2bc
  7. 20 Aug, 2015 1 commit
    • Jaegeuk Kim's avatar
      f2fs: handle failed bio allocation · 740432f8
      Jaegeuk Kim authored
      
      
      As the below comment of bio_alloc_bioset, f2fs can allocate multiple bios at the
      same time. So, we can't guarantee that bio is allocated all the time.
      
      "
       *   When @bs is not NULL, if %__GFP_WAIT is set then bio_alloc will always be
       *   able to allocate a bio. This is due to the mempool guarantees. To make this
       *   work, callers must never allocate more than 1 bio at a time from this pool.
       *   Callers that need to allocate more than 1 bio must always submit the
       *   previously allocated bio for IO before attempting to allocate a new one.
       *   Failure to do so can cause deadlocks under memory pressure.
      "
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      740432f8
  8. 13 Aug, 2015 1 commit
  9. 11 Aug, 2015 2 commits
    • Chao Yu's avatar
      f2fs: remove inmem radix tree · decd36b6
      Chao Yu authored
      Previously, we use radix tree to index all registered page entries for
      atomic file, but now we only use radix tree to see whether current page
      is indexed or not, since the other user of radix tree is gone in commit
      042b7816
      
       ("f2fs: remove unnecessary call to invalidate inmemory pages").
      
      So in this patch, we try to use one more efficient way:
      Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
      value to indicate page indexing status. By using this way, we can save
      memory and lookup time.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      decd36b6
    • Chao Yu's avatar
      f2fs: report EINVAL for unalignment direct IO · c15e8599
      Chao Yu authored
      
      
      We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in
      detail is as fallow:
      
      dio04
      
      <<<test_start>>>
      tag=dio04 stime=1432278894
      cmdline="diotest4"
      contacts=""
      analysis=exit
      <<<test_output>>>
      diotest4    1  TPASS  :  Negative Offset
      diotest4    2  TPASS  :  removed
      diotest4    3  TFAIL  :  diotest4.c:129: write allows odd count.returns 1: Success
      diotest4    4  TFAIL  :  diotest4.c:183: Odd count of read and write
      diotest4    5  TPASS  :  Read beyond the file size
      ......
      
      the result of ext4 with same environment:
      
      dio04
      
      <<<test_start>>>
      tag=dio04 stime=1432259643
      cmdline="diotest4"
      contacts=""
      analysis=exit
      <<<test_output>>>
      diotest4    1  TPASS  :  Negative Offset
      diotest4    2  TPASS  :  removed
      diotest4    3  TPASS  :  Odd count of read and write
      diotest4    4  TPASS  :  Read beyond the file size
      ......
      
      The reason is that when triggering DIO in f2fs, we will return zero value
      in ->direct_IO if writer's buffer offset, file offset and transfer size is
      not alignment to block size of filesystem, resulting in falling back into
      buffered write instead of returning -EINVAL.
      
      This patch fixes that problem by returning correct error number for above
      case, and removing the judgement condition in check_direct_IO to make sure
      the verification will be enabled for direct reader too.
      
      Besides, Jaegeuk Kim pointed out that there is expectional cases we should
      always make direct-io falling back into buffered write, such as dio in
      encrypted file.
      Signed-off-by: default avatarYunlei He <heyunlei@huawei.com>
      [Chao Yu make small change and add detail description in commit message]
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c15e8599
  10. 06 Aug, 2015 1 commit
  11. 05 Aug, 2015 5 commits
  12. 04 Aug, 2015 12 commits
    • Chao Yu's avatar
      f2fs: expose f2fs_write_cache_pages · 8f46dcae
      Chao Yu authored
      
      
      If there are gced dirty pages and normal dirty pages in the mapping
      of one inode, we might writeback them alternately with discontinuous
      block address, resulting in low performance.
      
      This patch introduces f2fs_write_cache_pages with codes copied from
      write_cache_pages in mm/page-writeback.c.
      
      In this function, we refactor flow with two steps:
      1) writeback all cold type pages.
      2) writeback all non-cold type pages.
      
      By using this method, f2fs will writeback dirty pages with the same
      temperature in bunch mode, it makes writeouted block being with
      more continuous address, so they can be merged as much as possible
      in f2fs bio cache, and also it will reduce the chance of submiting
      small IO from block layer.
      
      Test environment: 8g nokia sd card (very old sd card, but it shows
      better effect when testing with this patch, and with a 32g kingston
      sd card, I didn't see much more improvement).
      
      Test step:
      1. touch testfile;
      2. truncate -s 512K testfile;
      3. write all pages with odd index;
      4. trigger gc by ioctl;
      5. write all pages with even index;
      6. time fsync testfile.
      
      before:
      real	0m0.402s
      user	0m0.000s
      sys	0m0.000s
      
      after:
      real	0m0.143s
      user	0m0.004s
      sys	0m0.004s
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8f46dcae
    • Chao Yu's avatar
      f2fs: maintain extent cache in separated file · a28ef1f5
      Chao Yu authored
      
      
      This patch moves extent cache related code from data.c into extent_cache.c
      since extent cache is independent feature, and its codes are not relate to
      others in data.c, it's better for us to maintain them in separated place.
      
      There is no functionality change, but several small coding style fixes
      including:
      * rename __drop_largest_extent to f2fs_drop_largest_extent for exporting;
      * rename misspelled word 'untill' to 'until';
      * remove unneeded 'return' in the end of f2fs_destroy_extent_tree().
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a28ef1f5
    • Fan Li's avatar
      f2fs: don't try to split extents shorter than F2FS_MIN_EXTENT_LEN · 3c7df87d
      Fan Li authored
      
      
      Since only parts of extents longer than F2FS_MIN_EXTENT_LEN will
      be kept in extent cache after split, extents already shorter than
      F2FS_MIN_EXTENT_LEN don't need to try split at all.
      Signed-off-by: default avatarFan Li <fanofcode.li@samsung.com>
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      3c7df87d
    • Chao Yu's avatar
      f2fs: fix to update page flag · 90d4388a
      Chao Yu authored
      
      
      This patch fixes to update page flag (e.g. Uptodate/cold flag) in
      ->write_begin.
      
      Otherwise, page will be non-uptodate when we try to write entire
      page, and cold data flag in page will not be clean when gced page
      is being rewritten.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      90d4388a
    • Jaegeuk Kim's avatar
      f2fs: shrink unreferenced extent_caches first · 7023a1ad
      Jaegeuk Kim authored
      
      
      If an extent_tree entry has a zero reference count, we can drop it from the
      cache in higher priority rather than currently referencing entries.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7023a1ad
    • Chao Yu's avatar
      f2fs: enhance multithread performance · bb96a8d5
      Chao Yu authored
      
      
      In ->writepages, we use writepages mutex lock to serialize all block
      address allocation and page submitting pairs from different inodes.
      This method makes our delayed dirty pages of one inode being written
      continously as many as possible.
      
      But there is one problem that we did not submit current cached bio in
      protection region of writepages mutex lock, so there is a small chance
      that we submit the one of other thread's as below, resulting in
      splitting more bios.
      
      thread 1			thread 2
      ->writepages
        lock(writepages)
        ->write_cache_pages
        unlock(writepages)
      				  lock(writepages)
      				  ->write_cache_pages
        ->f2fs_submit_merged_bio
      				    ->writepage
      				  unlock(writepages)
      
      fs_mark-6535  [002] ....  2242.270230: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5766152, size = 524288
      fs_mark-6536  [000] ....  2242.270361: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5767176, size = 4096
      fs_mark-6536  [000] ....  2242.270370: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, NODE, sector = 8138112, size = 4096
      fs_mark-6535  [002] ....  2242.270776: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5767184, size = 516096
      
      This may really increase time of block layer works, and may cause
      larger IO lantency.
      
      This patch moves the submitting operation into region of writepages
      mutex lock to avoid bio splits when concurrently writebacking is
      intensive.
      
      my test environment: virtual machine,
      intel cpu i5 2500, 8GB size memory, 4GB size ramdisk
      
      time fs_mark  -t  16  -L  1  -s  524288  -S  1  -d  /mnt/f2fs/
      
      before:
      real	0m4.244s
      user	0m0.088s
      sys	0m12.336s
      
      after:
      real	0m3.822s
      user	0m0.072s
      sys	0m10.760s
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bb96a8d5
    • Jaegeuk Kim's avatar
      f2fs: check the largest extent at look-up time · 84bc926c
      Jaegeuk Kim authored
      
      
      Because of the extent shrinker or other -ENOMEM scenarios, it cannot guarantee
      that the largest extent would be cached in the tree all the time.
      
      Instead of relying on extent_tree, we can simply check the cached one in extent
      tree accordingly.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      84bc926c
    • Jaegeuk Kim's avatar
      f2fs: use extent_cache by default · 3e72f721
      Jaegeuk Kim authored
      
      
      We don't need to handle the duplicate extent information.
      
      The integrated rule is:
       - update on-disk extent with largest one tracked by in-memory extent_cache
       - destroy extent_tree for the truncation case
       - drop per-inode extent_cache by shrinker
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      3e72f721
    • Jaegeuk Kim's avatar
      f2fs: shrink extent_cache entries · 554df79e
      Jaegeuk Kim authored
      
      
      This patch registers shrinking extent_caches.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      554df79e
    • Jaegeuk Kim's avatar
      f2fs: set cached_en after checking finally · 244f4fc1
      Jaegeuk Kim authored
      
      
      This patch relocates cached_en not only to be covered by spin_lock, but also
      to set once after checking out completely.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      244f4fc1
    • Jaegeuk Kim's avatar
      f2fs: update on-disk extents even under extent_cache · cbe91923
      Jaegeuk Kim authored
      
      
      Previously, f2fs_update_extent_cache() updates in-memory extent_cache all the
      time, and then finally preserves its up-to-date extent into on-disk one during
      f2fs_evict_inode.
      
      But, in the following scenario:
      
      1. mount
      2. open & write an extent X
      3. f2fs_evict_inode; on-disk extent is X
      4. open & update the extent X with Y
      5. sync; trigger checkpoint
      6. power-cut
      
      after power-on, f2fs should serve extent Y, but we have an on-disk extent X.
      
      This causes a failure on xfstests/311.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      cbe91923
    • Jaegeuk Kim's avatar
      f2fs: fix wrong block address calculation for a split extent · 7a2cb678
      Jaegeuk Kim authored
      
      
      This patch fixes wrong calculation on block address field when an extent is
      split.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7a2cb678
  13. 29 Jul, 2015 1 commit
    • Christoph Hellwig's avatar
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig authored
      
      
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-b...
      4246a0b6
  14. 25 Jul, 2015 1 commit
    • Jaegeuk Kim's avatar
      f2fs: call set_page_dirty to attach i_wb for cgroup · 6282adbf
      Jaegeuk Kim authored
      The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback
      is called, __inc_wb_stat() updates i_wb's stat.
      
      So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to
      any writebacking pages.
      
      This patch should resolve the following kernel panic reported by Andreas Reis.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=101801
      
      --- Comment #2 from Andreas Reis <andreas.reis@gmail.com> ---
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
      IP: [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90
      PGD 2951ff067 PUD 2df43f067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 7 PID: 10356 Comm: gcc Tainted: G        W       4.2.0-1-cu #1
      
      
      Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS
      T01 02/03/2015
      task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000
      RIP: 0010:[<ffffffff8149deea>]  [<ffffffff8149deea>]
      __percpu_counter_add+0x1a/0x90
      RSP: 0018:ffff880295143ac8  EFLAGS: 00010082
      RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001
      RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088
      RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30
      R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088
      R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0
      FS:  00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Stack:
       0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750
       ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296
       0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40
      Call Trace:
       [<ffffffff811cc91e>] __test_set_page_writeback+0xde/0x1d0
       [<ffffffff813fee87>] do_write_data_page+0xe7/0x3a0
       [<ffffffff813faeea>] gc_data_segment+0x5aa/0x640
       [<ffffffff813fb0b8>] do_garbage_collect+0x138/0x150
       [<ffffffff813fb3fe>] f2fs_gc+0x1be/0x3e0
       [<ffffffff81405541>] f2fs_balance_fs+0x81/0x90
       [<ffffffff813ee357>] f2fs_unlink+0x47/0x1d0
       [<ffffffff81239329>] vfs_unlink+0x109/0x1b0
       [<ffffffff8123e3d7>] do_unlinkat+0x287/0x2c0
       [<ffffffff8123ebc6>] SyS_unlink+0x16/0x20
       [<ffffffff81942e2e>] entry_SYSCALL_64_fastpath+0x12/0x71
      Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49
      89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e <48> 8b 47 20 48 63 ca
      65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a
      RIP  [<ffffffff8149deea>] __percpu_counter_add+0x1a/0x90
       RSP <ffff880295143ac8>
      CR2: 00000000000000a8
      ---[ end trace 5132449a58ed93a3 ]---
      note: gcc[10356] exited with preempt_count 2
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6282adbf
  15. 01 Jun, 2015 1 commit
  16. 28 May, 2015 2 commits
    • Jaegeuk Kim's avatar
      f2fs crypto: add encryption support in read/write paths · 4375a336
      Jaegeuk Kim authored
      
      
      This patch adds encryption support in read and write paths.
      
      Note that, in f2fs, we need to consider cleaning operation.
      In cleaning procedure, we must avoid encrypting and decrypting written blocks.
      So, this patch implements move_encrypted_block().
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4375a336
    • Jaegeuk Kim's avatar
      f2fs crypto: activate encryption support for fs APIs · fcc85a4d
      Jaegeuk Kim authored
      
      
      This patch activates the following APIs for encryption support.
      
      The rules quoted by ext4 are:
       - An unencrypted directory may contain encrypted or unencrypted files
         or directories.
       - All files or directories in a directory must be protected using the
         same key as their containing directory.
       - Encrypted inode for regular file should not have inline_data.
       - Encrypted symlink and directory may have inline_data and inline_dentry.
      
      This patch activates the following APIs.
      1. f2fs_link              : validate context
      2. f2fs_lookup            :      ''
      3. f2fs_rename            :      ''
      4. f2fs_create/f2fs_mkdir : inherit its dir's context
      5. f2fs_direct_IO         : do buffered io for regular files
      6. f2fs_open              : check encryption info
      7. f2fs_file_mmap         :      ''
      8. f2fs_setattr           :      ''
      9. f2fs_file_write_iter   :      ''           (Called by sys_io_submit)
      10. f2fs_fallocate        : do not support fcollapse
      11. f2fs_evict_inode      : free_encryption_info
      Signed-off-by: default avatarMichael Halcrow <mhalcrow@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fcc85a4d