- 12 Apr, 2016 2 commits
-
-
Jens Axboe authored
We don't have any drivers left using it, so kill it off. Update documentation to use the newer blk_queue_write_cache(). Signed-off-by:
Jens Axboe <axboe@fb.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
Jens Axboe authored
Add an internal helper and flag for setting whether a queue has write back caching, or write through (or none). Add a sysfs file to show this as well, and make it changeable from user space. This will replace the (awkward) blk_queue_flush() interface that drivers currently use to inform the block layer of write cache state and capabilities. Signed-off-by:
Jens Axboe <axboe@fb.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 04 Apr, 2016 1 commit
-
-
Kirill A. Shutemov authored
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by:
Michal Hocko <mhocko@suse.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 11 Feb, 2016 2 commits
-
-
Keith Busch authored
The new queue limit is not used by the majority of block drivers, and should be initialized to 0 for the driver's requested settings to be used. Signed-off-by:
Keith Busch <keith.busch@intel.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by:
Sagi Grimberg <sagig@mellanox.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
Keith Busch authored
The new queue limit is not used by the majority of block drivers, and should be initialized to 0 for the driver's requested settings to be used. Signed-off-by:
Keith Busch <keith.busch@intel.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by:
Sagi Grimberg <sagig@mellanox.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 26 Nov, 2015 1 commit
-
-
Martin K. Petersen authored
Commit 4f258a46 ("sd: Fix maximum I/O size for BLOCK_PC requests") had the unfortunate side-effect of removing an implicit clamp to BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer code. This caused problems for some SMR drives. Debugging this issue revealed a few problems with the existing infrastructure since the block layer didn't know how to deal with device-imposed limits, only limits set by the I/O controller. - Introduce a new queue limit, max_dev_sectors, which is used by the ULD to signal the maximum sectors for a REQ_TYPE_FS request. - Ensure that max_dev_sectors is correctly stacked and taken into account when overriding max_sectors through sysfs. - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer values for later processing. - In sd_revalidate() set the queue's max_dev_sectors based on the MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value is not reported, fall back to a cap based on the CDB TRANSFER LENGTH field size. - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits VPD--if reported and sane--to signal the preferred device transfer size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS. - blk_limits_max_hw_sectors() is no longer used and can be removed. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581Reviewed-by:
Christoph Hellwig <hch@lst.de> Tested-by: sweeneygj@gmx.com Tested-by:
Arzeets <anatol.pomozov@gmail.com> Tested-by:
David Eisner <david.eisner@oriel.oxon.org> Tested-by:
Mario Kicherer <dev@kicherer.org> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com>
-
- 19 Aug, 2015 1 commit
-
-
Keith Busch authored
The SG_GAPS queue flag caused checks for bio vector alignment against PAGE_SIZE, but the device may have different constraints. This patch adds a queue limits so a driver with such constraints can set to allow requests that would have been unnecessarily split. The new gaps check takes the request_queue as a parameter to simplify the logic around invoking this function. This new limit makes the queue flag redundant, so removing it and all usage. Device-mappers will inherit the correct settings through blk_stack_limits(). Signed-off-by:
Keith Busch <keith.busch@intel.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 18 Aug, 2015 1 commit
-
-
Jeff Moyer authored
This reverts commit 34b48db6. That commit caused performance regressions for streaming I/O workloads on a number of different storage devices, from SATA disks to external RAID arrays. It also managed to trip up some buggy firmware in at least one drive, causing data corruption. The next patch will bump the default max_sectors_kb value to 1280, which will accommodate a 10-data-disk stripe write with chunk size 128k. In the testing I've done using iozone, fio, and aio-stress, a value of 1280 does not show a big performance difference from 512. This will hopefully still help the software RAID setup that Christoph saw the original performance gains with while still not regressing other storage configurations. Signed-off-by:
Jeff Moyer <jmoyer@redhat.com> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 13 Aug, 2015 1 commit
-
-
Kent Overstreet authored
As generic_make_request() is now able to handle arbitrarily sized bios, it's no longer necessary for each individual block driver to define its own ->merge_bvec_fn() callback. Remove every invocation completely. Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: drbd-user@lists.linbit.com Cc: Jiri Kosina <jkosina@suse.cz> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@kernel.org> Cc: ceph-devel@vger.kernel.org Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Neil Brown <neilb@suse.de> Cc: linux-raid@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits) Acked-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com> [dpark: also remove ->merge_bvec_fn() in dm-thin as well as dm-era-target, and resolve merge conflicts] Signed-off-by:
Dongsu Park <dpark@posteo.net> Signed-off-by:
Ming Lin <ming.l@ssi.samsung.com> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 12 Aug, 2015 1 commit
-
-
Martin K. Petersen authored
Commit bcdb247c ("sd: Limit transfer length") clamped the maximum size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK LIMITS VPD. This had the unfortunate effect of also limiting the maximum size of non-filesystem requests sent to the device through sg/bsg. Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue limit directly. Also update the comment in blk_limits_max_hw_sectors() to clarify that max_hw_sectors defines the limit for the I/O controller only. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Reported-by:
Brian King <brking@linux.vnet.ibm.com> Tested-by:
Brian King <brking@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 3.17+ Signed-off-by:
James Bottomley <JBottomley@Odin.com>
-
- 17 Jul, 2015 1 commit
-
-
Jens Axboe authored
Lots of devices support huge discard sizes these days. Depending on how the device handles them internally, huge discards can introduce massive latencies (hundreds of msec) on the device side. We have a sysfs file, discard_max_bytes, that advertises the max hardware supported discard size. Make this writeable, and split the settings into a soft and hard limit. This can be set from 'discard_granularity' and up to the hardware limit. Add a new sysfs file, 'discard_max_hw_bytes', that shows the hw set limit. Reviewed-by:
Jeff Moyer <jmoyer@redhat.com> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 31 Mar, 2015 1 commit
-
-
Mike Snitzer authored
Linux 3.19 commit 69c953c8 ("lib/lcm.c: lcm(n,0)=lcm(0,n) is 0, not n") caused blk_stack_limits() to not properly stack queue_limits for stacked devices (e.g. DM). Fix this regression by establishing lcm_not_zero() and switching blk_stack_limits() over to using it. DM uses blk_set_stacking_limits() to establish the initial top-level queue_limits that are then built up based on underlying devices' limits using blk_stack_limits(). In the case of optimal_io_size (io_opt) blk_set_stacking_limits() establishes a default value of 0. With commit 69c953c8, lcm(0, n) is no longer n, which compromises proper stacking of the underlying devices' io_opt. Test: $ modprobe scsi_debug dev_size_mb=10 num_tgts=1 opt_blks=1536 $ cat /sys/block/sde/queue/optimal_io_size 786432 $ dmsetup create node --table "0 100 linear /dev/sde 0" Before this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 0 After this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 786432 Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.19+ Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 21 Oct, 2014 1 commit
-
-
Christoph Hellwig authored
Set max_sectors to the value the drivers provides as hardware limit by default. Linux had proper I/O throttling for a long time and doesn't rely on a artifically small maximum I/O size anymore. By not limiting the I/O size by default we remove an annoying tuning step required for most Linux installation. Note that both the user, and if absolutely required the driver can still impose a limit for FS requests below max_hw_sectors_kb. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 09 Oct, 2014 1 commit
-
-
Mike Snitzer authored
The math in both blk_stack_limits() and queue_limit_alignment_offset() assume that a block device's io_min (aka minimum_io_size) is always a power-of-2. Fix the math such that it works for non-power-of-2 io_min. This issue (of alignment_offset != 0) became apparent when testing dm-thinp with a thinp blocksize that matches a RAID6 stripesize of 1280K. Commit fdfb4c8c ("dm thin: set minimum_io_size to pool's data block size") unlocked the potential for alignment_offset != 0 due to the dm-thin-pool's io_min possibly being a non-power-of-2. Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 10 Jun, 2014 1 commit
-
-
Jens Axboe authored
With commit 762380ad added support for chunk sizes and no merging across them, it broke the rule of always allowing adding of a single page to an empty bio. So relax the restriction a bit to allow for that, similarly to what we have always done. This fixes a crash with mkfs.xfs and 512b sector sizes on NVMe. Reported-by:
Keith Busch <keith.busch@intel.com> Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 05 Jun, 2014 1 commit
-
-
Jens Axboe authored
Some drivers have different limits on what size a request should optimally be, depending on the offset of the request. Similar to dividing a device into chunks. Add a setting that allows the driver to inform the block layer of such a chunk size. The block layer will then prevent merging across the chunks. This is needed to optimally support NVMe with a non-zero stripe size. Signed-off-by:
Jens Axboe <axboe@fb.com>
-
- 08 Jan, 2014 1 commit
-
-
Kent Overstreet authored
Now that we've got code for raid5/6 stripe awareness, bcache just needs to know about the stripes and when writing partial stripes is expensive - we probably don't want to enable this optimization for raid1 or 10, even though they have stripes. So add a flag to queue_limits. Signed-off-by:
Kent Overstreet <kmo@daterainc.com>
-
- 08 Nov, 2013 1 commit
-
-
Mike Snitzer authored
Without this patch all DM devices will default to BLK_MAX_SEGMENT_SIZE (65536) even if the underlying device(s) have a larger value -- this is due to blk_stack_limits() using min_not_zero() when stacking the max_segment_size limit. 1073741824 before patch: 65536 after patch: 1073741824 Reported-by:
Lukasz Flis <l.flis@cyfronet.pl> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # v3.3+ Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 31 Oct, 2013 1 commit
-
-
Santosh Shilimkar authored
The blk_queue_bounce_limit() API parameter 'dma_mask' is actually the maximum address the device can handle rather than a dma_mask. Rename it accordingly to avoid it being interpreted as dma_mask. No functional change. The idea is to fix the bad assumptions about dma_mask wherever it could be miss-interpreted. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by:
Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk>
-
- 14 Dec, 2012 1 commit
-
-
Shaohua Li authored
In MD raid case, discard granularity might not be power of 2, for example, a 4-disk raid5 has 3*chunk_size discard granularity. Correct the calculation for such cases. Reported-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Shaohua Li <shli@fusionio.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 20 Sep, 2012 1 commit
-
-
Martin K. Petersen authored
The WRITE SAME command supported on some SCSI devices allows the same block to be efficiently replicated throughout a block range. Only a single logical block is transferred from the host and the storage device writes the same data to all blocks described by the I/O. This patch implements support for WRITE SAME in the block layer. The blkdev_issue_write_same() function can be used by filesystems and block drivers to replicate a buffer across a block range. This can be used to efficiently initialize software RAID devices, etc. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Acked-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 01 Aug, 2012 1 commit
-
-
Mike Snitzer authored
blk_set_stacking_limits is intended to allow stacking drivers to build up the limits of the stacked device based on the underlying devices' limits. But defaulting 'max_sectors' to BLK_DEF_MAX_SECTORS (1024) doesn't allow the stacking driver to inherit a max_sectors larger than 1024 -- due to blk_stack_limits' use of min_not_zero. It is now clear that this artificial limit is getting in the way so change blk_set_stacking_limits's max_sectors to UINT_MAX (which allows stacking drivers like dm-multipath to inherit 'max_sectors' from the underlying paths). Reported-by:
Vijay Chauhan <vijay.chauhan@netapp.com> Tested-by:
Vijay Chauhan <vijay.chauhan@netapp.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 11 Jan, 2012 1 commit
-
-
Martin K. Petersen authored
Stacking driver queue limits are typically bounded exclusively by the capabilities of the low level devices, not by the stacking driver itself. This patch introduces blk_set_stacking_limits() which has more liberal metrics than the default queue limits function. This allows us to inherit topology parameters from bottom devices without manually tweaking the default limits in each driver prior to calling the stacking function. Since there is now a clear distinction between stacking and low-level devices, blk_set_default_limits() has been modified to carry the more conservative values that we used to manually set in blk_queue_make_request(). Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Acked-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 18 May, 2011 1 commit
-
-
Martin K. Petersen authored
In some cases we would end up stacking discard_zeroes_data incorrectly. Fix this by enabling the feature by default for stacking drivers and clearing it for low-level drivers. Incorporating a device that does not support dzd will then cause the feature to be disabled in the stacking driver. Also ensure that the maximum discard value does not overflow when exported in sysfs and return 0 in the alignment and dzd fields for devices that don't support discard. Reported-by:
Lukas Czerner <lczerner@redhat.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Acked-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 06 May, 2011 1 commit
-
-
shaohua.li@intel.com authored
flush request isn't queueable in some drives. Add a flag to let driver notify block layer about this. We can optimize flush performance with the knowledge. Stable: 2.6.39 only Cc: stable@kernel.org Signed-off-by:
Shaohua Li <shaohua.li@intel.com> Acked-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 18 Apr, 2011 1 commit
-
-
Jens Axboe authored
MD can't use this since it really requires us to be able to keep more than a single piece of state for the unplug. Commit 048c9374 added the required support for MD, so get rid of this now unused code. This reverts commit f7566457. Conflicts: block/blk-core.c Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 12 Apr, 2011 1 commit
-
-
Jens Axboe authored
MD would like to know when a queue is unplugged, so it can flush it's bitmap writes. Add such a callback. Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 10 Mar, 2011 1 commit
-
-
Jens Axboe authored
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 03 Mar, 2011 1 commit
-
-
Vivek Goyal authored
There does not seem to be a clear convention whether q->queue_lock is initialized or not when blk_cleanup_queue() is called. In the past it was not necessary but now blk_throtl_exit() takes up queue lock by default and needs queue lock to be available. In fact elevator_exit() code also has similar requirement just that it is less stringent in the sense that elevator_exit() is called only if elevator is initialized. Two problems have been noticed because of ambiguity about spin lock status. - If a driver calls blk_alloc_queue() and then soon calls blk_cleanup_queue() almost immediately, (because some other driver structure allocation failed or some other error happened) then blk_throtl_exit() will run into issues as queue lock is not initialized. Loop driver ran into this issue recently and I noticed error paths in md driver too. Similar error paths should exist in other drivers too. - If some driver provided external spin lock and zapped the lock before blk_cleanup_queue(), then it can lead to issues. So this patch initializes the default queue lock at queue allocation time. block throttling code is one of the users of queue lock and it is initialized at the queue allocation time, so it makes sense to initialize ->queue_lock also to internal lock. A driver can overide that lock later. This will take care of the issue where a driver does not have to worry about initializing the queue lock to default before calling blk_cleanup_queue() Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 17 Dec, 2010 2 commits
-
-
Mike Snitzer authored
Implement blk_limits_max_hw_sectors() and make blk_queue_max_hw_sectors() a wrapper around it. DM needs this to avoid setting queue_limits' max_hw_sectors and max_sectors directly. dm_set_device_limits() now leverages blk_limits_max_hw_sectors() logic to establish the appropriate max_hw_sectors minimum (PAGE_SIZE). Fixes issue where DM was incorrectly setting max_sectors rather than max_hw_sectors (which caused dm_merge_bvec()'s max_hw_sectors check to be ineffective). Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Martin K. Petersen authored
When stacking devices, a request_queue is not always available. This forced us to have a no_cluster flag in the queue_limits that could be used as a carrier until the request_queue had been set up for a metadevice. There were several problems with that approach. First of all it was up to the stacking device to remember to set queue flag after stacking had completed. Also, the queue flag and the queue limits had to be kept in sync at all times. We got that wrong, which could lead to us issuing commands that went beyond the max scatterlist limit set by the driver. The proper fix is to avoid having two flags for tracking the same thing. We deprecate QUEUE_FLAG_CLUSTER and use the queue limit directly in the block layer merging functions. The queue_limit 'no_cluster' is turned into 'cluster' to avoid double negatives and to ease stacking. Clustering defaults to being enabled as before. The queue flag logic is removed from the stacking function, and explicitly setting the cluster flag is no longer necessary in DM and MD. Reported-by:
Ed Lin <ed.lin@promise.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Acked-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 13 Oct, 2010 1 commit
-
-
Martin K. Petersen authored
Physical block size was declared unsigned int to accomodate the maximum size reported by READ CAPACITY(16). Make sure we use the right type in the related functions. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Acked-by:
Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 01 Oct, 2010 2 commits
-
-
Malahal Naineni authored
The bounce_pfn of the request queue in 64 bit systems is set to the current max_low_pfn. Adding more memory later makes this incorrect. Memory allocated beyond this boot time max_low_pfn appear to require bounce buffers (bounce buffers are actually not allocated but used in calculating segments that may result in "over max segments limit" errors). Signed-off-by:
Malahal Naineni <malahal@us.ibm.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Jens Axboe authored
Revert "block: set the bounce_pfn to the actual DMA limit rather than to max memory" This reverts commit c49825fa. Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 24 Sep, 2010 1 commit
-
-
Malahal Naineni authored
The bounce_pfn of the request queue in 64 bit systems is set to the current max_low_pfn. Adding more memory later makes this incorrect. Memory allocated beyond this boot time max_low_pfn appear to require bounce buffers (bounce buffers are actually not allocated but used in calculating segments that may result in "over max segments limit" errors). Signed-off-by:
Malahal Naineni <malahal@us.ibm.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 10 Sep, 2010 3 commits
-
-
Martin K. Petersen authored
Some controllers have a hardware limit on the number of protection information scatter-gather list segments they can handle. Introduce a max_integrity_segments limit in the block layer and provide a new scsi_host_template setting that allows HBA drivers to provide a value suitable for the hardware. Add support for honoring the integrity segment limit when merging both bios and requests. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <axboe@carl.home.kernel.dk>
-
Martin K. Petersen authored
We have several users of min_not_zero, each of them using their own definition. Move the define to kernel.h. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <axboe@carl.home.kernel.dk>
-
Tejun Heo authored
Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA requests. Deprecate barrier. All REQ_HARDBARRIERs are failed with -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler blk_queue_flush(). blk_queue_flush() takes combinations of REQ_FLUSH and FUA. If a device has write cache and can flush it, it should set REQ_FLUSH. If the device can handle FUA writes, it should also set REQ_FUA. All blk_queue_ordered() users are converted. * ORDERED_DRAIN is mapped to 0 which is the default value. * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH. * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Boaz Harrosh <bharrosh@panasas.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 07 Aug, 2010 1 commit
-
-
James Bottomley authored
Reviewed-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
- 30 Mar, 2010 1 commit
-
-
Tejun Heo authored
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by:
Tejun Heo <tj@kernel.org> Guess-its-ok-by:
Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-