- 24 Jan, 2014 19 commits
-
-
Kevin Wolf authored
This is going to become the bdrv_co_do_preadv() equivalent for writes. In this patch, however, just a function taking byte offsets is created, it doesn't align anything yet. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
First waiting for all COR requests to complete and calling the throttling function afterwards means that the request could be delayed and we still need to wait for the COR request even if it was issued only after the throttled write request. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This separates the part of bdrv_co_do_writev() that needs to happen before the request is modified to match the backend alignment, and a part that needs to be executed afterwards and passes the request to the BlockDriver. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Similar to bdrv_pread(), which aligns byte-aligned request to 512 byte sectors, bdrv_co_do_preadv() takes a byte-aligned request and aligns it to the alignment specified in bs->request_alignment. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This separates the part of bdrv_co_do_readv() that needs to happen before the request is modified to match the backend alignment, and a part that needs to be executed afterwards and passes the request to the BlockDriver. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Paolo Bonzini authored
Add a bs->request_alignment field that contains the required offset/length alignment for I/O requests and fill it in the raw block drivers. Use ioctls if possible, else see what alignment it takes for O_DIRECT to succeed. While at it, also expose the memory alignment requirements, which may be (and in practice are) different from the disk alignment requirements. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Paolo Bonzini authored
The alignment field is now set to the value that is promised to the guest, rather than required by the host. The next patches will make QEMU aware of the host-provided values, so make this clear. The alignment is also not about memory buffers, but about the sectors on the disk, change the documentation of the field. At this point, the field is set by the device emulation, but completely ignored by the block layer. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
bs->buffer_alignment is set by the device emulation and contains the logical block size of the guest device. This isn't something that the block layer should know, and even less something to use for determining the right alignment of buffers to be used for the host. The new BlockLimits field opt_mem_alignment tells the qemu block layer the optimal alignment to be used so that no bounce buffer must be used in the driver. This patch may change the buffer alignment from 4k to 512 for all callers that used qemu_blockalign() with the top-level image format BlockDriverState. The value was never propagated to other levels in the tree, so in particular raw-posix never required anything else than 512. While on disks with 4k sectors direct I/O requires a 4k alignment, memory may still be okay when aligned to 512 byte boundaries. This is what must have happened in practice, because otherwise this would already have failed earlier. Therefore I don't expect regressions even with this intermediate state. Later, raw-posix can implement the hook and expose a different memory alignment requirement. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
For an O_DIRECT request to succeed, it's not only necessary that all base addresses in the qiov are aligned, but also that each length in it is aligned. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
When reopening with different flags, or when backing files disappear from the chain, the limits may change. Make sure they get updated in these cases. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoît Canet <benoit@irqsave.net>
-
Kevin Wolf authored
When there is a format driver between the backend, it's not guaranteed that exposing the opt_transfer_length for the format driver results in the optimal requests (because of fragmentation etc.), but it can't make things worse, so let's just do it. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoît Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This function separates filling the BlockLimits from bdrv_open(), which allows it to call it from other operations which may change the limits (e.g. modifications to the backing file chain or bdrv_reopen) Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
bdrv_commit() could return 0 or 1 on success, depending on whether or not the last sector was allocated in the overlay and whether the overlay format had a .bdrv_make_empty callback. Most callers ignored it, but qemu-img commit would print an error message while the operation actually succeeded. Also clean up the handling of I/O errors to return the real error code instead of -EIO. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Jeff Cody authored
Currently, if an image file is logically larger than its backing file, committing it via 'qemu-img commit' will fail. For instance, if we have a base image with a virtual size 10G, and a snapshot image of size 20G, then committing the snapshot offline with 'qemu-img commit' will likely fail. This will automatically attempt to resize the base image, if the snapshot image to be committed is larger. Signed-off-by:
Jeff Cody <jcody@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> There was two candidate ways to implement named node manipulation: 1) { 'command': 'block_passwd', 'data': {'*device': 'str', '*node-name': 'str', 'password': 'str'} } 2) { 'command': 'block_passwd', 'data': {'device': 'str', '*device-is-node': 'bool', 'password': 'str'} } Luiz proposed 1 and says 2 was an abuse of the QMP interface and proposed to rewrite the QMP block interface for 2.0. Luiz does not like in 1 the fact that 2 fields are optional but one of them must be specified leading to an abuse of the QMP semantic. Kevin argumented that 2 what a clear abuse of the device field and would not be practical when reading fast some log file because the user would read "device" and think that a device is manipulated when it's in fact a node name. Documentation of 1 make it pretty clear what to do for the user. Kevin argued that all bs are node including devices ones so 2 does not make sense. Kevin also argued that rewriting the QMP block interface would not make disapear the current one. Kevin pushed the argument that making the QAPI generator compatible with the semantic of the operation would need a rewrite that no one has done yet. A vote has been done on the list to elect the version to use and 1 won. For reference the complete thread is: "[Qemu-devel] [PATCH V4 4/7] qmp: Allow to change password on names block driver states." Signed-off-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Add the minimum of code to prepare for the following patches. Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- 22 Jan, 2014 7 commits
-
-
Peter Feiner authored
When a backing file is opened such that (1) a protocol is directly used as the block driver and (2) the block driver has bdrv_file_open, bdrv_open_backing_file segfaults. The problem arises because bdrv_open_common returns without setting bd->backing_hd->file. To effect (1), you seem to have to use the -F flag in qemu-img. There are several block drivers that satisfy (2), such as "file" and "nbd". Here are some concrete examples: #!/bin/bash echo Test file format ./qemu-img create -f file base.file 1m ./qemu-img create -f qcow2 -F file -o backing_file=base.file\ file-overlay.qcow2 ./qemu-img convert -O raw file-overlay.qcow2 file-convert.raw echo Test nbd format SOCK=$PWD/nbd.sock ./qemu-img create -f raw base.raw 1m ./qemu-nbd -t -k $SOCK base.raw & trap "kill $!" EXIT while ! test -e $SOCK; do sleep 1; done ./qemu-img create -f qcow2 -F nbd -o backing_file=nbd:unix:$SOCK\ nbd-overlay.qcow2 ./qemu-img convert -O raw nbd-overlay.qcow2 nbd-convert.raw Without this patch, the two qemu-img convert commands segfault. This is a regression that was introduced in v1.7 by dbecebdd. Signed-off-by:
Peter Feiner <peter@gridcentric.ca> Reviewed-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Max Reitz authored
It should be possible to use a format as a driver for a file which in turn requires another file, i.e., nesting file formats. Allowing nested file formats results in e.g. qcow2 BlockDriverStates never being directly passed to bdrv_open_common() from bdrv_file_open(), but instead being handed through bdrv_open(). This changes the error message when trying to give a filename to qcow2, i.e. trying to use it as a driver for the protocol level. Therefore, change the reference output of I/O test 051 accordingly. Signed-off-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Max Reitz authored
Using bdrv_open_image() instead of bdrv_file_open() directly in bdrv_open() is easier. Signed-off-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Max Reitz authored
Add a common function for opening images to be used for block drivers specified through BlockdevRefs in an option QDict. The difference from bdrv_file_open() is that this function may invoke bdrv_open() instead, allowing auto-detection of the driver to be used; and second, it automatically extracts the BlockdevRef from the option QDict. Signed-off-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Max Reitz authored
blkdebug and blkverify will, in order to retain compatibility, not support the field "file" implicitly through bdrv_open(). In order to be able to use those drivers without giving a filename anyway, it is necessary to be able to have block devices without files implicitly opened by bdrv_open(). This is the case, if there was neither a file name, a reference to an existing block device to use as a file nor options specific to the file. Signed-off-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Max Reitz authored
With that now being possible, bdrv_open() should try to extract a block device reference from the options and pass it to bdrv_file_open(). Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Max Reitz authored
Allow specifying a reference to an existing block device (by name) for bdrv_file_open() instead of a filename and/or options. Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- 13 Dec, 2013 1 commit
-
-
Peter Lieven authored
during testing around with 4k LUNs a bad target implementation triggert an -EIO in iscsi_get_block_status, but it got never caught resulting in an infinite loop. CC: qemu-stable@nongnu.org Signed-off-by:
Peter Lieven <pl@kamp.de> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
- 06 Dec, 2013 1 commit
-
-
Stefan Hajnoczi authored
Since cc0681c4 ("block: Enable the new throttling code in the block layer.") bdrv_drain_all() no longer spins. The code used to look as follows: do { busy = qemu_aio_wait(); /* FIXME: We do not have timer support here, so this is effectively * a busy wait. */ QTAILQ_FOREACH(bs, &bdrv_states, list) { while (qemu_co_enter_next(&bs->throttled_reqs)) { busy = true; } } } while (busy); Note that throttle requests are kicked but I/O throttling limits are still in effect. The loop spins until the vm_clock time allows the request to make progress and complete. The new throttling code introduced bdrv_start_throttled_reqs(). This function not only kicks throttled requests but also temporarily disables throttling so requests can run. The outdated FIXME comment can be removed. Also drop the busy = true assignment since we overwrite it immediately afterwards. Reviewed-by:
Alex Bligh <alex@alex.org.uk> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- 04 Dec, 2013 1 commit
-
-
Max Reitz authored
Leaving the backing file open although it is not needed anymore can cause problems if it is opened through a block driver which allows exclusive access only and if the create function of the block driver used for the top image (the one being created) tries to close and reopen the image file (which will include opening the backing file a second time). In particular, this will happen with a backing file opened through qemu-nbd and using qcow2 as the top image file format (which reopens the image to flush it to disk). In addition, the BlockDriverState in bdrv_img_create() is used for the backing file only; it should therefore be made local to the respective block. Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- 03 Dec, 2013 6 commits
-
-
Paolo Bonzini authored
Right now, bdrv_co_do_write_zeroes will only try to align the beginning of the request. However, it is simpler for many formats to expect the block layer to separate both the head *and* the tail. This makes sure that the format's bdrv_co_write_zeroes function will be called with aligned sector_num and nb_sectors for the bulk of the request. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Peter Lieven <pl@kamp.de> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
Paolo Bonzini authored
Similar to write_zeroes, let the generic code receive a ENOTSUP for discard operations. Since bdrv_discard has advisory semantics, we can just swallow the error. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Peter Lieven <pl@kamp.de> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
Paolo Bonzini authored
This will be used by the SCSI layer. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Peter Lieven <pl@kamp.de> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Peter Lieven <pl@kamp.de> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
Paolo Bonzini authored
This lets bdrv_co_do_rw receive flags, so that it can be used for zero writes. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Peter Lieven <pl@kamp.de> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
Paolo Bonzini authored
bdrv_co_discard is only covering drivers which have a .bdrv_co_discard() implementation, but not those with .bdrv_aio_discard(). Not very nice, and easy to avoid. Suggested-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Peter Lieven <pl@kamp.de> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
- 29 Nov, 2013 5 commits
-
-
Kevin Wolf authored
If you open an image temporarily just because you want to check its size or get it flushed, there's no real reason to open the whole backing file chain. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
In the case of snapshot=on, don't rely on the backing file path in the temporary image any more, but override the backing file with the given set of options. This way, block drivers that don't use a file name can be accessed with snapshot=on, for example: -drive file.driver=nbd,file.host=localhost,snapshot=on Which becomes internally something like: file.filename=/tmp/vl.AWQZCu,backing.file.driver=nbd,backing.file.host=localhost Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fam Zheng authored
This adds "remove_break" command which is the reverse of blkdebug command "break": it removes all breakpoints with given tag and resumes all the requests. Signed-off-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Stefan Hajnoczi <stefanha@redhat.com>
-
Fam Zheng authored
We have multiple dirty bitmaps in BDS now, switch QAPI to allow query it (BlockInfo.dirty_bitmaps), and also drop old BlockInfo.dirty. Signed-off-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fam Zheng authored
Previously a BlockDriverState has only one dirty bitmap, so only one caller (e.g. a block job) can keep track of writing. This changes the dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the lifecycle is managed with these new functions: bdrv_create_dirty_bitmap bdrv_release_dirty_bitmap Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap. In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument is added to these functions, since each caller has its own dirty bitmap: bdrv_get_dirty bdrv_dirty_iter_init bdrv_get_dirty_count bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will internally walk the list of all dirty bitmaps and set them one by one. Signed-off-by:
Fam Zheng <famz@redhat.com> Reviewed-by:
Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-