summaryrefslogtreecommitdiff
path: root/block/raw-posix.c
AgeCommit message (Collapse)AuthorFilesLines
2010-12-17raw-posix: add discard supportChristoph Hellwig1-0/+45
Add support to discard blocks in a raw image residing on an XFS filesystem by calling the XFS_IOC_UNRESVSP64 ioctl to punch holes. Support for other hole punching mechanisms can be added when they become available. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-11-26raw-posix: raw_pwrite comment fixupChristoph Hellwig1-1/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-11-04block: Allow bdrv_flush to return errorsKevin Wolf1-2/+2
This changes bdrv_flush to return 0 on success and -errno in case of failure. It's a requirement for implementing proper error handle in users of bdrv_flush. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-10-23qemu-timer: move commonly used timer code to qemu-timer-commonBlue Swirl1-6/+6
Move timer init functions to a new file, qemu-timer-common.c. Make other critical timer functions inlined to preserve performance in qemu-timer.c, also move muldiv64() (used by the inline functions) to qemu-timer.h. Adjust block/raw-posix.c and simpletrace.c to use get_clock() directly. Remove a similar/duplicate definition in qemu-tool.c. Adjust hw/omap_clk.c to include qemu-timer.h because muldiv64() is used there. After this change, tracing can be used also for user code and simpletrace on Win32. Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Acked-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-09-21raw-posix: handle > 512 byte alignment correctlyChristoph Hellwig1-33/+46
Replace the hardcoded handling of 512 byte alignment with bs->buffer_alignment to handle larger sector size devices correctly. Note that we can not rely on it to be initialize in bdrv_open, so deal with the worst case there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08raw-posix: improve detection of scsi-generic devicesBernhard Kohl1-2/+8
Allow symbolic links which point to /dev/sgX devices. Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-08raw-posix: Don't use file name for host_cdrom detection on LinuxKevin Wolf1-3/+0
On Linux, we have code to detect CD-ROMs using an ioctl. We shouldn't lose anything but false positives by removing the check for a /dev/cd* path. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-08-03block: Fix bdrv_has_zero_initKevin Wolf1-4/+9
Assuming that any image on a block device is not properly zero-initialized is actually wrong: Only raw images have this problem. Any other image format shouldn't care about it, they initialize everything properly themselves. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-25block: Replace u_int8_t, u_int16_t, u_int32_t, u_int64_t by standard int typesStefan Weil1-1/+1
There is no need to have a second set of integral types. Replace them by the standard types from stdint.h. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-07-06raw-posix: Fix test for host CD-ROMMarkus Armbruster1-11/+6
raw_pread_aligned() retries up to two times if the block device backs a virtual CD-ROM (a drive with media=cdrom and if=ide, scsi, xen or none). This makes no sense. Whether retrying reads can correct read errors can only depend on what we're reading, not on how the result gets used. We need to check what whether we're reading from a physical CD-ROM or floppy here. I doubt retrying is useful even then. Left for another day. Impact: * Virtual CD-ROM backed by host_cdrom behaves the same. * Virtual CD-ROM backed by file or host_device no longer retries. * A drive backed by host_cdrom now retries even if it's not a virtual CD-ROM. * Any drive backed by host_floppy now retries. While there, clean up gratuitous use of goto. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-04Cleanup: raw-posix.c: Be more consistent using BDRV_SECTOR_SIZE instead of 512Jes Sorensen1-9/+11
Clean up raw-posix.c to be more consistent using BDRV_SECTOR_SIZE instead of hard coded 512 values. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03raw-posix: Use pread/pwrite instead of lseek+read/writeStefan Hajnoczi1-33/+4
This patch combines the lseek+read/write calls to use pread/pwrite instead. This will result in fewer system calls and is already used by AIO. Thanks to Jan Kiszka <jan.kiszka@siemens.com> for identifying excessive lseek and Christoph Hellwig <hch@lst.de> for confirming that this approach should work. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: Open the underlying image file in generic codeKevin Wolf1-5/+5
Format drivers shouldn't need to bother with things like file names, but rather just get an open BlockDriverState for the underlying protocol. This patch introduces this behaviour for bdrv_open implementation. For protocols which need to access the filename to open their file/device/connection/... a new callback bdrv_file_open is introduced which doesn't get an underlying file opened. For now, also some of the more obscure formats use bdrv_file_open because they open() the file themselves instead of using the block.c functions. They need to be fixed in later patches. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03block: separate raw images from the file protocolChristoph Hellwig1-5/+10
We're running into various problems because the "raw" file access, which is used internally by the various image formats is entangled with the "raw" image format, which maps the VM view 1:1 to a file system. This patch renames the raw file backends to the file protocol which is treated like other protocols (e.g. nbd and http) and adds a new "raw" image format which is just a wrapper around calls to the underlying protocol. The patch is surprisingly simple, besides changing the probing logical in block.c to only look for image formats when using bdrv_open and renaming of the old raw protocols to file there's almost nothing in there. For creating images, a new bdrv_create_file is introduced which guesses the protocol to use. This allows using qemu-img create -f raw (or just using the default) for both files and host devices. Converting the other format drivers to use this function to create their images is left for later patches. The only issues still open are in the handling of the host devices. Firstly in current qemu we can specifiy the host* format names on various command line acceping images, but the new code can't do that without adding some translation. Second the layering breaks the no_zero_init flag in the BlockDriver used by qemu-img. I'm not happy how this is done per-driver instead of per-state so I'll prepare a separate patch to clean this up. There's some more cleanup opportunity after this patch, e.g. using separate lists and registration functions for image formats vs protocols and maybe even host drivers, but this can be done at a later stage. Also there's a check for protocol in bdrv_open for the BDRV_O_SNAPSHOT case that I don't quite understand, but which I fear won't work as expected - possibly even before this patch. Note that this patch requires various recent block patches from Kevin and me, which should all be in his block queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23block: split raw_getlengthChristoph Hellwig1-23/+42
Split up the raw_getlength into separate generic, solaris and BSD versions to reduce the ifdef maze a bit. The BSD variant still is a complete maze, but to clean it up properly we'd need some people using the BSD variants to figure out what code is used for what variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-10raw-posix: don't assign bs->read_onlyChristoph Hellwig1-1/+0
bdrv_open already takes care of this for us. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-03-27raw-posix: Better error return values for hdev_createKevin Wolf1-3/+3
Now that we output an error message according to the returned error code in qemu-img, let's return the real error codes. "Input/output error" for everything isn't helpful. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-01-26block/raw-posix: Abort on pread beyond end of non-growable fileKevin Wolf1-1/+5
This shouldn't happen under any normal circumstances. However, it looks like it's possible to achieve this with corrupted images. Without this patch raw_pread is hanging in an endless loop in such cases. The patch is not affecting growable files, for which such reads happen in normal use cases. raw_pread_aligned already handles these cases and won't return zero in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26win32: pair qemu_memalign() with qemu_vfree()Herve Poussineau1-1/+1
Win32 suffers from a very big memory leak when dealing with SCSI devices. Each read/write request allocates memory with qemu_memalign (ie VirtualAlloc) but frees it with qemu_free (ie free). Pair all qemu_memalign() calls with qemu_vfree() to prevent such leaks. Signed-off-by: Herve Poussineau <hpoussin@reactos.org> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26block: kill BDRV_O_CREATChristoph Hellwig1-5/+1
The BDRV_O_CREAT option is unused inside qemu and partially duplicates the bdrv_create method. Remove it, and the -C option to qemu-io which isn't used in qemu-iotests anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-20Clean-up a little bit the RW related bits of BDRV_O_FLAGS. BDRV_O_RDONLY ↵Naphtali Sprei1-1/+1
gone (and so is BDRV_O_ACCESS). Default value for bdrv_flags (0/zero) is READ-ONLY. Need to explicitly request READ-WRITE. Instead of using the field 'readonly' of the BlockDriverState struct for passing the request, pass the request in the flags parameter to the function. Signed-off-by: Naphtali Sprei <nsprei@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-19raw-posix: Detect legacy floppy via ioctl on linuxCole Robinson1-2/+19
Current legacy floppy detection is hardcoded based on source file name. Make this smarter on linux by attempting a floppy specific ioctl. v2: Give ioctl check higher priority than filename check s/IDE/legacy/ v3: Actually initialize 'prio' variable Check for ioctl success rather than absence of specific failure v4: Explicitly mention that change is linux specific. Signed-off-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-19raw-posix: Detect CDROM via ioctl on linuxCole Robinson1-2/+18
Current CDROM detection is hardcoded based on source file name. Make this smarter on linux by attempting a CDROM specific ioctl. This makes '-cdrom /dev/sr0' succeed with no media present. v2: Give ioctl check higher priority than filename check. v3: Actually initialize 'prio' variable. Check for ioctl success rather than absence of specific failure. v4: Explicitly mention that change is linux specific. Signed-off-by: Cole Robinson <crobinso@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03Don't leak file descriptorsKevin Wolf1-1/+1
We're leaking file descriptors to child processes. Set FD_CLOEXEC on file descriptors that don't need to be passed to children to stop this misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03qemu-img: There is more than one host device driverKevin Wolf1-0/+4
I haven't heard yet of anyone using qemu-img to copy an image to a real floppy, but it's a valid use case. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-29Add support for GNU/kFreeBSDAurelien Jarno1-8/+8
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2009-10-30Remove aio_ctx from paio_* interfaceKevin Wolf1-6/+4
The context parameter in paio_submit isn't used anyway, so there is no reason why block drivers should need to remember it. This also avoids passing a Linux AIO context to paio_submit (which doesn't do any harm as long as the parameter is unused, but it is highly confusing). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27raw/linux-aio: Also initialize POSIX AIOKevin Wolf1-0/+4
When using Linux AIO raw still falls back to POSIX AIO sometimes, so we should initialize it. Not initializing it happens to work if POSIX AIO is used by another drive, or if the format is not specified (probing the format uses POSIX AIO) or by pure luck (e.g. it doesn't seem to happen any more with qcow2 since we have re-added synchronous qcow2 functions). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-05block/raw: Add create_options for host_deviceKevin Wolf1-6/+10
Today host_devices have a create function, so they also need a create_options field to prevent qemu-img from complaining. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11block: add aio_flush operationChristoph Hellwig1-0/+17
Instead stalling the VCPU while serving a cache flush try to do it asynchronously. Use our good old helper thread pool to issue an asynchronous fdatasync for raw-posix. Note that while Linux AIO implements a fdatasync operation it is not useful for us because it isn't actually implement in asynchronous fashion. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-11block: use fdatasync instead of fsync if possibleChristoph Hellwig1-1/+1
If we are flushing the caches for our image files we only care about the data (including the metadata required for accessing it) but not things like timestamp updates. So try to use fdatasync instead of fsync to implement the flush operations. Unfortunately many operating systems still do not support fdatasync, so we add a qemu_fdatasync wrapper that uses fdatasync if available as per the _POSIX_SYNCHRONIZED_IO feature macro or fsync otherwise. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-28Don't compile aio code if CONFIG_LINUX_AIO is undefinedStefan Weil1-1/+7
This patch fixes linker errors when building QEMU without Linux AIO support. It is based on suggestions from malc and Kevin Wolf. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27raw-posix: add Linux native AIO supportChristoph Hellwig1-5/+25
Now that do have a nicer interface to work against we can add Linux native AIO support. It's an extremly thing layer just setting up an iocb for the io_submit system call in the submission path, and registering an eventfd with the qemu poll handler to do complete the iocbs directly from there. This started out based on Anthony's earlier AIO patch, but after estimated 42,000 rewrites and just as many build system changes there's not much left of it. To enable native kernel aio use the aio=native sub-command on the drive command line. I have also added an option to qemu-io to test the aio support without needing a guest. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27raw-posix: refactor AIO supportChristoph Hellwig1-236/+39
Currently the raw-posix.c code contains a lot of knowledge about the asynchronous I/O scheme that is mostly implemented in posix-aio-compat.c. All this code does not really belong here and is getting a bit in the way of implementing native AIO on Linux. So instead move all the guts of the AIO implementation into posix-aio-compat.c (which might need a better name, btw). There's now a very small interface between the AIO providers and raw-posix.c: - an init routine is called from raw_open_common to return an AIO context for this drive. An AIO implementation may either re-use one context for all drives, or use a different one for each as the Linux native AIO support will do. - an submit routine is called from the aio_reav/writev methods to submit an AIO request There are no indirect calls involved in this interface as we need to decide which one to call manually. We will only call the Linux AIO native init function if we were requested to by vl.c, and we will only call the native submit function if we are asked to and the request is properly aligned. That's also the reason why the alignment check actually does the inverse move and now goes into raw-posix.c. The old posix-aio-compat.h headers is removed now that most of it's content is private to posix-aio-compat.c, and instead we add a new block/raw-posix-aio.h headers is created containing only the tiny interface between raw-posix.c and the AIO implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-24make pthreads mandatoryChristoph Hellwig1-27/+1
As requested by Anthony make pthreads mandatory. This means we will always have AIO available on posix hosts, and it will also allow enabling the I/O thread unconditionally once it's ready. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-27rename HOST_BSD to CONFIG_BSDJuan Quintela1-2/+2
Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-20Fix most warnings (errors with -Werror) when debugging is enabledBlue Swirl1-0/+1
I used the following command to enable debugging: perl -p -i -e 's/^\/\/#define DEBUG/#define DEBUG/g' * */* */*/* Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-07-16raw-posix: Handle errors in raw_createStefan Weil1-5/+12
In qemu-iotests, some large images are created using qemu-img. Without checks for errors, qemu-img will just create an empty image, and later read / write tests will fail. With the patch, failures during image creation are detected and reported. Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-09Substitute O_DSYNC with O_SYNC or O_FSYNC when needed.G 31-0/+4
Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-09Revert "support colon in filenames"Anthony Liguori1-1/+0
This reverts commit 707c0dbc97cddfe8d2441b8259c6c526d99f2dd8. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29block-raw: Allow pread beyond the end of growable imagesKevin Wolf1-0/+11
When using O_DIRECT, qcow2 snapshots didn't work any more for me. In the process of creating the snapshot, qcow2 tries to pwrite some new information (e.g. new L1 table) which will often end up being after the old end of the image file. Now pwrite tries to align things and reads the old contents of the file, read returns 0 because there is nothing to read after the end of file and pwrite is stuck in an endless loop. This patch allows to pread beyond the end of an image file. Whenever the given offset is after the end of the image file, the read succeeds and fills the buffer with zeros. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29support colon in filenamesRam Pai1-0/+1
Problem: It is impossible to feed filenames with the character colon because qemu interprets such names as a protocol. For example filename scsi:0, is interpreted as a protocol by name "scsi". This patch allows user to espace colon characters. For example the above filename can now be expressed either as 'scsi\:0' or as file:scsi:0 anything following the "file:" tag is interpreted verbatin. However if "file:" tag is omitted then any colon characters in the string must be escaped using backslash. Here are couple of examples: scsi\:0\:abc is a local file scsi:0:abc http\://myweb is a local file by name http://myweb file:scsi:0:abc is a local file scsi:0:abc file:http://myweb is a local file by name http://myweb Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-17Fix opening of read only raw imagesBlue Swirl1-16/+15
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-06-16raw-posix: Remove O_RDWR when attempting to open a file read-onlyAvi Kivity1-0/+1
When we open a file, we first attempt to open it read-write, then fall back to read-only. Unfortunately we reuse the flags from the previous attempt, so both attempts try to open the file with write permissions, and fail. Fix by clearing the O_RDWR flag from the previous attempt. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16raw-posix: open flags use BDRV_ namespace, not posix namespaceAvi Kivity1-1/+1
The flags argument to raw_common_open() contain bits defined by the BDRV_O_* namespace, not the posix O_* namespace. Adjust to use the correct constants. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-15raw-posix: cleanup ioctl methodsChristoph Hellwig1-34/+8
Rename raw_ioctl and raw_aio_ioctl to hdev_ioctl and hdev_aio_ioctl as they are only used for the host device. Also only add them to the method table for the cases where we need them (generic hdev if linux and linux CDROM) instead of declaring stubs and always add them. Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-06-15block: add bdrv_probe_device methodChristoph Hellwig1-0/+47
Add a bdrv_probe_device method to all BlockDriver instances implementing host devices to move matching of host device types into the actual drivers. For now we keep exacly the old matching behaviour based on the devices names, although we really should have better detetion methods based on device information in the future. Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-06-15raw-posix: split hdev driversChristoph Hellwig1-260/+304
Instead of declaring one BlockDriver for all host devices declared one for each type: a generic one for normal disk devices, a Linux floppy driver and a CDROM driver for Linux and FreeBSD. This gets rid of a lot of messy ifdefs and switching based on the type in the various removal device methods. block.c grows a new method to find the correct host device driver based on OS-sepcific criteria, which will later into the actual drivers in a later patch in this series. Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-06-15raw-posix: add a raw_open_common helperChristoph Hellwig1-29/+19
raw_open and hdev_open contain the same basic logic. Add a new raw_open_common helper containing the guts of the open routine and call it from raw_open and hdev_open. We use the new open_flags field in BDRVRawState to allow passing additional open flags to raw_open_common from both. Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-06-15raw-posix: always store open flagsChristoph Hellwig1-26/+21
Both the Linux floppy and the FreeBSD CDROM host device need to store the open flags so that they can re-open the device later. Store the open flags unconditionally to remove the ifdef mess and simply the calling conventions for the later patches in the series. Signed-off-by: Christoph Hellwig <hch@lst.de>