summaryrefslogtreecommitdiff
path: root/block/sheepdog.c
AgeCommit message (Collapse)AuthorFilesLines
2013-03-19sheepdog: show error message for halt statusLiu Yuan1-0/+2
Sheepdog (neither quorum nor unsafe mode) will refuse to serve IO requests when number of alive nodes is less than that of copies specified by users. This will return 0x19 to QEMU client which currently doesn't recognize it. This patch adds an error description when QEMU client receives it, other than plainly printing 'Invalid error code' Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15sheepdog: set io_flush handler in do_co_reqMORITA Kazutaka1-2/+11
If an io_flush handler is not set, qemu_aio_wait doesn't invoke callbacks. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15sheepdog: use non-blocking fd in coroutine contextMORITA Kazutaka1-4/+2
Using a blocking socket in the coroutine context reduces the chance of switching to other work. This patch makes the sheepdog driver use a non-blocking fd always. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04sheepdog: add support for connecting to unix domain socketMORITA Kazutaka1-12/+70
This patch adds support for a unix domain socket for a connection between qemu and local sheepdog server. You can use the unix domain socket with the following syntax: $ qemu sheepdog+unix:///<vdiname>?socket=<socket path>[#snapid] Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04sheepdog: use inet_connect to simplify connect codeMORITA Kazutaka1-81/+30
This uses the form "<host>:<port>" for the representation of the sheepdog server to use inet_connect. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04sheepdog: accept URIsMORITA Kazutaka1-34/+105
The URI syntax is consistent with the NBD and Gluster syntax. The syntax is sheepdog[+tcp]://[host:port]/vdiname[#snapid|#tag] Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-04move socket_set_nodelay to osdep.cMORITA Kazutaka1-10/+1
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-02-01sheepdog: pass vdi_id to sheep daemon for sd_close()Liu Yuan1-2/+3
Sheep daemon needs vdi_id to identify which vdi is closed to release resources such as object cache. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15sheepdog: clean up sd_aio_setup()Liu Yuan1-6/+5
The last two parameters of sd_aio_setup() are never used, so remove them. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15sheepdog: multiplex the rw FD to flush cacheLiu Yuan1-45/+37
This will reduce sockfds connected to the sheep server to one, which simply the future hacks. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-14sheepdog: implement direct write semanticsLiu Yuan1-30/+40
Sheepdog supports both writeback/writethrough write but has not yet supported DIRECTIO semantics which bypass the cache completely even if Sheepdog daemon is set up with cache enabled. Suppose cache is enabled on Sheepdog daemon size, the new cache control is cache=writeback # enable the writeback semantics for write cache=writethrough # enable the emulated writethrough semantics for write cache=directsync # disable cache competely Guest WCE toggling on the run time to toggle writeback/writethrough is also supported. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-02sheepdog: pass oid directly to send_pending_req()Liu Yuan1-1/+1
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-02sheepdog: don't update inode when create_and_write failsLiu Yuan1-4/+5
For the error case such as SD_RES_NO_SPACE, we shouldn't update the inode bitmap to avoid the scenario that the object is allocated but wasn't created at the server side. This will result in VM's IO error on the failed object. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19misc: move include files to include/qemu/Paolo Bonzini1-3/+3
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19block: move include files to include/block/Paolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-11-14aio: rename AIOPool to AIOCBInfoStefan Hajnoczi1-2/+2
Now that AIOPool no longer keeps a freelist, it isn't really a "pool" anymore. Rename it to AIOCBInfo and make it const since it no longer needs to be modified. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-12sheepdog: use bool for boolean variablesMORITA Kazutaka1-35/+35
This improves readability. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-10-05sheepdog: avoid a few buffer overrunsJim Meyering1-12/+22
* parse_vdiname: Use pstrcpy, not strncpy, when the destination buffer must be NUL-terminated. * sd_open: Likewise, avoid buffer overrun. * do_sd_create: Likewise. Leave the preceding memset, since pstrcpy does not NUL-fill, and filename needs that. * sd_snapshot_create: Add a comment/question. * find_vdi_name: Remove a useless memset. * sd_snapshot_goto: Remove a useless memset. Use pstrcpy to NUL-terminate, because find_vdi_name requires that its vdi arg (filename parameter) be NUL-terminated. It seems ok not to NUL-fill the buffer. Do the same for snapid: remove useless memset-0 (instead, zero tag[0]). Use pstrcpy, not strncpy. * sd_snapshot_list: Use pstrcpy, not strncpy to write into the ->name member. Each must be NUL-terminated. Acked-by: Kevin Wolf <kwolf@redhat.com> Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-24block: do not parse BDRV_O_CACHE_WB in block driversJeff Cody1-8/+6
Block drivers should ignore BDRV_O_CACHE_WB in .bdrv_open flags, and in the bs->open_flags. This patch removes the code, leaving the behaviour behind as if BDRV_O_CACHE_WB was set. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-12sheepdog: fix savevm and loadvmMORITA Kazutaka1-1/+2
This patch sets data to be sent to Sheepdog correctly and fixes savevm and loadvm operations on a Sheepdog image. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-22sheepdog: don't leak socket file descriptor upon connection failureJim Meyering1-0/+1
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-17sheepdog: do not blindly memset all read buffersChristoph Hellwig1-19/+18
Only buffers that map to unallocated blocks need to be zeroed. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-17sheepdog: always use coroutine-based network functionsMORITA Kazutaka1-66/+47
This reduces some code duplication. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09Merge remote-tracking branch 'mjt/mjt-iov2' into stagingAnthony Liguori1-3/+3
* mjt/mjt-iov2: rewrite iov_send_recv() and move it to iov.c cleanup qemu_co_sendv(), qemu_co_recvv() and friends export iov_send_recv() and use it in iov_send() and iov_recv() rename qemu_sendv to iov_send, change proto and move declarations to iov.h change qemu_iovec_to_buf() to match other to,from_buf functions consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent allow qemu_iovec_from_buffer() to specify offset from which to start copying consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset() rewrite iov_* functions change iov_* function prototypes to be more appropriate virtio-serial-bus: use correct lengths in control_out() message Conflicts: tests/Makefile Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-09sheepdog: traverse pending_list from the first for each timeMORITA Kazutaka1-6/+16
The pending list can be modified in other coroutine context sd_co_rw_vector, so we need to traverse the list from the first again after we send the pending request. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09sheepdog: split outstanding list into inflight and pendingMORITA Kazutaka1-25/+24
outstanding_list_head is used for both pending and inflight requests. This patch splits it and improves readability. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09sheepdog: make sure we don't free aiocb before sending all requestsMORITA Kazutaka1-13/+16
This patch increments the pending counter before sending requests, and make sures that aiocb is not freed while sending them. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09sheepdog: use coroutine based socket functions in coroutine contextMORITA Kazutaka1-2/+8
This removes blocking network I/Os in coroutine context. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09sheepdog: restart I/O when socket becomes ready in do_co_req()MORITA Kazutaka1-0/+14
Currently, no one reenters the yielded coroutine. This fixes it. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09sheepdog: fix dprintf format stringsMORITA Kazutaka1-4/+4
This fixes warnings about dprintf format in debug mode. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15sheepdog: add coroutine_fn markers to coroutine functionsMORITA Kazutaka1-4/+5
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-11cleanup qemu_co_sendv(), qemu_co_recvv() and friendsMichael Tokarev1-3/+3
The same as for non-coroutine versions in previous patches: rename arguments to be more obvious, change type of arguments from int to size_t where appropriate, and use common code for send and receive paths (with one extra argument) since these are exactly the same. Use common iov_send_recv() directly. qemu_co_sendv(), qemu_co_recvv(), and qemu_co_recv() are now trivial #define's merely adding one extra arg. qemu_co_sendv() and qemu_co_recvv() callers are converted to different argument order and extra `iov_cnt' argument. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-05-30sheepdog: fix return value of do_load_save_vm_stateMORITA Kazutaka1-5/+5
bdrv_save_vmstate and bdrv_load_vmstate should return the vmstate size on success, and -errno on error. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25sheepdog: use heap instead of stack for BDRVSheepdogStateMORITA Kazutaka1-13/+22
bdrv_create() is called in coroutine context now, so we cannot use more stack than 1 MB in the function if we use ucontext coroutine. This patch allocates BDRVSheepdogState, whose size is 4 MB, on the heap in sd_create(). Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25sheepdog: return -errno on errorMORITA Kazutaka1-32/+46
On error, BlockDriver APIs should return -errno instead of -1. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25sheepdog: mark image as snapshot when tag is specifiedMORITA Kazutaka1-1/+1
When a snapshot tag is specified in the filename, the opened image is a snapshot. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07sheepdog: switch to writethrough mode if cluster doesn't support flushMORITA Kazutaka1-0/+8
This is necessary for qemu to work with the older version of Sheepdog which doesn't support SD_OP_FLUSH_VDI. Signed-off-by: MORITA Kazutaka <morita.kazutaka@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-19aio: remove process_queue callback and qemu_aio_process_queuePaolo Bonzini1-6/+5
Both unused after the previous patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05sheepdog: fix send req helpersLiu Yuan1-0/+2
We should return if reading of the header fails. Cc: Kevin Wolf <kwolf@redhat.com> Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05sheepdog: implement SD_OP_FLUSH_VDI operationLiu Yuan1-14/+128
Flush operation is supposed to flush the write-back cache of sheepdog cluster. By issuing flush operation, we can assure the Guest of data reaching the sheepdog cluster storage. Cc: Kevin Wolf <kwolf@redhat.com> Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09sheepdog: fix co_recv coroutine contextMORITA Kazutaka1-0/+3
The co_recv coroutine has two things that will try to enter it: 1. The select(2) read callback on the sheepdog socket. 2. The aio_add_request() blocking operations, including a coroutine mutex. This patch fixes it by setting NULL to co_recv before sending data. In future, we should make the sheepdog driver fully coroutine-based and simplify request handling. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-01-13prepare for future GPLv2+ relicensingPaolo Bonzini1-0/+3
All files under GPLv2 will get GPLv2+ changes starting tomorrow. event_notifier.c and exec-obsolete.h were only ever touched by Red Hat employees and can be relicensed now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-22move corking functions to osdep.cPaolo Bonzini1-18/+2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22sheepdog: move coroutine send/recv function to generic codePaolo Bonzini1-209/+21
Outside coroutines, avoid busy waiting on EAGAIN by temporarily making the socket blocking. The API of qemu_recvv/qemu_sendv is slightly different from do_readv/do_writev because they do not handle coroutines. It returns the number of bytes written before encountering an EAGAIN. The specificity of yielding on EAGAIN is entirely in qemu-coroutine.c. Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-05block: Add coroutine_fn marker to coroutine functionsDong Xu Wang1-2/+2
Looks better when reviewing these source files. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-11-22sheepdog: Avoid deadlock in error pathDong Xu Wang1-0/+2
s->lock should be unlocked before leaving add_aio_request. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21sheepdog: add coroutine_fn markersPaolo Bonzini1-7/+7
This makes the following patch easier to review. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-14sheepdog: correct spellingDong Xu Wang1-1/+1
Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-08-24sheepdog: use coroutinesMORITA Kazutaka1-57/+93
This makes the sheepdog block driver support bdrv_co_readv/writev instead of bdrv_aio_readv/writev. With this patch, Sheepdog network I/O becomes fully asynchronous. The block driver yields back when send/recv returns EAGAIN, and is resumed when the sheepdog network connection is ready for the operation. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-20Use glib memory allocation and free functionsAnthony Liguori1-26/+26
qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>