summaryrefslogtreecommitdiff
path: root/util/oslib-posix.c
AgeCommit message (Collapse)AuthorFilesLines
2016-04-22util: align memory allocations to 2M on AArch64Christoffer Dall1-1/+2
For KVM to use Transparent Huge Pages (THP) we have to ensure that the alignment of the userspace address of the KVM memory slot and the IPA that the guest sees for a memory region have the same offset from the 2M huge page size boundary. One way to achieve this is to always align the IPA region at a 2M boundary and ensure that the mmap alignment is also at 2M. Unfortunately, we were only doing this for __arm__, not for __aarch64__, so add this simple condition. This fixes a performance regression using KVM/ARM on AArch64 platforms that showed a performance penalty of more than 50%, introduced by the following commit: 9fac18f (oslib: allocate PROT_NONE pages on top of RAM, 2015-09-10) We were only lucky before the above commit, because we were allocating large regions and naturally getting a 2M alignment on those allocations then. Cc: qemu-stable@nongnu.org Reported-by: Shih-Wei Li <shihwei@cs.columbia.edu> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: wrapped long line] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-24Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell1-0/+2
* Log filtering from Alex and Peter * Chardev fix from Marc-André * config.status tweak from David * Header file tweaks from Markus, myself and Veronia (Outreachy candidate) * get_ticks_per_sec() removal from Rutuja (Outreachy candidate) * Coverity fix from myself * PKE implementation from myself, based on rth's XSAVE support # gpg: Signature made Thu 24 Mar 2016 20:15:11 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (28 commits) target-i386: implement PKE for TCG config.status: Pass extra parameters char: translate from QIOChannel error to errno exec: fix error handling in file_ram_alloc cputlb: modernise the debug support qemu-log: support simple pid substitution for logs target-arm: dfilter support for in_asm qemu-log: dfilter-ise exec, out_asm, op and opt_op qemu-log: new option -dfilter to limit output qemu-log: Improve the "exec" TB execution logging qemu-log: Avoid function call for disabled qemu_log_mask logging qemu-log: correct help text for -d cpu tcg: pass down TranslationBlock to tcg_code_gen util: move declarations out of qemu-common.h Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND hw: explicitly include qemu-common.h and cpu.h include/crypto: Include qapi-types.h or qemu/bswap.h instead of qemu-common.h isa: Move DMA_transfer_handler from qemu-common.h to hw/isa/isa.h Move ParallelIOArg from qemu-common.h to sysemu/char.h Move QEMU_ALIGN_*() from qemu-common.h to qemu/osdep.h ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: scripts/clean-includes
2016-03-22util: move declarations out of qemu-common.hVeronia Bahaa1-0/+1
Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22include/qemu/osdep.h: Don't include qapi/error.hMarkus Armbruster1-0/+1
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22Remove unneeded include statements for setjmp.hStefan Weil1-1/+0
As soon as setjmp.h is included from qemu/osdep.h, those old include statements are no longer needed. Add also setjmp.h to the list in scripts/clean-includes. Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-02-16oslib-posix.c: Move workaround for OSX daemon() deprecation to osdep.hPeter Maydell1-9/+0
The right place for "work around issues with system headers" code is osdep.h. Move the workaround for OSX's stdlib.h emitting a deprecation warning for daemon() to that header. This also fixes a problem where running clean-includes on oslib-posix.c would erroneously remove the #include <stdlib.h> from it, breaking the workaround. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-04util: Clean up includesPeter Maydell1-2/+1
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-6-git-send-email-peter.maydell@linaro.org
2015-12-02util/mmap-alloc: fix hugetlb support on ppc64Michael S. Tsirkin1-23/+1
Since commit 8561c9244ddf1122d "exec: allocate PROT_NONE pages on top of RAM", it is no longer possible to back guest RAM with hugepages on ppc64 hosts: mmap(NULL, 285212672, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3fff57000000 mmap(0x3fff57000000, 268435456, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 19, 0) = -1 EBUSY (Device or resource busy) This is because on ppc64, Linux fixes a page size for a virtual address at mmap time, so we can't switch a range of memory from anonymous small pages to hugetlbs with MAP_FIXED. See commit d0f13e3c20b6fb73ccb467bdca97fa7cf5a574cd ("[POWERPC] Introduce address space "slices"") in Linux history for the details. Detect this and create the PROT_NONE mapping using the same fd. Naturally, this makes the guard page bigger with hugetlbfs. Based on patch by Greg Kurz. Acked-by: Rik van Riel <riel@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-22Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into stagingPeter Maydell1-24/+4
vhost, pc, virtio features, fixes, cleanups New features: VT-d support for devices behind a bridge vhost-user migration support Signed-off-by: Michael S. Tsirkin <mst@redhat.com> # gpg: Signature made Thu 22 Oct 2015 12:39:19 BST using RSA key ID D28D5469 # gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" # gpg: aka "Michael S. Tsirkin <mst@redhat.com>" * remotes/mst/tags/for_upstream: (37 commits) hw/isa/lpc_ich9: inject the SMI on the VCPU that is writing to APM_CNT i386: keep cpu_model field in MachineState uptodate vhost: set the correct queue index in case of migration with multiqueue piix: fix resource leak reported by Coverity seccomp: add memfd_create to whitelist vhost-user-test: check ownership during migration vhost-user-test: add live-migration test vhost-user-test: learn to tweak various qemu arguments vhost-user-test: wrap server in TestServer struct vhost-user-test: remove useless static check vhost-user-test: move wait_for_fds() out vhost: add migration block if memfd failed vhost-user: use an enum helper for features mask vhost user: add rarp sending after live migration for legacy guest vhost user: add support of live migration net: add trace_vhost_user_event vhost-user: document migration log vhost: use a function for each call vhost-user: add a migration blocker vhost-user: send log shm fd along with log_base ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-21exec: factor out duplicate mmap codeMichael S. Tsirkin1-24/+4
Anonymous and file-backed RAM allocation are now almost exactly the same. Reduce code duplication by moving RAM mmap code out of oslib-posix.c and exec.c. Reported-by: Marc-André Lureau <mlureau@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-20osdep: add qemu_fork() wrapper for safely handling signalsDaniel P. Berrange1-0/+71
When using regular fork() the child process of course inherits all the parents' signal handlers. If the child then proceeds to close() any open file descriptors, it may break some of those registered signal handlers. The child generally does not want to ever run any of the signal handlers that the parent may have installed in the short time before it exec's. The parent may also have blocked various signals which the child process will want enabled. This introduces a wrapper qemu_fork() that takes care to sanitize signal handling across fork. Before forking it blocks all signals in the parent thread. After fork returns, the parent unblocks the signals and carries on as usual. The child, however, resets all the signal handlers back to their defaults before it unblocks signals. The child process can now exec the binary in a "clean" signal environment. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-10-01oslib: allocate PROT_NONE pages on top of RAMMichael S. Tsirkin1-4/+4
This inserts a read and write protected page between RAM and QEMU memory. This makes it harder to exploit QEMU bugs resulting from buffer overflows in devices using variants of cpu_physical_memory_map, dma_memory_map etc. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01oslib: rework anonimous RAM allocationMichael S. Tsirkin1-2/+10
At the moment we first allocate RAM, sometimes more than necessary for alignment reasons. We then free the extra RAM. Rework this to avoid the temporary allocation: reserve the range by mapping it with PROT_NONE, then use just the necessary range with MAP_FIXED. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-22util: allow \n to terminate password inputDaniel P. Berrange1-1/+2
The qemu_read_password() method looks for \r to terminate the reading of the a password. This is what will be seen when reading the password from a TTY. When scripting though, it is useful to be able to send the password via a pipe, in which case we must look for \n to terminate password input. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-05-22util: move read_password method out of qemu-img into osdep/oslibDaniel P. Berrange1-0/+66
The qemu-img.c file has a read_password() method impl that is used to prompt for passwords on the console, with impls for POSIX and Windows. This will be needed by qemu-io.c too, so move it into the QEMU osdep/oslib files where it can be shared without code duplication Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10oslib-posix: Fix compiler warning (-Wclobbered) and simplify the codeStefan Weil1-2/+2
gcc reports this warning with -Wclobbered: util/oslib-posix.c: In function ‘os_mem_prealloc’: util/oslib-posix.c:374:49: error: argument ‘memory’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] Fix this and simplify the code by using an existing macro. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-11-23memory: expose alignment used for allocating RAM as MemoryRegion APIIgor Mammedov1-1/+4
introduce memory_region_get_alignment() that returns underlying memory block alignment or 0 if it's not relevant/implemented for backend. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-11-02util: Improve os_mem_prealloc error messageMichal Privoznik1-1/+2
Currently, when the preallocating guest memory process fails, a not so helpful error message is printed out: # virsh start migt10 error: Failed to start domain migt10 error: internal error: process exited while connecting to monitor: os_mem_prealloc: failed to preallocate pages From the error message it's not clear at the first glance where the problem lies. However, changing the error message might give users a clue. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15block: Introduce qemu_try_blockalign()Kevin Wolf1-6/+10
This function returns NULL instead of aborting when an allocation fails. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-07-10oslib-posix: Fix new compiler error with -WclobberedStefan Weil1-14/+16
Newer versions of gcc report a warning (or an error with -Werror) when compiler option -Wclobbered (or -Wextra) is active: util/oslib-posix.c:372:12: error: variable ‘hpagesize’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] The rewritten code fixes this warning: variable 'hpagesize' is now set and used in a block without any call of sigsetjmp or similar functions. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-19memory: move preallocation code out of exec.cPaolo Bonzini1-0/+73
So that backends can use it. Since we need the page size for efficiency, move code to compute it out of translate-all.c and into util/oslib-win32.c. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-03-13oslib-posix: Fix build on FreeBSDAndreas Färber1-0/+4
Commit 10f5bff622cad71645e22c027b77ac31e51008ef (util: Split out exec_dir from os_find_datadir) moved code from os-posix.c to util/oslib-posix.c but forgot to move a FreeBSD #include alongside, needed for CTL_KERN among others. Cc: Fam Zheng <famz@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de> Message-id: 1394717279-23406-1-git-send-email-andreas.faerber@web.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-20util: Split out exec_dir from os_find_datadirFam Zheng1-0/+54
With this change, main() calls qemu_init_exec_dir and uses argv[0] to init exec_dir. The saved value can be retrieved with qemu_get_exec_dir later. It will be reused by module loading. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-24qemu_memalign: Allow small alignmentsKevin Wolf1-0/+5
The functions used by qemu_memalign() require an alignment that is at least sizeof(void*). Adjust it if it is too small. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoît Canet <benoit@irqsave.net>
2014-01-22osdep: add qemu_set_tty_echo()Stefan Hajnoczi1-0/+18
Using stdin with readline.c requires disabling echo and line buffering. Add a portable wrapper to set the terminal attributes under Linux and Windows. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-02util: add socket_set_fast_reuse function which will replace setting SO_REUSEADDRSebastian Ottlik1-0/+12
If a socket is closed it remains in TIME_WAIT state for some time. On operating systems using BSD sockets the endpoint of the socket may not be reused while in this state unless SO_REUSEADDR was set on the socket. On windows on the other hand the default behaviour is to allow reuse (i.e. identical to SO_REUSEADDR on other operating systems) and setting SO_REUSEADDR on a socket allows it to be bound to a endpoint even if the endpoint is already used by another socket independently of the other sockets state. This can even result in undefined behaviour. Many sockets used by QEMU should not block the use of their endpoint after being closed while they are still in TIME_WAIT state. Currently QEMU sets SO_REUSEADDR for such sockets, which can lead to problems on Windows. This patch introduces the function socket_set_fast_reuse that should be used instead of setting SO_REUSEADDR when fast socket reuse is desired and behaves correctly on all operating systems. As a failure of this function can only be caused by bad QEMU internal errors, an assertion handles these situations. The return value is still passed on, to minimize changes in client code and prevent unused variable warnings if NDEBUG is defined. Signed-off-by: Sebastian Ottlik <ottlik@fzi.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-09-12exec: Don't abort when we can't allocate guest memoryMarkus Armbruster1-3/+1
We abort() on memory allocation failure. abort() is appropriate for programming errors. Maybe most memory allocation failures are programming errors, maybe not. But guest memory allocation failure isn't, and aborting when the user asks for more memory than we can provide is not nice. exit(1) instead, and do it in just one place, so the error message is consistent. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-id: 1375276272-15988-8-git-send-email-armbru@redhat.com Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-05-30osdep: add qemu_get_local_state_pathname()Laszlo Ersek1-0/+9
This function returns ${prefix}/var/RELATIVE_PATHNAME on POSIX-y systems, and <CSIDL_COMMON_APPDATA>/RELATIVE_PATHNAME on Win32. http://msdn.microsoft.com/en-us/library/bb762494.aspx [...] This folder is used for application data that is not user specific. For example, an application can store a spell-check dictionary, a database of clip art, or a log file in the CSIDL_COMMON_APPDATA folder. [...] Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14osdep: introduce qemu_anon_ram_free to free qemu_anon_ram_alloc-ed memoryPaolo Bonzini1-0/+8
We switched from qemu_memalign to mmap() but then we don't modify qemu_vfree() to do a munmap() over free(). Which we cannot do because qemu_vfree() frees memory allocated by qemu_{mem,block}align. Introduce a new function that does the munmap(), luckily the size is available in the RAMBlock. Reported-by: Amos Kong <akong@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Amos Kong <akong@redhat.com> Message-id: 1368454796-14989-3-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-05-14osdep, kvm: rename low-level RAM allocation functionsPaolo Bonzini1-2/+2
This is preparatory to the introduction of a separate freeing API. Reported-by: Amos Kong <akong@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Amos Kong <akong@redhat.com> Message-id: 1368454796-14989-2-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-16migration: initialize RAM to zeroPaolo Bonzini1-17/+18
Using qemu_memalign only leaves the RAM zero by chance, because libc will usually use mmap to satisfy our huge requests. But memory will not be zero when using MALLOC_PERTURB_ with a nonzero value. In the case of incoming migration, this breaks a recently-introduced invariant (commit f1c7279, migration: do not sent zero pages in bulk stage, 2013-03-26). To fix this, use mmap ourselves to get a well-aligned, always zero block for the RAM. Mmap-ed memory is easy to "trim" at the sides. This also removes the need to do something special on valgrind (see commit c2a8238a, Support running QEMU on Valgrind, 2011-10-31), thus effectively reverts that patch. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1365522223-20153-1-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-02oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()Stefan Hajnoczi1-2/+2
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets. Rename to qemu_set_nonblock() just like qemu_set_cloexec(). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-03-05oslib-posix: Align to permit transparent hugepages on ARM LinuxPeter Maydell1-1/+1
ARM Linux (like x86-64 Linux) can use transparent hugepages for KVM if memory blocks are 2MiB aligned; set QEMU_VMALLOC_ALIGN accordingly. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-01-26bsd-user: avoid conflict with qemu_vmallocBlue Swirl1-3/+0
Rename qemu_vmalloc() to bsd_vmalloc(), adjust the only user. Remove #ifdeffery in oslib-posix.c. Tested-by: Andreas Färber <andreas.faerber@web.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-12build: move libqemuutil.a components to util/Paolo Bonzini1-0/+228
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>