1. 19 Mar, 2015 7 commits
    • Linus Torvalds's avatar
      Merge tag 'nios2-fixes-v4.0-rc5' of git://git.rocketboards.org/linux-socfpga-next · 18eda522
      Linus Torvalds authored
      Pull two arch/nios2 fixes from Ley Foon Tan:
       - Remove ucontext.h from exported arch headers
       - nios2: mm: do not invoke OOM killer on kernel fault OOM
      
      * tag 'nios2-fixes-v4.0-rc5' of git://git.rocketboards.org/linux-socfpga-next:
        nios2: mm: do not invoke OOM killer on kernel fault OOM
        nios2: Remove ucontext.h from exported arch headers
      18eda522
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide · a93fc153
      Linus Torvalds authored
      Pull IDE fix from David Miller:
       "Just one fix to convert a by-hand conversion of jiffies to msecs, from
        Nicholas McGuire"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
        ide_tape: convert jiffies with jiffies_to_msecs
      a93fc153
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 22283c82
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
      
       1) Some command cases of semtimedop() not even handled due to miscoded
          comparison on sparc64.  From Rob Gardner.
      
       2) Due to two bugs, /proc/kcore wan't working properly on sparc.
      
       3) Make sure fatal traps stop all running cpus, from Dave Kleikamp.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Fix /proc/kcore
        sparc: semtimedop() unreachable due to comparison error
        sparc: io_64.h: Replace io function-link macros
        sparc64: fatal trap should stop all cpus
        arch: sparc: kernel: starfire.c: Remove unused function
        arch: sparc: kernel: traps_64.c: Remove some unused functions
      22283c82
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 47226fe1
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix packet header offset calculation in _decode_session6(), from
          Hajime Tazaki.
      
       2) Fix route leak in error paths of xfrm_lookup(), from Huaibin Wang.
      
       3) Be sure to clear state properly when scans fail in iwlwifi mvm code,
          from Luciano Coelho.
      
       4) iwlwifi tries to stop scans that aren't actually running, also from
          Luciano Coelho.
      
       5) mac80211 should drop mesh frames that are not encrypted, fix from
          Bob Copeland.
      
       6) Add new device ID to b43 wireless driver for BCM432228 chips, from
          Rafał Miłecki.
      
       7) Fix accidental addition of members after variable sized array in
          struct tc_u_hnode, from WANG Cong.
      
       8) Don't re-enable interrupts until after we call napi_complete() in
          ibmveth and WIZnet drivers, frm Yongbae Park.
      
       9) Fix regression in vlan tag handling of fec driver, from Fugang Duan.
      
      10) If a network namespace change fails during rtnl_newlink(), we don't
          unwind the device registry properly.
      
      11) Fix two TCP regressions, from Neal Cardwell:
        - Don't allow snd_cwnd_cnt to accumulate huge values due to missing
          test in tcp_cong_avoid_ai().
        - Restore CUBIC back to advancing cwnd by 1.5x packets per RTT.
      
      12) Fix performance regression in xne-netback involving push TX
          notifications, from David Vrabel.
      
      13) __skb_tstamp_tx() can be called with a NULL sk pointer, do not
          dereference blindly.  From Willem de Bruijn.
      
      14) Fix potential stack overflow in RDS protocol stack, from Arnd
          Bergmann.
      
      15) VXLAN_VID_MASK used incorrectly in new remote checksum offload
          support of VXLAN driver.  Fix from Alexey Kodanev.
      
      16) Fix too small netlink SKB allocation in inet_diag layer, from Eric
          Dumazet.
      
      17) ieee80211_check_combinations() does not count interfaces correctly,
          from Andrei Otcheretianski.
      
      18) Hardware feature determination in bxn2x driver references a piece of
          software state that actually isn't initialized yet, fix from Michal
          Schmidt.
      
      19) inet_csk_wait_for_connect() needs a sched_annotate_sleep()
          annoation, from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
        Revert "net: cx82310_eth: use common match macro"
        net/mlx4_en: Set statistics bitmap at port init
        IB/mlx4: Saturate RoCE port PMA counters in case of overflow
        net/mlx4_en: Fix off-by-one in ethtool statistics display
        IB/mlx4: Verify net device validity on port change event
        act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome
        Revert "smc91x: retrieve IRQ and trigger flags in a modern way"
        inet: Clean up inet_csk_wait_for_connect() vs. might_sleep()
        ip6_tunnel: fix error code when tunnel exists
        netdevice.h: fix ndo_bridge_* comments
        bnx2x: fix encapsulation features on 57710/57711
        mac80211: ignore CSA to same channel
        nl80211: ignore HT/VHT capabilities without QoS/WMM
        mac80211: ask for ECSA IE to be considered for beacon parse CRC
        mac80211: count interfaces correctly for combination checks
        isdn: icn: use strlcpy() when parsing setup options
        rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg()
        caif: fix MSG_OOB test in caif_seqpkt_recvmsg()
        bridge: reset bridge mtu after deleting an interface
        can: kvaser_usb: Fix tx queue start/stop race conditions
        ...
      47226fe1
    • Nicholas Mc Guire's avatar
      ide_tape: convert jiffies with jiffies_to_msecs · 84215964
      Nicholas Mc Guire authored
      Use jiffies_to_msecs for converting jiffies as it handles all of the corner
      cases reliably and also helps readability. The printk format is fixed up
      as jiffies_to_msecs returns unsigned int not unsigned long.
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84215964
    • Ondrej Zary's avatar
      Revert "net: cx82310_eth: use common match macro" · 8d006e01
      Ondrej Zary authored
      This reverts commit 11ad714b because
      it breaks cx82310_eth.
      
      The custom USB_DEVICE_CLASS macro matches
      bDeviceClass, bDeviceSubClass and bDeviceProtocol
      but the common USB_DEVICE_AND_INTERFACE_INFO matches
      bInterfaceClass, bInterfaceSubClass and bInterfaceProtocol instead, which are
      not specified.
      Signed-off-by: default avatarOndrej Zary <linux@rainbow-software.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d006e01
    • David S. Miller's avatar
      sparc: Fix /proc/kcore · 3c08158e
      David S. Miller authored
      /proc/kcore investigates the "System RAM" elements in /proc/iomem to
      initialize it's memory tables.  Therefore we have to register them
      before it tries to do so.  kcore uses device_initcall() so let's
      use arch_initcall() for the registry.
      
      Also we need ARCH_PROC_KCORE_TEXT to get the virtual addresses of
      the kernel image correct.
      Reported-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c08158e
  2. 18 Mar, 2015 10 commits
  3. 17 Mar, 2015 15 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c5861658
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes from all around the place:
      
         - a KASLR related revert where we ran out of time to get a fix - this
           represents a substantial portion of the diffstat,
      
         - two FPU fixes,
      
         - two x86 platform fixes: an ACPI reduced-hw fix and a NumaChip fix,
      
         - an entry code fix,
      
         - and a VDSO build fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert "x86/mm/ASLR: Propagate base load address calculation"
        x86/fpu: Drop_fpu() should not assume that tsk equals current
        x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig()
        x86/apic/numachip: Fix sibling map with NumaChip
        x86/platform, acpi: Bypass legacy PIC and PIT in ACPI hardware reduced mode
        x86/asm/entry/32: Fix user_mode() misuses
        x86/vdso: Fix the build on GCC5
      c5861658
    • Linus Torvalds's avatar
      Merge branches 'perf-urgent-for-linus' and 'timers-urgent-for-linus' of... · 13326e5a
      Linus Torvalds authored
      Merge branches 'perf-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
      
      Pull perf and timer fixes from Ingo Molnar:
       "Two small perf fixes:
         - kernel side context leak fix
         - tooling crash fix
      
        And two clocksource driver fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Fix context leak in put_event()
        perf annotate: Fix fallback to unparsed disassembler line
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clockevents: sun5i: Fix setup_irq init sequence
        clocksource: efm32: Fix a NULL pointer dereference
      13326e5a
    • Benjamin Tissoires's avatar
      HID: wacom: check for wacom->shared before following the pointer · e2c7d887
      Benjamin Tissoires authored
      486b908d (HID: wacom: do not send pen events before touch is up/forced out)
      introduces a kernel oops when plugging a tablet without touch.
      
      wacom->shared is null for these devices so this leads to a null pointer
      exception.
      
      Change the condition to make it clear that what we need is wacom->shared
      not NULL.
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      e2c7d887
    • Robert Jarzmik's avatar
      Revert "smc91x: retrieve IRQ and trigger flags in a modern way" · 8d7d9cca
      Robert Jarzmik authored
      The commit breaks the legacy platforms, ie. these not using device-tree,
      and setting up the interrupt resources with a flag to activate edge
      detection. The issue was found on the zylonite platform.
      
      The reason is that zylonite uses platform resources to pass the interrupt number
      and the irq flags (here IORESOURCE_IRQ_HIGHEDGE). It expects the driver to
      request the irq with these flags, which in turn setups the irq as high edge
      triggered.
      
      After the patch, this was supposed to be taken care of with :
        irq_resflags = irqd_get_trigger_type(irq_get_irq_data(ndev->irq));
      
      But irq_resflags is 0 for legacy platforms, while for example in
      arch/arm/mach-pxa/zylonite.c, in struct resource smc91x_resources[] the
      irq flag is specified. This breaks zylonite because the interrupt is not
      setup as triggered, and hardware doesn't provide interrupts.
      Signed-off-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d7d9cca
    • Eric Dumazet's avatar
      inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() · cb7cf8a3
      Eric Dumazet authored
      I got the following trace with current net-next kernel :
      
      [14723.885290] WARNING: CPU: 26 PID: 22658 at kernel/sched/core.c:7285 __might_sleep+0x89/0xa0()
      [14723.885325] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810e8734>] prepare_to_wait_exclusive+0x34/0xa0
      [14723.885355] CPU: 26 PID: 22658 Comm: netserver Not tainted 4.0.0-dbg-DEV #1379
      [14723.885359]  ffffffff81a223a8 ffff881fae9e7ca8 ffffffff81650b5d 0000000000000001
      [14723.885364]  ffff881fae9e7cf8 ffff881fae9e7ce8 ffffffff810a72e7 0000000000000000
      [14723.885367]  ffffffff81a57620 000000000000093a 0000000000000000 ffff881fae9e7e64
      [14723.885371] Call Trace:
      [14723.885377]  [<ffffffff81650b5d>] dump_stack+0x4c/0x65
      [14723.885382]  [<ffffffff810a72e7>] warn_slowpath_common+0x97/0xe0
      [14723.885386]  [<ffffffff810a73e6>] warn_slowpath_fmt+0x46/0x50
      [14723.885390]  [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
      [14723.885393]  [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0
      [14723.885396]  [<ffffffff810e8734>] ? prepare_to_wait_exclusive+0x34/0xa0
      [14723.885399]  [<ffffffff810ccdc9>] __might_sleep+0x89/0xa0
      [14723.885403]  [<ffffffff81581846>] lock_sock_nested+0x36/0xb0
      [14723.885406]  [<ffffffff815829a3>] ? release_sock+0x173/0x1c0
      [14723.885411]  [<ffffffff815ea1f7>] inet_csk_accept+0x157/0x2a0
      [14723.885415]  [<ffffffff810e8900>] ? abort_exclusive_wait+0xc0/0xc0
      [14723.885419]  [<ffffffff8161b96d>] inet_accept+0x2d/0x150
      [14723.885424]  [<ffffffff8157db6f>] SYSC_accept4+0xff/0x210
      [14723.885428]  [<ffffffff8165a451>] ? retint_swapgs+0xe/0x44
      [14723.885431]  [<ffffffff810f4c5d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
      [14723.885437]  [<ffffffff81369c0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [14723.885441]  [<ffffffff8157ef40>] SyS_accept+0x10/0x20
      [14723.885444]  [<ffffffff81659872>] system_call_fastpath+0x12/0x17
      [14723.885447] ---[ end trace ff74cd83355b1873 ]---
      
      In commit 26cabd31
      Peter added a sched_annotate_sleep() in sk_wait_event()
      
      Is the following patch needed as well ?
      
      Alternative would be to use sk_wait_event() from inet_csk_wait_for_connect()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb7cf8a3
    • Nicolas Dichtel's avatar
      ip6_tunnel: fix error code when tunnel exists · 37355565
      Nicolas Dichtel authored
      After commit 2b0bb01b, the kernel returns -ENOBUFS when user tries to add
      an existing tunnel with ioctl API:
      $ ip -6 tunnel add ip6tnl1 mode ip6ip6 dev eth1
      add tunnel "ip6tnl0" failed: No buffer space available
      
      It's confusing, the right error is EEXIST.
      
      This patch also change a bit the code returned:
       - ENOBUFS -> ENOMEM
       - ENOENT -> ENODEV
      
      Fixes: 2b0bb01b ("ip6_tunnel: Return an error when adding an existing tunnel.")
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      Reported-by: default avatarPierre Cheynier <me@pierre-cheynier.net>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37355565
    • Nicolas Dichtel's avatar
      netdevice.h: fix ndo_bridge_* comments · ad41faa8
      Nicolas Dichtel authored
      The argument 'flags' was missing in ndo_bridge_setlink().
      ndo_bridge_dellink() was missing.
      
      Fixes: 407af329 ("bridge: Add netlink interface to configure vlans on bridge ports")
      Fixes: add511b3 ("bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink")
      CC: Vlad Yasevich <vyasevic@redhat.com>
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad41faa8
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v4.0-rc4' of... · 8e6e44fb
      Linus Torvalds authored
      Merge tag 'regulator-fix-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "The two main fixes here from Javier and Doug both fix issues seen on
        the Exynos-based ARM Chromebooks with reference counting of GPIO
        regulators over system suspend.  The GPIO enable code didn't properly
        take account of this case (a full analysis is in Doug's commit log).
      
        This is fixed by both fixing the reference counting directly and by
        making the resume code skip enables it doesn't need to do.  We could
        skip the change in the resume code but it's a very simple change and
        adds extra robustness against problems in other drivers"
      
      * tag 'regulator-fix-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: tps65910: Add missing #include <linux/of.h>
        regulator: core: Fix enable GPIO reference counting
        regulator: Only enable disabled regulators on resume
      8e6e44fb
    • Linus Torvalds's avatar
      Merge tag 'regmap-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 529d2eb6
      Linus Torvalds authored
      Pull regmap fixes from Mark Brown:
       "A few things here:
      
         - a change from Lars to fix insertion of cache values at the start of
           rather than end of a rbtree block.  This hadn't been noticed before
           since almost everything lists registers in ascending order.
      
         - a fix from Takashi for spurious warnings during cache sync with
           read once registers, a problem which can be very noticeable on
           devices that it affects.
      
         - a fix from Valentin for a tighening of the oneshot IRQ request
           interface which would have broken affected devices"
      
      * tag 'regmap-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: regcache-rbtree: Fix present bitmap resize
        regmap: Skip read-only registers in regcache_sync()
        regmap-irq: set IRQF_ONESHOT flag to ensure IRQ request
      529d2eb6
    • Linus Torvalds's avatar
      Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · 4d272f90
      Linus Torvalds authored
      Pull virtio fixes from Rusty Russell:
       "Not entirely surprising: the ongoing QEMU work on virtio 1.0 has
        revealed more minor issues with our virtio 1.0 drivers just introduced
        in the kernel.
      
        (I would normally use my fixes branch for this, but there were a batch
        of them...)"
      
      * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        virtio_mmio: fix access width for mmio
        uapi/virtio_scsi: allow overriding CDB/SENSE size
        virtio_mmio: generation support
        virtio_rpmsg: set DRIVER_OK before using device
        9p/trans_virtio: fix hot-unplug
        virtio-balloon: do not call blocking ops when !TASK_RUNNING
        virtio_blk: fix comment for virtio 1.0
        virtio_blk: typo fix
        virtio_balloon: set DRIVER_OK before using device
        virtio_console: avoid config access from irq
        virtio_console: init work unconditionally
      4d272f90
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · 2fc67756
      Linus Torvalds authored
      Pull kvm fixes from Marcelo Tosatti:
       "KVM bug fixes (ARM and x86)"
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        arm/arm64: KVM: Keep elrsr/aisr in sync with software model
        KVM: VMX: Set msr bitmap correctly if vcpu is in guest mode
        arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create()
        kvm: x86: i8259: return initialized data on invalid-size read
        arm64: KVM: Fix outdated comment about VTCR_EL2.PS
        arm64: KVM: Do not use pgd_index to index stage-2 pgd
        arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting
        kvm: move advertising of KVM_CAP_IRQFD to common code
      2fc67756
    • Kirill A. Shutemov's avatar
      pagemap: do not leak physical addresses to non-privileged userspace · ab676b7d
      Kirill A. Shutemov authored
      As pointed by recent post[1] on exploiting DRAM physical imperfection,
      /proc/PID/pagemap exposes sensitive information which can be used to do
      attacks.
      
      This disallows anybody without CAP_SYS_ADMIN to read the pagemap.
      
      [1] http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
      
      [ Eventually we might want to do anything more finegrained, but for now
        this is the simple model.   - Linus ]
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Mark Seaborn <mseaborn@chromium.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab676b7d
    • Takashi Iwai's avatar
      Merge tag 'asoc-fix-v4.0-rc4' of... · 3fc6c5a1
      Takashi Iwai authored
      Merge tag 'asoc-fix-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
      
      ASoC: Fixes for v4.0
      
      As well as the usual collection of driver specific fixes there's a few
      more generic things:
      
       - Lots of fixes from Takashi for drivers using the wrong field in the
         control union to communicate with userspace, leading to potential
         errors on 64 bit systems.
       - A fix from Lars for locking of the lists of devices we maintain,
         mostly only likely to trigger during device probe and removal.
      3fc6c5a1
    • Petr Mladek's avatar
      livepatch: Fix subtle race with coming and going modules · 8cb2c2dc
      Petr Mladek authored
      There is a notifier that handles live patches for coming and going modules.
      It takes klp_mutex lock to avoid races with coming and going patches but
      it does not keep the lock all the time. Therefore the following races are
      possible:
      
        1. The notifier is called sometime in STATE_MODULE_COMING. The module
           is visible by find_module() in this state all the time. It means that
           new patch can be registered and enabled even before the notifier is
           called. It might create wrong order of stacked patches, see below
           for an example.
      
         2. New patch could still see the module in the GOING state even after
            the notifier has been called. It will try to initialize the related
            object structures but the module could disappear at any time. There
            will stay mess in the structures. It might even cause an invalid
            memory access.
      
      This patch solves the problem by adding a boolean variable into struct module.
      The value is true after the coming and before the going handler is called.
      New patches need to be applied when the value is true and they need to ignore
      the module when the value is false.
      
      Note that we need to know state of all modules on the system. The races are
      related to new patches. Therefore we do not know what modules will get
      patched.
      
      Also note that we could not simply ignore going modules. The code from the
      module could be called even in the GOING state until mod->exit() finishes.
      If we start supporting patches with semantic changes between function
      calls, we need to apply new patches to any still usable code.
      See below for an example.
      
      Finally note that the patch solves only the situation when a new patch is
      registered. There are no such problems when the patch is being removed.
      It does not matter who disable the patch first, whether the normal
      disable_patch() or the module notifier. There is nothing to do
      once the patch is disabled.
      
      Alternative solutions:
      ======================
      
      + reject new patches when a patched module is coming or going; this is ugly
      
      + wait with adding new patch until the module leaves the COMING and GOING
        states; this might be dangerous and complicated; we would need to release
        kgr_lock in the middle of the patch registration to avoid a deadlock
        with the coming and going handlers; also we might need a waitqueue for
        each module which seems to be even bigger overhead than the boolean
      
      + stop modules from entering COMING and GOING states; wait until modules
        leave these states when they are already there; looks complicated; we would
        need to ignore the module that asked to stop the others to avoid a deadlock;
        also it is unclear what to do when two modules asked to stop others and
        both are in COMING state (situation when two new patches are applied)
      
      + always register/enable new patches and fix up the potential mess (registered
        patches order) in klp_module_init(); this is nasty and prone to regressions
        in the future development
      
      + add another MODULE_STATE where the kallsyms are visible but the module is not
        used yet; this looks too complex; the module states are checked on "many"
        locations
      
      Example of patch stacking breakage:
      ===================================
      
      The notifier could _not_ _simply_ ignore already initialized module objects.
      For example, let's have three patches (P1, P2, P3) for functions a() and b()
      where a() is from vmcore and b() is from a module M. Something like:
      
      	a()	b()
      P1	a1()	b1()
      P2	a2()	b2()
      P3	a3()	b3(3)
      
      If you load the module M after all patches are registered and enabled.
      The ftrace ops for function a() and b() has listed the functions in this
      order:
      
      	ops_a->func_stack -> list(a3,a2,a1)
      	ops_b->func_stack -> list(b3,b2,b1)
      
      , so the pointer to b3() is the first and will be used.
      
      Then you might have the following scenario. Let's start with state when patches
      P1 and P2 are registered and enabled but the module M is not loaded. Then ftrace
      ops for b() does not exist. Then we get into the following race:
      
      CPU0					CPU1
      
      load_module(M)
      
        complete_formation()
      
        mod->state = MODULE_STATE_COMING;
        mutex_unlock(&module_mutex);
      
      					klp_register_patch(P3);
      					klp_enable_patch(P3);
      
      					# STATE 1
      
        klp_module_notify(M)
          klp_module_notify_coming(P1);
          klp_module_notify_coming(P2);
          klp_module_notify_coming(P3);
      
      					# STATE 2
      
      The ftrace ops for a() and b() then looks:
      
        STATE1:
      
      	ops_a->func_stack -> list(a3,a2,a1);
      	ops_b->func_stack -> list(b3);
      
        STATE2:
      	ops_a->func_stack -> list(a3,a2,a1);
      	ops_b->func_stack -> list(b2,b1,b3);
      
      therefore, b2() is used for the module but a3() is used for vmcore
      because they were the last added.
      
      Example of the race with going modules:
      =======================================
      
      CPU0					CPU1
      
      delete_module()  #SYSCALL
      
         try_stop_module()
           mod->state = MODULE_STATE_GOING;
      
         mutex_unlock(&module_mutex);
      
      					klp_register_patch()
      					klp_enable_patch()
      
      					#save place to switch universe
      
      					b()     # from module that is going
      					  a()   # from core (patched)
      
         mod->exit();
      
      Note that the function b() can be called until we call mod->exit().
      
      If we do not apply patch against b() because it is in MODULE_STATE_GOING,
      it will call patched a() with modified semantic and things might get wrong.
      
      [jpoimboe@redhat.com: use one boolean instead of two]
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      8cb2c2dc
    • Michael S. Tsirkin's avatar
      virtio_mmio: fix access width for mmio · 704a0b5f
      Michael S. Tsirkin authored
      Going over the virtio mmio code, I noticed that it doesn't correctly
      access modern device config values using "natural" accessors: it uses
      readb to get/set them byte by byte, while the virtio 1.0 spec explicitly states:
      
      	4.2.2.2 Driver Requirements: MMIO Device Register Layout
      
      	...
      
      	The driver MUST only use 32 bit wide and aligned reads and writes to
      	access the control registers described in table 4.1.
      	For the device-specific configuration space, the driver MUST use
      	8 bit wide accesses for 8 bit wide fields, 16 bit wide and aligned
      	accesses for 16 bit wide fields and 32 bit wide and aligned accesses for
      	32 and 64 bit wide fields.
      
      Borrow code from virtio_pci_modern to do this correctly.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      704a0b5f
  4. 16 Mar, 2015 8 commits