1. 26 Feb, 2013 9 commits
    • NeilBrown's avatar
      md/raid1,raid10: fix deadlock with freeze_array() · ee0b0244
      NeilBrown authored
      When raid1/raid10 needs to fix a read error, it first drains
      all pending requests by calling freeze_array().
      This calls flush_pending_writes() if it needs to sleep,
      but some writes may be pending in a per-process plug rather
      than in the per-array request queue.
      
      When raid1{,0}_unplug() moves the request from the per-process
      plug to the per-array request queue (from which
      flush_pending_writes() can flush them), it needs to wake up
      freeze_array(), or freeze_array() will never flush them and so
      it will block forever.
      
      So add the requires wake_up() calls.
      
      This bug was introduced by commit
         f54a9d0e
      for raid1 and a similar commit for RAID10, and so has been present
      since linux-3.6.  As the bug causes a deadlock I believe this fix is
      suitable for -stable.
      
      Cc: stable@vger.kernel.org (3.6.y 3.7.y 3.8.y)
      Reported-by: default avatarTregaron Bayly <tbayly@bluehost.com>
      Tested-by: default avatarTregaron Bayly <tbayly@bluehost.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      ee0b0244
    • NeilBrown's avatar
      md/raid0: improve error message when converting RAID4-with-spares to RAID0 · f96c9f30
      NeilBrown authored
      Mentioning "bad disk number -1" exposes irrelevant internal detail.
      Just say they are inactive and must be removed.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      f96c9f30
    • NeilBrown's avatar
      md: raid0: fix error return from create_stripe_zones. · 58ebb34c
      NeilBrown authored
      Create_stripe_zones returns an error slightly differently to
      raid0_run and to raid0_takeover_*.
      
      The error returned used by the second was wrong and an error would
      result in mddev->private being set to NULL and sooner or later a
      crash.
      
      So never return NULL, return ERR_PTR(err), not NULL from
      create_stripe_zones.
      
      This bug has been present since 2.6.35 so the fix is suitable
      for any kernel since then.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      58ebb34c
    • NeilBrown's avatar
      md: fix two bugs when attempting to resize RAID0 array. · a6468539
      NeilBrown authored
      You cannot resize a RAID0 array (in terms of making the devices
      bigger), but the code doesn't entirely stop you.
      So:
      
       disable setting of the available size on each device for
       RAID0 and Linear devices.  This must not change as doing so
       can change the effective layout of data.
      
       Make sure that the size that raid0_size() reports is accurate,
       but rounding devices sizes to chunk sizes.  As the device sizes
       cannot change now, this isn't so important, but it is best to be
       safe.
      
      Without this change:
        mdadm --grow /dev/md0 -z max
        mdadm --grow /dev/md0 -Z max
        then read to the end of the array
      
      can cause a BUG in a RAID0 array.
      
      These bugs have been present ever since it became possible
      to resize any device, which is a long time.  So the fix is
      suitable for any -stable kerenl.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      a6468539
    • Jonathan Brassow's avatar
      DM RAID: Add support for MD's RAID10 "far" and "offset" algorithms · fe5d2f4a
      Jonathan Brassow authored
      DM RAID:  Add support for MD's RAID10 "far" and "offset" algorithms
      
      Until now, dm-raid.c only supported the "near" algorthm of MD's RAID10
      implementation.  This patch adds support for the "far" and "offset"
      algorithms, but only with the improved redundancy that is brought with
      the introduction of the 'use_far_sets' bit, which shifts copied stripes
      according to smaller sets vs the entire array.  That is, the 17th bit
      of the 'layout' variable that defines the RAID10 implementation will
      always be set.   (More information on how the 'layout' variable selects
      the RAID10 algorithm can be found in the opening comments of
      drivers/md/raid10.c.)
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      fe5d2f4a
    • Jonathan Brassow's avatar
      MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 2) · 9a3152ab
      Jonathan Brassow authored
      MD RAID10:  Improve redundancy for 'far' and 'offset' algorithms (part 2)
      
      This patch addresses raid arrays that have a number of devices that cannot
      be evenly divided by 'far_copies'.  (E.g. 5 devices, far_copies = 2)  This
      case must be handled differently because it causes that last set to be of
      a different size than the rest of the sets.  We must compute a new modulo
      for this last set so that copied chunks are properly wrapped around.
      
      Example use_far_sets=1, far_copies=2, near_copies=1, devices=5:
                      "far" algorithm
              dev1 dev2 dev3 dev4 dev5
      	==== ==== ==== ==== ====
      	[ A   B ] [ C    D   E ]
              [ G   H ] [ I    J   K ]
                          ...
              [ B   A ] [ E    C   D ] --> nominal set of 2 and last set of 3
              [ H   G ] [ K    I   J ]     []'s show far/offset sets
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      9a3152ab
    • Jonathan Brassow's avatar
      MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 1) · 475901af
      Jonathan Brassow authored
      The MD RAID10 'far' and 'offset' algorithms make copies of entire stripe
      widths - copying them to a different location on the same devices after
      shifting the stripe.  An example layout of each follows below:
      
      	        "far" algorithm
      	dev1 dev2 dev3 dev4 dev5 dev6
      	==== ==== ==== ==== ==== ====
      	 A    B    C    D    E    F
      	 G    H    I    J    K    L
      	            ...
      	 F    A    B    C    D    E  --> Copy of stripe0, but shifted by 1
      	 L    G    H    I    J    K
      	            ...
      
      		"offset" algorithm
      	dev1 dev2 dev3 dev4 dev5 dev6
      	==== ==== ==== ==== ==== ====
      	 A    B    C    D    E    F
      	 F    A    B    C    D    E  --> Copy of stripe0, but shifted by 1
      	 G    H    I    J    K    L
      	 L    G    H    I    J    K
      	            ...
      
      Redundancy for these algorithms is gained by shifting the copied stripes
      one device to the right.  This patch proposes that array be divided into
      sets of adjacent devices and when the stripe copies are shifted, they wrap
      on set boundaries rather than the array size boundary.  That is, for the
      purposes of shifting, the copies are confined to their sets within the
      array.  The sets are 'near_copies * far_copies' in size.
      
      The above "far" algorithm example would change to:
      	        "far" algorithm
      	dev1 dev2 dev3 dev4 dev5 dev6
      	==== ==== ==== ==== ==== ====
      	 A    B    C    D    E    F
      	 G    H    I    J    K    L
      	            ...
      	 B    A    D    C    F    E  --> Copy of stripe0, shifted 1, 2-dev sets
      	 H    G    J    I    L    K      Dev sets are 1-2, 3-4, 5-6
      	            ...
      
      This has the affect of improving the redundancy of the array.  We can
      always sustain at least one failure, but sometimes more than one can
      be handled.  In the first examples, the pairs of devices that CANNOT fail
      together are:
      	(1,2) (2,3) (3,4) (4,5) (5,6) (1, 6) [40% of possible pairs]
      In the example where the copies are confined to sets, the pairs of
      devices that cannot fail together are:
      	(1,2) (3,4) (5,6)                    [20% of possible pairs]
      
      We cannot simply replace the old algorithms, so the 17th bit of the 'layout'
      variable is used to indicate whether we use the old or new method of computing
      the shift.  (This is similar to the way the 16th bit indicates whether the
      "far" algorithm or the "offset" algorithm is being used.)
      
      This patch only handles the cases where the number of total raid disks is
      a multiple of 'far_copies'.  A follow-on patch addresses the condition where
      this is not true.
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      475901af
    • Jonathan Brassow's avatar
      MD RAID10: Minor non-functional code changes · 4c0ca26b
      Jonathan Brassow authored
      Changes include assigning 'addr' from 's' instead of 'sector' to be
      consistent with the way the code does it just a few lines later and
      using '%=' vs a conditional and subtraction.
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      4c0ca26b
    • Joe Lawrence's avatar
      md: raid1,10: Handle REQ_WRITE_SAME flag in write bios · c8dc9c65
      Joe Lawrence authored
      Set mddev queue's max_write_same_sectors to its chunk_sector value (before
      disk_stack_limits merges the underlying disk limits.)  With that in place,
      be sure to handle writes coming down from the block layer that have the
      REQ_WRITE_SAME flag set.  That flag needs to be copied into any newly cloned
      write bio.
      Signed-off-by: default avatarJoe Lawrence <joe.lawrence@stratus.com>
      Acked-by: default avatar"Martin K. Petersen" <martin.petersen@oracle.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      c8dc9c65
  2. 21 Feb, 2013 1 commit
  3. 18 Feb, 2013 3 commits
  4. 15 Feb, 2013 9 commits
  5. 14 Feb, 2013 4 commits
    • David S. Miller's avatar
      sunvdc: Fix off-by-one in generic_request(). · f4d96054
      David S. Miller authored
      The 'operations' bitmap corresponds one-for-one with the operation
      codes, no adjustment is necessary.
      Reported-by: default avatarMark Kettenis <mark.kettenis@xs4all.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4d96054
    • Tomi Valkeinen's avatar
      omapdrm: fix the dependency to omapdss · 91e83ffd
      Tomi Valkeinen authored
      omapdrm uses "select" in Kconfig to enable omapdss. This doesn't work
      correctly, as "select" forces omapdss to be enabled in the config even
      if it normally could not be enabled because of missing Kconfig
      dependencies.
      
      This causes a build break on ARM, when using allyesconfig:
      
      drivers/video/omap2/dss/dss.c: In function 'dss_calc_clock_div':
      drivers/video/omap2/dss/dss.c:572:20: error: 'CONFIG_OMAP2_DSS_MIN_FCK_PER_PCK' undeclared (first use in this function)
      drivers/video/omap2/dss/dss.c:572:20: note: each undeclared identifier is reported only once for each function it appears in
      
      Instead of using select, this patch changes omapdrm to use "depend
      on".
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      91e83ffd
    • NeilBrown's avatar
      OMAPDSS: add FEAT_DPI_USES_VDDS_DSI to omap3630_dss_feat_list · eb91e79b
      NeilBrown authored
      commit 195e672a
         OMAPDSS: DPI: Remove cpu_is_xxxx checks
      
      made the mistake of assuming that cpu_is_omap34xx() is exclusive of
      other cpu_is_* predicates whereas it includes cpu_is_omap3630().
      
      So on an omap3630, code that was previously enabled by
        if (cpu_is_omap34xx())
      is now disabled as
        dss_has_feature(FEAT_DPI_USES_VDDS_DSI)
      fails.
      
      So add FEAT_DPI_USES_VDDS_DSI to omap3630_dss_feat_list.
      
      Cc: Chandrabhanu Mahapatra <cmahapatra@ti.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      eb91e79b
    • Satoru Takeuchi's avatar
      efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter · 1de63d60
      Satoru Takeuchi authored
      There was a serious problem in samsung-laptop that its platform driver is
      designed to run under BIOS and running under EFI can cause the machine to
      become bricked or can cause Machine Check Exceptions.
      
          Discussion about this problem:
          https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
          https://bugzilla.kernel.org/show_bug.cgi?id=47121
      
          The patches to fix this problem:
          efi: Make 'efi_enabled' a function to query EFI facilities
          83e68189
      
          samsung-laptop: Disable on EFI hardware
          e0094244
      
      Unfortunately this problem comes back again if users specify "noefi" option.
      This parameter clears EFI_BOOT and that driver continues to run even if running
      under EFI. Refer to the document, this parameter should clear
      EFI_RUNTIME_SERVICES instead.
      
      Documentation/kernel-parameters.txt:
      ===============================================================================
      ...
      	noefi		[X86] Disable EFI runtime services support.
      ...
      ===============================================================================
      
      Documentation/x86/x86_64/uefi.txt:
      ===============================================================================
      ...
      - If some or all EFI runtime services don't work, you can try following
        kernel command line parameters to turn off some or all EFI runtime
        services.
      	noefi		turn off all EFI runtime services
      ...
      ===============================================================================
      Signed-off-by: default avatarSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
      Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      1de63d60
  6. 13 Feb, 2013 14 commits
    • Cyril Roelandt's avatar
      xen: remove redundant NULL check before unregister_and_remove_pcpu(). · 4f8c8527
      Cyril Roelandt authored
      unregister_and_remove_pcpu on a NULL pointer is a no-op, so the NULL check in
      sync_pcpu can be removed.
      Signed-off-by: default avatarCyril Roelandt <tipecaml@gmail.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      4f8c8527
    • Jan Beulich's avatar
      x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS. · 13d2b4d1
      Jan Beulich authored
      This fixes CVE-2013-0228 / XSA-42
      
      Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user
      in 32bit PV guest can use to crash the > guest with the panic like this:
      
      -------------
      general protection fault: 0000 [#1] SMP
      last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
      Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
      iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
      xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4
      mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last
      unloaded: scsi_wait_scan]
      
      Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1
      EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0
      EIP is at xen_iret+0x12/0x2b
      EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010
      ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0
       DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069
      Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000)
      Stack:
       00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000
      Call Trace:
      Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00
      8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40
      10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02
      EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0
      general protection fault: 0000 [#2]
      ---[ end trace ab0d29a492dcd330 ]---
      Kernel panic - not syncing: Fatal exception
      Pid: 1250, comm: r Tainted: G      D    ---------------
      2.6.32-356.el6.i686 #1
      Call Trace:
       [<c08476df>] ? panic+0x6e/0x122
       [<c084b63c>] ? oops_end+0xbc/0xd0
       [<c084b260>] ? do_general_protection+0x0/0x210
       [<c084a9b7>] ? error_code+0x73/
      -------------
      
      Petr says: "
       I've analysed the bug and I think that xen_iret() cannot cope with
       mangled DS, in this case zeroed out (null selector/descriptor) by either
       xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT
       entry was invalidated by the reproducer. "
      
      Jan took a look at the preliminary patch and came up a fix that solves
      this problem:
      
      "This code gets called after all registers other than those handled by
      IRET got already restored, hence a null selector in %ds or a non-null
      one that got loaded from a code or read-only data descriptor would
      cause a kernel mode fault (with the potential of crashing the kernel
      as a whole, if panic_on_oops is set)."
      
      The way to fix this is to realize that the we can only relay on the
      registers that IRET restores. The two that are guaranteed are the
      %cs and %ss as they are always fixed GDT selectors. Also they are
      inaccessible from user mode - so they cannot be altered. This is
      the approach taken in this patch.
      
      Another alternative option suggested by Jan would be to relay on
      the subtle realization that using the %ebp or %esp relative references uses
      the %ss segment.  In which case we could switch from using %eax to %ebp and
      would not need the %ss over-rides. That would also require one extra
      instruction to compensate for the one place where the register is used
      as scaled index. However Andrew pointed out that is too subtle and if
      further work was to be done in this code-path it could escape folks attention
      and lead to accidents.
      Reviewed-by: default avatarPetr Matousek <pmatouse@redhat.com>
      Reported-by: default avatarPetr Matousek <pmatouse@redhat.com>
      Reviewed-by: default avatarAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      13d2b4d1
    • David S. Miller's avatar
      sparc64: Fix get_user_pages_fast() wrt. THP. · 89a77915
      David S. Miller authored
      Mostly mirrors the s390 logic, as unlike x86 we don't need the
      SetPageReferenced() bits.
      
      On sparc64 we also lack a user/privileged bit in the huge PMDs.
      
      In order to make this work for THP and non-THP builds, some header
      file adjustments were necessary.  Namely, provide the PMD_HUGE_* bit
      defines and the pmd_large() inline unconditionally rather than
      protected by TRANSPARENT_HUGEPAGE.
      Reported-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89a77915
    • David S. Miller's avatar
      sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE. · b9156ebb
      David S. Miller authored
      This got missed in the cleanups done for the S390 THP
      support.
      
      CC: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9156ebb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 323a72d8
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "This is primarily to get those r8169 reverts sorted, but other fixes
        have accumulated meanwhile.
      
         1) Revert two r8169 changes to fix suspend/resume for some users,
            from Francois Romieu.
      
         2) PCI dma mapping errors in atl1c are not checked for and this cause
            hard crashes for some users, from Xiong Huang.
      
         3) In 3.8.x we merged the removal of the EXPERIMENTAL dependency for
            'dlm' but the same patch for 'sctp' got lost somewhere, resulting
            in the potential for build errors since there are cross
            dependencies.  From Kees Cook.
      
         4) SCTP's ipv6 socket route validation makes boolean tests
            incorrectly, fix from Daniel Borkmann.
      
         5) mac80211 does sizeof(ptr) instead of (sizeof(ptr) * nelem), from
            Cong Ding.
      
         6) arp_rcv() can crash on shared non-linear packets, from Eric
            Dumazet.
      
         7) Avoid crashes in macvtap by setting ->gso_type consistently in
            ixgbe, qlcnic, and bnx2x drivers.  From Michael S Tsirkin and
            Alexander Duyck.
      
         8) Trinity fuzzer spots infinite loop in __skb_recv_datagram(), fix
            from Eric Dumazet.
      
         9) STP protocol frames should use high packet priority, otherwise an
            overloaded bridge can get stuck.  From Stephen Hemminger.
      
        10) The HTB packet scheduler was converted some time ago to store
            internal timestamps in nanoseconds, but we don't convert back into
            psched ticks for the user during dumps.  Fix from Jiri Pirko.
      
        11) mwl8k channel table doesn't set the .band field properly,
            resulting in NULL pointer derefs.  Fix from Jonas Gorski.
      
        12) mac80211 doesn't accumulate channels properly during a scan so we
            can downgrade heavily to a much less desirable connection speed.
            Fix from Johannes Berg.
      
        13) PHY probe failure in stmmac can result in resource leaks and
            double MDIO registery later, from Giuseppe CAVALLARO.
      
        14) Correct ipv6 checksumming in ip6t_NPT netfilter module, also fix
            address prefix mangling, from YOSHIFUJI Hideaki."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
        net, sctp: remove CONFIG_EXPERIMENTAL
        net: sctp: sctp_v6_get_dst: fix boolean test in dst cache
        batman-adv: Fix NULL pointer dereference in DAT hash collision avoidance
        net/macb: fix race with RX interrupt while doing NAPI
        atl1c: add error checking for pci_map_single functions
        htb: fix values in opt dump
        ixgbe: Only set gso_type to SKB_GSO_TCPV4 as RSC does not support IPv6
        net: fix infinite loop in __skb_recv_datagram()
        net: qmi_wwan: add Yota / Megafon M100-1 4g modem
        mwl8k: fix band for supported channels
        bridge: set priority of STP packets
        mac80211: fix channel selection bug
        arp: fix possible crash in arp_rcv()
        bnx2x: set gso_type
        qlcnic: set gso_type
        ixgbe: fix gso type
        stmmac: mdio register has to fail if the phy is not found
        stmmac: fix macro used for debugging the xmit
        Revert "r8169: enable internal ASPM and clock request settings".
        Revert "r8169: enable ALDPS for power saving".
        ...
      323a72d8
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42976ad0
      Linus Torvalds authored
      Pull x86 fixes from Peter Anvin:
       "One (hopefully) last batch of x86 fixes.  You asked for the patch by
        patch justifications, so here they are:
      
            x86, MCE: Retract most UAPI exports
      
         This one unexports from userspace a bunch of definitions which should
         never have been exported.  We really don't want to create an
         accidental legacy here.
      
            x86, doc: Add a bootloader ID for OVMF
      
         This is a documentation-only patch, just recording the official
         assignment of a boot loader ID.
      
            x86: Do not leak kernel page mapping locations
      
         Security: avoid making it needlessly easy for user space to probe the
         kernel memory layout.
      
            x86/mm: Check if PUD is large when validating a kernel address
      
         Prevent failures using /proc/kcore when using 1G pages.
      
            x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems
      
         Works around a BIOS problem causing boot failures on affected hardware."
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Check if PUD is large when validating a kernel address
        x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems
        x86, doc: Add a bootloader ID for OVMF
        x86: Do not leak kernel page mapping locations
        x86, MCE: Retract most UAPI exports
      42976ad0
    • Wolfram Sang's avatar
      MAINTAINERS: change my email and repos · 14d77c4d
      Wolfram Sang authored
      Change to my private email, change to my shiny new kernel.org repos,
      and drop outdated entry from the former maintainer. Drop my PCA entry,
      too, since it belongs to the I2C realm anyhow.
      Signed-off-by: default avatarWolfram Sang <wolfram@the-dreams.de>
      14d77c4d
    • Rafael J. Wysocki's avatar
      PCI/PM: Clean up PME state when removing a device · 249bfb83
      Rafael J. Wysocki authored
      Devices are added to pci_pme_list when drivers use pci_enable_wake()
      or pci_wake_from_d3(), but they aren't removed from the list unless
      the driver explicitly disables wakeup.  Many drivers never disable
      wakeup, so their devices remain on the list even after they are
      removed, e.g., via hotplug.  A subsequent PME poll will oops when
      it tries to touch the device.
      
      This patch disables PME# on a device before removing it, which removes
      the device from pci_pme_list.  This is safe even if the device never
      had PME# enabled.
      
      This oops can be triggered by unplugging a Thunderbolt ethernet adapter
      on a Macbook Pro, as reported by Daniel below.
      
      [bhelgaas: changelog]
      Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.comReported-and-tested-by: default avatarDaniel J Blueman <daniel@quora.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org
      249bfb83
    • Kees Cook's avatar
      net, sctp: remove CONFIG_EXPERIMENTAL · 3bdb1a44
      Kees Cook authored
      This config item has not carried much meaning for a while now and is
      almost always enabled by default. As agreed during the Linux kernel
      summit, remove it.
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3bdb1a44
    • Daniel Borkmann's avatar
      net: sctp: sctp_v6_get_dst: fix boolean test in dst cache · e9c0dfba
      Daniel Borkmann authored
      We walk through the bind address list and try to get the best source
      address for a given destination. However, currently, we take the
      'continue' path of the loop when an entry is invalid (!laddr->valid)
      *and* the entry state does not equal SCTP_ADDR_SRC (laddr->state !=
      SCTP_ADDR_SRC).
      
      Thus, still, invalid entries with SCTP_ADDR_SRC might not 'continue'
      as well as valid entries with SCTP_ADDR_{NEW, SRC, DEL}, with a possible
      false baddr and matchlen as a result, causing in worst case dst route
      to be false or possibly NULL.
      
      This test should actually be a '||' instead of '&&'. But lets fix it
      and make this a bit easier to read by having the condition the same way
      as similarly done in sctp_v4_get_dst.
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9c0dfba
    • Pau Koning's avatar
      batman-adv: Fix NULL pointer dereference in DAT hash collision avoidance · 816cd5b8
      Pau Koning authored
      An entry in DAT with the hashed position of 0 can cause a NULL pointer
      dereference when the first entry is checked by batadv_choose_next_candidate.
      This first candidate automatically has the max value of 0 and the max_orig_node
      of NULL. Not checking max_orig_node for NULL in batadv_is_orig_node_eligible
      will lead to a NULL pointer dereference when checking for the lowest address.
      
      This problem was added in 785ea114
      ("batman-adv: Distributed ARP Table - create DHT helper functions").
      Signed-off-by: default avatarPau Koning <paukoning@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      816cd5b8
    • Nicolas Ferre's avatar
      net/macb: fix race with RX interrupt while doing NAPI · 8770e91a
      Nicolas Ferre authored
      When interrupts are disabled, an RX condition can occur but
      it is not reported when enabling interrupts again. We need to check
      RSR and use napi_reschedule() if condition is met.
      Signed-off-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8770e91a
    • Huang, Xiong's avatar
      atl1c: add error checking for pci_map_single functions · ac574804
      Huang, Xiong authored
      it is reported that code hit DMA-API errors on 3.8-rc6+,
      (see https://bugzilla.redhat.com/show_bug.cgi?id=908436, and
           https://bugzilla.redhat.com/show_bug.cgi?id=908550)
      
      this patch just adds error handler for
          pci_map_single and skb_frag_dma_map.
      Signed-off-by: default avatarxiong <xiong@qca.qualcomm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac574804
    • Mel Gorman's avatar
      x86/mm: Check if PUD is large when validating a kernel address · 0ee364eb
      Mel Gorman authored
      A user reported the following oops when a backup process reads
      /proc/kcore:
      
       BUG: unable to handle kernel paging request at ffffbb00ff33b000
       IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
       [...]
      
       Call Trace:
        [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
        [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
        [<ffffffff81151687>] vfs_read+0xc7/0x130
        [<ffffffff811517f3>] sys_read+0x53/0xa0
        [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
      
      Investigation determined that the bug triggered when reading
      system RAM at the 4G mark. On this system, that was the first
      address using 1G pages for the virt->phys direct mapping so the
      PUD is pointing to a physical address, not a PMD page.
      
      The problem is that the page table walker in kern_addr_valid() is
      not checking pud_large() and treats the physical address as if
      it was a PMD.  If it happens to look like pmd_none then it'll
      silently fail, probably returning zeros instead of real data. If
      the data happens to look like a present PMD though, it will be
      walked resulting in the oops above.
      
      This patch adds the necessary pud_large() check.
      
      Unfortunately the problem was not readily reproducible and now
      they are running the backup program without accessing
      /proc/kcore so the patch has not been validated but I think it
      makes sense.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reviewed-by: default avatarRik van Riel <riel@redhat.coM>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: stable@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0ee364eb