1. 22 Sep, 2014 6 commits
    • NeilBrown's avatar
      md/raid1: update next_resync under resync_lock. · c2fd4c94
      NeilBrown authored
      raise_barrier() uses next_resync as part of its calculations, so it
      really should be updated first, instead of afterwards.
      
      next_resync is always used under resync_lock so update it under
      resync lock to, just before it is used.  That is safest.
      
      This could cause normal IO and resync IO to interact badly so
      it suitable for -stable.
      
      Fixes: 79ef3a8a
      cc: stable@vger.kernel.org (v3.13+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      c2fd4c94
    • NeilBrown's avatar
      md/raid1: Don't use next_resync to determine how far resync has progressed · 23554960
      NeilBrown authored
      next_resync is (approximately) the location for the next resync request.
      However it does *not* reliably determine the earliest location
      at which resync might be happening.
      This is because resync requests can complete out of order, and
      we only limit the number of current requests, not the distance
      from the earliest pending request to the latest.
      
      mddev->curr_resync_completed is a reliable indicator of the earliest
      position at which resync could be happening.   It is updated less
      frequently, but is actually reliable which is more important.
      
      So use it to determine if a write request is before the region
      being resynced and so safe from conflict.
      
      This error can allow resync IO to interfere with normal IO which
      could lead to data corruption. Hence: stable.
      
      Fixes: 79ef3a8a
      cc: stable@vger.kernel.org (v3.13+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      23554960
    • NeilBrown's avatar
      md/raid1: make sure resync waits for conflicting writes to complete. · 2f73d3c5
      NeilBrown authored
      The resync/recovery process for raid1 was recently changed
      so that writes could happen in parallel with resync providing
      they were in different regions of the device.
      
      There is a problem though:  While a write request will always
      wait for conflicting resync to complete, a resync request
      will *not* always wait for conflicting writes to complete.
      
      Two changes are needed to fix this:
      
      1/ raise_barrier (which waits until it is safe to do resync)
         must wait until current_window_requests is zero
      2/ wait_battier (which waits at the start of a new write request)
         must update current_window_requests if the request could
         possible conflict with a concurrent resync.
      
      As concurrent writes and resync can lead to data loss,
      this patch is suitable for -stable.
      
      Fixes: 79ef3a8a
      Cc: stable@vger.kernel.org (v3.13+)
      Cc: majianpeng <majianpeng@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      2f73d3c5
    • NeilBrown's avatar
      md/raid1: clean up request counts properly in close_sync() · 669cc7ba
      NeilBrown authored
      If there are outstanding writes when close_sync is called,
      the change to ->start_next_window might cause them to
      decrement the wrong counter when they complete.  Fix this
      by merging the two counters into the one that will be decremented.
      
      Having an incorrect value in a counter can cause raise_barrier()
      to hangs, so this is suitable for -stable.
      
      Fixes: 79ef3a8a
      cc: stable@vger.kernel.org (v3.13+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      669cc7ba
    • NeilBrown's avatar
      md/raid1: be more cautious where we read-balance during resync. · c6d119cf
      NeilBrown authored
      commit 79ef3a8a made
      it possible for reads to happen concurrently with resync.
      This means that we need to be more careful where read_balancing
      is allowed during resync - we can no longer be sure that any
      resync that has already started will definitely finish.
      
      So keep read_balancing to before recovery_cp, which is conservative
      but safe.
      
      This bug makes it possible to read from a device that doesn't
      have up-to-date data, so it can cause data corruption.
      So it is suitable for any kernel since 3.11.
      
      Fixes: 79ef3a8a
      cc: stable@vger.kernel.org (v3.13+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      c6d119cf
    • NeilBrown's avatar
      md/raid1: intialise start_next_window for READ case to avoid hang · f0cc9a05
      NeilBrown authored
      r1_bio->start_next_window is not initialised in the READ
      case, so allow_barrier may incorrectly decrement
         conf->current_window_requests
      which can cause raise_barrier() to block forever.
      
      Fixes: 79ef3a8a
      cc: stable@vger.kernel.org (v3.13+)
      Reported-by: default avatarBrassow Jonathan <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      f0cc9a05
  2. 08 Sep, 2014 4 commits
    • Linus Torvalds's avatar
      Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · d030671f
      Linus Torvalds authored
      Pull cgroup fixes from Tejun Heo:
       "This pull request includes Alban's patch to disallow '\n' in cgroup
        names.
      
        Two other patches from Li to fix a possible oops when cgroup
        destruction races against other file operations and one from Vivek to
        fix a unified hierarchy devel behavior"
      
      * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: check cgroup liveliness before unbreaking kernfs
        cgroup: delay the clearing of cgrp->kn->priv
        cgroup: Display legacy cgroup files on default hierarchy
        cgroup: reject cgroup names with '\n'
      d030671f
    • Linus Torvalds's avatar
      Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu · 6a5c75ce
      Linus Torvalds authored
      Pull percpu fixes from Tejun Heo:
       "One patch to fix a failure path in the alloc path.  The bug is
        dangerous but probably not too likely to actually trigger in the wild
        given that there hasn't been any report yet.
      
        The other two are low impact fixes"
      
      * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
        percpu: free percpu allocation info for uniprocessor system
        percpu: perform tlb flush after pcpu_map_pages() failure
        percpu: fix pcpu_alloc_pages() failure path
      6a5c75ce
    • Linus Torvalds's avatar
      Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · cfa7c641
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Two patches are to add PCI IDs for ICH9 and all others are device
        specific fixes.  Nothing too interesting"
      
      * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        ahci_xgene: Fix the link down in first attempt for the APM X-Gene SoC AHCI SATA host controller driver.
        ahci_xgene: Skip the PHY and clock initialization if already configured by the firmware.
        ahci: add pcid for Marvel 0x9182 controller
        ata: Disabling the async PM for JMicron chip 363/361
        ata_piix: Add Device IDs for Intel 9 Series PCH
        ahci: Add Device IDs for Intel 9 Series PCH
        ata: ahci_tegra: Read calibration fuse
      cfa7c641
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b531f5dd
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix skb leak in mac802154, from Martin Townsend
      
       2) Use select not depends on NF_NAT for NFT_NAT, from Pablo Neira
          Ayuso
      
       3) Fix union initializer bogosity in vxlan, from Gerhard Stenzel
      
       4) Fix RX checksum configuration in stmmac driver, from Giuseppe
          CAVALLARO
      
       5) Fix TSO with non-accelerated VLANs in e1000, e1000e, bna, ehea,
          i40e, i40evf, mvneta, and qlge, from Vlad Yasevich
      
       6) Fix capability checks in phy_init_eee(), from Giuseppe CAVALLARO
      
       7) Try high order allocations more sanely for SKBs, specifically if a
          high order allocation fails, fall back directly to zero order pages
          rather than iterating down one order at a time.  From Eric Dumazet
      
       8) Fix a memory leak in openvswitch, from Li RongQing
      
       9) amd-xgbe initializes wrong spinlock, from Thomas Lendacky
      
      10) RTNL locking was busted in setsockopt for anycast and multicast, fix
          from Sabrina Dubroca
      
      11) Fix peer address refcount leak in ipv6, from Nicolas Dichtel
      
      12) DocBook typo fixes, from Masanari Iida
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits)
        ipv6: restore the behavior of ipv6_sock_ac_drop()
        amd-xgbe: Enable interrupts for all management counters
        amd-xgbe: Treat certain counter registers as 64 bit
        greth: moved TX ring cleaning to NAPI rx poll func
        cnic : Cleanup CONFIG_IPV6 & VLAN check
        net: treewide: Fix typo found in DocBook/networking.xml
        bnx2x: Fix link problems for 1G SFP RJ45 module
        3c59x: avoid panic in boomerang_start_xmit when finding page address:
        netfilter: add explicit Kconfig for NETFILTER_XT_NAT
        ipv6: use addrconf_get_prefix_route() to remove peer addr
        ipv6: fix a refcnt leak with peer addr
        net-timestamp: only report sw timestamp if reporting bit is set
        drivers/net/fddi/skfp/h/skfbi.h: Remove useless PCI_BASE_2ND macros
        l2tp: fix race while getting PMTU on PPP pseudo-wire
        ipv6: fix rtnl locking in setsockopt for anycast and multicast
        VMXNET3: Check for map error in vmxnet3_set_mc
        openvswitch: distinguish between the dropped and consumed skb
        amd-xgbe: Fix initialization of the wrong spin lock
        openvswitch: fix a memory leak
        netfilter: fix missing dependencies in NETFILTER_XT_TARGET_LOG
        ...
      b531f5dd
  3. 07 Sep, 2014 13 commits
  4. 06 Sep, 2014 17 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 2b12164b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "A smattering of bug fixes across most architectures"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        powerpc/kvm/cma: Fix panic introduces by signed shift operation
        KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags
        KVM: s390/mm: Fix storage key corruption during swapping
        arm/arm64: KVM: Complete WFI/WFE instructions
        ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU
        KVM: s390/mm: try a cow on read only pages for key ops
        KVM: s390: Fix user triggerable bug in dead code
      2b12164b
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 56c22854
      Linus Torvalds authored
      Pull ARM SoC fixes from Kevin Hilman:
       "Another round of fixes from arm-soc land, which are mostly DT fixes
        for:
      
         - OMAP: handful of DT fixes devices on newly supported hardware
         - davinci: fix 2nd EDMA channel
         - ux500: extend previous pinctrl fix to another board
         - at91: clock registration fixes, compatibility string precision
      
        And one more fix for event cleanup in drivers/bus/arm-ccn"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        bus: arm-ccn: Move event cleanup routine
        ARM: at91/dt: rm9200: fix usb clock definition
        ARM: at91: rm9200: fix clock registration
        ARM: at91/dt: sam9g20: set at91sam9g20 pllb driver
        ARM: dts: dra7-evm: Add vtt regulator support
        ARM: dts: dra7-evm: Fix spi1 mux documentation
        ARM: dts: am43x-epos-evm: Disable QSPI to prevent conflict with GPMC-NAND
        ARM: OMAP2+: gpmc: Don't complain if wait pin is used without r/w monitoring
        ARM: dts: am43xx-epos-evm: Don't use read/write wait monitoring
        ARM: dts: am437x-gp-evm: Don't use read/write wait monitoring
        ARM: dts: am437x-gp-evm: Use BCH16 ECC scheme instead of BCH8
        ARM: dts: am43x-epos-evm: Use BCH16 ECC scheme instead of BCH8
        ARM: dts: am4372: fix USB regs size
        ARM: dts: am437x-gp: switch i2c0 to 100KHz
        ARM: dts: dra7-evm: Fix 8th NAND partition's name
        ARM: dts: dra7-evm: Fix i2c3 pinmux and frequency
        ARM: ux500: disable msp2 node on Snowball
        ARM: edma: Fix configuration parsing for SoCs with multiple eDMA3 CC
        ARM: dts: set 'ti,set-rate-parent' for dpll4_m5x2 clock
      56c22854
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 11e97398
      Linus Torvalds authored
      Pull xfs fixes from Dave Chinner:
       "The fixes all address recently discovered data corruption issues.
      
        The original Direct IO issue was discovered by Chris Mason @ Facebook
        on a production workload which mixed buffered reads with direct reads
        and writes IO to the same file.  The fix for that exposed other issues
        with page invalidation (exposed by millions of fsx operations) failing
        due to dirty buffers beyond EOF.
      
        Finally, the collapse_range code could also cause problems due to
        racing writeback changing the extent map while it was being shifted
        around.  The commits for that problem are simple mitigation fixes that
        prevent the problem from occuring.  A more robust fix for 3.18 that
        addresses the underlying problem is currently being worked on by
        Brian.
      
        Summary of fixes:
         - a direct IO read/buffered read data corruption
         - the associated fallout from the DIO data corruption fix
         - collapse range bugs that are potential data corruption issues"
      
      * tag 'xfs-for-linus-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
        xfs: trim eofblocks before collapse range
        xfs: xfs_file_collapse_range is delalloc challenged
        xfs: don't log inode unless extent shift makes extent modifications
        xfs: use ranged writeback and invalidation for direct IO
        xfs: don't zero partial page cache pages during O_DIRECT writes
        xfs: don't zero partial page cache pages during O_DIRECT writes
        xfs: don't dirty buffers beyond EOF
      11e97398
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20140905' of git://git.infradead.org/linux-mtd · 925e0ea4
      Linus Torvalds authored
      Pull mtd fixes from Brian Norris:
       "Two trivial MTD updates for 3.17-rc4:
      
         - a tiny comment tweak, to kill a bunch of DocBook warnings added
           during the merge window
      
         - a small fixup to the OTP routines' error handling"
      
      * tag 'for-linus-20140905' of git://git.infradead.org/linux-mtd:
        mtd: nand: fix DocBook warnings on nand_sdr_timings doc
        mtd: cfi_cmdset_0002: check return code for get_chip()
      925e0ea4
    • Thomas Gleixner's avatar
      timekeeping: Update timekeeper before updating vsyscall and pvclock · 9bf2419f
      Thomas Gleixner authored
      The update_walltime() code works on the shadow timekeeper to make the
      seqcount protected region as short as possible. But that update to the
      shadow timekeeper does not update all timekeeper fields because it's
      sufficient to do that once before it becomes life. One of these fields
      is tkr.base_mono. That stays stale in the shadow timekeeper unless an
      operation happens which copies the real timekeeper to the shadow.
      
      The update function is called after the update calls to vsyscall and
      pvclock. While not correct, it did not cause any problems because none
      of the invoked update functions used base_mono.
      
      commit cbcf2dd3 (x86: kvm: Make kvm_get_time_and_clockread()
      nanoseconds based) changed that in the kvm pvclock update function, so
      the stale mono_base value got used and caused kvm-clock to malfunction.
      
      Put the update where it belongs and fix the issue.
      Reported-by: default avatarChris J Arges <chris.j.arges@canonical.com>
      Reported-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409050000570.3333@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      9bf2419f
    • Thomas Gleixner's avatar
      compat: nanosleep: Clarify error handling · 849151dd
      Thomas Gleixner authored
      The error handling in compat_sys_nanosleep() is correct, but
      completely non obvious. Document it and restrict it to the
      -ERESTART_RESTARTBLOCK return value for clarity.
      Reported-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      849151dd
    • David S. Miller's avatar
      Merge branch 'amd-xgbe-net' · bc55dc63
      David S. Miller authored
      Tom Lendacky says:
      
      ====================
      amd-xgbe: AMD XGBE driver fixes 2014-09-05
      
      The following series of patches includes fixes to the driver.
      
      - Proper access to 64 bit management counter registers
      - Enable all management counter registers to generate an interrupt when
        the counter threshold is reached
      
      This patch series is based on net.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc55dc63
    • Lendacky, Thomas's avatar
      amd-xgbe: Enable interrupts for all management counters · a3ba7c98
      Lendacky, Thomas authored
      As the management counters reach a threshold they will generate an
      interrupt so the value can be saved and the counter reset. The
      current code does not enable this interrupt on all counters. This
      can result in inaccurate statistics.
      
      Update the code to enable all the counters to generate an interrupt
      when its threshold is exceeded.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3ba7c98
    • Lendacky, Thomas's avatar
      amd-xgbe: Treat certain counter registers as 64 bit · 60265108
      Lendacky, Thomas authored
      Even if the management counters are configured to be 32 bit register
      values, the [rt]xoctetcount_gb and [rt]xoctetcount_g counters are
      always 64 bit counter registers.  Since they are not being treated as
      64 bit values, these statistics are being reported incorrectly (ifconfig,
      ethtool, etc.).
      
      Update the routines used to read the registers to access the "hi"
      register (an offset of 4 from the "lo" register) to create a 64 bit
      value for these 64 bit counters.
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60265108
    • Daniel Hellstrom's avatar
      greth: moved TX ring cleaning to NAPI rx poll func · e1743a16
      Daniel Hellstrom authored
      This patch does not affect the 10/100 GRETH MAC.
      
      Before all GBit GRETH TX descriptor ring cleaning was done in
      start_xmit(), when descriptor list became full it activated
      TX interrupt to start the NAPI rx poll function to do TX ring
      cleaning.
      
      With this patch the TX descriptor ring is always cleaned from
      the NAPI rx poll function, triggered via TX or RX interrupt.
      Otherwise we could end up in TX frames being sent but not
      reported to the stack being sent. On the 10/100 GRETH this
      is not an issue since the SKB is copied&aligned into private
      buffers so that the SKB can be freed directly on start_xmit()
      Signed-off-by: default avatarDaniel Hellstrom <daniel@gaisler.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1743a16
    • Anish Bhatt's avatar
      cnic : Cleanup CONFIG_IPV6 & VLAN check · c99d667e
      Anish Bhatt authored
      The cnic module needs to ensure that if ipv6 support is compiled as a module,
      then the cnic module cannot be compiled as built-in as it depends on ipv6.
      Made this check cleaner via Kconfig
      
      Use simpler IS_ENABLED for CONFIG_VLAN_8021Q check
      Signed-off-by: default avatarAnish Bhatt <anish@chelsio.com>
      Acked-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c99d667e
    • Suman Tripathi's avatar
      ahci_xgene: Fix the link down in first attempt for the APM X-Gene SoC AHCI... · 0babe614
      Suman Tripathi authored
      ahci_xgene: Fix the link down in first attempt for the APM X-Gene SoC AHCI SATA host controller driver.
      
      Due to HW errata the APM X-Gene AHCI SATA host controller reports link
      down even if the device presence is detected. This issue is due to speed
      negotiation failure. This patch implements the algorithm to retry the
      COMRESET if PxSTAT register reports device presence detected but
      PHY communication not established. The maximum retry attempts are 3.
      
      This patch also fixes the code to match the algorithm for the printing
      a warning message if the disparity error still exists after link up.
      Signed-off-by: default avatarLoc Ho <lho@apm.com>
      Signed-off-by: default avatarSuman Tripathi <stripathi@apm.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      0babe614
    • Suman Tripathi's avatar
      ahci_xgene: Skip the PHY and clock initialization if already configured by the firmware. · 0bed13be
      Suman Tripathi authored
      This patch implements the feature to skip the PHY and clock
      initialization if it is already configured by the firmware.
      Signed-off-by: default avatarLoc Ho <lho@apm.com>
      Signed-off-by: default avatarSuman Tripathi <stripathi@apm.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      0bed13be
    • Masanari Iida's avatar
      net: treewide: Fix typo found in DocBook/networking.xml · e793c0f7
      Masanari Iida authored
      This patch fix spelling typo found in DocBook/networking.xml.
      It is because the neworking.xml is generated from comments
      in the source, I have to fix typo in comments within the source.
      Signed-off-by: default avatarMasanari Iida <standby24x7@gmail.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e793c0f7
    • Yaniv Rosner's avatar
      bnx2x: Fix link problems for 1G SFP RJ45 module · 6e9e5644
      Yaniv Rosner authored
      When 1G SFP RJ45 module is detected, driver must reset the Tx laser
      in order to prevent link issues. As part of change, the link_attr_sync
      was relocated from vars to params.
      Signed-off-by: default avatarYaniv Rosner <Yaniv.Rosner@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e9e5644
    • Neil Horman's avatar
      3c59x: avoid panic in boomerang_start_xmit when finding page address: · 98ea232c
      Neil Horman authored
      This bug was reported on a very old kernel (RHEL6, 2.6.32-491.el6):
      
      BUG: unable to handle kernel paging request at 00800000
      IP: [<c04107b5>] nommu_map_page+0x15/0x110
      *pdpt = 000000003454f001 *pde = 000000003f03d067
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/system/cpu/online
      Modules linked in: nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs p4_clockmod
      ipv6 ppdev parport_pc parport microcode iTCO_wdt iTCO_vendor_support 3c59x mii
      dcdbas serio_raw snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device
      snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 sg lpc_ich mfd_core ext4
      jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix
      radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash
      dm_log dm_mod [last unloaded: mperf]
      
      Pid: 4219, comm: nfsd Not tainted 2.6.32-491.el6.i686 #1 Dell Computer
      Corporation OptiPlex GX240               /OptiPlex GX240
      EIP: 0060:[<c04107b5>] EFLAGS: 00010246 CPU: 0
      EIP is at nommu_map_page+0x15/0x110
      EAX: 00000000 EBX: c0a83480 ECX: 00000000 EDX: 00800000
      ESI: 00000000 EDI: f70e7860 EBP: e2d09b54 ESP: e2d09b24
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process nfsd (pid: 4219, ti=e2d08000 task=e2ceaaa0 task.ti=e2d08000)
      Stack:
       00000056 00000000 0000000e c65efd38 00000020 00000296 00000206 00000206
      <0> c050c850 c0a83480 e2cef154 00000001 e2d09ba8 f8fcd585 00000510 00000001
      <0> 00000000 00000000 f5172200 f8fdac00 0039ef8c f5277020 f70e7860 00000510
      Call Trace:
       [<c050c850>] ? page_address+0xd0/0xe0
       [<f8fcd585>] ? boomerang_start_xmit+0x3b5/0x520 [3c59x]
       [<c07b2975>] ? dev_hard_start_xmit+0xe5/0x400
       [<f9182b00>] ? ip6_output_finish+0x0/0xf0 [ipv6]
       [<c07ca053>] ? sch_direct_xmit+0x113/0x180
       [<c07d5588>] ? nf_hook_slow+0x68/0x120
       [<c07b2ea5>] ? dev_queue_xmit+0x1b5/0x290
       [<f9182b6d>] ? ip6_output_finish+0x6d/0xf0 [ipv6]
       [<f9184cb8>] ? ip6_xmit+0x3e8/0x490 [ipv6]
       [<f91ab9f9>] ? inet6_csk_xmit+0x289/0x2f0 [ipv6]
       [<c07f6451>] ? tcp_transmit_skb+0x431/0x7f0
       [<c07a403f>] ? __alloc_skb+0x4f/0x140
       [<c07f85a2>] ? tcp_write_xmit+0x1c2/0xa50
       [<c07f90b1>] ? __tcp_push_pending_frames+0x31/0xe0
       [<c07ea47a>] ? tcp_sendpage+0x44a/0x4b0
       [<c07ea030>] ? tcp_sendpage+0x0/0x4b0
       [<c079be1e>] ? kernel_sendpage+0x4e/0x90
       [<f8457bb9>] ? svc_send_common+0xc9/0x120 [sunrpc]
       [<f8457c85>] ? svc_sendto+0x75/0x1f0 [sunrpc]
       [<c060d0d9>] ? _atomic_dec_and_lock+0x59/0x90
       [<f87d55d0>] ? nfs3svc_encode_readres+0x0/0xc0 [nfsd]
       [<f845876d>] ? svc_authorise+0x2d/0x40 [sunrpc]
       [<f87d4410>] ? nfs3svc_release_fhandle+0x0/0x10 [nfsd]
       [<f8455721>] ? svc_process_common+0xf1/0x5a0 [sunrpc]
       [<f8457e86>] ? svc_tcp_sendto+0x36/0xa0 [sunrpc]
       [<f8461778>] ? svc_send+0x98/0xd0 [sunrpc]
       [<f87c698c>] ? nfsd+0xac/0x140 [nfsd]
       [<c04470e0>] ? complete+0x40/0x60
       [<f87c68e0>] ? nfsd+0x0/0x140 [nfsd]
       [<c04802ac>] ? kthread+0x7c/0xa0
       [<c0480230>] ? kthread+0x0/0xa0
       [<c0409f9f>] ? kernel_thread_helper+0x7/0x10
      Code: 8d b6 00 00 00 00 eb f8 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 55 89 e5
      83 ec 30 89 75 f8 31 f6 89 7d fc 89 c7 89 c8 89 5d f4 <8b> 1a 8b 4d 08 c1 eb 19
      c1 e3 04 8b 9b c0 29 c7 c0 83 e3 fc 29
      
      But the problem seems to still exist upstream.  It seems on 32 bit kernels
      page_address() can reutrn a NULL value in some circumstances, and the
      pci_map_single api isn't prepared to handle that (on this system it results in a
      bogus pointer deference in nommu_map_page.
      
      The fix is pretty easy, if we convert the 3c59x driver to use the more
      convieient skb_frag_dma_map api we don't need to find the virtual address of the
      page at all, and page gets mapped to the hardware properly.  Verified to fix the
      problem as described by the reporter.
      
      Applies to the net tree
      
      Change Notes:
      
      v2) Converted PCI_DMA_TODEVICE to DMA_TO_DEVICE.  Thanks Dave!
      
      v3) Actually Run git commit after making changes to v2 :)
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: klassert@mathematik.tu-chemnitz.de
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98ea232c
    • Pablo Neira Ayuso's avatar
      netfilter: add explicit Kconfig for NETFILTER_XT_NAT · 84a59ca5
      Pablo Neira Ayuso authored
      Paul Bolle reports that 'select NETFILTER_XT_NAT' from the IPV4 and IPV6
      NAT tables becomes noop since there is no Kconfig switch for it. Add the
      Kconfig switch to resolve this problem.
      
      Fixes: 8993cf8e netfilter: move NAT Kconfig switches out of the iptables scope
      Reported-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84a59ca5