1. 16 Apr, 2014 5 commits
    • Sachin Prabhu's avatar
      cifs: Wait for writebacks to complete before attempting write. · c11f1df5
      Sachin Prabhu authored
      Problem reported in Red Hat bz 1040329 for strict writes where we cache
      only when we hold oplock and write direct to the server when we don't.
      
      When we receive an oplock break, we first change the oplock value for
      the inode in cifsInodeInfo->oplock to indicate that we no longer hold
      the oplock before we enqueue a task to flush changes to the backing
      device. Once we have completed flushing the changes, we return the
      oplock to the server.
      
      There are 2 ways here where we can have data corruption
      1) While we flush changes to the backing device as part of the oplock
      break, we can have processes write to the file. These writes check for
      the oplock, find none and attempt to write directly to the server.
      These direct writes made while we are flushing from cache could be
      overwritten by data being flushed from the cache causing data
      corruption.
      2) While a thread runs in cifs_strict_writev, the machine could receive
      and process an oplock break after the thread has checked the oplock and
      found that it allows us to cache and before we have made changes to the
      cache. In that case, we end up with a dirty page in cache when we
      shouldn't have any. This will be flushed later and will overwrite all
      subsequent writes to the part of the file represented by this page.
      
      Before making any writes to the server, we need to confirm that we are
      not in the process of flushing data to the server and if we are, we
      should wait until the process is complete before we attempt the write.
      We should also wait for existing writes to complete before we process
      an oplock break request which changes oplock values.
      
      We add a version specific  downgrade_oplock() operation to allow for
      differences in the oplock values set for the different smb versions.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <piastry@etersoft.ru>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      c11f1df5
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 0f689a33
      Linus Torvalds authored
      Pull s390 patches from Martin Schwidefsky:
       "An update to the oops output with additional information about the
        crash.  The renameat2 system call is enabled.  Two patches in regard
        to the PTR_ERR_OR_ZERO cleanup.  And a bunch of bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/sclp_cmd: replace PTR_RET with PTR_ERR_OR_ZERO
        s390/sclp: replace PTR_RET with PTR_ERR_OR_ZERO
        s390/sclp_vt220: Fix kernel panic due to early terminal input
        s390/compat: fix typo
        s390/uaccess: fix possible register corruption in strnlen_user_srst()
        s390: add 31 bit warning message
        s390: wire up sys_renameat2
        s390: show_registers() should not map user space addresses to kernel symbols
        s390/mm: print control registers and page table walk on crash
        s390/smp: fix smp_stop_cpu() for !CONFIG_SMP
        s390: fix control register update
      0f689a33
    • Linus Torvalds's avatar
      Merge tag 'please-pull-ia64-erratum' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · 7d38cc02
      Linus Torvalds authored
      Pull itanium erratum fix from Tony Luck:
       "Small workaround for a rare, but annoying, erratum #237"
      
      * tag 'please-pull-ia64-erratum' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
        [IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237)
      7d38cc02
    • Tony Luck's avatar
      [IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237) · c0b5a64d
      Tony Luck authored
      April 2014 Itanium processor specification update:
      
      http://www.intel.com/content/www/us/en/processors/itanium/itanium-specification-update.html
      
      describes this erratum:
      
      =========================================================================
      237. Under a complex set of conditions, store to load forwarding for a
      sub 8-byte load may complete incorrectly
      
      Problem: A load instruction may complete incorrectly when a code sequence
      using 4-byte or smaller load and store operations to the same address
      is executed in combination with specific timing of all the following
      concurrent conditions: store to load forwarding, alignment checking
      enabled, a mis-predicted branch, and complex cache utilization activity.
      
      Implication: The affected sub 8-byte instruction may complete
      incorrectly resulting in unpredictable system behavior. There is an
      extremely low probability of exposure due to the significant number of
      complex microarchitectural concurrent conditions required to encounter
      the erratum.
      
      Workaround: Set PSR.ac = 0 to completely avoid the erratum. Disabling
      Hyper-Threading will significantly reduce exposure to the conditions
      that contribute to encountering the erratum.
      
      Status: See the Summary Table of Changes for the affected steppings.
      =========================================================================
      
      [Table of changes essentially lists all models from McKinley to Tukwila]
      
      The PSR.ac bit controls whether the processor will always generate
      an unaligned reference trap (0x5a00) for a misaligned data access
      (when PSR.ac=1) or if it will let the access succeed when running
      on a cpu that implements logic to handle some unaligned accesses.
      
      Way back in 2008 in commit b704882e
        [IA64] Rationalize kernel mode alignment checking
      we made the decision to always enable strict checking. We were
      already doing so in trap/interrupt context because the common
      preamble code set this bit - but the rest of supervisor code
      (and by inheritance user code) ran with PSR.ac=0.
      
      We now reverse that decision and set PSR.ac=0 everywhere in the
      kernel (also inherited by user processes). This will avoid the
      erratum using the method described in the Itanium specification
      update.  Net effect for users is that the processor will handle
      unaligned access when it can (typically with a tiny performance
      bubble in the pipeline ... but much less invasive than taking a
      trap and having the OS perform the access).
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      c0b5a64d
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 10ec34fc
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix BPF filter validation of netlink attribute accesses, from
          Mathias Kruase.
      
       2) Netfilter conntrack generation seqcount not initialized properly,
          from Andrey Vagin.
      
       3) Fix comparison mask computation on big-endian in nft_cmp_fast(),
          from Patrick McHardy.
      
       4) Properly limit MTU over ipv6, from Eric Dumazet.
      
       5) Fix seccomp system call argument population on 32-bit, from Daniel
          Borkmann.
      
       6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead
          skb->mac_len needs to be used.  From Vlad Yasevich.
      
       7) We have several cases of using socket based communications to
          implement a tunnel.  For example, some tunnels are encapsulations
          over UDP so we use an internal kernel UDP socket to do the
          transmits.
      
          These tunnels should behave just like other software devices and
          pass the packets on down to the next layer.
      
          Most importantly we want the top-level socket (eg TCP) that created
          the traffic to be charged for the SKB memory.
      
          However, once you get into the IP output path, we have code that
          assumed that whatever was attached to skb->sk is an IP socket.
      
          To keep the top-level socket being charged for the SKB memory,
          whilst satisfying the needs of the IP output path, we now pass in an
          explicit 'sk' argument.
      
          From Eric Dumazet.
      
       8) ping_init_sock() leaks group info, from Xiaoming Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
        cxgb4: use the correct max size for firmware flash
        qlcnic: Fix MSI-X initialization code
        ip6_gre: don't allow to remove the fb_tunnel_dev
        ipv4: add a sock pointer to dst->output() path.
        ipv4: add a sock pointer to ip_queue_xmit()
        driver/net: cosa driver uses udelay incorrectly
        at86rf230: fix __at86rf230_read_subreg function
        at86rf230: remove check if AVDD settled
        net: cadence: Add architecture dependencies
        net: Start with correct mac_len in skb_network_protocol
        Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
        cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
        net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
        seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
        qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
        qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
        qlcnic: Fix PVID configuration on eSwitch port.
        qlcnic: Fix max ring count calculation
        qlcnic: Fix to send INIT_NIC_FUNC as first mailbox.
        qlcnic: Fix panic due to uninitialzed delayed_work struct in use.
        ...
      10ec34fc
  2. 15 Apr, 2014 9 commits
  3. 14 Apr, 2014 26 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · 55101e2d
      Linus Torvalds authored
      Pull KVM fixes from Marcelo Tosatti:
       - Fix for guest triggerable BUG_ON (CVE-2014-0155)
       - CR4.SMAP support
       - Spurious WARN_ON() fix
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: remove WARN_ON from get_kernel_ns()
        KVM: Rename variable smep to cr4_smep
        KVM: expose SMAP feature to guest
        KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
        KVM: Add SMAP support when setting CR4
        KVM: Remove SMAP bit from CR4_RESERVED_BITS
        KVM: ioapic: try to recover if pending_eoi goes out of range
        KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)
      55101e2d
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · dafe344d
      Linus Torvalds authored
      Pull bmc2835 crypto fix from Herbert Xu:
       "This fixes a potential boot crash on bcm2835 due to the recent change
        that now causes hardware RNGs to be accessed on registration"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        hwrng: bcm2835 - fix oops when rng h/w is accessed during registration
      dafe344d
    • Mikulas Patocka's avatar
      user namespace: fix incorrect memory barriers · e79323bd
      Mikulas Patocka authored
      smp_read_barrier_depends() can be used if there is data dependency between
      the readers - i.e. if the read operation after the barrier uses address
      that was obtained from the read operation before the barrier.
      
      In this file, there is only control dependency, no data dependecy, so the
      use of smp_read_barrier_depends() is incorrect. The code could fail in the
      following way:
      * the cpu predicts that idx < entries is true and starts executing the
        body of the for loop
      * the cpu fetches map->extent[0].first and map->extent[0].count
      * the cpu fetches map->nr_extents
      * the cpu verifies that idx < extents is true, so it commits the
        instructions in the body of the for loop
      
      The problem is that in this scenario, the cpu read map->extent[0].first
      and map->nr_extents in the wrong order. We need a full read memory barrier
      to prevent it.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e79323bd
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 00cbc3dc
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains three Netfilter fixes for your net tree,
      they are:
      
      * Fix missing generation sequence initialization which results in a splat
        if lockdep is enabled, it was introduced in the recent works to improve
        nf_conntrack scalability, from Andrey Vagin.
      
      * Don't flush the GRE keymap list in nf_conntrack when the pptp helper is
        disabled otherwise this crashes due to a double release, from Andrey
        Vagin.
      
      * Fix nf_tables cmp fast in big endian, from Patrick McHardy.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00cbc3dc
    • Vlad Yasevich's avatar
      net: Start with correct mac_len in skb_network_protocol · 1e785f48
      Vlad Yasevich authored
      Sometimes, when the packet arrives at skb_mac_gso_segment()
      its skb->mac_len already accounts for some of the mac lenght
      headers in the packet.  This seems to happen when forwarding
      through and OpenSSL tunnel.
      
      When we start looking for any vlan headers in skb_network_protocol()
      we seem to ignore any of the already known mac headers and start
      with an ETH_HLEN.  This results in an incorrect offset, dropped
      TSO frames and general slowness of the connection.
      
      We can start counting from the known skb->mac_len
      and return at least that much if all mac level headers
      are known and accounted for.
      
      Fixes: 53d6471c (net: Account for all vlan headers in skb_mac_gso_segment)
      CC: Eric Dumazet <eric.dumazet@gmail.com>
      CC: Daniel Borkman <dborkman@redhat.com>
      Tested-by: default avatarMartin Filip <nexus+kernel@smoula.net>
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e785f48
    • Marcelo Tosatti's avatar
      b351c39c
    • Feng Wu's avatar
      KVM: Rename variable smep to cr4_smep · 66386ade
      Feng Wu authored
      Rename variable smep to cr4_smep, which can better reflect the
      meaning of the variable.
      Signed-off-by: default avatarFeng Wu <feng.wu@intel.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      66386ade
    • Feng Wu's avatar
      KVM: expose SMAP feature to guest · de935ae1
      Feng Wu authored
      This patch exposes SMAP feature to guest
      Signed-off-by: default avatarFeng Wu <feng.wu@intel.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      de935ae1
    • Feng Wu's avatar
      KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode · e1e746b3
      Feng Wu authored
      SMAP is disabled if CPU is in non-paging mode in hardware.
      However KVM always uses paging mode to emulate guest non-paging
      mode with TDP. To emulate this behavior, SMAP needs to be
      manually disabled when guest switches to non-paging mode.
      Signed-off-by: default avatarFeng Wu <feng.wu@intel.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      e1e746b3
    • Feng Wu's avatar
      KVM: Add SMAP support when setting CR4 · 97ec8c06
      Feng Wu authored
      This patch adds SMAP handling logic when setting CR4 for guests
      
      Thanks a lot to Paolo Bonzini for his suggestion to use the branchless
      way to detect SMAP violation.
      Signed-off-by: default avatarFeng Wu <feng.wu@intel.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      97ec8c06
    • Feng Wu's avatar
      KVM: Remove SMAP bit from CR4_RESERVED_BITS · 56d6efc2
      Feng Wu authored
      This patch removes SMAP bit from CR4_RESERVED_BITS.
      Signed-off-by: default avatarFeng Wu <feng.wu@intel.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      56d6efc2
    • Daniel Borkmann's avatar
      Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" · 362d5204
      Daniel Borkmann authored
      This reverts commit ef2820a7 ("net: sctp: Fix a_rwnd/rwnd management
      to reflect real state of the receiver's buffer") as it introduced a
      serious performance regression on SCTP over IPv4 and IPv6, though a not
      as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs.
      
      Current state:
      
      [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60
      iperf version 3.0.1 (10 January 2014)
      Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
      Time: Fri, 11 Apr 2014 17:56:21 GMT
      Connecting to host 192.168.241.3, port 5201
            Cookie: Lab200slot2.1397238981.812898.548918
      [  4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201
      Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
      [ ID] Interval           Transfer     Bandwidth
      [  4]   0.00-1.09   sec  20.8 MBytes   161 Mbits/sec
      [  4]   1.09-2.13   sec  10.8 MBytes  86.8 Mbits/sec
      [  4]   2.13-3.15   sec  3.57 MBytes  29.5 Mbits/sec
      [  4]   3.15-4.16   sec  4.33 MBytes  35.7 Mbits/sec
      [  4]   4.16-6.21   sec  10.4 MBytes  42.7 Mbits/sec
      [  4]   6.21-6.21   sec  0.00 Bytes    0.00 bits/sec
      [  4]   6.21-7.35   sec  34.6 MBytes   253 Mbits/sec
      [  4]   7.35-11.45  sec  22.0 MBytes  45.0 Mbits/sec
      [  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
      [  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
      [  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
      [  4]  11.45-12.51  sec  16.0 MBytes   126 Mbits/sec
      [  4]  12.51-13.59  sec  20.3 MBytes   158 Mbits/sec
      [  4]  13.59-14.65  sec  13.4 MBytes   107 Mbits/sec
      [  4]  14.65-16.79  sec  33.3 MBytes   130 Mbits/sec
      [  4]  16.79-16.79  sec  0.00 Bytes    0.00 bits/sec
      [  4]  16.79-17.82  sec  5.94 MBytes  48.7 Mbits/sec
      (etc)
      
      [root@Lab200slot2 ~]#  iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60
      iperf version 3.0.1 (10 January 2014)
      Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
      Time: Fri, 11 Apr 2014 19:08:41 GMT
      Connecting to host 2001:db8:0:f101::1, port 5201
            Cookie: Lab200slot2.1397243321.714295.2b3f7c
      [  4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201
      Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test
      [ ID] Interval           Transfer     Bandwidth
      [  4]   0.00-1.00   sec   169 MBytes  1.42 Gbits/sec
      [  4]   1.00-2.00   sec   201 MBytes  1.69 Gbits/sec
      [  4]   2.00-3.00   sec   188 MBytes  1.58 Gbits/sec
      [  4]   3.00-4.00   sec   174 MBytes  1.46 Gbits/sec
      [  4]   4.00-5.00   sec   165 MBytes  1.39 Gbits/sec
      [  4]   5.00-6.00   sec   199 MBytes  1.67 Gbits/sec
      [  4]   6.00-7.00   sec   163 MBytes  1.36 Gbits/sec
      [  4]   7.00-8.00   sec   174 MBytes  1.46 Gbits/sec
      [  4]   8.00-9.00   sec   193 MBytes  1.62 Gbits/sec
      [  4]   9.00-10.00  sec   196 MBytes  1.65 Gbits/sec
      [  4]  10.00-11.00  sec   157 MBytes  1.31 Gbits/sec
      [  4]  11.00-12.00  sec   175 MBytes  1.47 Gbits/sec
      [  4]  12.00-13.00  sec   192 MBytes  1.61 Gbits/sec
      [  4]  13.00-14.00  sec   199 MBytes  1.67 Gbits/sec
      (etc)
      
      After patch:
      
      [root@Lab200slot2 ~]#  iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60
      iperf version 3.0.1 (10 January 2014)
      Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64
      Time: Mon, 14 Apr 2014 16:40:48 GMT
      Connecting to host 192.168.240.3, port 5201
            Cookie: Lab200slot2.1397493648.413274.65e131
      [  4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201
      Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
      [ ID] Interval           Transfer     Bandwidth
      [  4]   0.00-1.00   sec   240 MBytes  2.02 Gbits/sec
      [  4]   1.00-2.00   sec   239 MBytes  2.01 Gbits/sec
      [  4]   2.00-3.00   sec   240 MBytes  2.01 Gbits/sec
      [  4]   3.00-4.00   sec   239 MBytes  2.00 Gbits/sec
      [  4]   4.00-5.00   sec   245 MBytes  2.05 Gbits/sec
      [  4]   5.00-6.00   sec   240 MBytes  2.01 Gbits/sec
      [  4]   6.00-7.00   sec   240 MBytes  2.02 Gbits/sec
      [  4]   7.00-8.00   sec   239 MBytes  2.01 Gbits/sec
      
      With the reverted patch applied, the SCTP/IPv4 performance is back
      to normal on latest upstream for IPv4 and IPv6 and has same throughput
      as 3.4.2 test kernel, steady and interval reports are smooth again.
      
      Fixes: ef2820a7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer")
      Reported-by: default avatarPeter Butler <pbutler@sonusnet.com>
      Reported-by: default avatarDongsheng Song <dongsheng.song@gmail.com>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Tested-by: default avatarPeter Butler <pbutler@sonusnet.com>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
      Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      362d5204
    • Steve Wise's avatar
      cxgb4: Save the correct mac addr for hw-loopback connections in the L2T · bfae2324
      Steve Wise authored
      Hardware needs the local device mac address to support hw loopback for
      rdma loopback connections.
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfae2324
    • Daniel Borkmann's avatar
      net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W · 8c482cdc
      Daniel Borkmann authored
      While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has
      been wrongly decoded by commit a8fc9277 ("sk-filter: Add ability to
      get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS
      although it should have been decoded as BPF_LD|BPF_W|BPF_ABS.
      
      In practice, this should not have much side-effect though, as such
      conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse
      operation PR_GET_SECCOMP will only return the current seccomp mode, but
      not the filter itself. Since the transition to the new BPF infrastructure,
      it's also not used anymore, so we can simply remove this as it's
      unreachable.
      
      Fixes: a8fc9277 ("sk-filter: Add ability to get socket filter program (v2)")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c482cdc
    • Daniel Borkmann's avatar
      seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF · 2eac7648
      Daniel Borkmann authored
      Linus reports that on 32-bit x86 Chromium throws the following seccomp
      resp. audit log messages:
      
        audit: type=1326 audit(1397359304.356:28108): auid=500 uid=500
      gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
      pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0
      syscall=172 compat=0 ip=0xb2dd9852 code=0x30000
      
        audit: type=1326 audit(1397359304.356:28109): auid=500 uid=500
      gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
      pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=5
      compat=0 ip=0xb2dd9852 code=0x50000
      
      These audit messages are being triggered via audit_seccomp() through
      __secure_computing() in seccomp mode (BPF) filter with seccomp return
      codes 0x30000 (== SECCOMP_RET_TRAP) and 0x50000 (== SECCOMP_RET_ERRNO)
      during filter runtime. Moreover, Linus reports that x86_64 Chromium
      seems fine.
      
      The underlying issue that explains this is that the implementation of
      populate_seccomp_data() is wrong. Our seccomp data structure sd that
      is being shared with user ABI is:
      
        struct seccomp_data {
          int nr;
          __u32 arch;
          __u64 instruction_pointer;
          __u64 args[6];
        };
      
      Therefore, a simple cast to 'unsigned long *' for storing the value of
      the syscall argument via syscall_get_arguments() is just wrong as on
      32-bit x86 (or any other 32bit arch), it would result in storing a0-a5
      at wrong offsets in args[] member, and thus i) could leak stack memory
      to user space and ii) tampers with the logic of seccomp BPF programs
      that read out and check for syscall arguments:
      
        syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);
      
      Tested on 32-bit x86 with Google Chrome, unfortunately only via remote
      test machine through slow ssh X forwarding, but it fixes the issue on
      my side. So fix it up by storing args in type correct variables, gcc
      is clever and optimizes the copy away in other cases, e.g. x86_64.
      
      Fixes: bd4cf0ed ("net: filter: rework/optimize internal BPF interpreter's instruction set")
      Reported-and-bisected-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2eac7648
    • David S. Miller's avatar
      Merge branch 'qlcnic' · 14ed4a5b
      David S. Miller authored
      Shahed Shaikh says:
      
      ====================
      qlcnic: Bug fixes
      
      This patch series contains following bug fixes -
      
      * Send INIT_NIC_FUNC mailbox command as first mailbox
      * Fix a panic because of uninitialized delayed_work.
      * Fix inconsistent calculation of max rings count.
      * Fix PVID configuration issue. Driver needs to clear older
        PVID before adding new one.
      * Fix QLogic application/driver interface by packing vNIC information
        array.
      * Fix a crash when user tries to disable SR-IOV while VFs are
        still assigned to VMs.
      
      Please apply to net.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14ed4a5b
    • Manish Chopra's avatar
      qlcnic: Do not disable SR-IOV when VFs are assigned to VMs · 696f1943
      Manish Chopra authored
      o While disabling SR-IOV when VFs are assigned to VMs causes host crash
        so return -EPERM when user request to disable SR-IOV using pci sysfs in
        case of VFs are assigned to VMs.
      Signed-off-by: default avatarManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      696f1943
    • Jitendra Kalsaria's avatar
      qlcnic: Fix QLogic application/driver interface for virtual NIC configuration · 4f030227
      Jitendra Kalsaria authored
      o Application expect vNIC number as the array index but driver interface
      return configuration in array index form.
      
      o Pack the vNIC information array in the buffer such that application can
      access it using vNIC number as the array index.
      Signed-off-by: default avatarJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f030227
    • Jitendra Kalsaria's avatar
      qlcnic: Fix PVID configuration on eSwitch port. · a78b6da8
      Jitendra Kalsaria authored
      Clear older PVID before adding a newer PVID to the eSwicth port
      Signed-off-by: default avatarJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a78b6da8
    • Shahed Shaikh's avatar
      qlcnic: Fix max ring count calculation · 7b546842
      Shahed Shaikh authored
      Do not read max rings count from qlcnic_get_nic_info(). Use driver defined
      values for 82xx adapters. In case of 83xx adapters, use minimum of firmware
      provided and driver defined values.
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b546842
    • Sucheta Chakraborty's avatar
      qlcnic: Fix to send INIT_NIC_FUNC as first mailbox. · 4d52e1e8
      Sucheta Chakraborty authored
      o INIT_NIC_FUNC should be first mailbox sent. Sending DCB capability and
        parameter query commands after that command.
      Signed-off-by: default avatarSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d52e1e8
    • Sucheta Chakraborty's avatar
      qlcnic: Fix panic due to uninitialzed delayed_work struct in use. · 463518a0
      Sucheta Chakraborty authored
      o AEN event was being received before initializing delayed_work struct
        and handlers for it. This was resulting in crash. This patch fixes it.
      Signed-off-by: default avatarSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      463518a0
    • David S. Miller's avatar
      Merge branch 'be2net' · 677df2f4
      David S. Miller authored
      Sathya Perla says:
      
      ====================
      be2net: patch set
      
      Patch 1/2 is a v2 of a patch that was submitted earlier (as a part of a
      different patch-set). v2 incorporates a suggestion given by David Laight
      for how long to poll for pending TX completions while disabling a device.
      
      Patch 2/2 fixes a crash in be_remove()->be_close()
      path after be2net has aborted an EEH error recovery
      due to a permanant failure.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      677df2f4
    • Kalesh AP's avatar
      be2net: Fix invocation of be_close() after be_clear() · e1ad8e33
      Kalesh AP authored
      In the EEH error recovery path, when a permanent failure occurs,
      we clean up adapter structure (i.e. destroy queues etc) by calling
      be_clear() and return PCI_ERS_RESULT_DISCONNECT.
      After this the stack tries to remove device from bus and calls
      be_remove() which invokes netdev_unregister()->be_close().
      be_close() operating on destroyed queues results in a
      NULL dereference.
      
      This patch fixes this problem by introducing a flag to keep track
      of the setup state.
      Signed-off-by: default avatarKalesh AP <kalesh.purayil@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1ad8e33
    • Vasundhara Volam's avatar
      be2net: Fix to reap TX compls till HW doesn't respond for some time · 1a3d0717
      Vasundhara Volam authored
      be_close() currently waits for a max of 200ms to receive all pending
      TX compls. This timeout value was roughly calculated based on 10G
      transmission speeds and the TX queue depth. This timeout may not be
      enough when the link is operating at lower speeds or in multi-channel/SR-IOV
      configs with TX-rate limiting setting.
      
      It is hard to calculate a "proper timeout value" that works in all
      configurations.  This patch solves this problem by continuing to reap
      TX completions till the HW is completely silent for a period of 10ms or
      a HW error is detected.
      
      v2: implements the new scheme (as suggested by David Laight) instead of
      just waiting longer than 200ms for reaping all completions.
      Signed-off-by: default avatarVasundhara Volam <vasundhara.volam@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a3d0717
    • Amir Vadai's avatar
      net/mlx4_core: Defer VF initialization till PF is fully initialized · e1a5ddc5
      Amir Vadai authored
      Fix in commit [1] is not sufficient since a deferred VF initialization
      could happen after pci_enable_sriov() is finished, but before the PF is
      fully initialized.
      Need to prevent VFs from initializing till the PF is fully ready and
      comm channel is operational.
      
      [1] - 97989356 "net/mlx4_core: mlx4_init_slave() shouldn't access comm
            channel before PF is ready"
      
      CC: Stuart Hayes <Stuart_Hayes@Dell.com>
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1a5ddc5