Commits · 1be9a950c646c9092fb3618197f7b6bfb50e82aa · Kirill Smelkov / linux

23 Jul, 2014 1 commit

net: sctp: inherit auth_capable on INIT collisions · 1be9a950

Daniel Borkmann authored Jul 22, 2014

Jason reported an oops caused by SCTP on his ARM machine with
SCTP authentication enabled:

Internal error: Oops: 17 [#1] ARM
CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
task: c6eefa40 ti: c6f52000 task.ti: c6f52000
PC is at sctp_auth_calculate_hmac+0xc4/0x10c
LR is at sg_init_table+0x20/0x38
pc : [<c024bb80>]    lr : [<c00f32dc>]    psr: 40000013
sp : c6f538e8  ip : 00000000  fp : c6f53924
r10: c6f50d80  r9 : 00000000  r8 : 00010000
r7 : 00000000  r6 : c7be4000  r5 : 00000000  r4 : c6f56254
r3 : c00c8170  r2 : 00000001  r1 : 00000008  r0 : c6f1e660
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005397f  Table: 06f28000  DAC: 00000015
Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
Stack: (0xc6f538e8 to 0xc6f54000)
[...]
Backtrace:
[<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8)
[<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844)
[<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28)
[<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220)
[<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4)
[<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160)
[<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74)
[<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888)

While we already had various kind of bugs in that area
ec0223ec ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
we/peer is AUTH capable") and b14878cc ("net: sctp: cache
auth_enable per endpoint"), this one is a bit of a different
kind.

Giving a bit more background on why SCTP authentication is
needed can be found in RFC4895:

  SCTP uses 32-bit verification tags to protect itself against
  blind attackers. These values are not changed during the
  lifetime of an SCTP association.

  Looking at new SCTP extensions, there is the need to have a
  method of proving that an SCTP chunk(s) was really sent by
  the original peer that started the association and not by a
  malicious attacker.

To cause this bug, we're triggering an INIT collision between
peers; normal SCTP handshake where both sides intent to
authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
parameters that are being negotiated among peers:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895 says that each endpoint therefore knows its own random
number and the peer's random number *after* the association
has been established. The local and peer's random number along
with the shared key are then part of the secret used for
calculating the HMAC in the AUTH chunk.

Now, in our scenario, we have 2 threads with 1 non-blocking
SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY
and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling
sctp_bindx(3), listen(2) and connect(2) against each other,
thus the handshake looks similar to this, e.g.:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] -----------
  -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] -------->
  ...

Since such collisions can also happen with verification tags,
the RFC4895 for AUTH rather vaguely says under section 6.1:

  In case of INIT collision, the rules governing the handling
  of this Random Number follow the same pattern as those for
  the Verification Tag, as explained in Section 5.2.4 of
  RFC 2960 [5]. Therefore, each endpoint knows its own Random
  Number and the peer's Random Number after the association
  has been established.

In RFC2960, section 5.2.4, we're eventually hitting Action B:

  B) In this case, both sides may be attempting to start an
     association at about the same time but the peer endpoint
     started its INIT after responding to the local endpoint's
     INIT. Thus it may have picked a new Verification Tag not
     being aware of the previous Tag it had sent this endpoint.
     The endpoint should stay in or enter the ESTABLISHED
     state but it MUST update its peer's Verification Tag from
     the State Cookie, stop any init or cookie timers that may
     running and send a COOKIE ACK.

In other words, the handling of the Random parameter is the
same as behavior for the Verification Tag as described in
Action B of section 5.2.4.

Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
side effect interpreter, and in fact it properly copies over
peer_{random, hmacs, chunks} parameters from the newly created
association to update the existing one.

Also, the old asoc_shared_key is being released and based on
the new params, sctp_auth_asoc_init_active_key() updated.
However, the issue observed in this case is that the previous
asoc->peer.auth_capable was 0, and has *not* been updated, so
that instead of creating a new secret, we're doing an early
return from the function sctp_auth_asoc_init_active_key()
leaving asoc->asoc_shared_key as NULL. However, we now have to
authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).

That in fact causes the server side when responding with ...

  <------------------ AUTH; COOKIE-ACK -----------------

... to trigger a NULL pointer dereference, since in
sctp_packet_transmit(), it discovers that an AUTH chunk is
being queued for xmit, and thus it calls sctp_auth_calculate_hmac().

Since the asoc->active_key_id is still inherited from the
endpoint, and the same as encoded into the chunk, it uses
asoc->asoc_shared_key, which is still NULL, as an asoc_key
and dereferences it in ...

  crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)

... causing an oops. All this happens because sctp_make_cookie_ack()
called with the *new* association has the peer.auth_capable=1
and therefore marks the chunk with auth=1 after checking
sctp_auth_send_cid(), but it is *actually* sent later on over
the then *updated* association's transport that didn't initialize
its shared key due to peer.auth_capable=0. Since control chunks
in that case are not sent by the temporary association which
are scheduled for deletion, they are issued for xmit via
SCTP_CMD_REPLY in the interpreter with the context of the
*updated* association. peer.auth_capable was 0 in the updated
association (which went from COOKIE_WAIT into ESTABLISHED state),
since all previous processing that performed sctp_process_init()
was being done on temporary associations, that we eventually
throw away each time.

The correct fix is to update to the new peer.auth_capable
value as well in the collision case via sctp_assoc_update(),
so that in case the collision migrated from 0 -> 1,
sctp_auth_asoc_init_active_key() can properly recalculate
the secret. This therefore fixes the observed server panic.

Fixes: 730fc3d0 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1be9a950

22 Jul, 2014 7 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 15ba2236

Linus Torvalds authored Jul 21, 2014

Pull networking fixes from David Miller:

 1) Null termination fix in dns_resolver got the pointer dereferncing
    wrong, fix from Ben Hutchings.

 2) ip_options_compile() has a benign but real buffer overflow when
    parsing options.  From Eric Dumazet.

 3) Table updates can crash in netfilter's nftables if none of the state
    flags indicate an actual change, from Pablo Neira Ayuso.

 4) Fix race in nf_tables dumping, also from Pablo.

 5) GRE-GRO support broke the forwarding path because the segmentation
    state was not fully initialized in these paths, from Jerry Chu.

 6) sunvnet driver leaks objects and potentially crashes on module
    unload, from Sowmini Varadhan.

 7) We can accidently generate the same handle for several u32
    classifier filters, fix from Cong Wang.

 8) Several edge case bug fixes in fragment handling in xen-netback,
    from Zoltan Kiss.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
  ipv4: fix buffer overflow in ip_options_compile()
  batman-adv: fix TT VLAN inconsistency on VLAN re-add
  batman-adv: drop QinQ claim frames in bridge loop avoidance
  dns_resolver: Null-terminate the right string
  xen-netback: Fix pointer incrementation to avoid incorrect logging
  xen-netback: Fix releasing header slot on error path
  xen-netback: Fix releasing frag_list skbs in error path
  xen-netback: Fix handling frag_list on grant op error path
  net_sched: avoid generating same handle for u32 filters
  net: huawei_cdc_ncm: add "subclass 3" devices
  net: qmi_wwan: add two Sierra Wireless/Netgear devices
  wan/x25_asy: integer overflow in x25_asy_change_mtu()
  net: ppp: fix creating PPP pass and active filters
  net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
  sunvnet: clean up objects created in vnet_new() on vnet_exit()
  r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
  net-gre-gro: Fix a bug that breaks the forwarding path
  netfilter: nf_tables: 64bit stats need some extra synchronization
  netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
  netfilter: nf_tables: safe RCU iteration on list when dumping
  ...

15ba2236

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 89faa06e

Linus Torvalds authored Jul 21, 2014

Pull sparc fix from David Miller:
 "Need to hook up the new renameat2 system call"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Hook up renameat2 syscall.

89faa06e

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide · 14867719

Linus Torvalds authored Jul 21, 2014

Pull IDE fixes from David Miller:
 - fix interrupt registry for some Atari IDE chipsets.
 - adjust Kconfig dependencies for x86_32 specific chips.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
  ide: Fix SC1200 dependencies
  ide: Fix CS5520 and CS5530 dependencies
  m68k/atari - ide: do not register interrupt if host->get_lock is set

14867719

Merge tag 'trace-fixes-v3.16-rc6' of... · 8dcc3be2

Linus Torvalds authored Jul 21, 2014

Merge tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull trace fix from Steven Rostedt:
 "Tony Luck found that using the "uptime" trace clock that uses jiffies
  as a counter was converted to nanoseconds (silly), and after 1 hour 11
  minutes and 34 seconds, this monotonic clock would wrap, causing havoc
  with the tracing system and making the clock useless.

  He converted that clock to use jiffies_64 and made it into a counter
  instead of nanosecond conversions, and displayed the clock with the
  straight jiffy count, which works much better than it did in the past"

* tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix wraparound problems in "uptime" trace clock

8dcc3be2

sparc: Hook up renameat2 syscall. · 26053926
David S. Miller authored Jul 21, 2014
```
Signed-off-by: David S. Miller <davem@davemloft.net>
```
26053926

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge · 850717ef

David S. Miller authored Jul 21, 2014

Antonio Quartulli says:

====================
pull request [net]: batman-adv 20140721

here you have two fixes that we have been testing for quite some time
(this is why they arrived a bit late in the rc cycle).

Patch 1) ensures that BLA packets get dropped and not forwarded to the
mesh even if they reach batman-adv within QinQ frames. Forwarding them
into the mesh means messing up with the TT database of other nodes which
can generate all kind of unexpected behaviours during route computation.

Patch 2) avoids a couple of race conditions triggered upon fast VLAN
deletion-addition. Such race conditions are pretty dangerous because
they not only create inconsistencies in the TT database of the nodes
in the network, but such scenario is also unrecoverable (unless
nodes are rebooted).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

850717ef

ipv4: fix buffer overflow in ip_options_compile() · 10ec9472

Eric Dumazet authored Jul 21, 2014

There is a benign buffer overflow in ip_options_compile spotted by
AddressSanitizer[1] :

Its benign because we always can access one extra byte in skb->head
(because header is followed by struct skb_shared_info), and in this case
this byte is not even used.

[28504.910798] ==================================================================
[28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
[28504.913170] Read of size 1 by thread T15843:
[28504.914026]  [<ffffffff81802f91>] ip_options_compile+0x121/0x9c0
[28504.915394]  [<ffffffff81804a0d>] ip_options_get_from_user+0xad/0x120
[28504.916843]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
[28504.918175]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
[28504.919490]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
[28504.920835]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
[28504.922208]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
[28504.923459]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
[28504.924722]
[28504.925106] Allocated by thread T15843:
[28504.925815]  [<ffffffff81804995>] ip_options_get_from_user+0x35/0x120
[28504.926884]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
[28504.927975]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
[28504.929175]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
[28504.930400]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
[28504.931677]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
[28504.932851]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
[28504.934018]
[28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
[28504.934377]  of 40-byte region [ffff880026382800, ffff880026382828)
[28504.937144]
[28504.937474] Memory state around the buggy address:
[28504.938430]  ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
[28504.939884]  ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28504.941294]  ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
[28504.942504]  ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28504.943483]  ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
[28504.945573]                         ^
[28504.946277]  ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.094949]  ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.096114]  ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.097116]  ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.098472]  ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.099804] Legend:
[28505.100269]  f - 8 freed bytes
[28505.100884]  r - 8 redzone bytes
[28505.101649]  . - 8 allocated bytes
[28505.102406]  x=1..7 - x allocated bytes + (8-x) redzone bytes
[28505.103637] ==================================================================

[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernelSigned-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

10ec9472

21 Jul, 2014 24 commits

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 67dd8f35

Linus Torvalds authored Jul 21, 2014

Pull media fixes from Mauro Carvalho Chehab:
 "A series of driver fixes:
   - fix DVB-S tuning with tda1071
   - fix tuner probe on af9035 when the device has a bad eeprom
   - some fixes for the new si2168/2157 drivers
   - one Kconfig build fix (for omap4iss)
   - fixes at vpif error path
   - don't lock saa7134 ioctl at driver's base core level, as it now
     uses V4L2 and VB2 locking schema
   - fix audio at hdpvr driver
   - fix the aspect ratio at the digital timings table
   - one new USB ID (at gspca_pac7302): Genius i-Look 317 webcam"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] gspca_pac7302: Add new usb-id for Genius i-Look 317
  [media] tda10071: fix returned symbol rate calculation
  [media] tda10071: fix spec inversion reporting
  [media] tda10071: add missing DVB-S2/PSK-8 FEC AUTO
  [media] tda10071: force modulation to QPSK on DVB-S
  [media] hdpvr: fix two audio bugs
  [media] davinci: vpif: missing unlocks on error
  [media] af9035: override tuner id when bad value set into eeprom
  [media] saa7134: use unlocked_ioctl instead of ioctl
  [media] media: v4l2-core: v4l2-dv-timings.c: Cleaning up code wrong value used in aspect ratio
  [media] si2168: firmware download fix
  [media] si2157: add one missing parenthesis
  [media] si2168: add one missing parenthesis
  [media] staging: tighten omap4iss dependencies

67dd8f35

Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 6890ad4b

Linus Torvalds authored Jul 21, 2014

Pull block fixes from Jens Axboe:
 "Final block fixes for 3.16

  Four small fixes that should go into 3.16, have been queued up for a
  bit and delayed due to vacation and other euro duties.  But here they
  are.  The pull request contains:

   - Fix for a reported crash with shared tagging on SCSI from Christoph

   - A regression fix for drbd.  From Lars Ellenberg.

   - Hooking up the compat ioctl for BLKZEROOUT, which requires no
     translation.  From Mikulas.

- A fix for a regression where we woud crash on queue exit if the
  root_blkg is gone/not there. From Tejun"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: provide compat ioctl for BLKZEROOUT
  blkcg: don't call into policy draining if root_blkg is already gone
  drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
  block: don't assume last put of shared tags is for the host

6890ad4b

Merge branch 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · d6e6c48e

Linus Torvalds authored Jul 21, 2014

Pull libata fixes from Tejun Heo:
 "Late libata fixes.

  The most important one is from Kevin Hao which makes sure that libata
  only allocates tags inside the max tag number the controller supports.
  libata always had this problem but the recent tag allocation change
  and addition of support for sata_fsl which only supports queue depth
  of 16 exposed the issue.

  Hans de Goede agreed to become the maintainer of libahci_platform
  which is under higher than usual development pressure from all the new
  controllers popping up from the ARM world"

* 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
  drivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()
  libata: EH should handle AMNF error condition as a media error
  libata: support the ata host which implements a queue depth less than 32
  MAINTAINERS: Add Hans de Goede as ahci-platform maintainer

d6e6c48e

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 5b2b9d77

Linus Torvalds authored Jul 21, 2014

Pull kvm fixes from Paolo Bonzini:
 "These are mostly PPC changes for 3.16-new things.  However, there is
  an x86 change too and it is a regression from 3.14.  As it only
  affects nested virtualization and there were other changes in this
  area in 3.16, I am not nominating it for 3.15-stable"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Check for nested events if there is an injectable interrupt
  KVM: PPC: RTAS: Do byte swaps explicitly
  KVM: PPC: Book3S PR: Fix ABIv2 on LE
  KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
  PPC: Add _GLOBAL_TOC for 32bit
  KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value
  KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute

5b2b9d77

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 80d6191e

Linus Torvalds authored Jul 21, 2014

Pull s390 fixes from Martin Schwidefsky:
 "A couple of last minute bug fixes for 3.16, including a fix for ptrace
  to close a hole which allowed a user space program to write to the
  kernel address space"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: fix restore of invalid floating-point-control
  s390/zcrypt: improve device probing for zcrypt adapter cards
  s390/ptrace: fix PSW mask check
  s390/MSI: Use standard mask and unmask funtions
  s390/3270: correct size detection with the read-partition command
  s390: require mvcos facility, not tod clock steering facility

80d6191e

tracing: Fix wraparound problems in "uptime" trace clock · 58d4e21e

Tony Luck authored Jul 18, 2014

The "uptime" trace clock added in:

    commit 8aacf017
    tracing: Add "uptime" trace clock that uses jiffies

has wraparound problems when the system has been up more
than 1 hour 11 minutes and 34 seconds. It converts jiffies
to nanoseconds using:
        (u64)jiffies_to_usecs(jiffy) * 1000ULL
but since jiffies_to_usecs() only returns a 32-bit value, it
truncates at 2^32 microseconds.  An additional problem on 32-bit
systems is that the argument is "unsigned long", so fixing the
return value only helps until 2^32 jiffies (49.7 days on a HZ=1000
system).

Avoid these problems by using jiffies_64 as our basis, and
not converting to nanoseconds (we do convert to clock_t because
user facing API must not be dependent on internal kernel
HZ values).

Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com

Cc: stable@vger.kernel.org # 3.10+
Fixes: 8aacf017 "tracing: Add "uptime" trace clock that uses jiffies"
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

58d4e21e

batman-adv: fix TT VLAN inconsistency on VLAN re-add · 35df3b29

Antonio Quartulli authored May 08, 2014

When a VLAN interface (on top of batX) is removed and
re-added within a short timeframe TT does not have enough
time to properly cleanup. This creates an internal TT state
mismatch as the newly created softif_vlan will be
initialized from scratch with a TT client count of zero
(even if TT entries for this VLAN still exist). The
resulting TT messages are bogus due to the counter / tt
client listing mismatch, thus creating inconsistencies on
every node in the network

To fix this issue destroy_vlan() has to not free the VLAN
object immediately but it has to be kept alive until all the
TT entries for this VLAN have been removed. destroy_vlan()
still removes the sysfs folder so that the user has the
feeling that everything went fine.

If the same VLAN is re-added before the old object is free'd,
then the latter is resurrected and re-used.

Implement such behaviour by increasing the reference counter
of a softif_vlan object every time a new local TT entry for
such VLAN is created and remove the object from the list
only when all the TT entries have been destroyed.
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

35df3b29

batman-adv: drop QinQ claim frames in bridge loop avoidance · d46b6bfa

Simon Wunderlich authored Jun 23, 2014

Since bridge loop avoidance only supports untagged or simple 802.1q
tagged VLAN claim frames, claim frames with stacked VLAN headers (QinQ)
should be detected and dropped. Transporting the over the mesh may cause
problems on the receivers, or create bogus entries in the local tt
tables.
Reported-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

d46b6bfa

dns_resolver: Null-terminate the right string · 640d7efe

Ben Hutchings authored Jul 21, 2014

*_result[len] is parsed as *(_result[len]) which is not at all what we
want to touch here.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 84a7c0b1 ("dns_resolver: assure that dns_query() result is null-terminated")
Signed-off-by: David S. Miller <davem@davemloft.net>

640d7efe

Linux 3.16-rc6 · 9a3c4145
Linus Torvalds authored Jul 20, 2014

9a3c4145

Merge branch 'xen-netback' · 653bbf19

David S. Miller authored Jul 20, 2014

Zoltan Kiss says:

====================
xen-netback: Fixing up xenvif_tx_check_gop

This series fixes a lot of bugs on the error path around this function, which
were introduced with my grant mapping series in 3.15. They apply to the latest
net tree, but probably to net-next as well without any modification.
I'll post an another series which applies to 3.15 stable, as the problem was
first discovered there. The only difference is that the "queue" variable name is
replaced to "vif".
====================
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>

653bbf19

xen-netback: Fix pointer incrementation to avoid incorrect logging · d8cfbfc4

Zoltan Kiss authored Jul 18, 2014

Due to this pointer is increased prematurely, the error log contains rubbish.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>

d8cfbfc4

xen-netback: Fix releasing header slot on error path · 1b860da0

Zoltan Kiss authored Jul 18, 2014

This patch makes this function aware that the first frag and the header might
share the same ring slot. That could happen if the first slot is bigger than
PKT_PROT_LEN. Due to this the error path might release that slot twice or never,
depending on the error scenario.
xenvif_idx_release is also removed from xenvif_idx_unmap, and called separately.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>

1b860da0

xen-netback: Fix releasing frag_list skbs in error path · b42cc6e4

Zoltan Kiss authored Jul 18, 2014

When the grant operations failed, the skb is freed up eventually, and it tries
to release the frags, if there is any. For the main skb nr_frags is set to 0 to
avoid this, but on the frag_list it iterates through the frags array, and tries
to call put_page on the page pointer which contains garbage at that time.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>

b42cc6e4

xen-netback: Fix handling frag_list on grant op error path · 1a998d3e

Zoltan Kiss authored Jul 18, 2014

The error handling for skb's with frag_list was completely wrong, it caused
double unmap attempts to happen if the error was on the first skb. Move it to
the right place in the loop.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>

1a998d3e

net_sched: avoid generating same handle for u32 filters · 7801db8a

Cong Wang authored Jul 17, 2014

When kernel generates a handle for a u32 filter, it tries to start
from the max in the bucket. So when we have a filter with the max (fff)
handle, it will cause kernel always generates the same handle for new
filters. This can be shown by the following command:

	tc qdisc add dev eth0 ingress
	tc filter add dev eth0 parent ffff: protocol ip pref 770 handle 800::fff u32 match ip protocol 1 0xff
	tc filter add dev eth0 parent ffff: protocol ip pref 770 u32 match ip protocol 1 0xff
	...

we will get some u32 filters with same handle:

 # tc filter show dev eth0 parent ffff:
filter protocol ip pref 770 u32
filter protocol ip pref 770 u32 fh 800: ht divisor 1
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
  match 00010000/00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
  match 00010000/00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
  match 00010000/00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
  match 00010000/00ff0000 at 8

handles should be unique. This patch fixes it by looking up a bitmap,
so that can guarantee the handle is as unique as possible. For compatibility,
we still start from 0x800.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7801db8a

Merge tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · b7a68369

Linus Torvalds authored Jul 20, 2014

Pull more IIO driver fixes from Greg KH:
 "Here are two IIO driver fixes for 3.16-rc6 that resolve some reported
  issues"

* tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: mma8452: Use correct acceleration units.
  iio:core: Handle error when mask type is not separate

b7a68369

Merge tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · caa7c4e1

Linus Torvalds authored Jul 20, 2014

Pull USB fixes from Greg KH:
 "Here are two USB patches that resolve some reported issues, one with
  an odd HUB, and one in the chipidea driver"

* tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: Check if port status is equal to RxDetect
  usb: chipidea: udc: Disable auto ZLP generation on ep0

caa7c4e1

Merge tag 'driver-core-3.16-rc6' of... · f47d5bb0

Linus Torvalds authored Jul 20, 2014

Merge tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
 "Here is a single driver core fix that reverts an older patch that has
  been causing a number of reported problems with the platform devices.

  This revert has been in linux-next for a while with no reported issues"

* tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  platform_get_irq: Revert to platform_get_resource if of_irq_get fails

f47d5bb0

Merge tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · fa24615f

Linus Torvalds authored Jul 20, 2014

Pull char/misc fix from Greg KH:
 "Here's a single hyper-v driver fix for a reported issue"

* tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: hv_fcopy: fix a race condition for SMP guest

fa24615f

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 5556ea4d

Linus Torvalds authored Jul 20, 2014

Pull intel drm fixes from Dave Airlie:
 "Intel fixes came in late, but since I debugged one of them I'll send
  them on,

  Two reverts, a quirk and one warn regression"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/i915: reverse dp link param selection, prefer fast over wide again"
  drm/i915: Track the primary plane correctly when reassigning planes
  drm/i915: Ignore VBT backlight presence check on HP Chromebook 14
  Revert "drm/i915: Don't set the 8to6 dither flag when not scaling"

5556ea4d

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · cfad81ce

Linus Torvalds authored Jul 20, 2014

Pull UML fixes from Richard Weinberger:
 "Four fixes, all discovered by Trinity"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: segv: Save regs only in case of a kernel mode fault
  um: Fix hung task in fix_range_common()
  um: Ensure that a stub page cannot get unmapped
  Revert "um: Fix wait_stub_done() error handling"

cfad81ce

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · da83fc6e

Linus Torvalds authored Jul 20, 2014

Pull btrfs fixes from Chris Mason:
 "We have two more fixes in my for-linus branch.

  I was hoping to also include a fix for a btrfs deadlock with
  compression enabled, but we're still nailing that one down"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: test for valid bdev before kobj removal in btrfs_rm_device
  Btrfs: fix abnormal long waiting in fsync

da83fc6e

Merge tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 90d51d56

Linus Torvalds authored Jul 20, 2014

Pull NFS client fixes from Trond Myklebust:
 "Apologies for the relative lateness of this pull request, however the
  commits fix some issues with the NFS read/write code updates in
  3.16-rc1 that can cause serious Oopsing when using small r/wsize.  The
  delay was mainly due to extra testing to make sure that the fixes
  behave correctly.

  Highlights include;
   - Stable fix for an NFSv3 posix ACL regression
   - Multiple fixes for regressions to the NFS generic read/write code:
     - Fix page splitting bugs that come into play when a small
       rsize/wsize read/write needs to be sent again (due to error
       conditions or page redirty)
     - Fix nfs_wb_page_cancel, which is called by the "invalidatepage"
       method
   - Fix 2 compile warnings about unused variables
   - Fix a performance issue affecting unstable writes"

* tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Don't reset pg_moreio in __nfs_pageio_add_request
  NFS: Remove 2 unused variables
  nfs: handle multiple reqs in nfs_wb_page_cancel
  nfs: handle multiple reqs in nfs_page_async_flush
  nfs: change find_request to find_head_request
  nfs: nfs_page should take a ref on the head req
  nfs: mark nfs_page reqs with flag for extra ref
  nfs: only show Posix ACLs in listxattr if actually present

90d51d56

20 Jul, 2014 4 commits

um: segv: Save regs only in case of a kernel mode fault · bb6a1b2e

Richard Weinberger authored Jul 20, 2014

...otherwise me lose user mode regs and the resulting
stack trace is useless.
Signed-off-by: Richard Weinberger <richard@nod.at>

bb6a1b2e

um: Fix hung task in fix_range_common() · 468f6597

Richard Weinberger authored Jul 20, 2014

If do_ops() fails we have to release current->mm->mmap_sem
otherwise the failing task will never terminate.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

468f6597

um: Ensure that a stub page cannot get unmapped · 284e6d39

Richard Weinberger authored Jul 20, 2014

Trinity discovered an execution path such that a task
can unmap his stub page.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

284e6d39

Revert "um: Fix wait_stub_done() error handling" · ae5db6d1

Richard Weinberger authored Jul 20, 2014

This reverts commit 0974a9ca.
The real for for that issue is to release current->mm->mmap_sem in
fix_range_common().
Signed-off-by: Richard Weinberger <richard@nod.at>

ae5db6d1

19 Jul, 2014 4 commits

btrfs: test for valid bdev before kobj removal in btrfs_rm_device · 0bfaa9c5

Eric Sandeen authored Jul 07, 2014

commit 99994cde btrfs: dev delete should remove sysfs entry
added a btrfs_kobj_rm_device, which dereferences device->bdev...
right after we check whether device->bdev might be NULL.

I don't honestly know if it's possible to have a NULL device->bdev
here, but assuming that it is (given the test), we need to move
the kobject removal to be under that test.

(Coverity spotted this)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>

0bfaa9c5

Btrfs: fix abnormal long waiting in fsync · 98ce2ded

Liu Bo authored Jul 17, 2014

xfstests generic/127 detected this problem.

With commit 7fc34a62, now fsync will only flush
data within the passed range.  This is the cause of the above problem,
-- btrfs's fsync has a stage called 'sync log' which will wait for all the
ordered extents it've recorded to finish.

In xfstests/generic/127, with mixed operations such as truncate, fallocate,
punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will
mmap, and then msync.  And I find that msync will wait for quite a long time
(about 20s in my case), thanks to ftrace, it turns out that the previous
fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the
range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants,
there can be some ordered extents created but not getting corresponding pages
flushed, then they're left in memory until we fsync which runs into the
stage 'sync log', and fsync will just wait for the system writeback thread
to flush those pages and get ordered extents finished, so the latency is
inevitable.

This adds a flush similar to btrfs_start_ordered_extent() in
btrfs_wait_logged_extents() to fix that.
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>

98ce2ded

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d0571909

Linus Torvalds authored Jul 19, 2014

Pull locking fixes from Thomas Gleixner:
 "The locking department delivers:

   - A rather large and intrusive bundle of fixes to address serious
     performance regressions introduced by the new rwsem / mcs
     technology.  Simpler solutions have been discussed, but they would
     have been ugly bandaids with more risk than doing the right thing.

   - Make the rwsem spin on owner technology opt-in for architectures
     and enable it only on the known to work ones.

   - A few fixes to the lockdep userspace library"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER
  locking/mutex: Disable optimistic spinning on some architectures
  locking/rwsem: Reduce the size of struct rw_semaphore
  locking/rwsem: Rename 'activity' to 'count'
  locking/spinlocks/mcs: Micro-optimize osq_unlock()
  locking/spinlocks/mcs: Introduce and use init macro and function for osq locks
  locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead
  locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node()
  locking/rwsem: Allow conservative optimistic spinning when readers have lock
  tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire
  tools/liblockdep: Remove debug print left over from development
  tools/liblockdep: Fix comparison of a boolean value with a value of 2

d0571909

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d1743b81

Linus Torvalds authored Jul 19, 2014

Pull scheduler fix from Thomas Gleixner:
 "Prevent a possible divide by zero in the debugging code"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix possible divide by zero in avg_atom() calculation

d1743b81