1. 05 May, 2015 15 commits
    • Tejun Heo's avatar
      writeback: fix possible underflow in write bandwidth calculation · 8e9be18f
      Tejun Heo authored
      commit c72efb65 upstream.
      
      From 1ebf33901ecc75d9496862dceb1ef0377980587c Mon Sep 17 00:00:00 2001
      From: Tejun Heo <tj@kernel.org>
      Date: Mon, 23 Mar 2015 00:08:19 -0400
      
      2f800fbd ("writeback: fix dirtied pages accounting on redirty")
      introduced account_page_redirty() which reverts stat updates for a
      redirtied page, making BDI_DIRTIED no longer monotonically increasing.
      
      bdi_update_write_bandwidth() uses the delta in BDI_DIRTIED as the
      basis for bandwidth calculation.  While unlikely, since the above
      patch, the newer value may be lower than the recorded past value and
      underflow the bandwidth calculation leading to a wild result.
      
      Fix it by subtracing min of the old and new values when calculating
      delta.  AFAIK, there hasn't been any report of it happening but the
      resulting erratic behavior would be non-critical and temporary, so
      it's possible that the issue is happening without being reported.  The
      risk of the fix is very low, so tagged for -stable.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Greg Thelen <gthelen@google.com>
      Fixes: 2f800fbd ("writeback: fix dirtied pages accounting on redirty")
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8e9be18f
    • Vineet Gupta's avatar
      ARC: SA_SIGINFO ucontext regs off-by-one · 218bb3b8
      Vineet Gupta authored
      commit 6914e1e3 upstream.
      
      The regfile provided to SA_SIGINFO signal handler as ucontext was off by
      one due to pt_regs gutter cleanups in 2013.
      
      Before handling signal, user pt_regs are copied onto user_regs_struct and copied
      back later. Both structs are binary compatible. This was all fine until
      commit 2fa91904 (ARC: pt_regs update #2) which removed the empty stack slot
      at top of pt_regs (corresponding to first pad) and made the corresponding
      fixup in struct user_regs_struct (the pad in there was moved out of
      @scratch - not removed altogether as it is part of ptrace ABI)
      
       struct user_regs_struct {
      +       long pad;
              struct {
      -               long pad;
                      long bta, lp_start, lp_end,....
              } scratch;
       ...
       }
      
      This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
      signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
      which is what this commit does.
      
      This problem was hidden for 2 years, because both save/restore, despite
      using wrong location, were using the same location. Only an interim
      inspection (reproducer below) exposed the issue.
      
           void handle_segv(int signo, siginfo_t *info, void *context)
           {
       	ucontext_t *uc = context;
      	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
      
      	printf("regs %x %x\n",               <=== prints 7 8 (vs. 8 9)
                     regs->scratch.r8, regs->scratch.r9);
           }
      
           int main()
           {
      	struct sigaction sa;
      
      	sa.sa_sigaction = handle_segv;
      	sa.sa_flags = SA_SIGINFO;
      	sigemptyset(&sa.sa_mask);
      	sigaction(SIGSEGV, &sa, NULL);
      
      	asm volatile(
      	"mov	r7, 7	\n"
      	"mov	r8, 8	\n"
      	"mov	r9, 9	\n"
      	"mov	r10, 10	\n"
      	:::"r7","r8","r9","r10");
      
      	*((unsigned int*)0x10) = 0;
           }
      
      Fixes: 2fa91904 "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs"
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      218bb3b8
    • Sergei Antonov's avatar
      hfsplus: fix B-tree corruption after insertion at position 0 · b5ef8af2
      Sergei Antonov authored
      commit 98cf21c6 upstream.
      
      Fix B-tree corruption when a new record is inserted at position 0 in the
      node in hfs_brec_insert().  In this case a hfs_brec_update_parent() is
      called to update the parent index node (if exists) and it is passed
      hfs_find_data with a search_key containing a newly inserted key instead
      of the key to be updated.  This results in an inconsistent index node.
      The bug reproduces on my machine after an extents overflow record for
      the catalog file (CNID=4) is inserted into the extents overflow B-tree.
      Because of a low (reserved) value of CNID=4, it has to become the first
      record in the first leaf node.
      
      The resulting first leaf node is correct:
      
        ----------------------------------------------------
        | key0.CNID=4 | key1.CNID=123 | key2.CNID=456, ... |
        ----------------------------------------------------
      
      But the parent index key0 still contains the previous key CNID=123:
      
        -----------------------
        | key0.CNID=123 | ... |
        -----------------------
      
      A change in hfs_brec_insert() makes hfs_brec_update_parent() work
      correctly by preventing it from getting fd->record=-1 value from
      __hfs_brec_find().
      
      Along the way, I removed duplicate code with unification of the if
      condition.  The resulting code is equivalent to the original code
      because node is never 0.
      
      Also hfs_brec_update_parent() will now return an error after getting a
      negative fd->record value.  However, the return value of
      hfs_brec_update_parent() is not checked anywhere in the file and I'm
      leaving it unchanged by this patch.  brec.c lacks error checking after
      some other calls too, but this issue is of less importance than the one
      being fixed by this patch.
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Reviewed-by: default avatarVyacheslav Dubeyko <slava@dubeyko.com>
      Acked-by: default avatarHin-Tak Leung <htl10@users.sourceforge.net>
      Cc: Anton Altaparmakov <aia21@cam.ac.uk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b5ef8af2
    • Gu Zheng's avatar
      mm/memory hotplug: postpone the reset of obsolete pgdat · 4d751774
      Gu Zheng authored
      commit b0dc3a34 upstream.
      
      Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under
      stress condition:
      
        BUG: unable to handle kernel paging request at 0000000000025f60
        IP: next_online_pgdat+0x1/0x50
        PGD 0
        Oops: 0000 [#1] SMP
        ACPI: Device does not support D3cold
        Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf]
        CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G           O 3.10.15-5885-euler0302 #1
        Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015
        Workqueue: events vmstat_update
        task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000
        RIP: 0010: next_online_pgdat+0x1/0x50
        RSP: 0018:ffffa800d32afce8  EFLAGS: 00010286
        RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082
        RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000
        RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96
        R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440
        R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800
        FS:  0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
          refresh_cpu_vm_stats+0xd0/0x140
          vmstat_update+0x11/0x50
          process_one_work+0x194/0x3d0
          worker_thread+0x12b/0x410
          kthread+0xc6/0xd0
          ret_from_fork+0x7c/0xb0
      
      The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
      try_offline_node, which will reset all the content of pgdat to 0, as the
      pgdat is accessed lock-free, so that the users still using the pgdat
      will panic, such as the vmstat_update routine.
      
      process A:				offline node XX:
      
      vmstat_updat()
         refresh_cpu_vm_stats()
           for_each_populated_zone()
             find online node XX
           cond_resched()
      					offline cpu and memory, then try_offline_node()
      					node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
             zone = next_zone(zone)
               pg_data_t *pgdat = zone->zone_pgdat;  // here pgdat is NULL now
                 next_online_pgdat(pgdat)
                   next_online_node(pgdat->node_id);  // NULL pointer access
      
      So the solution here is postponing the reset of obsolete pgdat from
      try_offline_node() to hotadd_new_pgdat(), and just resetting
      pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
      0 to avoid breaking pointer information in pgdat.
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Reported-by: default avatarXishi Qiu <qiuxishi@huawei.com>
      Suggested-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4d751774
    • Leon Yu's avatar
      mm: fix anon_vma->degree underflow in anon_vma endless growing prevention · ed3d98af
      Leon Yu authored
      commit 3fe89b3e upstream.
      
      I have constantly stumbled upon "kernel BUG at mm/rmap.c:399!" after
      upgrading to 3.19 and had no luck with 4.0-rc1 neither.
      
      So, after looking into new logic introduced by commit 7a3ef208 ("mm:
      prevent endless growth of anon_vma hierarchy"), I found chances are that
      unlink_anon_vmas() is called without incrementing dst->anon_vma->degree
      in anon_vma_clone() due to allocation failure.  If dst->anon_vma is not
      NULL in error path, its degree will be incorrectly decremented in
      unlink_anon_vmas() and eventually underflow when exiting as a result of
      another call to unlink_anon_vmas().  That's how "kernel BUG at
      mm/rmap.c:399!" is triggered for me.
      
      This patch fixes the underflow by dropping dst->anon_vma when allocation
      fails.  It's safe to do so regardless of original value of dst->anon_vma
      because dst->anon_vma doesn't have valid meaning if anon_vma_clone()
      fails.  Besides, callers don't care dst->anon_vma in such case neither.
      
      Also suggested by Michal Hocko, we can clean up vma_adjust() a bit as
      anon_vma_clone() now does the work.
      
      [akpm@linux-foundation.org: tweak comment]
      Fixes: 7a3ef208 ("mm: prevent endless growth of anon_vma hierarchy")
      Signed-off-by: default avatarLeon Yu <chianglungyu@gmail.com>
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ed3d98af
    • Joe Perches's avatar
      selinux: fix sel_write_enforce broken return value · cf781cbc
      Joe Perches authored
      commit 6436a123 upstream.
      
      Return a negative error value like the rest of the entries in this function.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      [PM: tweaked subject line]
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      cf781cbc
    • Catalin Marinas's avatar
      arm64: Use the reserved TTBR0 if context switching to the init_mm · 930792ab
      Catalin Marinas authored
      commit e53f21bc upstream.
      
      The idle_task_exit() function may call switch_mm() with next ==
      &init_mm. On arm64, init_mm.pgd cannot be used for user mappings, so
      this patch simply sets the reserved TTBR0.
      Reported-by: default avatarJon Medhurst (Tixy) <tixy@linaro.org>
      Tested-by: default avatarJon Medhurst (Tixy) <tixy@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      930792ab
    • Peter Zijlstra's avatar
      perf: Fix irq_work 'tail' recursion · 464ab131
      Peter Zijlstra authored
      commit d525211f upstream.
      
      Vince reported a watchdog lockup like:
      
      	[<ffffffff8115e114>] perf_tp_event+0xc4/0x210
      	[<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
      	[<ffffffff810b7f10>] lock_release+0x130/0x260
      	[<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40
      	[<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80
      	[<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0
      	[<ffffffff811f71ce>] send_sigio+0xae/0x100
      	[<ffffffff811f72b7>] kill_fasync+0x97/0xf0
      	[<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
      	[<ffffffff8115d103>] perf_pending_event+0x33/0x60
      	[<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
      	[<ffffffff8114e448>] irq_work_run+0x18/0x40
      	[<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
      	[<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
      
      Which is caused by an irq_work generating new irq_work and therefore
      not allowing forward progress.
      
      This happens because processing the perf irq_work triggers another
      perf event (tracepoint stuff) which in turn generates an irq_work ad
      infinitum.
      
      Avoid this by raising the recursion counter in the irq_work -- which
      effectively disables all software events (including tracepoints) from
      actually triggering again.
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Tested-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20150219170311.GH21418@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      464ab131
    • Markos Chandras's avatar
      net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5} · 4235c2c4
      Markos Chandras authored
      commit 87f966d9 upstream.
      
      On a MIPS Malta board, tons of fifo underflow errors have been observed
      when using u-boot as bootloader instead of YAMON. The reason for that
      is that YAMON used to set the pcnet device to SRAM mode but u-boot does
      not. As a result, the default Tx threshold (64 bytes) is now too small to
      keep the fifo relatively used and it can result to Tx fifo underflow errors.
      As a result of which, it's best to setup the SRAM on supported controllers
      so we can always use the NOUFLO bit.
      
      Cc: <netdev@vger.kernel.org>
      Cc: <linux-kernel@vger.kernel.org>
      Cc: Don Fry <pcnet32@frontier.com>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4235c2c4
    • Tyrel Datwyler's avatar
      powerpc/pseries: Little endian fixes for post mobility device tree update · eeab505b
      Tyrel Datwyler authored
      commit f6ff0414 upstream.
      
      We currently use the device tree update code in the kernel after resuming
      from a suspend operation to re-sync the kernels view of the device tree with
      that of the hypervisor. The code as it stands is not endian safe as it relies
      on parsing buffers returned by RTAS calls that thusly contains data in big
      endian format.
      
      This patch annotates variables and structure members with __be types as well
      as performing necessary byte swaps to cpu endian for data that needs to be
      parsed.
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Cc: Cyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      eeab505b
    • Uwe Kleine-König's avatar
      spi: trigger trace event for message-done before mesg->complete · 6b101400
      Uwe Kleine-König authored
      commit 391949b6 upstream.
      
      With spidev the mesg->complete callback points to spidev_complete.
      Calling this unblocks spidev_sync and so spidev_sync_write finishes. As
      the struct spi_message just read is a local variable in
      spidev_sync_write and recording the trace event accesses this message
      the recording is better done first. The same can happen for
      spidev_sync_read.
      
      This fixes an oops observed on a 3.14-rt system with spidev activity
      after
      
      	echo 1 > /sys/kernel/debug/tracing/events/spi/enable
      
      .
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6b101400
    • Radim Krčmář's avatar
      KVM: nVMX: mask unrestricted_guest if disabled on L0 · 63f1e745
      Radim Krčmář authored
      commit 0790ec17 upstream.
      
      If EPT was enabled, unrestricted_guest was allowed in L1 regardless of
      L0.  L1 triple faulted when running L2 guest that required emulation.
      
      Another side effect was 'WARN_ON_ONCE(vmx->nested.nested_run_pending)'
      in L0's dmesg:
        WARNING: CPU: 0 PID: 0 at arch/x86/kvm/vmx.c:9190 nested_vmx_vmexit+0x96e/0xb00 [kvm_intel] ()
      
      Prevent this scenario by masking SECONDARY_EXEC_UNRESTRICTED_GUEST when
      the host doesn't have it enabled.
      
      Fixes: 78051e3b ("KVM: nVMX: Disable unrestricted mode if ept=0")
      Tested-By: default avatarKashyap Chamarthy <kchamart@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      [ kamal: backport to 3.13-stable: context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      63f1e745
    • Ameya Palande's avatar
      mfd: kempld-core: Fix callback return value check · a367d8c0
      Ameya Palande authored
      commit c8648508 upstream.
      
      On success, callback function returns 0. So invert the if condition
      check so that we can break out of loop.
      Signed-off-by: default avatarAmeya Palande <2ameya@gmail.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a367d8c0
    • Sudip Mukherjee's avatar
      nbd: fix possible memory leak · 333fe40e
      Sudip Mukherjee authored
      commit ff6b8090 upstream.
      
      we have already allocated memory for nbd_dev, but we were not
      releasing that memory and just returning the error value.
      Signed-off-by: default avatarSudip Mukherjee <sudip@vectorindia.org>
      Acked-by: default avatarPaul Clements <Paul.Clements@SteelEye.com>
      Signed-off-by: default avatarMarkus Pargmann <mpa@pengutronix.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      333fe40e
    • Tejun Heo's avatar
      writeback: add missing INITIAL_JIFFIES init in global_update_bandwidth() · 78bb3718
      Tejun Heo authored
      commit 7d70e154 upstream.
      
      global_update_bandwidth() uses static variable update_time as the
      timestamp for the last update but forgets to initialize it to
      INITIALIZE_JIFFIES.
      
      This means that global_dirty_limit will be 5 mins into the future on
      32bit and some large amount jiffies into the past on 64bit.  This
      isn't critical as the only effect is that global_dirty_limit won't be
      updated for the first 5 mins after booting on 32bit machines,
      especially given the auxiliary nature of global_dirty_limit's role -
      protecting against global dirty threshold's sudden dips; however, it
      does lead to unintended suboptimal behavior.  Fix it.
      
      Fixes: c42843f2 ("writeback: introduce smoothed global dirty limit")
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      78bb3718
  2. 01 May, 2015 5 commits
    • Ben Hutchings's avatar
      tcp: Fix crash in TCP Fast Open · 7d157c2e
      Ben Hutchings authored
      Commit 355a901e ("tcp: make connect() mem charging friendly")
      changed tcp_send_syn_data() to perform an open-coded copy of the 'syn'
      skb rather than using skb_copy_expand().
      
      The open-coded copy does not cover the skb_shared_info::gso_segs
      field, so in the new skb it is left set to 0.  When this commit was
      backported into stable branches between 3.10.y and 3.16.7-ckty
      inclusive, it triggered the BUG() in tcp_transmit_skb().
      
      Since Linux 3.18 the GSO segment count is kept in the
      tcp_skb_cb::tcp_gso_segs field and tcp_send_syn_data() does copy the
      tcp_skb_cb structure to the new skb, so mainline and newer stable
      branches are not affected.
      
      Set skb_shared_info::gso_segs to the correct value of 1.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      [ kamal: pre-3.18-stable only; no upstream commit ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7d157c2e
    • Jann Horn's avatar
      fs: take i_mutex during prepare_binprm for set[ug]id executables · ee212ab1
      Jann Horn authored
      commit 8b01fc86 upstream.
      
      This prevents a race between chown() and execve(), where chowning a
      setuid-user binary to root would momentarily make the binary setuid
      root.
      
      This patch was mostly written by Linus Torvalds.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [ luis: backported to 3.16:
        - relaced task_no_new_privs() by current->no_new_privs
        - replaced READ_ONCE() by ACCESS_ONCE() ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ee212ab1
    • Nadav Amit's avatar
      KVM: x86: Fix lost interrupt on irr_pending race · 078be168
      Nadav Amit authored
      commit f210f757 upstream.
      
      apic_find_highest_irr assumes irr_pending is set if any vector in APIC_IRR is
      set.  If this assumption is broken and apicv is disabled, the injection of
      interrupts may be deferred until another interrupt is delivered to the guest.
      Ultimately, if no other interrupt should be injected to that vCPU, the pending
      interrupt may be lost.
      
      commit 56cc2406 ("KVM: nVMX: fix "acknowledge interrupt on exit" when APICv
      is in use") changed the behavior of apic_clear_irr so irr_pending is cleared
      after setting APIC_IRR vector. After this commit, if apic_set_irr and
      apic_clear_irr run simultaneously, a race may occur, resulting in APIC_IRR
      vector set, and irr_pending cleared. In the following example, assume a single
      vector is set in IRR prior to calling apic_clear_irr:
      
      apic_set_irr				apic_clear_irr
      ------------				--------------
      apic->irr_pending = true;
      					apic_clear_vector(...);
      					vec = apic_search_irr(apic);
      					// => vec == -1
      apic_set_vector(...);
      					apic->irr_pending = (vec != -1);
      					// => apic->irr_pending == false
      
      Nonetheless, it appears the race might even occur prior to this commit:
      
      apic_set_irr				apic_clear_irr
      ------------				--------------
      apic->irr_pending = true;
      					apic->irr_pending = false;
      					apic_clear_vector(...);
      					if (apic_search_irr(apic) != -1)
      						apic->irr_pending = true;
      					// => apic->irr_pending == false
      apic_set_vector(...);
      
      Fixing this issue by:
      1. Restoring the previous behavior of apic_clear_irr: clear irr_pending, call
         apic_clear_vector, and then if APIC_IRR is non-zero, set irr_pending.
      2. On apic_set_irr: first call apic_set_vector, then set irr_pending.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      078be168
    • Peter Hurley's avatar
      n_tty: Fix read buffer overwrite when no newline · 1c11b289
      Peter Hurley authored
      commit fb5ef9e7 upstream.
      
      BugLink: http://bugs.launchpad.net/bugs/1381005
      
      In canon mode, the read buffer head will advance over the buffer tail
      if the input > 4095 bytes without receiving a line termination char.
      
      Discard additional input until a line termination is received.
      Before evaluating for overflow, the 'room' value is normalized for
      I_PARMRK and 1 byte is reserved for line termination (even in !icanon
      mode, in case the mode is switched). The following table shows the
      transform:
      
       actual buffer |  'room' value before overflow calc
        space avail  |    !I_PARMRK    |    I_PARMRK
       --------------------------------------------------
            0        |       -1        |       -1
            1        |        0        |        0
            2        |        1        |        0
            3        |        2        |        0
            4+       |        3        |        1
      
      When !icanon or when icanon and the read buffer contains newlines,
      normalized 'room' values of -1 and 0 are clamped to 0, and
      'overflow' is 0, so read_head is not adjusted and the input i/o loop
      exits (setting no_room if called from flush_to_ldisc()). No input
      is discarded since the reader does have input available to read
      which ensures forward progress.
      
      When icanon and the read buffer does not contain newlines and the
      normalized 'room' value is 0, then overflow and room are reset to 1,
      so that the i/o loop will process the next input char normally
      (except for parity errors which are ignored). Thus, erasures, signalling
      chars, 7-bit mode, etc. will continue to be handled properly.
      
      If the input char processed was not a line termination char, then
      the canon_head index will not have advanced, so the normalized 'room'
      value will now be -1 and 'overflow' will be set, which indicates the
      read_head can safely be reset, effectively erasing the last char
      processed.
      
      If the input char processed was a line termination, then the
      canon_head index will have advanced, so 'overflow' is cleared to 0,
      the read_head is not reset, and 'room' is cleared to 0, which exits
      the i/o loop (because the reader now have input available to read
      which ensures forward progress).
      
      Note that it is possible for a line termination to be received, and
      for the reader to copy the line to the user buffer before the
      input i/o loop is ready to process the next input char. This is
      why the i/o loop recomputes the room/overflow state with every
      input char while handling overflow.
      
      Finally, if the input data was processed without receiving
      a line termination (so that overflow is still set), the pty
      driver must receive a write wakeup. A pty writer may be waiting
      to write more data in n_tty_write() but without unthrottling
      here that wakeup will not arrive, and forward progress will halt.
      (Normally, the pty writer is woken when the reader reads data out
      of the buffer and more space become available).
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (backported from commit fb5ef9e7)
      Signed-off-by: default avatarJoseph Salisbury <joseph.salisbury@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1c11b289
    • Peter Hurley's avatar
      n_tty: Merge .receive_buf() flavors · 9c4da38d
      Peter Hurley authored
      commit 5c32d123 upstream.
      
      N_TTY's direct and flow-controlled flavors of the .receive_buf()
      method are nearly identical; fold together.
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [ kamal: 3.13-stable prereq for
        fb5ef9e7 n_tty: Fix read buffer overwrite when no newline ]
      Cc: Joseph Salisbury <joseph.salisbury@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      9c4da38d
  3. 10 Apr, 2015 18 commits
  4. 08 Apr, 2015 2 commits