Commits · f014e54c21bf0ee03bac265d84feffbb7e233e89 · Kirill Smelkov / linux

24 May, 2015 18 commits

net: sctp: fix memory leak in auth key management · f014e54c

Daniel Borkmann authored Nov 10, 2014

commit 4184b2a7 upstream.

A very minimal and simple user space application allocating an SCTP
socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing
the socket again will leak the memory containing the authentication
key from user space:

unreferenced object 0xffff8800837047c0 (size 16):
  comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s)
  hex dump (first 16 bytes):
    01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff816d7e8e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811c88d8>] __kmalloc+0xe8/0x270
    [<ffffffffa0870c23>] sctp_auth_create_key+0x23/0x50 [sctp]
    [<ffffffffa08718b1>] sctp_auth_set_key+0xa1/0x140 [sctp]
    [<ffffffffa086b383>] sctp_setsockopt+0xd03/0x1180 [sctp]
    [<ffffffff815bfd94>] sock_common_setsockopt+0x14/0x20
    [<ffffffff815beb61>] SyS_setsockopt+0x71/0xd0
    [<ffffffff816e58a9>] system_call_fastpath+0x12/0x17
    [<ffffffffffffffff>] 0xffffffffffffffff

This is bad because of two things, we can bring down a machine from
user space when auth_enable=1, but also we would leave security sensitive
keying material in memory without clearing it after use. The issue is
that sctp_auth_create_key() already sets the refcount to 1, but after
allocation sctp_auth_set_key() does an additional refcount on it, and
thus leaving it around when we free the socket.

Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit 3af10169)
Signed-off-by: Willy Tarreau <w@1wt.eu>

f014e54c

isofs: Fix unchecked printing of ER records · 09b5d759

Jan Kara authored Dec 18, 2014

commit 4e202462 upstream

We didn't check length of rock ridge ER records before printing them.
Thus corrupted isofs image can cause us to access and print some memory
behind the buffer with obvious consequences.
Reported-and-tested-by: Carl Henrik Lunde <chlunde@ping.uio.no>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>

09b5d759

isofs: Fix infinite looping over CE entries · 08313e26

Jan Kara authored Dec 15, 2014

commit f54e18f1 upstream

Rock Ridge extensions define so called Continuation Entries (CE) which
define where is further space with Rock Ridge data. Corrupted isofs
image can contain arbitrarily long chain of these, including a one
containing loop and thus causing kernel to end in an infinite loop when
traversing these entries.

Limit the traversal to 32 entries which should be more than enough space
to store all the Rock Ridge data.
Reported-by: P J P <ppandit@redhat.com>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>

08313e26

netfilter: conntrack: disable generic tracking for known protocols · 8e421332

Florian Westphal authored Sep 26, 2014

commit db29a950 upstream

Given following iptables ruleset:

-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT

One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.

This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).

All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.

Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.

 [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html

Joint work with Daniel Borkmann.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 2.6.32: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

8e421332

splice: Apply generic position and size checks to each write · 7e6536a2

Ben Hutchings authored Jan 29, 2015

We need to check the position and size of file writes against various
limits, using generic_write_check().  This was not being done for
the splice write path.  It was fixed upstream by commit 8d020765
("->splice_write() via ->write_iter()") but we can't apply that.

CVE-2014-7822
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

7e6536a2

serial: samsung: wait for transfer completion before clock disable · 17f35338

Robert Baldyga authored Nov 24, 2014

This patch adds waiting until transmit buffer and shifter will be empty
before clock disabling.

Without this fix it's possible to have clock disabled while data was
not transmited yet, which causes unproper state of TX line and problems
in following data transfers.

Cc: stable@vger.kernel.org
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1ff383a4)
Signed-off-by: Willy Tarreau <w@1wt.eu>

17f35338

x86: Conditionally update time when ack-ing pending irqs · d7522130

Shai Fultheim authored Apr 20, 2012

commit 42fa4250 upstream.

On virtual environments, apic_read could take a long time. As a
result, under certain conditions the ack pending loop may exit
without any queued irqs left, but after more than one second. A
warning will be printed needlessly in this case.

If the loop is about to exit regardless of max_loops, don't
update it.
Signed-off-by: Shai Fultheim <shai@scalemp.com>
[ rebased and reworded the commit message]
Signed-off-by: Ido Yariv <ido@wizery.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1334873552-31346-1-git-send-email-ido@wizery.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit c9f1417b)
Signed-off-by: Willy Tarreau <w@1wt.eu>

d7522130

x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization · c069a4f2

Andy Lutomirski authored Mar 05, 2015

commit 956421fb upstream.

'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and
the related state make sense for 'ret_from_sys_call'.  This is
entirely the wrong check.  TS_COMPAT would make a little more
sense, but there's really no point in keeping this optimization
at all.

This fixes a return to the wrong user CS if we came from int
0x80 in a 64-bit task.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net
[ Backported from tip:x86/asm. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit 159891c0)
Signed-off-by: Willy Tarreau <w@1wt.eu>

c069a4f2

x86, cpu, amd: Add workaround for family 16h, erratum 793 · ebcbe139

Borislav Petkov authored Jan 15, 2014

commit 3b564968 upstream

This adds the workaround for erratum 793 as a precaution in case not
every BIOS implements it.  This addresses CVE-2013-6885.

Erratum text:

[Revision Guide for AMD Family 16h Models 00h-0Fh Processors,
document 51810 Rev. 3.04 November 2013]

793 Specific Combination of Writes to Write Combined Memory Types and
Locked Instructions May Cause Core Hang

Description

Under a highly specific and detailed set of internal timing
conditions, a locked instruction may trigger a timing sequence whereby
the write to a write combined memory type is not flushed, causing the
locked instruction to stall indefinitely.

Potential Effect on System

Processor core hang.

Suggested Workaround

BIOS should set MSR
C001_1020[15] = 1b.

Fix Planned

No fix planned

[ hpa: updated description, fixed typo in MSR name ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20140114230711.GS29865@pd.tnicTested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
[bwh: Backported to 3.2:
 - Adjust filename
 - Venkatesh Srinivas pointed out we should use {rd,wr}msrl_safe() to
   avoid crashing on KVM.  This was fixed upstream by commit 8f86a737
   ("x86, AMD: Convert to the new bit access MSR accessors") but that's too
   much trouble to backport.  Here we must use {rd,wr}msrl_amd_safe().]
Signed-off-by: Willy Tarreau <w@1wt.eu>

ebcbe139

ASLR: fix stack randomization on 64-bit systems · ba455d81

Hector Marco-Gisbert authored Feb 14, 2015

commit 4e7c22d4 upstream

The issue is that the stack for processes is not properly randomized on 64 bit
architectures due to an integer overflow.

The affected function is randomize_stack_top() in file "fs/binfmt_elf.c":

static unsigned long randomize_stack_top(unsigned long stack_top)
{
         unsigned int random_variable = 0;

         if ((current->flags & PF_RANDOMIZE) &&
                 !(current->personality & ADDR_NO_RANDOMIZE)) {
                 random_variable = get_random_int() & STACK_RND_MASK;
                 random_variable <<= PAGE_SHIFT;
         }
         return PAGE_ALIGN(stack_top) + random_variable;
         return PAGE_ALIGN(stack_top) - random_variable;
}

Note that, it declares the "random_variable" variable as "unsigned int". Since
the result of the shifting operation between STACK_RND_MASK (which is
0x3fffff on x86_64, 22 bits) and PAGE_SHIFT (which is 12 on x86_64):

random_variable <<= PAGE_SHIFT;

then the two leftmost bits are dropped when storing the result in the
"random_variable". This variable shall be at least 34 bits long to hold the
(22+12) result.

These two dropped bits have an impact on the entropy of process stack.
Concretely, the total stack entropy is reduced by four: from 2^28 to 2^30 (One
fourth of expected entropy).

This patch restores back the entropy by correcting the types involved in the
operations in the functions randomize_stack_top() and stack_maxrandom_size().

The successful fix can be tested with:
$ for i in `seq 1 10`; do cat /proc/self/maps | grep stack; done
7ffeda566000-7ffeda587000 rw-p 00000000 00:00 0                          [stack]
7fff5a332000-7fff5a353000 rw-p 00000000 00:00 0                          [stack]
7ffcdb7a1000-7ffcdb7c2000 rw-p 00000000 00:00 0                          [stack]
7ffd5e2c4000-7ffd5e2e5000 rw-p 00000000 00:00 0                          [stack]
...

Once corrected, the leading bytes should be between 7ffc and 7fff, rather
than always being 7fff.

CVE-2015-1593
Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Signed-off-by: Ismael Ripoll <iripoll@upv.es>
[kees: rebase, fix 80 char, clean up commit message, add test example, cve]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>

ba455d81

x86_64, vdso: Fix the vdso address randomization algorithm · 7f36f1ac

Andy Lutomirski authored Dec 19, 2014

commit 394f56fe upstream

The theory behind vdso randomization is that it's mapped at a random
offset above the top of the stack.  To avoid wasting a page of
memory for an extra page table, the vdso isn't supposed to extend
past the lowest PMD into which it can fit.  Other than that, the
address should be a uniformly distributed address that meets all of
the alignment requirements.

The current algorithm is buggy: the vdso has about a 50% probability
of being at the very end of a PMD.  The current algorithm also has a
decent chance of failing outright due to incorrect handling of the
case where the top of the stack is near the top of its PMD.

This fixes the implementation.  The paxtest estimate of vdso
"randomisation" improves from 11 bits to 18 bits.  (Disclaimer: I
don't know what the paxtest code is actually calculating.)

It's worth noting that this algorithm is inherently biased: the vdso
is more likely to end up near the end of its PMD than near the
beginning.  Ideally we would either nix the PMD sharing requirement
or jointly randomize the vdso and the stack to reduce the bias.

In the mean time, this is a considerable improvement with basically
no risk of compatibility issues, since the allowed outputs of the
algorithm are unchanged.

As an easy test, doing this:

for i in `seq 10000`
  do grep -P vdso /proc/self/maps |cut -d- -f1
done |sort |uniq -d

used to produce lots of output (1445 lines on my most recent run).
A tiny subset looks like this:

7fffdfffe000
7fffe01fe000
7fffe05fe000
7fffe07fe000
7fffe09fe000
7fffe0bfe000
7fffe0dfe000

Note the suspicious fe000 endings.  With the fix, I get a much more
palatable 76 repeated addresses.
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
[bwh: Backported to 2.6.32:
 - The whole file is only built for x86_64; adjust context and comment for this
 - We don't have align_vdso_addr()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

7f36f1ac

x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit · b23c0b06

Andy Lutomirski authored Dec 05, 2014

commit 29fa6825 upstream

paravirt_enabled has the following effects:

 - Disables the F00F bug workaround warning.  There is no F00F bug
   workaround any more because Linux's standard IDT handling already
   works around the F00F bug, but the warning still exists.  This
   is only cosmetic, and, in any event, there is no such thing as
   KVM on a CPU with the F00F bug.

 - Disables 32-bit APM BIOS detection.  On a KVM paravirt system,
   there should be no APM BIOS anyway.

 - Disables tboot.  I think that the tboot code should check the
   CPUID hypervisor bit directly if it matters.

 - paravirt_enabled disables espfix32.  espfix32 should *not* be
   disabled under KVM paravirt.

The last point is the purpose of this patch.  It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests.  Fixes CVE-2014-8134.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
[bwh: Backported to 2.6.32: adjust indentation, context]
Signed-off-by: Willy Tarreau <w@1wt.eu>

b23c0b06

x86/tls: Don't validate lm in set_thread_area() after all · c52fdba6

Andy Lutomirski authored Dec 17, 2014

commit 3fb2f423 upstream.

It turns out that there's a lurking ABI issue.  GCC, when
compiling this in a 32-bit program:

struct user_desc desc = {
	.entry_number    = idx,
	.base_addr       = base,
	.limit           = 0xfffff,
	.seg_32bit       = 1,
	.contents        = 0, /* Data, grow-up */
	.read_exec_only  = 0,
	.limit_in_pages  = 1,
	.seg_not_present = 0,
	.useable         = 0,
};

will leave .lm uninitialized.  This means that anything in the
kernel that reads user_desc.lm for 32-bit tasks is unreliable.

Revert the .lm check in set_thread_area().  The value never did
anything in the first place.

Fixes: 0e58af4e ("x86/tls: Disallow unusual TLS segments")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.netSigned-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit c759a579)
Signed-off-by: Willy Tarreau <w@1wt.eu>

c52fdba6

x86/tls: Disallow unusual TLS segments · 18cb16aa

Andy Lutomirski authored Dec 04, 2014

commit 0e58af4e upstream.

Users have no business installing custom code segments into the
GDT, and segments that are not present but are otherwise valid
are a historical source of interesting attacks.

For completeness, block attempts to set the L bit.  (Prior to
this patch, the L bit would have been silently dropped.)

This is an ABI break.  I've checked glibc, musl, and Wine, and
none of them look like they'll have any trouble.

Note to stable maintainers: this is a hardening patch that fixes
no known bugs.  Given the possibility of ABI issues, this
probably shouldn't be backported quickly.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: security@kernel.org <security@kernel.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit fbc3c534)
Signed-off-by: Willy Tarreau <w@1wt.eu>

18cb16aa

x86, tls: Interpret an all-zero struct user_desc as "no segment" · 1f50d3c7

Andy Lutomirski authored Jan 22, 2015

commit 3669ef9f upstream.

The Witcher 2 did something like this to allocate a TLS segment index:

        struct user_desc u_info;
        bzero(&u_info, sizeof(u_info));
        u_info.entry_number = (uint32_t)-1;

        syscall(SYS_set_thread_area, &u_info);

Strictly speaking, this code was never correct.  It should have set
read_exec_only and seg_not_present to 1 to indicate that it wanted
to find a free slot without putting anything there, or it should
have put something sensible in the TLS slot if it wanted to allocate
a TLS entry for real.  The actual effect of this code was to
allocate a bogus segment that could be used to exploit espfix.

The set_thread_area hardening patches changed the behavior, causing
set_thread_area to return -EINVAL and crashing the game.

This changes set_thread_area to interpret this as a request to find
a free slot and to leave it empty, which isn't *quite* what the game
expects but should be close enough to keep it working.  In
particular, using the code above to allocate two segments will
allocate the same segment both times.

According to FrostbittenKing on Github, this fixes The Witcher 2.

If this somehow still causes problems, we could instead allocate
a limit==0 32-bit data segment, but that seems rather ugly to me.

Fixes: 41bdc785 x86/tls: Validate TLS entries to protect espfix
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/0cb251abe1ff0958b8e468a9a9a905b80ae3a746.1421954363.git.luto@amacapital.netSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit 3175b4cb)
Signed-off-by: Willy Tarreau <w@1wt.eu>

1f50d3c7

x86, tls, ldt: Stop checking lm in LDT_empty · 598b6280

Andy Lutomirski authored Jan 22, 2015

commit e30ab185 upstream.

32-bit programs don't have an lm bit in their ABI, so they can't
reliably cause LDT_empty to return true without resorting to memset.
They shouldn't need to do this.

This should fix a longstanding, if minor, issue in all 64-bit kernels
as well as a potential regression in the TLS hardening code.

Fixes: 41bdc785 x86/tls: Validate TLS entries to protect espfix
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/72a059de55e86ad5e2935c80aa91880ddf19d07c.1421954363.git.luto@amacapital.netSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit f62570cb)
Signed-off-by: Willy Tarreau <w@1wt.eu>

598b6280

x86/tls: Validate TLS entries to protect espfix · 85e01300

Andy Lutomirski authored Dec 04, 2014

commit 41bdc785 upstream

Installing a 16-bit RW data segment into the GDT defeats espfix.
AFAICT this will not affect glibc, Wine, or dosemu at all.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: security@kernel.org <security@kernel.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

85e01300

x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs · f0d8cc6f

Andy Lutomirski authored Nov 24, 2014

commit 7ddc6a21 upstream.

These functions can be executed on the int3 stack, so kprobes
are dangerous. Tracing is probably a bad idea, too.

Fixes: b645af2d ("x86_64, traps: Rework bad_iret")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/50e33d26adca60816f3ba968875801652507d0c4.1416870125.git.luto@amacapital.netSigned-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2:
 - Use __kprobes instead of NOKPROBE_SYMBOL()
 - Don't use __visible]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit 8ea4c465)
Signed-off-by: Willy Tarreau <w@1wt.eu>

f0d8cc6f

13 Dec, 2014 22 commits

Linux 2.6.32.65 · 3bd0d1ad
Willy Tarreau authored Dec 13, 2014
```
Signed-off-by: Willy Tarreau <w@1wt.eu>
```
3bd0d1ad

proc connector: Delete spurious memset in proc_exit_connector() · 6eef90fc

Ben Hutchings authored Dec 07, 2014

Upstream commit e727ca82 ("proc connector: fix info leaks")
changed many functions that don't exist in 2.6.32.y.  When it was
cherry-picked into 2.6.32.61, one extra memset() calls was inserted
into proc_exit_connector().  This results in clearing the cpu
field of exit events.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

6eef90fc

cciss: Fix misapplied "cciss: fix info leak in cciss_ioctl32_passthru()" · a99c4d9b

Ben Hutchings authored Dec 07, 2014

Upstream commit 58f09e00 was applied to the wrong function when
cherry-picked for 2.6.32.61.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

a99c4d9b

block: Fix blk_execute_rq_nowait() dead queue handling · d22e4c78

Muthukumar Ratty authored Dec 07, 2014

commit e81ca6fe upstream.

If the queue is dead blk_execute_rq_nowait() doesn't invoke the done()
callback function. That will result in blk_execute_rq() being stuck
in wait_for_completion(). Avoid this by initializing rq->end_io to the
done() callback before we check the queue state. Also, make sure the
queue lock is held around the invocation of the done() callback. Found
this through source code review.
Signed-off-by: Muthukumar Ratty <muthur@gmail.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
[bwh: Backported to 2.6.32: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

d22e4c78

block: add missing blk_queue_dead() checks · 3b77fc36

Tejun Heo authored Dec 07, 2014

commit 8ba61435 upstream.

blk_insert_cloned_request(), blk_execute_rq_nowait() and
blk_flush_plug_list() either didn't check whether the queue was dead
or did it without holding queue_lock.  Update them so that dead state
is checked while holding queue_lock.

AFAICS, this plugs all holes (requeue doesn't matter as the request is
transitioning atomically from in_flight to queued).
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[bwh: Backported to 2.6.32:
 - Drop inapplicable changes to queue_unplugged() and
   blk_flush_plug_list()
 - We don't have blk_queue_dead() so open-code it
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

3b77fc36

md/raid6: Fix misapplied backport in 2.6.32.64 · d4946518

Ben Hutchings authored Dec 07, 2014

Upstream commit 9c4bdf69 ("md/raid6: avoid data corruption during
recovery of double-degraded RAID6") changes handle_stripe(), but we
have separate functions for RAID5 and RAID6 and need to apply the
change to handle_stripe6().  When cherry-picked, the change was
wrongly applied to handle_stripe5().
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

d4946518

sctp: Fix double-free introduced by bad backport in 2.6.32.62 · 7185a719

Ben Hutchings authored Dec 07, 2014

One deletion was omitted from the backport of upstream commit c485658b
("net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk").
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

7185a719

vlan: Don't propagate flag changes on down interfaces. · 04b872de

Matthijs Kooijman authored Dec 07, 2014

commit deede2fa upstream.

When (de)configuring a vlan interface, the IFF_ALLMULTI ans IFF_PROMISC
flags are cleared or set on the underlying interface. So, if these flags
are changed on a vlan interface that is not up, the flags underlying
interface might be set or cleared twice.

Only propagating flag changes when a device is up makes sure this does
not happen. It also makes sure that an underlying device is not set to
promiscuous or allmulti mode for a vlan device that is down.
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: This is a dependency of commit d2615bf4 ("net: core: Always
 propagate flag changes to interfaces"), already backported in 2.6.32.62]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>

04b872de

ttusb-dec: buffer overflow in ioctl · 44023d2f

Dan Carpenter authored Nov 24, 2014

commit dc0ab1ddeb0c5f5eb3f37a72eadb394792b3c40d upstream

We need to add a limit check here so we don't overflow the buffer.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
(backported from commit f2e323ec)
CVE-2014-8884
BugLink: http://bugs.launchpad.net/bugs/1395187Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

44023d2f

mac80211: fix fragmentation code, particularly for encryption · ed99f577

Johannes Berg authored Nov 24, 2014

commit a722a419815ed203b519151f9556859ff256638b upstream

The "new" fragmentation code (since my rewrite almost 5 years ago)
erroneously sets skb->len rather than using skb_trim() to adjust
the length of the first fragment after copying out all the others.
This leaves the skb tail pointer pointing to after where the data
originally ended, and thus causes the encryption MIC to be written
at that point, rather than where it belongs: immediately after the
data.

The impact of this is that if software encryption is done, then
 a) encryption doesn't work for the first fragment, the connection
    becomes unusable as the first fragment will never be properly
    verified at the receiver, the MIC is practically guaranteed to
    be wrong
 b) we leak up to 8 bytes of plaintext (!) of the packet out into
    the air

This is only mitigated by the fact that many devices are capable
of doing encryption in hardware, in which case this can't happen
as the tail pointer is irrelevant in that case. Additionally,
fragmentation is not used very frequently and would normally have
to be configured manually.

Fix this by using skb_trim() properly.

Cc: stable@vger.kernel.org
Fixes: 2de8e0d9 ("mac80211: rewrite fragmentation")
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
(backported from commit 338f977f)
CVE-2014-8709
BugLink: http://bugs.launchpad.net/bugs/1392013Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

ed99f577

net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet · 4e5bfd77

Daniel Borkmann authored Nov 24, 2014

commit a6e1af6f5e9e75095ae1b004f6f00082e5fc4d2d upstream

An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
in the form of:

  ------------ INIT[PARAM: SET_PRIMARY_IP] ------------>

While the INIT chunk parameter verification dissects through many things
in order to detect malformed input, it misses to actually check parameters
inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
IP address' parameter in ASCONF, which has as a subparameter an address
parameter.

So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
and thus sctp_get_af_specific() returns NULL, too, which we then happily
dereference unconditionally through af->from_addr_param().

The trace for the log:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
PGD 0
Oops: 0000 [#1] SMP
[...]
Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa01e9c62>]  [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
[...]
Call Trace:
 <IRQ>
 [<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
 [<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
 [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
 [<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
 [<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
 [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
 [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
 [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[...]

A minimal way to address this is to check for NULL as we do on all
other such occasions where we know sctp_get_af_specific() could
possibly return with NULL.

Fixes: d6de3097 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e40607cb)
CVE-2014-7841
BugLink: http://bugs.launchpad.net/bugs/1392820Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

4e5bfd77

udf: Avoid infinite loop when processing indirect ICBs · 0973cb31

Jan Kara authored Sep 23, 2014

commit 541d302ee5c46336cbad333222bc278b76cc1c42 upstream

We did not implement any bound on number of indirect ICBs we follow when
loading inode. Thus corrupted medium could cause kernel to go into an
infinite loop, possibly causing a stack overflow.

Fix the possible stack overflow by removing recursion from
__udf_read_inode() and limit number of indirect ICBs we follow to avoid
infinite loops.
Signed-off-by: Jan Kara <jack@suse.cz>
(back ported from commit c03aa9f6)
[ luis: adjusted context and replaced udf_err() by printk() ]
CVE-2014-6410
BugLink: http://bugs.launchpad.net/bugs/1370042Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

0973cb31

net: sctp: fix remote memory pressure from excessive queueing · b77b8578

Daniel Borkmann authored Nov 11, 2014

commit b2d33ef40de7161c23032106284959ae75bdf3cd upstream

This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...

  -------------- INIT[ASCONF; ASCONF_ACK] ------------->
  <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------
  ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
  [...]
  ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>

... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.

We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].

The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.

In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.

One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 26b87c78)
CVE-2014-3688
BugLink: http://bugs.launchpad.net/bugs/1386393Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

b77b8578

net: sctp: fix panic on duplicate ASCONF chunks · eec4c818

Daniel Borkmann authored Nov 11, 2014

commit 12b18fdbbda747ad22a7684413a6559b681e503c upstream

When receiving a e.g. semi-good formed connection scan in the
form of ...

-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---------------- ASCONF_a; ASCONF_b ----------------->

... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!

The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.

Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():

While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit 2e3216cd
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.

Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.

Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b69040d8)
CVE-2014-3687
BugLink: http://bugs.launchpad.net/bugs/1386392Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

eec4c818

USB: whiteheat: Added bounds checking for bulk command response · 476b176f

James Forshaw authored Sep 17, 2014

commit c5fd4126151855330280ea9382684980afcfdd03 upstream

This patch fixes a potential security issue in the whiteheat USB driver
which might allow a local attacker to cause kernel memory corrpution. This
is due to an unchecked memcpy into a fixed size buffer (of 64 bytes). On
EHCI and XHCI busses it's possible to craft responses greater than 64
bytes leading a buffer overflow.
Signed-off-by: James Forshaw <forshaw@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(backported from commit 6817ae22)
CVE-2014-3185
BugLink: http://bugs.launchpad.net/bugs/1370036Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>

476b176f

ALSA: control: Fix replacing user controls · 0a592dac

Lars-Peter Clausen authored Jun 18, 2014

(commit 82262a46 upstream)

There are two issues with the current implementation for replacing user
controls. The first is that the code does not check if the control is actually a
user control and neither does it check if the control is owned by the process
that tries to remove it. That allows userspace applications to remove arbitrary
controls, which can cause a user after free if a for example a driver does not
expect a control to be removed from under its feed.

The second issue is that on one hand when a control is replaced the
user_ctl_count limit is not checked and on the other hand the user_ctl_count is
increased (even though the number of user controls does not change). This allows
userspace, once the user_ctl_count limit as been reached, to repeatedly replace
a control until user_ctl_count overflows. Once that happens new controls can be
added effectively bypassing the user_ctl_count limit.

Both issues can be fixed by instead of open-coding the removal of the control
that is to be replaced to use snd_ctl_remove_user_ctl(). This function does
proper permission checks as well as decrements user_ctl_count after the control
has been removed.

Note that by using snd_ctl_remove_user_ctl() the check which returns -EBUSY at
beginning of the function if the control already exists is removed. This is not
a problem though since the check is quite useless, because the lock that is
protecting the control list is released between the check and before adding the
new control to the list, which means that it is possible that a different
control with the same settings is added to the list after the check. Luckily
there is another check that is done while holding the lock in snd_ctl_add(), so
we'll rely on that to make sure that the same control is not added twice.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[wt: fixes CVE-2014-4654 & CVE-2014-4655]
Signed-off-by: Willy Tarreau <w@1wt.eu>

0a592dac

ALSA: control: Don't access controls outside of protected regions · c99bd4fc

Lars-Peter Clausen authored Jun 18, 2014

(commit fd9f26e4 upstream)

A control that is visible on the card->controls list can be freed at any time.
This means we must not access any of its memory while not holding the
controls_rw_lock. Otherwise we risk a use after free access.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[wt: fixes CVE-2014-4653]
Signed-off-by: Willy Tarreau <w@1wt.eu>

c99bd4fc

net/l2tp: don't fall back on UDP [get|set]sockopt · eb406993

Sasha Levin authored Jul 14, 2014

(commit 3cf521f7 upstream)

The l2tp [get|set]sockopt() code has fallen back to the UDP functions
for socket option levels != SOL_PPPOL2TP since day one, but that has
never actually worked, since the l2tp socket isn't an inet socket.

As David Miller points out:

  "If we wanted this to work, it'd have to look up the tunnel and then
   use tunnel->sk, but I wonder how useful that would be"

Since this can never have worked so nobody could possibly have depended
on that functionality, just remove the broken code and return -EINVAL.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: James Chapman <jchapman@katalix.com>
Acked-by: David Miller <davem@davemloft.net>
Cc: Phil Turnbull <phil.turnbull@oracle.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[geissert: adjust file paths and context for 2.6.32]
[wt: fixes CVE-2014-4943]
Signed-off-by: Willy Tarreau <w@1wt.eu>

eb406993

x86_64, traps: Rework bad_iret · 1f449561

Andy Lutomirski authored Nov 22, 2014

It's possible for iretq to userspace to fail.  This can happen because
of a bad CS, SS, or RIP.

Historically, we've handled it by fixing up an exception from iretq to
land at bad_iret, which pretends that the failed iret frame was really
the hardware part of #GP(0) from userspace.  To make this work, there's
an extra fixup to fudge the gs base into a usable state.

This is suboptimal because it loses the original exception.  It's also
buggy because there's no guarantee that we were on the kernel stack to
begin with.  For example, if the failing iret happened on return from an
NMI, then we'll end up executing general_protection on the NMI stack.
This is bad for several reasons, the most immediate of which is that
general_protection, as a non-paranoid idtentry, will try to deliver
signals and/or schedule from the wrong stack.

This patch throws out bad_iret entirely.  As a replacement, it augments
the existing swapgs fudge into a full-blown iret fixup, mostly written
in C.  It's should be clearer and more correct.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b645af2d)
[wt: notes for backport to 2.6.32:
  - _ASM_EXTABLE was open-coded.
  - removed unneeded CFI_ENDPROC
  - removed __visible (introduced in 2.6.37-rc1, not needed here)
/wt]
Signed-off-by: Willy Tarreau <w@1wt.eu>

1f449561

x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C · 70443551

Andy Lutomirski authored Nov 22, 2014

There's nothing special enough about the espfix64 double fault fixup to
justify writing it in assembly.  Move it to C.

This also fixes a bug: if the double fault came from an IST stack, the
old asm code would return to a partially uninitialized stack frame.

Fixes: 3891a04aSigned-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit af726f21)
[wt: backport notes for 2.6.32 :
  - Adaptations to entry_64.S in declaration of do_double_fault.
  - no exception_enter() in 2.6.32. Seems to be only for context tracking.
/wt]
Signed-off-by: Willy Tarreau <w@1wt.eu>

70443551

x86_64, traps: Stop using IST for #SS · 0e62192b

Andy Lutomirski authored Nov 22, 2014

On a 32-bit kernel, this has no effect, since there are no IST stacks.

On a 64-bit kernel, #SS can only happen in user code, on a failed iret
to user space, a canonical violation on access via RSP or RBP, or a
genuine stack segment violation in 32-bit kernel code.  The first two
cases don't need IST, and the latter two cases are unlikely fatal bugs,
and promoting them to double faults would be fine.

This fixes a bug in which the espfix64 code mishandles a stack segment
violation.

This saves 4k of memory per CPU and a tiny bit of code.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 6f442be2)
[wt: no CONFIG_TRACING on 2.6.32, Fixes CVE-2014-9090]
Signed-off-by: Willy Tarreau <w@1wt.eu>

0e62192b

x86/espfix/xen: Fix allocation of pages for paravirt page tables · f890caed

Boris Ostrovsky authored Jul 09, 2014

commit 8762e509 upstream.

init_espfix_ap() is currently off by one level when informing hypervisor
that allocated pages will be used for ministacks' page tables.

The most immediate effect of this on a PV guest is that if
'stack_page = __get_free_page()' returns a non-zeroed-out page the hypervisor
will refuse to use it for a page table (which it shouldn't be anyway). This will
result in warnings by both Xen and Linux.

More importantly, a subsequent write to that page (again, by a PV guest) is
likely to result in fatal page fault.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1404926298-5565-1-git-send-email-boris.ostrovsky@oracle.comReviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from 3.2 commit 060e7f67)
Signed-off-by: Willy Tarreau <w@1wt.eu>

f890caed