1. 23 Jul, 2014 12 commits
    • Linus Torvalds's avatar
      Merge tag 'urgent-slab-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 355cb093
      Linus Torvalds authored
      Pull slab fix from Mike Snitzer:
       "This fixes the broken duplicate slab name check in
        kmem_cache_sanity_check() that has been repeatedly reported (as
        recently as today against Fedora rawhide).
      
        Pekka seemed to have it staged for a late 3.15-rc in his 'slab/urgent'
        branch but never sent a pull request, see:
            https://lkml.org/lkml/2014/5/23/648"
      
      * tag 'urgent-slab-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        slab_common: fix the check for duplicate slab names
      355cb093
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew Morton) · ed4a1084
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "10 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: hugetlb: fix copy_hugetlb_page_range()
        simple_xattr: permit 0-size extended attributes
        mm/fs: fix pessimization in hole-punching pagecache
        shmem: fix splicing from a hole while it's punched
        shmem: fix faulting into a hole, not taking i_mutex
        mm: do not call do_fault_around for non-linear fault
        sh: also try passing -m4-nofpu for SH2A builds
        zram: avoid lockdep splat by revalidate_disk
        mm/rmap.c: fix pgoff calculation to handle hugepage correctly
        coredump: fix the setting of PF_DUMPCORE
      ed4a1084
    • Naoya Horiguchi's avatar
      mm: hugetlb: fix copy_hugetlb_page_range() · 0253d634
      Naoya Horiguchi authored
      Commit 4a705fef ("hugetlb: fix copy_hugetlb_page_range() to handle
      migration/hwpoisoned entry") changed the order of
      huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
      in some workloads like hugepage-backed heap allocation via libhugetlbfs.
      This patch fixes it.
      
      The test program for the problem is shown below:
      
        $ cat heap.c
        #include <unistd.h>
        #include <stdlib.h>
        #include <string.h>
      
        #define HPS 0x200000
      
        int main() {
        	int i;
        	char *p = malloc(HPS);
        	memset(p, '1', HPS);
        	for (i = 0; i < 5; i++) {
        		if (!fork()) {
        			memset(p, '2', HPS);
        			p = malloc(HPS);
        			memset(p, '3', HPS);
        			free(p);
        			return 0;
        		}
        	}
        	sleep(1);
        	free(p);
        	return 0;
        }
      
        $ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap
      
      Fixes 4a705fef ("hugetlb: fix copy_hugetlb_page_range() to handle
      migration/hwpoisoned entry"), so is applicable to -stable kernels which
      include it.
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Reported-by: default avatarGuillaume Morin <guillaume@morinfr.org>
      Suggested-by: default avatarGuillaume Morin <guillaume@morinfr.org>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>	[2.6.37+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0253d634
    • Hugh Dickins's avatar
      simple_xattr: permit 0-size extended attributes · 4e66d445
      Hugh Dickins authored
      If a filesystem uses simple_xattr to support user extended attributes,
      LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
      memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
      values of zero size.  Fix that off-by-one (but apparently no filesystem
      needs them yet).
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Jeff Layton <jlayton@poochiereds.net>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e66d445
    • Hugh Dickins's avatar
      mm/fs: fix pessimization in hole-punching pagecache · 792ceaef
      Hugh Dickins authored
      I wanted to revert my v3.1 commit d0823576 ("mm: pincer in
      truncate_inode_pages_range"), to keep truncate_inode_pages_range() in
      synch with shmem_undo_range(); but have stepped back - a change to
      hole-punching in truncate_inode_pages_range() is a change to
      hole-punching in every filesystem (except tmpfs) that supports it.
      
      If there's a logical proof why no filesystem can depend for its own
      correctness on the pincer guarantee in truncate_inode_pages_range() - an
      instant when the entire hole is removed from pagecache - then let's
      revisit later.  But the evidence is that only tmpfs suffered from the
      livelock, and we have no intention of extending hole-punch to ramfs.  So
      for now just add a few comments (to match or differ from those in
      shmem_undo_range()), and fix one silliness noticed in d0823576...
      
      Its "index == start" addition to the hole-punch termination test was
      incomplete: it opened a way for the end condition to be missed, and the
      loop go on looking through the radix_tree, all the way to end of file.
      Fix that pessimization by resetting index when detected in inner loop.
      
      Note that it's actually hard to hit this case, without the obsessive
      concurrent faulting that trinity does: normally all pages are removed in
      the initial trylock_page() pass, and this loop finds nothing to do.  I
      had to "#if 0" out the initial pass to reproduce bug and test fix.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      792ceaef
    • Hugh Dickins's avatar
      shmem: fix splicing from a hole while it's punched · b1a36650
      Hugh Dickins authored
      shmem_fault() is the actual culprit in trinity's hole-punch starvation,
      and the most significant cause of such problems: since a page faulted is
      one that then appears page_mapped(), needing unmap_mapping_range() and
      i_mmap_mutex to be unmapped again.
      
      But it is not the only way in which a page can be brought into a hole in
      the radix_tree while that hole is being punched; and Vlastimil's testing
      implies that if enough other processors are busy filling in the hole,
      then shmem_undo_range() can be kept from completing indefinitely.
      
      shmem_file_splice_read() is the main other user of SGP_CACHE, which can
      instantiate shmem pagecache pages in the read-only case (without holding
      i_mutex, so perhaps concurrently with a hole-punch).  Probably it's
      silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
      ought to be safe, but might bring surprises - not a change to be rushed.
      
      shmem_read_mapping_page_gfp() is an internal interface used by
      drivers/gpu/drm GEM (and next by uprobes): it should be okay.  And
      shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
      called internally by the kernel (perhaps for a stacking filesystem,
      which might rely on holes to be reserved): it's unclear whether it could
      be provoked to keep hole-punch busy or not.
      
      We could apply the same umbrella as now used in shmem_fault() to
      shmem_file_splice_read() and the others; but it looks ugly, and use over
      a range raises questions - should it actually be per page? can these get
      starved themselves?
      
      The origin of this part of the problem is my v3.1 commit d0823576
      ("mm: pincer in truncate_inode_pages_range"), once it was duplicated
      into shmem.c.  It seemed like a nice idea at the time, to ensure
      (barring RCU lookup fuzziness) that there's an instant when the entire
      hole is empty; but the indefinitely repeated scans to ensure that make
      it vulnerable.
      
      Revert that "enhancement" to hole-punch from shmem_undo_range(), but
      retain the unproblematic rescanning when it's truncating; add a couple
      of comments there.
      
      Remove the "indices[0] >= end" test: that is now handled satisfactorily
      by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
      to be worth avoiding here.
      
      But if we do not always loop indefinitely, we do need to handle the case
      of swap swizzled back to page before shmem_free_swap() gets it: add a
      retry for that case, as suggested by Konstantin Khlebnikov; and for the
      case of page swizzled back to swap, as suggested by Johannes Weiner.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Suggested-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: <stable@vger.kernel.org>	[3.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1a36650
    • Hugh Dickins's avatar
      shmem: fix faulting into a hole, not taking i_mutex · 8e205f77
      Hugh Dickins authored
      Commit f00cdc6d ("shmem: fix faulting into a hole while it's
      punched") was buggy: Sasha sent a lockdep report to remind us that
      grabbing i_mutex in the fault path is a no-no (write syscall may already
      hold i_mutex while faulting user buffer).
      
      We tried a completely different approach (see following patch) but that
      proved inadequate: good enough for a rational workload, but not good
      enough against trinity - which forks off so many mappings of the object
      that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
      into serious starvation when concurrent faults force the puncher to fall
      back to single-page unmap_mapping_range() searches of the i_mmap tree.
      
      So return to the original umbrella approach, but keep away from i_mutex
      this time.  We really don't want to bloat every shmem inode with a new
      mutex or completion, just to protect this unlikely case from trinity.
      So extend the original with wait_queue_head on stack at the hole-punch
      end, and wait_queue item on the stack at the fault end.
      
      This involves further use of i_lock to guard against the races: lockdep
      has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
      i_lock around wake_up_bit(), which is comparable to what we do here.
      i_lock is more convenient, but we could switch to shmem's info->lock.
      
      This issue has been tagged with CVE-2014-4171, which will require commit
      f00cdc6d and this and the following patch to be backported: we
      suggest to 3.1+, though in fact the trinity forkbomb effect might go
      back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
      not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
      Anyone running trinity on 3.0 and earlier? I don't think we need care.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: <stable@vger.kernel.org>	[3.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e205f77
    • Konstantin Khlebnikov's avatar
      mm: do not call do_fault_around for non-linear fault · c118678b
      Konstantin Khlebnikov authored
      Ingo Korb reported that "repeated mapping of the same file on tmpfs
      using remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when
      the process exits".
      
      He bisected the bug to d7c17551 ("mm: implement ->map_pages for
      shmem/tmpfs"), although the bug was actually added by commit
      8c6e50b0 ("mm: introduce vm_ops->map_pages()").
      
      The problem is caused by calling do_fault_around for a _non-linear_
      fault.  In this case pgoff is shifted and might become negative during
      calculation.
      
      Faulting around non-linear page-fault makes no sense and breaks the
      logic in do_fault_around because pgoff is shifted.
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-by: default avatarIngo Korb <ingo.korb@tu-dortmund.de>
      Tested-by: default avatarIngo Korb <ingo.korb@tu-dortmund.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Ning Qu <quning@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[3.15.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c118678b
    • Geert Uytterhoeven's avatar
      sh: also try passing -m4-nofpu for SH2A builds · b1923b55
      Geert Uytterhoeven authored
      When compiling a SH2A kernel (e.g.  se7206_defconfig or rsk7203_defconfig)
      using sh4-linux-gcc, linking fails with:
      
        net/built-in.o: In function `__sk_run_filter':
        net/core/filter.c:566: undefined reference to `__fpscr_values'
        net/core/filter.c:269: undefined reference to `__fpscr_values'
        ...
        net/built-in.o:net/core/filter.c:580: more undefined references to `__fpscr_values' follow
      
      This happens because sh4-linux-gcc doesn't support the "-m2a-nofpu",
      which is thus filtered out by "$(call cc-option, ...)".
      
      As compiling using sh4-linux-gcc is useful for compile coverage, also
      try passing "-m4-nofpu" (which is presumably filtered out when using a
      real sh2a-linux toolchain) to disable the generation of FPU instructions
      and references to __fpscr_values[].
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Tony Breeds <tony@bakeyournoodle.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1923b55
    • Minchan Kim's avatar
      zram: avoid lockdep splat by revalidate_disk · b4c5c609
      Minchan Kim authored
      Sasha reported lockdep warning [1] introduced by [2].
      
      It could be fixed by doing disk revalidation out of the init_lock.  It's
      okay because disk capacity change is protected by init_lock so that
      revalidate_disk always sees up-to-date value so there is no race.
      
      [1] https://lkml.org/lkml/2014/7/3/735
      [2] zram: revalidate disk after capacity change
      
      Fixes 2e32baea ("zram: revalidate disk after capacity change").
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: "Alexander E. Patrakov" <patrakov@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4c5c609
    • Naoya Horiguchi's avatar
      mm/rmap.c: fix pgoff calculation to handle hugepage correctly · a0f7a756
      Naoya Horiguchi authored
      I triggered VM_BUG_ON() in vma_address() when I tried to migrate an
      anonymous hugepage with mbind() in the kernel v3.16-rc3.  This is
      because pgoff's calculation in rmap_walk_anon() fails to consider
      compound_order() only to have an incorrect value.
      
      This patch introduces page_to_pgoff(), which gets the page's offset in
      PAGE_CACHE_SIZE.
      
      Kirill pointed out that page cache tree should natively handle
      hugepages, and in order to make hugetlbfs fit it, page->index of
      hugetlbfs page should be in PAGE_CACHE_SIZE.  This is beyond this patch,
      but page_to_pgoff() contains the point to be fixed in a single function.
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0f7a756
    • Silesh C V's avatar
      coredump: fix the setting of PF_DUMPCORE · aed8adb7
      Silesh C V authored
      Commit 079148b9 ("coredump: factor out the setting of PF_DUMPCORE")
      cleaned up the setting of PF_DUMPCORE by removing it from all the
      linux_binfmt->core_dump() and moving it to zap_threads().But this ended
      up clearing all the previously set flags.  This causes issues during
      core generation when tsk->flags is checked again (eg.  for PF_USED_MATH
      to dump floating point registers).  Fix this.
      Signed-off-by: default avatarSilesh C V <svellattu@mvista.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: <stable@vger.kernel.org>	[3.10+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aed8adb7
  2. 22 Jul, 2014 8 commits
    • Mike Snitzer's avatar
      Merge branch 'slab/urgent' of... · 45ccaf47
      Mike Snitzer authored
      Merge branch 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux into for-3.16-rcX
      45ccaf47
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 15ba2236
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Null termination fix in dns_resolver got the pointer dereferncing
          wrong, fix from Ben Hutchings.
      
       2) ip_options_compile() has a benign but real buffer overflow when
          parsing options.  From Eric Dumazet.
      
       3) Table updates can crash in netfilter's nftables if none of the state
          flags indicate an actual change, from Pablo Neira Ayuso.
      
       4) Fix race in nf_tables dumping, also from Pablo.
      
       5) GRE-GRO support broke the forwarding path because the segmentation
          state was not fully initialized in these paths, from Jerry Chu.
      
       6) sunvnet driver leaks objects and potentially crashes on module
          unload, from Sowmini Varadhan.
      
       7) We can accidently generate the same handle for several u32
          classifier filters, fix from Cong Wang.
      
       8) Several edge case bug fixes in fragment handling in xen-netback,
          from Zoltan Kiss.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
        ipv4: fix buffer overflow in ip_options_compile()
        batman-adv: fix TT VLAN inconsistency on VLAN re-add
        batman-adv: drop QinQ claim frames in bridge loop avoidance
        dns_resolver: Null-terminate the right string
        xen-netback: Fix pointer incrementation to avoid incorrect logging
        xen-netback: Fix releasing header slot on error path
        xen-netback: Fix releasing frag_list skbs in error path
        xen-netback: Fix handling frag_list on grant op error path
        net_sched: avoid generating same handle for u32 filters
        net: huawei_cdc_ncm: add "subclass 3" devices
        net: qmi_wwan: add two Sierra Wireless/Netgear devices
        wan/x25_asy: integer overflow in x25_asy_change_mtu()
        net: ppp: fix creating PPP pass and active filters
        net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
        sunvnet: clean up objects created in vnet_new() on vnet_exit()
        r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
        net-gre-gro: Fix a bug that breaks the forwarding path
        netfilter: nf_tables: 64bit stats need some extra synchronization
        netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
        netfilter: nf_tables: safe RCU iteration on list when dumping
        ...
      15ba2236
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 89faa06e
      Linus Torvalds authored
      Pull sparc fix from David Miller:
       "Need to hook up the new renameat2 system call"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Hook up renameat2 syscall.
      89faa06e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide · 14867719
      Linus Torvalds authored
      Pull IDE fixes from David Miller:
       - fix interrupt registry for some Atari IDE chipsets.
       - adjust Kconfig dependencies for x86_32 specific chips.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
        ide: Fix SC1200 dependencies
        ide: Fix CS5520 and CS5530 dependencies
        m68k/atari - ide: do not register interrupt if host->get_lock is set
      14867719
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.16-rc6' of... · 8dcc3be2
      Linus Torvalds authored
      Merge tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull trace fix from Steven Rostedt:
       "Tony Luck found that using the "uptime" trace clock that uses jiffies
        as a counter was converted to nanoseconds (silly), and after 1 hour 11
        minutes and 34 seconds, this monotonic clock would wrap, causing havoc
        with the tracing system and making the clock useless.
      
        He converted that clock to use jiffies_64 and made it into a counter
        instead of nanosecond conversions, and displayed the clock with the
        straight jiffy count, which works much better than it did in the past"
      
      * tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix wraparound problems in "uptime" trace clock
      8dcc3be2
    • David S. Miller's avatar
      sparc: Hook up renameat2 syscall. · 26053926
      David S. Miller authored
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26053926
    • David S. Miller's avatar
      Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge · 850717ef
      David S. Miller authored
      Antonio Quartulli says:
      
      ====================
      pull request [net]: batman-adv 20140721
      
      here you have two fixes that we have been testing for quite some time
      (this is why they arrived a bit late in the rc cycle).
      
      Patch 1) ensures that BLA packets get dropped and not forwarded to the
      mesh even if they reach batman-adv within QinQ frames. Forwarding them
      into the mesh means messing up with the TT database of other nodes which
      can generate all kind of unexpected behaviours during route computation.
      
      Patch 2) avoids a couple of race conditions triggered upon fast VLAN
      deletion-addition. Such race conditions are pretty dangerous because
      they not only create inconsistencies in the TT database of the nodes
      in the network, but such scenario is also unrecoverable (unless
      nodes are rebooted).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      850717ef
    • Eric Dumazet's avatar
      ipv4: fix buffer overflow in ip_options_compile() · 10ec9472
      Eric Dumazet authored
      There is a benign buffer overflow in ip_options_compile spotted by
      AddressSanitizer[1] :
      
      Its benign because we always can access one extra byte in skb->head
      (because header is followed by struct skb_shared_info), and in this case
      this byte is not even used.
      
      [28504.910798] ==================================================================
      [28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
      [28504.913170] Read of size 1 by thread T15843:
      [28504.914026]  [<ffffffff81802f91>] ip_options_compile+0x121/0x9c0
      [28504.915394]  [<ffffffff81804a0d>] ip_options_get_from_user+0xad/0x120
      [28504.916843]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
      [28504.918175]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
      [28504.919490]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
      [28504.920835]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
      [28504.922208]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
      [28504.923459]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
      [28504.924722]
      [28504.925106] Allocated by thread T15843:
      [28504.925815]  [<ffffffff81804995>] ip_options_get_from_user+0x35/0x120
      [28504.926884]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
      [28504.927975]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
      [28504.929175]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
      [28504.930400]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
      [28504.931677]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
      [28504.932851]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
      [28504.934018]
      [28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
      [28504.934377]  of 40-byte region [ffff880026382800, ffff880026382828)
      [28504.937144]
      [28504.937474] Memory state around the buggy address:
      [28504.938430]  ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.939884]  ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.941294]  ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.942504]  ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.943483]  ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.945573]                         ^
      [28504.946277]  ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.094949]  ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.096114]  ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.097116]  ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.098472]  ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.099804] Legend:
      [28505.100269]  f - 8 freed bytes
      [28505.100884]  r - 8 redzone bytes
      [28505.101649]  . - 8 allocated bytes
      [28505.102406]  x=1..7 - x allocated bytes + (8-x) redzone bytes
      [28505.103637] ==================================================================
      
      [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernelSigned-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10ec9472
  3. 21 Jul, 2014 20 commits