1. 09 Apr, 2015 28 commits
    • Alex Deucher's avatar
      drm/radeon: do a posting read in r600_set_irq · 3030c151
      Alex Deucher authored
      commit 9d1393f2 upstream.
      
      To make sure the writes go through the pci bridge.
      
      bug:
      https://bugzilla.kernel.org/show_bug.cgi?id=90741Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3030c151
    • Alex Deucher's avatar
      drm/radeon: do a posting read in r100_set_irq · a1a62b8f
      Alex Deucher authored
      commit f957063f upstream.
      
      To make sure the writes go through the pci bridge.
      
      bug:
      https://bugzilla.kernel.org/show_bug.cgi?id=90741Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a1a62b8f
    • Alex Deucher's avatar
      drm/radeon: do a posting read in evergreen_set_irq · 60a74c2e
      Alex Deucher authored
      commit c320bb5f upstream.
      
      To make sure the writes go through the pci bridge.
      
      bug:
      https://bugzilla.kernel.org/show_bug.cgi?id=90741Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      60a74c2e
    • Tommi Rantala's avatar
      drm/radeon: fix DRM_IOCTL_RADEON_CS oops · 26c65937
      Tommi Rantala authored
      commit a28b2a47 upstream.
      
      Passing zeroed drm_radeon_cs struct to DRM_IOCTL_RADEON_CS produces the
      following oops.
      
      Fix by always calling INIT_LIST_HEAD() to avoid the crash in list_sort().
      
      ----------------------------------
      
       #include <stdint.h>
       #include <fcntl.h>
       #include <unistd.h>
       #include <sys/ioctl.h>
       #include <drm/radeon_drm.h>
      
       static const struct drm_radeon_cs cs;
      
       int main(int argc, char **argv)
       {
               return ioctl(open(argv[1], O_RDWR), DRM_IOCTL_RADEON_CS, &cs);
       }
      
      ----------------------------------
      
      [ttrantal@test2 ~]$ ./main /dev/dri/card0
      [   46.904650] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [   46.905022] IP: [<ffffffff814d6df2>] list_sort+0x42/0x240
      [   46.905022] PGD 68f29067 PUD 688b5067 PMD 0
      [   46.905022] Oops: 0002 [#1] SMP
      [   46.905022] CPU: 0 PID: 2413 Comm: main Not tainted 4.0.0-rc1+ #58
      [   46.905022] Hardware name: Hewlett-Packard HP Compaq dc5750 Small Form Factor/0A64h, BIOS 786E3 v02.10 01/25/2007
      [   46.905022] task: ffff880058e2bcc0 ti: ffff880058e64000 task.ti: ffff880058e64000
      [   46.905022] RIP: 0010:[<ffffffff814d6df2>]  [<ffffffff814d6df2>] list_sort+0x42/0x240
      [   46.905022] RSP: 0018:ffff880058e67998  EFLAGS: 00010246
      [   46.905022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [   46.905022] RDX: ffffffff81644410 RSI: ffff880058e67b40 RDI: ffff880058e67a58
      [   46.905022] RBP: ffff880058e67a88 R08: 0000000000000000 R09: 0000000000000000
      [   46.905022] R10: ffff880058e2bcc0 R11: ffffffff828e6ca0 R12: ffffffff81644410
      [   46.905022] R13: ffff8800694b8018 R14: 0000000000000000 R15: ffff880058e679b0
      [   46.905022] FS:  00007fdc65a65700(0000) GS:ffff88006d600000(0000) knlGS:0000000000000000
      [   46.905022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   46.905022] CR2: 0000000000000000 CR3: 0000000058dd9000 CR4: 00000000000006f0
      [   46.905022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   46.905022] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
      [   46.905022] Stack:
      [   46.905022]  ffff880058e67b40 ffff880058e2bcc0 ffff880058e67a78 0000000000000000
      [   46.905022]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [   46.905022]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [   46.905022] Call Trace:
      [   46.905022]  [<ffffffff81644a65>] radeon_cs_parser_fini+0x195/0x220
      [   46.905022]  [<ffffffff81645069>] radeon_cs_ioctl+0xa9/0x960
      [   46.905022]  [<ffffffff815e1f7c>] drm_ioctl+0x19c/0x640
      [   46.905022]  [<ffffffff810f8fdd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [   46.905022]  [<ffffffff810f90ad>] ? trace_hardirqs_on+0xd/0x10
      [   46.905022]  [<ffffffff8160c066>] radeon_drm_ioctl+0x46/0x80
      [   46.905022]  [<ffffffff81211868>] do_vfs_ioctl+0x318/0x570
      [   46.905022]  [<ffffffff81462ef6>] ? selinux_file_ioctl+0x56/0x110
      [   46.905022]  [<ffffffff81211b41>] SyS_ioctl+0x81/0xa0
      [   46.905022]  [<ffffffff81dc6312>] system_call_fastpath+0x12/0x17
      [   46.905022] Code: 48 89 b5 10 ff ff ff 0f 84 03 01 00 00 4c 8d bd 28 ff ff
      ff 31 c0 48 89 fb b9 15 00 00 00 49 89 d4 4c 89 ff f3 48 ab 48 8b 46 08 <48> c7
      00 00 00 00 00 48 8b 0e 48 85 c9 0f 84 7d 00 00 00 c7 85
      [   46.905022] RIP  [<ffffffff814d6df2>] list_sort+0x42/0x240
      [   46.905022]  RSP <ffff880058e67998>
      [   46.905022] CR2: 0000000000000000
      [   47.149253] ---[ end trace 09576b4e8b2c20b8 ]---
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarTommi Rantala <tt.rantala@gmail.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      26c65937
    • Eli Cohen's avatar
      IB/core: Avoid leakage from kernel to user space · 71f1579b
      Eli Cohen authored
      commit 377b5134 upstream.
      
      Clear the reserved field of struct ib_uverbs_async_event_desc which is
      copied to user space.
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Reviewed-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      71f1579b
    • Dave Hansen's avatar
      mm: thp: give transparent hugepage code a separate copy_page · 1ef840f5
      Dave Hansen authored
      commit 30b0a105 upstream.
      
      Right now, the migration code in migrate_page_copy() uses copy_huge_page()
      for hugetlbfs and thp pages:
      
             if (PageHuge(page) || PageTransHuge(page))
                      copy_huge_page(newpage, page);
      
      So, yay for code reuse.  But:
      
        void copy_huge_page(struct page *dst, struct page *src)
        {
              struct hstate *h = page_hstate(src);
      
      and a non-hugetlbfs page has no page_hstate().  This works 99% of the
      time because page_hstate() determines the hstate from the page order
      alone.  Since the page order of a THP page matches the default hugetlbfs
      page order, it works.
      
      But, if you change the default huge page size on the boot command-line
      (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
      so page_hstate() returns null and copy_huge_page() oopses pretty fast
      since copy_huge_page() dereferences the hstate:
      
        void copy_huge_page(struct page *dst, struct page *src)
        {
              struct hstate *h = page_hstate(src);
              if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) {
        ...
      
      Mel noticed that the migration code is really the only user of these
      functions.  This moves all the copy code over to migrate.c and makes
      copy_huge_page() work for THP by checking for it explicitly.
      
      I believe the bug was introduced in commit b32967ff ("mm: numa: Add
      THP migration for the NUMA working set scanning fault case")
      
      [akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi]
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Tested-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1ef840f5
    • Jiri Slaby's avatar
      mm, hugetlb: define page_hstate for !HUGETLB_PAGE · 04240adf
      Jiri Slaby authored
      This is a single hunk introduced later in the upstream commit
      cb900f41 (mm, hugetlb: convert
      hugetlbfs to use split pmd lock). We need page_hstate even for
      !HUGETLB_PAGE case for the next patch (mm: thp: give transparent
      hugepage code a separate copy_page).
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      04240adf
    • Naoya Horiguchi's avatar
      include/linux/hugetlb.h: make isolate_huge_page() an inline · a5ff308f
      Naoya Horiguchi authored
      commit f40386a4 upstream.
      
      With CONFIG_HUGETLBFS=n:
      
        mm/migrate.c: In function `do_move_page_to_node_array':
        include/linux/hugetlb.h:140:33: warning: statement with no effect [-Wunused-value]
         #define isolate_huge_page(p, l) false
                                         ^
        mm/migrate.c:1170:4: note: in expansion of macro `isolate_huge_page'
            isolate_huge_page(page, &pagelist);
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Tested-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a5ff308f
    • Scott Wood's avatar
      powerpc/mpc85xx: Add ranges to etsec2 nodes · 429c93f0
      Scott Wood authored
      commit bb344ca5 upstream.
      
      Commit 746c9e9f "of/base: Fix PowerPC address parsing hack" limited
      the applicability of the workaround whereby a missing ranges is treated
      as an empty ranges.  This workaround was hiding a bug in the etsec2
      device tree nodes, which have children with reg, but did not have
      ranges.
      Signed-off-by: default avatarScott Wood <scottwood@freescale.com>
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      429c93f0
    • oliver@neukum.org's avatar
      HID: add ALWAYS_POLL quirk for a Logitech 0xc007 · aa838de9
      oliver@neukum.org authored
      commit a4154577 upstream.
      
      This device disconnects every 60s without X
      Signed-off-by: default avatarOliver Neukum <oliver@neukum.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      aa838de9
    • David S. Miller's avatar
      sparc64: Fix several bugs in memmove(). · 97f5ebbc
      David S. Miller authored
      [ Upstream commit 2077cef4 ]
      
      Firstly, handle zero length calls properly.  Believe it or not there
      are a few of these happening during early boot.
      
      Next, we can't just drop to a memcpy() call in the forward copy case
      where dst <= src.  The reason is that the cache initializing stores
      used in the Niagara memcpy() implementations can end up clearing out
      cache lines before we've sourced their original contents completely.
      
      For example, considering NG4memcpy, the main unrolled loop begins like
      this:
      
           load   src + 0x00
           load   src + 0x08
           load   src + 0x10
           load   src + 0x18
           load   src + 0x20
           store  dst + 0x00
      
      Assume dst is 64 byte aligned and let's say that dst is src - 8 for
      this memcpy() call.  That store at the end there is the one to the
      first line in the cache line, thus clearing the whole line, which thus
      clobbers "src + 0x28" before it even gets loaded.
      
      To avoid this, just fall through to a simple copy only mildly
      optimized for the case where src and dst are 8 byte aligned and the
      length is a multiple of 8 as well.  We could get fancy and call
      GENmemcpy() but this is good enough for how this thing is actually
      used.
      Reported-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Reported-by: default avatarBob Picco <bpicco@meloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      97f5ebbc
    • David Ahern's avatar
      sparc: Touch NMI watchdog when walking cpus and calling printk · 14b7dc59
      David Ahern authored
      [ Upstream commit 31aaa98c ]
      
      With the increase in number of CPUs calls to functions that dump
      output to console (e.g., arch_trigger_all_cpu_backtrace) can take
      a long time to complete. If IRQs are disabled eventually the NMI
      watchdog kicks in and creates more havoc. Avoid by telling the NMI
      watchdog everything is ok.
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      14b7dc59
    • David Ahern's avatar
      sparc: perf: Make counting mode actually work · 6b74b2d8
      David Ahern authored
      [ Upstream commit d51291cb ]
      
      Currently perf-stat (aka, counting mode) does not work:
      
      $ perf stat ls
      ...
       Performance counter stats for 'ls':
      
                1.585665      task-clock (msec)         #    0.580 CPUs utilized
                      24      context-switches          #    0.015 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      86      page-faults               #    0.054 M/sec
         <not supported>      cycles
         <not supported>      stalled-cycles-frontend
         <not supported>      stalled-cycles-backend
         <not supported>      instructions
         <not supported>      branches
         <not supported>      branch-misses
      
             0.002735100 seconds time elapsed
      
      The reason is that state is never reset (stays with PERF_HES_UPTODATE set).
      Add a call to sparc_pmu_enable_event during the added_event handling.
      Clean up the encoding since pmu_start calls sparc_pmu_enable_event which
      does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call
      to sparc_perf_event_set_period can be removed as well.
      
      With this patch:
      
      $ perf stat ls
      ...
       Performance counter stats for 'ls':
      
                1.552890      task-clock (msec)         #    0.552 CPUs utilized
                      24      context-switches          #    0.015 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      86      page-faults               #    0.055 M/sec
               5,748,997      cycles                    #    3.702 GHz
         <not supported>      stalled-cycles-frontend:HG
         <not supported>      stalled-cycles-backend:HG
               1,684,362      instructions:HG           #    0.29  insns per cycle
                 295,133      branches:HG               #  190.054 M/sec
                  28,007      branch-misses:HG          #    9.49% of all branches
      
             0.002815665 seconds time elapsed
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Acked-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6b74b2d8
    • David Ahern's avatar
      sparc: perf: Remove redundant perf_pmu_{en|dis}able calls · 18d8d2dc
      David Ahern authored
      [ Upstream commit 5b0d4b55 ]
      
      perf_pmu_disable is called by core perf code before pmu->del and the
      enable function is called by core perf code afterwards. No need to
      call again within sparc_pmu_del.
      
      Ditto for pmu->add and sparc_pmu_add.
      Signed-off-by: default avatarDavid Ahern <david.ahern@oracle.com>
      Acked-by: default avatarBob Picco <bob.picco@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      18d8d2dc
    • Rob Gardner's avatar
      sparc: semtimedop() unreachable due to comparison error · 028ba5ba
      Rob Gardner authored
      [ Upstream commit 53eb2516 ]
      
      A bug was reported that the semtimedop() system call was always
      failing eith ENOSYS.
      
      Since SEMCTL is defined as 3, and SEMTIMEDOP is defined as 4,
      the comparison "call <= SEMCTL" will always prevent SEMTIMEDOP
      from getting through to the semaphore ops switch statement.
      
      This is corrected by changing the comparison to "call <= SEMTIMEDOP".
      
      Orabug: 20633375
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      028ba5ba
    • Andreas Larsson's avatar
      sparc32: destroy_context() and switch_mm() needs to disable interrupts. · 7cd15d5e
      Andreas Larsson authored
      [ Upstream commit 66d0f7ec ]
      
      Load balancing can be triggered in the critical sections protected by
      srmmu_context_spinlock in destroy_context() and switch_mm() and can hang
      the cpu waiting for the rq lock of another cpu that in turn has called
      switch_mm hangning on srmmu_context_spinlock leading to deadlock.
      
      So, disable interrupt while taking srmmu_context_spinlock in
      destroy_context() and switch_mm() so we don't deadlock.
      
      See also commit 77b838fa ("[SPARC64]: destroy_context() needs to disable
      interrupts.")
      Signed-off-by: default avatarAndreas Larsson <andreas@gaisler.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7cd15d5e
    • Eric Dumazet's avatar
      tcp: make connect() mem charging friendly · d06381e8
      Eric Dumazet authored
      [ Upstream commit 355a901e ]
      
      While working on sk_forward_alloc problems reported by Denys
      Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
      sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
      sk_forward_alloc is negative while connect is in progress.
      
      We can fix this by calling regular sk_stream_alloc_skb() both for the
      SYN packet (in tcp_connect()) and the syn_data packet in
      tcp_send_syn_data()
      
      Then, tcp_send_syn_data() can avoid copying syn_data as we simply
      can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
      
      Instead of open coding memcpy_fromiovecend(), simply use this helper.
      
      This leaves in socket write queue clean fast clone skbs.
      
      This was tested against our fastopen packetdrill tests.
      Reported-by: default avatarDenys Fedoryshchenko <nuclearcat@nuclearcat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d06381e8
    • Catalin Marinas's avatar
      net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour · ee1887e4
      Catalin Marinas authored
      [ Upstream commit 91edd096 ]
      
      Commit db31c55a (net: clamp ->msg_namelen instead of returning an
      error) introduced the clamping of msg_namelen when the unsigned value
      was larger than sizeof(struct sockaddr_storage). This caused a
      msg_namelen of -1 to be valid. The native code was subsequently fixed by
      commit dbb490b9 (net: socket: error on a negative msg_namelen).
      
      In addition, the native code sets msg_namelen to 0 when msg_name is
      NULL. This was done in commit (6a2a2b3a net:socket: set msg_namelen
      to 0 if msg_name is passed as NULL in msghdr struct from userland) and
      subsequently updated by 08adb7da (fold verify_iovec() into
      copy_msghdr_from_user()).
      
      This patch brings the get_compat_msghdr() in line with
      copy_msghdr_from_user().
      
      Fixes: db31c55a (net: clamp ->msg_namelen instead of returning an error)
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ee1887e4
    • Josh Hunt's avatar
      tcp: fix tcp fin memory accounting · e5354126
      Josh Hunt authored
      [ Upstream commit d22e1537 ]
      
      tcp_send_fin() does not account for the memory it allocates properly, so
      sk_forward_alloc can be negative in cases where we've sent a FIN:
      
      ss example output (ss -amn | grep -B1 f4294):
      tcp    FIN-WAIT-1 0      1            192.168.0.1:45520         192.0.2.1:8080
      	skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0)
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e5354126
    • Steven Barth's avatar
      ipv6: fix backtracking for throw routes · 7d5a6af3
      Steven Barth authored
      [ Upstream commit 73ba57bf ]
      
      for throw routes to trigger evaluation of other policy rules
      EAGAIN needs to be propagated up to fib_rules_lookup
      similar to how its done for IPv4
      
      A simple testcase for verification is:
      
      ip -6 rule add lookup 33333 priority 33333
      ip -6 route add throw 2001:db8::1
      ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333
      ip route get 2001:db8::1
      Signed-off-by: default avatarSteven Barth <cyrus@openwrt.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7d5a6af3
    • Ondrej Zary's avatar
      Revert "net: cx82310_eth: use common match macro" · 9e3bce09
      Ondrej Zary authored
      [ Upstream commit 8d006e01 ]
      
      This reverts commit 11ad714b because
      it breaks cx82310_eth.
      
      The custom USB_DEVICE_CLASS macro matches
      bDeviceClass, bDeviceSubClass and bDeviceProtocol
      but the common USB_DEVICE_AND_INTERFACE_INFO matches
      bInterfaceClass, bInterfaceSubClass and bInterfaceProtocol instead, which are
      not specified.
      Signed-off-by: default avatarOndrej Zary <linux@rainbow-software.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9e3bce09
    • Al Viro's avatar
      rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() · 9b79f2c1
      Al Viro authored
      [ Upstream commit 7d985ed1 ]
      
      [I would really like an ACK on that one from dhowells; it appears to be
      quite straightforward, but...]
      
      MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
      fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
      in there.  It gets passed via flags; in fact, another such check in the same
      function is done correctly - as flags & MSG_PEEK.
      
      It had been that way (effectively disabled) for 8 years, though, so the patch
      needs beating up - that case had never been tested.  If it is correct, it's
      -stable fodder.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9b79f2c1
    • Al Viro's avatar
      caif: fix MSG_OOB test in caif_seqpkt_recvmsg() · e0704a3c
      Al Viro authored
      [ Upstream commit 3eeff778 ]
      
      It should be checking flags, not msg->msg_flags.  It's ->sendmsg()
      instances that need to look for that in ->msg_flags, ->recvmsg() ones
      (including the other ->recvmsg() instance in that file, as well as
      unix_dgram_recvmsg() this one claims to be imitating) check in flags.
      Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC
      in receive") back in 2010, so it goes quite a while back.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e0704a3c
    • Eric Dumazet's avatar
      inet_diag: fix possible overflow in inet_diag_dump_one_icsk() · da65a31a
      Eric Dumazet authored
      [ Upstream commit c8e2c80d ]
      
      inet_diag_dump_one_icsk() allocates too small skb.
      
      Add inet_sk_attr_size() helper right before inet_sk_diag_fill()
      so that it can be updated if/when new attributes are added.
      
      iproute2/ss currently does not use this dump_one() interface,
      this might explain nobody noticed this problem yet.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      da65a31a
    • Arnd Bergmann's avatar
      rds: avoid potential stack overflow · 4ac02a11
      Arnd Bergmann authored
      [ Upstream commit f862e07c ]
      
      The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
      on the stack in order to pass a pair of addresses. This happens to just
      fit withint the 1024 byte stack size warning limit on x86, but just
      exceed that limit on ARM, which gives us this warning:
      
      net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      As the use of this large variable is basically bogus, we can rearrange
      the code to not do that. Instead of passing an rds socket into
      rds_iw_get_device, we now just pass the two addresses that we have
      available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
      to create two address structures on the stack there.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4ac02a11
    • Alexey Kodanev's avatar
      net: sysctl_net_core: check SNDBUF and RCVBUF for min length · ed56585d
      Alexey Kodanev authored
      [ Upstream commit b1cb59cf ]
      
      sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
      set to incorrect values. Given that 'struct sk_buff' allocates from
      rcvbuf, incorrectly set buffer length could result to memory
      allocation failures. For example, set them as follows:
      
          # sysctl net.core.rmem_default=64
            net.core.wmem_default = 64
          # sysctl net.core.wmem_default=64
            net.core.wmem_default = 64
          # ping localhost -s 1024 -i 0 > /dev/null
      
      This could result to the following failure:
      
      skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
      head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
      kernel BUG at net/core/skbuff.c:102!
      invalid opcode: 0000 [#1] SMP
      ...
      task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
      RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
      RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
      RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
      RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
      FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
      Stack:
       ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
       ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
       ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
      Call Trace:
       [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
       [<ffffffff81556f56>] sock_write_iter+0x146/0x160
       [<ffffffff811d9612>] new_sync_write+0x92/0xd0
       [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
       [<ffffffff811da499>] SyS_write+0x59/0xd0
       [<ffffffff81651532>] system_call_fastpath+0x12/0x17
      Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
            00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
            eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
      RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
      RSP <ffff88003ae8bc68>
      Kernel panic - not syncing: Fatal exception
      
      Moreover, the possible minimum is 1, so we can get another kernel panic:
      ...
      BUG: unable to handle kernel paging request at ffff88013caee5c0
      IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
      ...
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ed56585d
    • Takashi Iwai's avatar
      ALSA: hda - Fix regression of HD-audio controller fallback modes · 901077ae
      Takashi Iwai authored
      commit a1f3f1ca upstream.
      
      The commit [63e51fd7: ALSA: hda - Don't take unresponsive D3
      transition too serious] introduced a conditional fallback behavior to
      the HD-audio controller depending on the flag set.  However, it
      introduced a silly bug, too, that the flag was evaluated in a reverse
      way.  This resulted in a regression of HD-audio controller driver
      where it can't go to the fallback mode at communication errors.
      
      Unfortunately (or fortunately?) this didn't come up until recently
      because the affected code path is an error handling that happens only
      on an unstable hardware chip.  Most of recent chips work stably, thus
      they didn't hit this problem.  Now, we've got a regression report with
      a VIA chip, and this seems indeed requiring the fallback to the
      polling mode, and finally the bug was revealed.
      
      The fix is a oneliner to remove the wrong logical NOT in the check.
      (Lesson learned - be careful about double negation.)
      
      The bug should be backported to stable, but the patch won't be
      applicable to 3.13 or earlier because of the code splits.  The stable
      fix patches for earlier kernels will be posted later manually.
      
      [... and this is the manual patch -- tiwai]
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94021
      Fixes: 63e51fd7 ('ALSA: hda - Don't take unresponsive D3 transition too serious')
      Cc: <stable@vger.kernel.org> # v3.11-3.13
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      901077ae
    • Naoya Horiguchi's avatar
      mm/hugetlb: fix getting refcount 0 page in hugetlb_fault() · c5d351e1
      Naoya Horiguchi authored
      commit 0f792cf9 upstream.
      
      When running the test which causes the race as shown in the previous patch,
      we can hit the BUG "get_page() on refcount 0 page" in hugetlb_fault().
      
      This race happens when pte turns into migration entry just after the first
      check of is_hugetlb_entry_migration() in hugetlb_fault() passed with false.
      To fix this, we need to check pte_present() again after huge_ptep_get().
      
      This patch also reorders taking ptl and doing pte_page(), because
      pte_page() should be done in ptl.  Due to this reordering, we need use
      trylock_page() in page != pagecache_page case to respect locking order.
      
      Fixes: 66aebce7 ("hugetlb: fix race condition in hugetlb_fault()")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: <stable@vger.kernel.org>	[3.2+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: Jiri Slaby <jslaby@suse.cz> [backport to 3.12]
      c5d351e1
  2. 16 Mar, 2015 12 commits
    • Jiri Slaby's avatar
      Linux 3.12.39 · 0b8aecf1
      Jiri Slaby authored
      0b8aecf1
    • Sergei Shtylyov's avatar
      clk-gate: fix bit # check in clk_register_gate() · 13538bca
      Sergei Shtylyov authored
      commit 2e9dcdae upstream.
      
      In case CLK_GATE_HIWORD_MASK flag is passed to clk_register_gate(), the bit #
      should be no higher than 15, however the corresponding check is obviously off-
      by-one.
      
      Fixes: 04577994 ("clk: gate: add CLK_GATE_HIWORD_MASK")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarMichael Turquette <mturquette@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      13538bca
    • Sergey Ryazanov's avatar
      ath5k: fix spontaneus AR5312 freezes · cb3f9ca3
      Sergey Ryazanov authored
      commit 8bfae4f9 upstream.
      
      Sometimes while CPU have some load and ath5k doing the wireless
      interface reset the whole WiSoC completely freezes. Set of tests shows
      that using atomic delay function while we wait interface reset helps to
      avoid such freezes.
      
      The easiest way to reproduce this issue: create a station interface,
      start continous scan with wpa_supplicant and load CPU by something. Or
      just create multiple station interfaces and put them all in continous
      scan.
      
      This patch partially reverts the commit 1846ac3d ("ath5k: Use
      usleep_range where possible"), which replaces initial udelay()
      by usleep_range().
      
      I do not know actual source of this issue, but all looks like that HW
      freeze is caused by transaction on internal SoC bus, while wireless
      block is in reset state.
      
      Also I should note that I do not know how many chips are affected, but I
      did not see this issue with chips, other than AR5312.
      
      CC: Jiri Slaby <jirislaby@gmail.com>
      CC: Nick Kossifidis <mickflemm@gmail.com>
      CC: Luis R. Rodriguez <mcgrof@do-not-panic.com>
      Fixes: 1846ac3d ("ath5k: Use usleep_range where possible")
      Reported-by: default avatarChristophe Prevotaux <c.prevotaux@rural-networks.com>
      Tested-by: default avatarChristophe Prevotaux <c.prevotaux@rural-networks.com>
      Tested-by: default avatarEric Bree <ebree@nltinc.com>
      Signed-off-by: default avatarSergey Ryazanov <ryazanov.s.a@gmail.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cb3f9ca3
    • Trond Myklebust's avatar
      NFSv4: Don't call put_rpccred() under the rcu_read_lock() · 168d2e5e
      Trond Myklebust authored
      commit 7c0af9ff upstream.
      
      put_rpccred() can sleep.
      
      Fixes: 8f649c37 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      168d2e5e
    • Chris Wilson's avatar
      ACPI / video: Load the module even if ACPI is disabled · fec983fe
      Chris Wilson authored
      commit 6e17cb12 upstream.
      
      i915.ko depends upon the acpi/video.ko module and so refuses to load if
      ACPI is disabled at runtime if for example the BIOS is broken beyond
      repair. acpi/video provides an optional service for i915.ko and so we
      should just allow the modules to load, but do no nothing in order to let
      the machines boot correctly.
      Reported-by: default avatarBill Augur <bill-auger@programmer.net>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Acked-by: default avatarAaron Lu <aaron.lu@intel.com>
      [ rjw: Fixed up the new comment in acpi_video_init() ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fec983fe
    • Alex Deucher's avatar
      drm/radeon: fix 1 RB harvest config setup for TN/RL · 8245bceb
      Alex Deucher authored
      commit dbfb00c3 upstream.
      
      The logic was reversed from what the hw actually exposed.
      Fixes graphics corruption in certain harvest configurations.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8245bceb
    • Alex Deucher's avatar
      drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh · 84f9532f
      Alex Deucher authored
      commit 3d2d98ee upstream.
      
      Just in case it hasn't been calculated for the mode.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      84f9532f
    • Jiri Kosina's avatar
      HID: fixup the conflicting keyboard mappings quirk · 36530280
      Jiri Kosina authored
      commit 8e7b3410 upstream.
      
      The ignore check that got added in 6ce901eb ("HID: input: fix confusion
      on conflicting mappings") needs to properly check for VARIABLE reports
      as well (ARRAY reports should be ignored), otherwise legitimate keyboards
      might break.
      
      Fixes: 6ce901eb ("HID: input: fix confusion on conflicting mappings")
      Reported-by: default avatarFredrik Hallenberg <megahallon@gmail.com>
      Reported-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      36530280
    • David Herrmann's avatar
      HID: input: fix confusion on conflicting mappings · 955c3838
      David Herrmann authored
      commit 6ce901eb upstream.
      
      On an PC-101/103/104 keyboard (American layout) the 'Enter' key and its
      neighbours look like this:
      
                 +---+ +---+ +-------+
                 | 1 | | 2 | |   5   |
                 +---+ +---+ +-------+
                   +---+ +-----------+
                   | 3 | |     4     |
                   +---+ +-----------+
      
      On a PC-102/105 keyboard (European layout) it looks like this:
      
                 +---+ +---+ +-------+
                 | 1 | | 2 | |       |
                 +---+ +---+ +-+  4  |
                   +---+ +---+ |     |
                   | 3 | | 5 | |     |
                   +---+ +---+ +-----+
      
      (Note that the number of keys is the same, but key '5' is moved down and
       the shape of key '4' is changed. Keys '1' to '3' are exactly the same.)
      
      The keys 1-4 report the same scan-code in HID in both layouts, even though
      the keysym they produce is usually different depending on the XKB-keymap
      used by user-space.
      However, key '5' (US 'backslash'/'pipe') reports 0x31 for the upper layout
      and 0x32 for the lower layout, as defined by the HID spec. This is highly
      confusing as the linux-input API uses a single keycode for both.
      
      So far, this was never a problem as there never has been a keyboard with
      both of those keys present at the same time. It would have to look
      something like this:
      
                 +---+ +---+ +-------+
                 | 1 | | 2 | |  x31  |
                 +---+ +---+ +-------+
                   +---+ +---+ +-----+
                   | 3 | |x32| |  4  |
                   +---+ +---+ +-----+
      
      HID can represent such a keyboard, but the linux-input API cannot.
      Furthermore, any user-space mapping would be confused by this and,
      luckily, no-one ever produced such hardware.
      
      Now, the HID input layer fixed this mess by mapping both 0x31 and 0x32 to
      the same keycode (KEY_BACKSLASH==0x2b). As only one of both physical keys
      is present on a hardware, this works just fine.
      
      Lets introduce hardware-vendors into this:
      ------------------------------------------
      
      Unfortunately, it seems way to expensive to produce a different device for
      American and European layouts. Therefore, hardware-vendors put both keys,
      (0x31 and 0x32) on the same keyboard, but only one of them is hooked up
      to the physical button, the other one is 'dead'.
      This means, they can use the same hardware, with a different button-layout
      and automatically produce the correct HID events for American *and*
      European layouts. This is unproblematic for normal keyboards, as the
      'dead' key will never report any KEY-DOWN events. But RollOver keyboards
      send the whole matrix on each key-event, allowing n-key roll-over mode.
      This means, we get a 0x31 and 0x32 event on each key-press. One of them
      will always be 0, the other reports the real state. As we map both to the
      same keycode, we will get spurious key-events, even though the real
      key-state never changed.
      
      The easiest way would be to blacklist 'dead' keys and never handle those.
      We could simply read the 'country' tag of USB devices and blacklist either
      key according to the layout. But... hardware vendors... want the same
      device for all countries and thus many of them set 'country' to 0 for all
      devices. Meh..
      
      So we have to deal with this properly. As we cannot know which of the keys
      is 'dead', we either need a heuristic and track those keys, or we simply
      make use of our value-tracking for HID fields. We simply ignore HID events
      for absolute data if the data didn't change. As HID tracks events on the
      HID level, we haven't done the keycode translation, yet. Therefore, the
      'dead' key is tracked independently of the real key, therefore, any events
      on it will be ignored.
      
      This patch simply discards any HID events for absolute data if it didn't
      change compared to the last report. We need to ignore relative and
      buffered-byte reports for obvious reasons. But those cannot be affected by
      this bug, so we're fine.
      
      Preferably, we'd do this filtering on the HID-core level. But this might
      break a lot of custom drivers, if they do not follow the HID specs.
      Therefore, we do this late in hid-input just before we inject it into the
      input layer (which does the exact same filtering, but on the keycode
      level).
      
      If this turns out to break some devices, we might have to limit filtering
      to EV_KEY events. But lets try to do the Right Thing first, and properly
      filter any absolute data that didn't change.
      
      This patch is tagged for 'stable' as it fixes a lot of n-key RollOver
      hardware. We might wanna wait with backporting for a while, before we know
      it doesn't break anything else, though.
      Reported-by: default avatarAdam Goode <adam@spicenitz.org>
      Reported-by: default avatarFredrik Hallenberg <megahallon@gmail.com>
      Tested-by: default avatarFredrik Hallenberg <megahallon@gmail.com>
      Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      955c3838
    • Mikulas Patocka's avatar
      dm snapshot: fix a possible invalid memory access on unload · dde15ae3
      Mikulas Patocka authored
      commit 22aa66a3 upstream.
      
      When the snapshot target is unloaded, snapshot_dtr() waits until
      pending_exceptions_count drops to zero.  Then, it destroys the snapshot.
      Therefore, the function that decrements pending_exceptions_count
      should not touch the snapshot structure after the decrement.
      
      pending_complete() calls free_pending_exception(), which decrements
      pending_exceptions_count, and then it performs up_write(&s->lock) and it
      calls retry_origin_bios() which dereferences  s->origin.  These two
      memory accesses to the fields of the snapshot may touch the dm_snapshot
      struture after it is freed.
      
      This patch moves the call to free_pending_exception() to the end of
      pending_complete(), so that the snapshot will not be destroyed while
      pending_complete() is in progress.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dde15ae3
    • Mikulas Patocka's avatar
      dm: fix a race condition in dm_get_md · 0a21fde3
      Mikulas Patocka authored
      commit 2bec1f4a upstream.
      
      The function dm_get_md finds a device mapper device with a given dev_t,
      increases the reference count and returns the pointer.
      
      dm_get_md calls dm_find_md, dm_find_md takes _minor_lock, finds the
      device, tests that the device doesn't have DMF_DELETING or DMF_FREEING
      flag, drops _minor_lock and returns pointer to the device. dm_get_md then
      calls dm_get. dm_get calls BUG if the device has the DMF_FREEING flag,
      otherwise it increments the reference count.
      
      There is a possible race condition - after dm_find_md exits and before
      dm_get is called, there are no locks held, so the device may disappear or
      DMF_FREEING flag may be set, which results in BUG.
      
      To fix this bug, we need to call dm_get while we hold _minor_lock. This
      patch renames dm_find_md to dm_get_md and changes it so that it calls
      dm_get while holding the lock.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0a21fde3
    • Darrick J. Wong's avatar
      dm io: reject unsupported DISCARD requests with EOPNOTSUPP · d5a59035
      Darrick J. Wong authored
      commit 37527b86 upstream.
      
      I created a dm-raid1 device backed by a device that supports DISCARD
      and another device that does NOT support DISCARD with the following
      dm configuration:
      
       #  echo '0 2048 mirror core 1 512 2 /dev/sda 0 /dev/sdb 0' | dmsetup create moo
       # lsblk -D
       NAME         DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
       sda                 0        4K       1G         0
       `-moo (dm-0)        0        4K       1G         0
       sdb                 0        0B       0B         0
       `-moo (dm-0)        0        4K       1G         0
      
      Notice that the mirror device /dev/mapper/moo advertises DISCARD
      support even though one of the mirror halves doesn't.
      
      If I issue a DISCARD request (via fstrim, mount -o discard, or ioctl
      BLKDISCARD) through the mirror, kmirrord gets stuck in an infinite
      loop in do_region() when it tries to issue a DISCARD request to sdb.
      The problem is that when we call do_region() against sdb, num_sectors
      is set to zero because q->limits.max_discard_sectors is zero.
      Therefore, "remaining" never decreases and the loop never terminates.
      
      To fix this: before entering the loop, check for the combination of
      REQ_DISCARD and no discard and return -EOPNOTSUPP to avoid hanging up
      the mirror device.
      
      This bug was found by the unfortunate coincidence of pvmove and a
      discard operation in the RHEL 6.5 kernel; upstream is also affected.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Acked-by: default avatar"Martin K. Petersen" <martin.petersen@oracle.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d5a59035