1. 11 Feb, 2014 27 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 16e5a2ed
      Linus Torvalds authored
      Pull networking updates from David Miller:
      
       1) Fix flexcan build on big endian, from Arnd Bergmann
      
       2) Correctly attach cpsw to GPIO bitbang MDIO drive, from Stefan Roese
      
       3) udp_add_offload has to use GFP_ATOMIC since it can be invoked from
          non-sleepable contexts.  From Or Gerlitz
      
       4) vxlan_gro_receive() does not iterate over all possible flows
          properly, fix also from Or Gerlitz
      
       5) CAN core doesn't use a proper SKB destructor when it hooks up
          sockets to SKBs.  Fix from Oliver Hartkopp
      
       6) ip_tunnel_xmit() can use an uninitialized route pointer, fix from
          Eric Dumazet
      
       7) Fix address family assignment in IPVS, from Michal Kubecek
      
       8) Fix ath9k build on ARM, from Sujith Manoharan
      
       9) Make sure fail_over_mac only applies for the correct bonding modes,
          from Ding Tianhong
      
      10) The udp offload code doesn't use RCU correctly, from Shlomo Pongratz
      
      11) Handle gigabit features properly in generic PHY code, from Florian
          Fainelli
      
      12) Don't blindly invoke link operations in
          rtnl_link_get_slave_info_data_size, they are optional.  Fix from
          Fernando Luis Vazquez Cao
      
      13) Add USB IDs for Netgear Aircard 340U, from Bjørn Mork
      
      14) Handle netlink packet padding properly in openvswitch, from Thomas
          Graf
      
      15) Fix oops when deleting chains in nf_tables, from Patrick McHardy
      
      16) Fix RX stalls in xen-netback driver, from Zoltan Kiss
      
      17) Fix deadlock in mac80211 stack, from Emmanuel Grumbach
      
      18) inet_nlmsg_size() forgets to consider ifa_cacheinfo, fix from Geert
          Uytterhoeven
      
      19) tg3_change_mtu() can deadlock, fix from Nithin Sujir
      
      20) Fix regression in setting SCTP local source addresses on accepted
          sockets, caused by some generic ipv6 socket changes.  Fix from
          Matija Glavinic Pecotic
      
      21) IPPROTO_* must be pure defines, otherwise module aliases don't get
          constructed properly.  Fix from Jan Moskyto
      
      22) IPV6 netconsole setup doesn't work properly unless an explicit
          source address is specified, fix from Sabrina Dubroca
      
      23) Use __GFP_NORETRY for high order skb page allocations in
          sock_alloc_send_pskb and skb_page_frag_refill.  From Eric Dumazet
      
      24) Fix a regression added in netconsole over bridging, from Cong Wang
      
      25) TCP uses an artificial offset of 1ms for SRTT, but this doesn't jive
          well with TCP pacing which needs the SRTT to be accurate.  Fix from
          Eric Dumazet
      
      26) Several cases of missing header file includes from Rashika Kheria
      
      27) Add ZTE MF667 device ID to qmi_wwan driver, from Raymond Wanyoike
      
      28) TCP Small Queues doesn't handle nonagle properly in some corner
          cases, fix from Eric Dumazet
      
      29) Remove extraneous read_unlock in bond_enslave, whoops.  From Ding
          Tianhong
      
      30) Fix 9p trans_virtio handling of vmalloc buffers, from Richard Yao
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (136 commits)
        6lowpan: fix lockdep splats
        alx: add missing stats_lock spinlock init
        9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
        bonding: remove unwanted bond lock for enslave processing
        USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support
        tcp: tsq: fix nonagle handling
        bridge: Prevent possible race condition in br_fdb_change_mac_address
        bridge: Properly check if local fdb entry can be deleted when deleting vlan
        bridge: Properly check if local fdb entry can be deleted in br_fdb_delete_by_port
        bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address
        bridge: Fix the way to check if a local fdb entry can be deleted
        bridge: Change local fdb entries whenever mac address of bridge device changes
        bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address
        bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr
        bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
        tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min
        net: vxge: Remove unused device pointer
        net: qmi_wwan: add ZTE MF667
        3c59x: Remove unused pointer in vortex_eisa_cleanup()
        net: fix 'ip rule' iif/oif device rename
        ...
      16e5a2ed
    • Eric Dumazet's avatar
      6lowpan: fix lockdep splats · 20e7c4e8
      Eric Dumazet authored
      When a device ndo_start_xmit() calls again dev_queue_xmit(),
      lockdep can complain because dev_queue_xmit() is re-entered and the
      spinlocks protecting tx queues share a common lockdep class.
      
      Same issue was fixed for bonding/l2tp/ppp in commits
      
      0daa2303 ("[PATCH] bonding: lockdep annotation")
      49ee4920 ("bonding: set qdisc_tx_busylock to avoid LOCKDEP splat")
      23d3b8bf ("net: qdisc busylock needs lockdep annotations ")
      303c07db ("ppp: set qdisc_tx_busylock to avoid LOCKDEP splat ")
      Reported-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20e7c4e8
    • John Greene's avatar
      alx: add missing stats_lock spinlock init · 3e5ccc29
      John Greene authored
      Trivial fix for init time stack trace occuring in
      alx_get_stats64 upon start up. Should have been part of
      commit adding the spinlock:
      f1b6b106 alx: add alx_get_stats64 operation
      Signed-off-by: default avatarJohn Greene <jogreene@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e5ccc29
    • Richard Yao's avatar
      9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers · b6f52ae2
      Richard Yao authored
      The 9p-virtio transport does zero copy on things larger than 1024 bytes
      in size. It accomplishes this by returning the physical addresses of
      pages to the virtio-pci device. At present, the translation is usually a
      bit shift.
      
      That approach produces an invalid page address when we read/write to
      vmalloc buffers, such as those used for Linux kernel modules. Any
      attempt to load a Linux kernel module from 9p-virtio produces the
      following stack.
      
      [<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510
      [<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0
      [<ffffffff814839dd>] p9_client_read+0x15d/0x240
      [<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0
      [<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20
      [<ffffffff811c84e7>] v9fs_file_read+0x37/0x70
      [<ffffffff8114e3fb>] vfs_read+0x9b/0x160
      [<ffffffff81153571>] kernel_read+0x41/0x60
      [<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180
      
      Subsequently, QEMU will die printing:
      
      qemu-system-x86_64: virtio: trying to map MMIO memory
      
      This patch enables 9p-virtio to correctly handle this case. This not
      only enables us to load Linux kernel modules off virtfs, but also
      enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
      
      Special thanks to both Avi Kivity and Alexander Graf for their
      interpretation of QEMU backtraces. Without their guidence, tracking down
      this bug would have taken much longer. Also, special thanks to Linus
      Torvalds for his insightful explanation of why this should use
      is_vmalloc_addr() instead of is_vmalloc_or_module_addr():
      
      https://lkml.org/lkml/2014/2/8/272Signed-off-by: default avatarRichard Yao <ryao@gentoo.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6f52ae2
    • dingtianhong's avatar
      bonding: remove unwanted bond lock for enslave processing · 6b8790b5
      dingtianhong authored
      The bond enslave processing don't hold bond->lock anymore,
      so release an unlocked rw lock will cause warning message,
      remove the unwanted read_unlock(&bond->lock).
      
      Cc: Jay Vosburgh <fubar@us.ibm.com>
      Cc: Veaceslav Falico <vfalico@redhat.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Acked-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b8790b5
    • Liu Junliang's avatar
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew Morton) · 6792dfe3
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "A bunch of fixes"
      
      * emailed patches fron Andrew Morton <akpm@linux-foundation.org>:
        ocfs2: check existence of old dentry in ocfs2_link()
        ocfs2: update inode size after zeroing the hole
        ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size
        mm/memory-failure.c: move refcount only in !MF_COUNT_INCREASED
        smp.h: fix x86+cpu.c sparse warnings about arch nonboot CPU calls
        mm: fix page leak at nfs_symlink()
        slub: do not assert not having lock in removing freed partial
        gitignore: add all.config
        ocfs2: fix ocfs2_sync_file() if filesystem is readonly
        drivers/edac/edac_mc_sysfs.c: poll timeout cannot be zero
        fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem
        xen: properly account for _PAGE_NUMA during xen pte translations
        mm/slub.c: list_lock may not be held in some circumstances
        drivers/md/bcache/extents.c: use %zi to format size_t
        vmcore: prevent PT_NOTE p_memsz overflow during header update
        drivers/message/i2o/i2o_config.c: fix deadlock in compat_ioctl(I2OGETIOPS)
        Documentation/: update 00-INDEX files
        checkpatch: fix detection of git repository
        get_maintainer: fix detection of git repository
        drivers/misc/sgi-gru/grukdump.c: unlocking should be conditional in gru_dump_context()
      6792dfe3
    • Xue jiufei's avatar
      ocfs2: check existence of old dentry in ocfs2_link() · 0e048316
      Xue jiufei authored
      System call linkat first calls user_path_at(), check the existence of
      old dentry, and then calls vfs_link()->ocfs2_link() to do the actual
      work.  There may exist a race when Node A create a hard link for file
      while node B rm it.
      
               Node A                          Node B
      user_path_at()
        ->ocfs2_lookup(),
      find old dentry exist
                                      rm file, add inode say inodeA
                                      to orphan_dir
      
      call ocfs2_link(),create a
      hard link for inodeA.
      
                                      rm the link, add inodeA to orphan_dir
                                      again
      
      When orphan_scan work start, it calls ocfs2_queue_orphans() to do the
      main work.  It first tranverses entrys in orphan_dir, linking all inodes
      in this orphan_dir to a list look like this:
      
      	inodeA->inodeB->...->inodeA
      
      When tranvering this list, it will fall into loop, calling iput() again
      and again.  And finally trigger BUG_ON(inode->i_state & I_CLEAR).
      Signed-off-by: default avatarjoyce <xuejiufei@huawei.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0e048316
    • Junxiao Bi's avatar
      ocfs2: update inode size after zeroing the hole · c7d2cbc3
      Junxiao Bi authored
      fs-writeback will release the dirty pages without page lock whose offset
      are over inode size, the release happens at
      block_write_full_page_endio().  If not update, dirty pages in file holes
      may be released before flushed to the disk, then file holes will contain
      some non-zero data, this will cause sparse file md5sum error.
      
      To reproduce the bug, find a big sparse file with many holes, like vm
      image file, its actual size should be bigger than available mem size to
      make writeback work more frequently, tar it with -S option, then keep
      untar it and check its md5sum again and again until you get a wrong
      md5sum.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Younger Liu <younger.liu@huawei.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c7d2cbc3
    • Younger Liu's avatar
      ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size · d62e74be
      Younger Liu authored
      The issue scenario is as following:
      
      - Create a small file and fallocate a large disk space for a file with
        FALLOC_FL_KEEP_SIZE option.
      
      - ftruncate the file back to the original size again.  but the disk free
        space is not changed back.  This is a real bug that be fixed in this
        patch.
      
      In order to solve the issue above, we modified ocfs2_setattr(), if
      attr->ia_size != i_size_read(inode), It calls ocfs2_truncate_file(), and
      truncate disk space to attr->ia_size.
      Signed-off-by: default avatarYounger Liu <younger.liu@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Tested-by: default avatarJie Liu <jeff.liu@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Reviewed-by: default avatarJensen <shencanquan@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d62e74be
    • Naoya Horiguchi's avatar
      mm/memory-failure.c: move refcount only in !MF_COUNT_INCREASED · 8d547ff4
      Naoya Horiguchi authored
      mce-test detected a test failure when injecting error to a thp tail
      page.  This is because we take page refcount of the tail page in
      madvise_hwpoison() while the fix in commit a3e0f9e4
      ("mm/memory-failure.c: transfer page count from head page to tail page
      after split thp") assumes that we always take refcount on the head page.
      
      When a real memory error happens we take refcount on the head page where
      memory_failure() is called without MF_COUNT_INCREASED set, so it seems
      to me that testing memory error on thp tail page using madvise makes
      little sense.
      
      This patch cancels moving refcount in !MF_COUNT_INCREASED for valid
      testing.
      
      [akpm@linux-foundation.org: s/&&/&/]
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Chen Gong <gong.chen@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[3.9+: a3e0f9e4]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8d547ff4
    • Paul Gortmaker's avatar
      smp.h: fix x86+cpu.c sparse warnings about arch nonboot CPU calls · fb37bb04
      Paul Gortmaker authored
      Use what we already do for arch_disable_smp_support() to fix these:
      
        arch/x86/kernel/smpboot.c:1155:6: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
        arch/x86/kernel/smpboot.c:1160:6: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
        kernel/cpu.c:512:13: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
        kernel/cpu.c:516:13: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fb37bb04
    • Rafael Aquini's avatar
      mm: fix page leak at nfs_symlink() · a0b54add
      Rafael Aquini authored
      Changes in commit a0b8cab3 ("mm: remove lru parameter from
      __pagevec_lru_add and remove parts of pagevec API") have introduced a
      call to add_to_page_cache_lru() which causes a leak in nfs_symlink() as
      now the page gets an extra refcount that is not dropped.
      
      Jan Stancek observed and reported the leak effect while running test8
      from Connectathon Testsuite.  After several iterations over the test
      case, which creates several symlinks on a NFS mountpoint, the test
      system was quickly getting into an out-of-memory scenario.
      
      This patch fixes the page leak by dropping that extra refcount
      add_to_page_cache_lru() is grabbing.
      Signed-off-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: <stable@vger.kernel.org>	[3.11.x+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0b54add
    • Steven Rostedt's avatar
      slub: do not assert not having lock in removing freed partial · 1e4dd946
      Steven Rostedt authored
      Vladimir reported the following issue:
      
      Commit c65c1877 ("slub: use lockdep_assert_held") requires
      remove_partial() to be called with n->list_lock held, but free_partial()
      called from kmem_cache_close() on cache destruction does not follow this
      rule, leading to a warning:
      
        WARNING: CPU: 0 PID: 2787 at mm/slub.c:1536 __kmem_cache_shutdown+0x1b2/0x1f0()
        Modules linked in:
        CPU: 0 PID: 2787 Comm: modprobe Tainted: G        W    3.14.0-rc1-mm1+ #1
        Hardware name:
         0000000000000600 ffff88003ae1dde8 ffffffff816d9583 0000000000000600
         0000000000000000 ffff88003ae1de28 ffffffff8107c107 0000000000000000
         ffff880037ab2b00 ffff88007c240d30 ffffea0001ee5280 ffffea0001ee52a0
        Call Trace:
          __kmem_cache_shutdown+0x1b2/0x1f0
          kmem_cache_destroy+0x43/0xf0
          xfs_destroy_zones+0x103/0x110 [xfs]
          exit_xfs_fs+0x38/0x4e4 [xfs]
          SyS_delete_module+0x19a/0x1f0
          system_call_fastpath+0x16/0x1b
      
      His solution was to add a spinlock in order to quiet lockdep.  Although
      there would be no contention to adding the lock, that lock also requires
      disabling of interrupts which will have a larger impact on the system.
      
      Instead of adding a spinlock to a location where it is not needed for
      lockdep, make a __remove_partial() function that does not test if the
      list_lock is held, as no one should have it due to it being freed.
      
      Also added a __add_partial() function that does not do the lock
      validation either, as it is not needed for the creation of the cache.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reported-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Suggested-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e4dd946
    • Borislav Petkov's avatar
      gitignore: add all.config · 25fba9be
      Borislav Petkov authored
      This is used by kbuild to load preset Kconfig options.  We need to
      ignore it, otherwise git clean kills it.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25fba9be
    • Younger Liu's avatar
      ocfs2: fix ocfs2_sync_file() if filesystem is readonly · a987c7ca
      Younger Liu authored
      If filesystem is readonly, there is no need to flush drive's caches or
      force any uncommitted transactions.
      
      [akpm@linux-foundation.org: return -EROFS, not 0]
      Signed-off-by: default avatarYounger Liu <younger.liucn@gmail.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a987c7ca
    • Prarit Bhargava's avatar
      drivers/edac/edac_mc_sysfs.c: poll timeout cannot be zero · 79040cad
      Prarit Bhargava authored
      If you do
      
        echo 0 > /sys/module/edac_core/parameters/edac_mc_poll_msec
      
      the following stack trace is output because the edac module is not
      designed to poll with a timeout of zero.
      
        WARNING: CPU: 12 PID: 0 at lib/list_debug.c:33 __list_add+0xac/0xc0()
        list_add corruption. prev->next should be next (ffff8808291dd1b8), but was           (null). (prev=ffff8808286fe3f8).
        Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
        CPU: 12 PID: 0 Comm: swapper/12 Not tainted 3.13.0+ #1
        Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
        Call Trace:
         <IRQ>
          __list_add+0xac/0xc0
          __internal_add_timer+0xab/0x130
          internal_add_timer+0x17/0x40
          mod_timer_pinned+0xca/0x170
          intel_pstate_timer_func+0x28a/0x380
          call_timer_fn+0x36/0x100
          run_timer_softirq+0x1ff/0x2f0
          __do_softirq+0xf5/0x2e0
          irq_exit+0x10d/0x120
          smp_apic_timer_interrupt+0x45/0x60
          apic_timer_interrupt+0x6d/0x80
         <EOI>
          cpuidle_idle_call+0xb9/0x1f0
          arch_cpu_idle+0xe/0x30
          cpu_startup_entry+0x9e/0x240
          start_secondary+0x1e4/0x290
      
        kernel BUG at kernel/timer.c:1084!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
        CPU: 12 PID: 0 Comm: swapper/12 Tainted: G        W    3.13.0+ #1
        Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
        Call Trace:
         <IRQ>
          run_timer_softirq+0x245/0x2f0
          __do_softirq+0xf5/0x2e0
          irq_exit+0x10d/0x120
          smp_apic_timer_interrupt+0x45/0x60
          apic_timer_interrupt+0x6d/0x80
         <EOI>
          cpuidle_idle_call+0xb9/0x1f0
          arch_cpu_idle+0xe/0x30
          cpu_startup_entry+0x9e/0x240
          start_secondary+0x1e4/0x290
        RIP   cascade+0x93/0xa0
      
        WARNING: CPU: 36 PID: 1154 at kernel/workqueue.c:1461 __queue_delayed_work+0xed/0x1a0()
        Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm ixgbe e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ptp sb_edac iTCO_vendor_support pps_core mdio ipmi_devintf edac_core ioatdma microcode shpchp lpc_ich pcspkr i2c_i801 dca mfd_core ipmi_si wmi ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod sr_mod cdrom crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt isci i2c_algo_bit drm_kms_helper ttm drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
        CPU: 36 PID: 1154 Comm: kworker/u481:3 Tainted: G        W    3.13.0+ #1
        Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
        Workqueue: edac-poller edac_mc_workq_function [edac_core]
        Call Trace:
          dump_stack+0x45/0x56
          warn_slowpath_common+0x7d/0xa0
          warn_slowpath_null+0x1a/0x20
          __queue_delayed_work+0xed/0x1a0
          queue_delayed_work_on+0x27/0x50
          edac_mc_workq_function+0x72/0xa0 [edac_core]
          process_one_work+0x17b/0x460
          worker_thread+0x11b/0x400
          kthread+0xd2/0xf0
          ret_from_fork+0x7c/0xb0
      
      This patch adds a range check in the edac_mc_poll_msec code to check for 0.
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79040cad
    • Eric W. Biederman's avatar
      fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem · 96c7a2ff
      Eric W. Biederman authored
      Recently due to a spike in connections per second memcached on 3
      separate boxes triggered the OOM killer from accept.  At the time the
      OOM killer was triggered there was 4GB out of 36GB free in zone 1.  The
      problem was that alloc_fdtable was allocating an order 3 page (32KiB) to
      hold a bitmap, and there was sufficient fragmentation that the largest
      page available was 8KiB.
      
      I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious
      but I do agree that order 3 allocations are very likely to succeed.
      
      There are always pathologies where order > 0 allocations can fail when
      there are copious amounts of free memory available.  Using the pigeon
      hole principle it is easy to show that it requires 1 page more than 50%
      of the pages being free to guarantee an order 1 (8KiB) allocation will
      succeed, 1 page more than 75% of the pages being free to guarantee an
      order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of
      the pages being free to guarantee an order 3 allocate will succeed.
      
      A server churning memory with a lot of small requests and replies like
      memcached is a common case that if anything can will skew the odds
      against large pages being available.
      
      Therefore let's not give external applications a practical way to kill
      linux server applications, and specify __GFP_NORETRY to the kmalloc in
      alloc_fdmem.  Unless I am misreading the code and by the time the code
      reaches should_alloc_retry in __alloc_pages_slowpath (where
      __GFP_NORETRY becomes signification).  We have already tried everything
      reasonable to allocate a page and the only thing left to do is wait.  So
      not waiting and falling back to vmalloc immediately seems like the
      reasonable thing to do even if there wasn't a chance of triggering the
      OOM killer.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Cong Wang <cwang@twopensource.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96c7a2ff
    • Mel Gorman's avatar
      xen: properly account for _PAGE_NUMA during xen pte translations · a9c8e4be
      Mel Gorman authored
      Steven Noonan forwarded a users report where they had a problem starting
      vsftpd on a Xen paravirtualized guest, with this in dmesg:
      
        BUG: Bad page map in process vsftpd  pte:8000000493b88165 pmd:e9cc01067
        page:ffffea00124ee200 count:0 mapcount:-1 mapping:     (null) index:0x0
        page flags: 0x2ffc0000000014(referenced|dirty)
        addr:00007f97eea74000 vm_flags:00100071 anon_vma:ffff880e98f80380 mapping:          (null) index:7f97eea74
        CPU: 4 PID: 587 Comm: vsftpd Not tainted 3.12.7-1-ec2 #1
        Call Trace:
          dump_stack+0x45/0x56
          print_bad_pte+0x22e/0x250
          unmap_single_vma+0x583/0x890
          unmap_vmas+0x65/0x90
          exit_mmap+0xc5/0x170
          mmput+0x65/0x100
          do_exit+0x393/0x9e0
          do_group_exit+0xcc/0x140
          SyS_exit_group+0x14/0x20
          system_call_fastpath+0x1a/0x1f
        Disabling lock debugging due to kernel taint
        BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:0 val:-1
        BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:1 val:1
      
      The issue could not be reproduced under an HVM instance with the same
      kernel, so it appears to be exclusive to paravirtual Xen guests.  He
      bisected the problem to commit 1667918b ("mm: numa: clear numa
      hinting information on mprotect") that was also included in 3.12-stable.
      
      The problem was related to how xen translates ptes because it was not
      accounting for the _PAGE_NUMA bit.  This patch splits pte_present to add
      a pteval_present helper for use by xen so both bare metal and xen use
      the same code when checking if a PTE is present.
      
      [mgorman@suse.de: wrote changelog, proposed minor modifications]
      [akpm@linux-foundation.org: fix typo in comment]
      Reported-by: default avatarSteven Noonan <steven@uplinklabs.net>
      Tested-by: default avatarSteven Noonan <steven@uplinklabs.net>
      Signed-off-by: default avatarElena Ufimtseva <ufimtseva@gmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: <stable@vger.kernel.org>	[3.12+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9c8e4be
    • David Rientjes's avatar
      mm/slub.c: list_lock may not be held in some circumstances · 255d0884
      David Rientjes authored
      Commit c65c1877 ("slub: use lockdep_assert_held") incorrectly
      required that add_full() and remove_full() hold n->list_lock.  The lock
      is only taken when kmem_cache_debug(s), since that's the only time it
      actually does anything.
      
      Require that the lock only be taken under such a condition.
      Reported-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Tested-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Tested-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      255d0884
    • Geert Uytterhoeven's avatar
      drivers/md/bcache/extents.c: use %zi to format size_t · bd180b4e
      Geert Uytterhoeven authored
        drivers/md/bcache/extents.c: In function `btree_ptr_bad_expensive':
        drivers/md/bcache/extents.c:196: warning: format `%li' expects type `long int', but argument 4 has type `size_t'
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bd180b4e
    • Greg Pearson's avatar
      vmcore: prevent PT_NOTE p_memsz overflow during header update · 38dfac84
      Greg Pearson authored
      Currently, update_note_header_size_elf64() and
      update_note_header_size_elf32() will add the size of a PT_NOTE entry to
      real_sz even if that causes real_sz to exceeds max_sz.  This patch
      corrects the while loop logic in those routines to ensure that does not
      happen and prints a warning if a PT_NOTE entry is dropped.  If zero
      PT_NOTE entries are found or this condition is encountered because the
      only entry was dropped, a warning is printed and an error is returned.
      
      One possible negative side effect of exceeding the max_sz limit is an
      allocation failure in merge_note_headers_elf64() or
      merge_note_headers_elf32() which would produce console output such as
      the following while booting the crash kernel.
      
        vmalloc: allocation failure: 14076997632 bytes
        swapper/0: page allocation failure: order:0, mode:0x80d2
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-gbp1 #7
        Call Trace:
          dump_stack+0x19/0x1b
          warn_alloc_failed+0xf0/0x160
          __vmalloc_node_range+0x19e/0x250
          vmalloc_user+0x4c/0x70
          merge_note_headers_elf64.constprop.9+0x116/0x24a
          vmcore_init+0x2d4/0x76c
          do_one_initcall+0xe2/0x190
          kernel_init_freeable+0x17c/0x207
          kernel_init+0xe/0x180
          ret_from_fork+0x7c/0xb0
      
        Kdump: vmcore not initialized
      
        kdump: dump target is /dev/sda4
        kdump: saving to /sysroot//var/crash/127.0.0.1-2014.01.28-13:58:52/
        kdump: saving vmcore-dmesg.txt
        Cannot open /proc/vmcore: No such file or directory
        kdump: saving vmcore-dmesg.txt failed
        kdump: saving vmcore
        kdump: saving vmcore failed
      
      This type of failure has been seen on a four socket prototype system
      with certain memory configurations.  Most PT_NOTE sections have a single
      entry similar to:
      
        n_namesz = 0x5
        n_descsz = 0x150
        n_type   = 0x1
      
      Occasionally, a second entry is encountered with very large n_namesz and
      n_descsz sizes:
      
        n_namesz = 0x80000008
        n_descsz = 0x510ae163
        n_type   = 0x80000008
      
      Not yet sure of the source of these extra entries, they seem bogus, but
      they shouldn't cause crash dump to fail.
      Signed-off-by: default avatarGreg Pearson <greg.pearson@hp.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      38dfac84
    • Alexey Khoroshilov's avatar
      drivers/message/i2o/i2o_config.c: fix deadlock in compat_ioctl(I2OGETIOPS) · a3eb7fbb
      Alexey Khoroshilov authored
      i2o_cfg_compat_ioctl(I2OGETIOPS) locks i2o_cfg_mutex and then calls
      i2o_cfg_ioctl(I2OGETIOPS) that locks i2o_cfg_mutex as well.  A deadlock
      is guaranteed.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAlexey Khoroshilov <khoroshilov@ispras.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3eb7fbb
    • Henrik Austad's avatar
      Documentation/: update 00-INDEX files · 3cf8ca1c
      Henrik Austad authored
      Some of the 00-INDEX files are somewhat outdated and some folders does
      not contain 00-INDEX at all.  Only outdated (with the notably exception
      of spi) indexes are touched here, the 169 folders without 00-INDEX has
      not been touched.
      
      New 00-INDEX
       - spi/* was added in a series of commits dating back to 2006
      
      Added files (missing in (*/)00-INDEX)
       - dmatest.txt was added by commit 851b7e16 ("dmatest: run test via
         debugfs")
       - this_cpu_ops.txt was added by commit a1b2a555 ("percpu: add
         documentation on this_cpu operations")
       - ww-mutex-design.txt was added by commit 040a0a37 ("mutex: Add
         support for wound/wait style locks")
       - bcache.txt was added by commit cafe5635 ("bcache: A block layer
         cache")
       - kernel-per-CPU-kthreads.txt was added by commit 49717cb4
         ("kthread: Document ways of reducing OS jitter due to per-CPU
         kthreads")
       - phy.txt was added by commit ff764963 ("drivers: phy: add generic
         PHY framework")
       - block/null_blk was added by commit 12f8f4fc ("null_blk:
         documentation")
       - module-signing.txt was added by commit 3cafea30 ("Add
         Documentation/module-signing.txt file")
       - assoc_array.txt was added by commit 3cb98950 ("Add a generic
         associative array implementation.")
       - arm/IXP4xx was part of the initial repo
       - arm/cluster-pm-race-avoidance.txt was added by commit 7fe31d28
         ("ARM: mcpm: introduce helpers for platform coherency exit/setup")
       - arm/firmware.txt was added by commit 7366b92a ("ARM: Add
         interface for registering and calling firmware-specific operations")
       - arm/kernel_mode_neon.txt was added by commit 2afd0a05 ("ARM:
         7825/1: document the use of NEON in kernel mode")
       - arm/tcm.txt was added by commit bc581770 ("ARM: 5580/2: ARM TCM
         (Tightly-Coupled Memory) support v3")
       - arm/vlocks.txt was added by commit 9762f12d ("ARM: mcpm: Add
         baremetal voting mutexes")
       - blackfin/gptimers-example.c, Makefile was added by commit
         4b60779d ("Blackfin: add an example showing how to use the
         gptimers API")
       - devicetree/usage-model.txt was added by commit 31134efc ("dt:
         Linux DT usage model documentation")
       - fb/api.txt was added by commit fb21c2f4 ("fbdev: Add FOURCC-based
         format configuration API")
       - fb/sm501.txt was added by commit e6a04980 ("video, sm501: add
         edid and commandline support")
       - fb/udlfb.txt was added by commit 96f8d864 ("fbdev: move udlfb out
         of staging.")
       - filesystems/Makefile was added by commit 1e0051ae
         ("Documentation/fs/: split txt and source files")
       - filesystems/nfs/nfsd-admin-interfaces.txt was added by commit
         8a4c6e19 ("nfsd: document kernel interfaces for nfsd
         configuration")
       - ide/warm-plug-howto.txt was added by commit f74c9141 ("ide: add
         warm-plug support for IDE devices (take 2)")
       - laptops/Makefile was added by commit d49129ac
         ("Documentation/laptop/: split txt and source files")
       - leds/leds-blinkm.txt was added by commit b54cf35a ("LEDS: add
         BlinkM RGB LED driver, documentation and update MAINTAINERS")
       - leds/ledtrig-oneshot.txt was added by commit 5e417281 ("leds: add
         oneshot trigger")
       - leds/ledtrig-transient.txt was added by commit 44e1e9f8 ("leds:
         add new transient trigger for one shot timer activation")
       - m68k/README.buddha was part of the initial repo
       - networking/LICENSE.(qla3xxx|qlcnic|qlge) was added by commits
         40839129, c4e84bde, 5a4faa87
       - networking/Makefile was added by commit 3794f3e8 ("docsrc: build
         Documentation/ sources")
       - networking/i40evf.txt was added by commit 105bf2fe ("i40evf: add
         driver to kernel build system")
       - networking/ipsec.txt was added by commit b3c6efbc ("xfrm: Add
         file to document IPsec corner case")
       - networking/mac80211-auth-assoc-deauth.txt was added by commit
         3cd7920a ("mac80211: add auth/assoc/deauth flow diagram")
       - networking/netlink_mmap.txt was added by commit 5683264c
         ("netlink: add documentation for memory mapped I/O")
       - networking/nf_conntrack-sysctl.txt was added by commit c9f9e0e1
         ("netfilter: doc: add nf_conntrack sysctl api documentation") lan)
       - networking/team.txt was added by commit 3d249d4c ("net: introduce
         ethernet teaming device")
       - networking/vxlan.txt was added by commit d342894c ("vxlan:
         virtual extensible lan")
       - power/runtime_pm.txt was added by commit 5e928f77 ("PM: Introduce
         core framework for run-time PM of I/O devices (rev.  17)")
       - power/charger-manager.txt was added by commit 3bb3dbbd
         ("power_supply: Add initial Charger-Manager driver")
       - RCU/lockdep-splat.txt was added by commit d7bd2d68 ("rcu:
         Document interpretation of RCU-lockdep splats")
       - s390/kvm.txt was added by 5ecee4ba (KVM: s390: API documentation)
       - s390/qeth.txt was added by commit b4d72c08 ("qeth: bridgeport
         support - basic control")
       - scheduler/sched-bwc.txt was added by commit 88ebc08e ("sched: Add
         documentation for bandwidth control")
       - scsi/advansys.txt was added by commit 4bd6d7f3 ("[SCSI] advansys:
         Move documentation to Documentation/scsi")
       - scsi/bfa.txt was added by commit 1ec90174 ("[SCSI] bfa: add
         readme file")
       - scsi/bnx2fc.txt was added by commit 12b8fc10 ("[SCSI] bnx2fc: Add
         driver documentation")
       - scsi/cxgb3i.txt was added by commit c3673464 ("[SCSI] cxgb3i: Add
         cxgb3i iSCSI driver.")
       - scsi/hpsa.txt was added by commit 992ebcf1 ("[SCSI] hpsa: Add
         hpsa.txt to Documentation/scsi")
       - scsi/link_power_management_policy.txt was added by commit
         ca77329f ("[libata] Link power management infrastructure")
       - scsi/osd.txt was added by commit 78e0c621 ("[SCSI] osd:
         Documentation for OSD library")
       - scsi/scsi-parameter.txt was created/moved by commit 163475fb
         ("Documentation: move SCSI parameters to their own text file")
       - serial/driver was part of the initial repo
       - serial/n_gsm.txt was added by commit 323e8412 ("n_gsm: add a
         documentation")
       - timers/Makefile was added by commit 3794f3e8 ("docsrc: build
         Documentation/ sources")
       - virt/kvm/s390.txt was added by commit d9101fca ("KVM: s390:
         diagnose call documentation")
       - vm/split_page_table_lock was added by commit 49076ec2 ("mm:
         dynamically allocate page->ptl if it cannot be embedded to struct
         page")
       - w1/slaves/w1_ds28e04 was added by commit fbf7f7b4 ("w1: Add
         1-wire slave device driver for DS28E04-100")
       - w1/masters/omap-hdq was added by commit e0a29382 ("hdq:
         documentation for OMAP HDQ")
       - x86/early-microcode.txt was added by commit 0d91ea86 ("x86, doc:
         Documentation for early microcode loading")
       - x86/earlyprintk.txt was added by commit a1aade47 ("x86/doc:
         mini-howto for using earlyprintk=dbgp")
       - x86/entry_64.txt was added by commit 8b4777a4 ("x86-64: Document
         some of entry_64.S")
       - x86/pat.txt was added by commit d27554d8 ("x86: PAT
         documentation")
      
      Moved files
       - arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by
         commit 37b83046 ("ARM: kuser: move interface documentation out of
         the source code")
       - efi-stub.txt was moved out of x86/ and down into Documentation/ in
         commit 4172fe2f ("EFI stub documentation updates")
       - laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit
         efcfed9b ("Move hp_accel to drivers/platform/x86")
       - commit 5616c23a ("x86: doc: move x86-generic documentation from
         Doc/x86/i386"):
         * x86/usb-legacy-support.txt
         * x86/boot.txt
         * x86/zero_page.txt
       - power/video_extension.txt was moved to acpi in commit 70e66e4d
         ("ACPI / video: move video_extension.txt to Documentation/acpi")
      
      Removed files (left in 00-INDEX)
       - memory.txt was removed by commit 00ea8990 ("memory.txt: remove
         stray information")
       - gpio.txt was moved to gpio/ in commit fd8e198c ("Documentation:
         gpiolib: document new interface")
       - networking/DLINK.txt was removed by commit 168e06ae
         ("drivers/net: delete old parallel port de600/de620 drivers")
       - serial/hayes-esp.txt was removed by commit f53a2ade ("tty: esp:
         remove broken driver")
       - s390/TAPE was removed by commit 9e280f66 ("[S390] remove tape
         block docu")
       - vm/locking was removed by commit 57ea8171 ("mm: documentation:
         remove hopelessly out-of-date locking doc")
       - laptops/acer-wmi.txt was remvoed by commit 02003667 ("acer-wmi:
         Delete out-of-date documentation")
      
      Typos/misc issues
       - rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit
         030d794b ("SUNRPC: Use gssproxy upcall for server RPCGSS
         authentication.")
       - commit b88cf73d ("net: add missing entries to
         Documentation/networking/00-INDEX")
         * generic-hdlc.txt was added as generic_hdlc.txt
         * spider_net.txt was added as spider-net.txt
       - w1/master/mxc-w1 was added as mxc_w1 by commit a5fd9139 ("w1: add
         1-wire master driver for i.MX27 / i.MX31")
       - s390/zfcpdump.txt was added as zfcpdump by commit 6920c12a
         ("[S390] Add Documentation/s390/00-INDEX.")
      Signed-off-by: default avatarHenrik Austad <henrik@austad.us>
      Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	[rcu bits]
      Acked-by: default avatarRob Landley <rob@landley.net>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: James Bottomley <JBottomley@parallels.com>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3cf8ca1c
    • Richard Genoud's avatar
      checkpatch: fix detection of git repository · 3645e328
      Richard Genoud authored
      Since git v1.7.7, the .git directory can be a file when, for example,
      the kernel is a submodule of another git super project.  So, the check
      "-d .git" is not working anymore in this case.  Using a more generic
      check like "-e .git" corrects this behaviour.
      Signed-off-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3645e328
    • Richard Genoud's avatar
      get_maintainer: fix detection of git repository · ec83b616
      Richard Genoud authored
      Since git v1.7.7, the .git directory can be a file when, for example,
      the kernel is a submodule of another git super project.  So, the check
      "-d .git" is not working anymore in this case.  Using a more generic
      check like "-e .git" corrects this behaviour.
      Signed-off-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ec83b616
    • Dan Carpenter's avatar
      drivers/misc/sgi-gru/grukdump.c: unlocking should be conditional in gru_dump_context() · 49d3d6c3
      Dan Carpenter authored
      I was reviewing this and noticed that unlocking should be conditional on
      the error path.  I've changed it to unlock and return directly since we
      only do it once and it seems unlikely to change in the near future.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarDimitri Sivanich <sivanich@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49d3d6c3
  2. 10 Feb, 2014 13 commits
    • John Ogness's avatar
      tcp: tsq: fix nonagle handling · bf06200e
      John Ogness authored
      Commit 46d3ceab ("tcp: TCP Small Queues") introduced a possible
      regression for applications using TCP_NODELAY.
      
      If TCP session is throttled because of tsq, we should consult
      tp->nonagle when TX completion is done and allow us to send additional
      segment, especially if this segment is not a full MSS.
      Otherwise this segment is sent after an RTO.
      
      [edumazet] : Cooked the changelog, added another fix about testing
      sk_wmem_alloc twice because TX completion can happen right before
      setting TSQ_THROTTLED bit.
      
      This problem is particularly visible with recent auto corking,
      but might also be triggered with low tcp_limit_output_bytes
      values or NIC drivers delaying TX completion by hundred of usec,
      and very low rtt.
      
      Thomas Glanzmann for example reported an iscsi regression, caused
      by tcp auto corking making this bug quite visible.
      
      Fixes: 46d3ceab ("tcp: TCP Small Queues")
      Signed-off-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarThomas Glanzmann <thomas@glanzmann.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf06200e
    • David S. Miller's avatar
      Merge branch 'bridge' · 684bd2e1
      David S. Miller authored
      Toshiaki Makita says:
      
      ====================
      bridge: Fix corner case problems around local fdb entries
      
      There are so many corner cases that are not handled properly around local
      fdb entries.
      - We might fail to delete the old entry and might delete an arbitrary local
        entry when changing mac address of a bridge port.
      - We always fail to delete the old entry when changing mac address of the
        bridge device.
      - We might incorrectly delete a necessary entry when detaching a bridge port.
      - We might incorrectly delete a necessary entry when deleting a vlan.
      and so on.
      
      This is a patch series to fix these issues.
      
      v3:
      - Handle NTF_USE case in patch 1/9, commented by Vlad Yasevich.
      
      - Tested port detach/attach and didn't find any problem with patch 5/9,
        suggested by Stephen Hemminger.
      
      - Add comments about possible inconsistent state in current implementation
        into commit log of patch 5/9, found by the above test.
      
      - Reword unintensive changelog of patch 7/9, commented by Vlad Yasevich.
      
      v2:
      - Change the way to find the old entry in br_fdb_changeaddr() from memorizing
        previous port address to introducing a new flag indicating whether a fdb
        entry is added by user or not, commented by Stephen Hemminger.
      
      - Add a fix for the way to insert a new address in br_fdb_changeaddr().
      
      - Prevent creating an entry such that its dst is NULL in br_add_if() to
        preserve old behavior, commented by Vlad Yasevich.
      
      - Add more comments about slight behavior change, where the bridge device
        come to be able to receive traffic to an address it has during short
        window, to changelogs, commented by Vlad Yasevich.
      
      - Add a fix for possible race in br_fdb_change_mac_address().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      684bd2e1
    • Toshiaki Makita's avatar
      bridge: Prevent possible race condition in br_fdb_change_mac_address · ac4c8868
      Toshiaki Makita authored
      br_fdb_change_mac_address() calls fdb_insert()/fdb_delete() without
      br->hash_lock.
      
      These hash list updates are racy with br_fdb_update()/br_fdb_cleanup().
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac4c8868
    • Toshiaki Makita's avatar
      bridge: Properly check if local fdb entry can be deleted when deleting vlan · 424bb9c9
      Toshiaki Makita authored
      Vlan codes unconditionally delete local fdb entries.
      We should consider the possibility that other ports have the same
      address and vlan.
      
      Example of problematic case:
        ip link set eth0 address 12:34:56:78:90:ab
        ip link set eth1 address aa:bb:cc:dd:ee:ff
        brctl addif br0 eth0
        brctl addif br0 eth1 # br0 will have mac address 12:34:56:78:90:ab
        bridge vlan add dev eth0 vid 10
        bridge vlan add dev eth1 vid 10
        bridge vlan add dev br0 vid 10 self
      We will have fdb entry such that f->dst == eth0, f->vlan_id == 10 and
      f->addr == 12:34:56:78:90:ab at this time.
      Next, delete eth0 vlan 10.
        bridge vlan del dev eth0 vid 10
      In this case, we still need the entry for br0, but it will be deleted.
      
      Note that br0 needs the entry even though its mac address is not set
      manually. To delete the entry with proper condition checking,
      fdb_delete_local() is suitable to use.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      424bb9c9
    • Toshiaki Makita's avatar
      bridge: Properly check if local fdb entry can be deleted in br_fdb_delete_by_port · a778e6d1
      Toshiaki Makita authored
      br_fdb_delete_by_port() doesn't care about vlan and mac address of the
      bridge device.
      
      As the check is almost the same as mac address changing, slightly modify
      fdb_delete_local() and use it.
      
      Note that we can always set added_by_user to 0 in fdb_delete_local() because
      - br_fdb_delete_by_port() calls fdb_delete_local() for local entries
        regardless of its added_by_user. In this case, we have to check if another
        port has the same address and vlan, and if found, we have to create the
        entry (by changing dst). This is kernel-added entry, not user-added.
      - br_fdb_changeaddr() doesn't call fdb_delete_local() for user-added entry.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a778e6d1
    • Toshiaki Makita's avatar
      bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address · 960b589f
      Toshiaki Makita authored
      br_fdb_change_mac_address() doesn't check if the local entry has the
      same address as any of bridge ports.
      Although I'm not sure when it is beneficial, current implementation allow
      the bridge device to receive any mac address of its ports.
      To preserve this behavior, we have to check if the mac address of the
      entry being deleted is identical to that of any port.
      
      As this check is almost the same as that in br_fdb_changeaddr(), create
      a common function fdb_delete_local() and call it from
      br_fdb_changeadddr() and br_fdb_change_mac_address().
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      960b589f
    • Toshiaki Makita's avatar
      bridge: Fix the way to check if a local fdb entry can be deleted · 2b292fb4
      Toshiaki Makita authored
      We should take into account the followings when deleting a local fdb
      entry.
      
      - nbp_vlan_find() can be used only when vid != 0 to check if an entry is
        deletable, because a fdb entry with vid 0 can exist at any time while
        nbp_vlan_find() always return false with vid 0.
      
        Example of problematic case:
          ip link set eth0 address 12:34:56:78:90:ab
          ip link set eth1 address 12:34:56:78:90:ab
          brctl addif br0 eth0
          brctl addif br0 eth1
          ip link set eth0 address aa:bb:cc:dd:ee:ff
        Then, the fdb entry 12:34:56:78:90:ab will be deleted even though the
        bridge port eth1 still has that address.
      
      - The port to which the bridge device is attached might needs a local entry
        if its mac address is set manually.
      
        Example of problematic case:
          ip link set eth0 address 12:34:56:78:90:ab
          brctl addif br0 eth0
          ip link set br0 address 12:34:56:78:90:ab
          ip link set eth0 address aa:bb:cc:dd:ee:ff
        Then, the fdb still must have the entry 12:34:56:78:90:ab, but it will be
        deleted.
      
      We can use br->dev->addr_assign_type to check if the address is manually
      set or not, but I propose another approach.
      
      Since we delete and insert local entries whenever changing mac address
      of the bridge device, we can change dst of the entry to NULL regardless of
      addr_assign_type when deleting an entry associated with a certain port,
      and if it is found to be unnecessary later, then delete it.
      That is, if changing mac address of a port, the entry might be changed
      to its dst being NULL first, but is eventually deleted when recalculating
      and changing bridge id.
      
      This approach is especially useful when we want to share the code with
      deleting vlan in which the bridge device might want such an entry regardless
      of addr_assign_type, and makes things easy because we don't have to consider
      if mac address of the bridge device will be changed or not at the time we
      delete a local entry of a port, which means fdb code will not be bothered
      even if the bridge id calculating logic is changed in the future.
      
      Also, this change reduces inconsistent state, where frames whose dst is the
      mac address of the bridge, can't reach the bridge because of premature fdb
      entry deletion. This change reduces the possibility that the bridge device
      replies unreachable mac address to arp requests, which could occur during
      the short window between calling del_nbp() and br_stp_recalculate_bridge_id()
      in br_del_if(). This will effective after br_fdb_delete_by_port() starts to
      use the same code by following patch.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b292fb4
    • Toshiaki Makita's avatar
      bridge: Change local fdb entries whenever mac address of bridge device changes · a4b816d8
      Toshiaki Makita authored
      Vlan code may need fdb change when changing mac address of bridge device
      even if it is caused by the mac address changing of a bridge port.
      
      Example configuration:
        ip link set eth0 address 12:34:56:78:90:ab
        ip link set eth1 address aa:bb:cc:dd:ee:ff
        brctl addif br0 eth0
        brctl addif br0 eth1 # br0 will have mac address 12:34:56:78:90:ab
        bridge vlan add dev br0 vid 10 self
        bridge vlan add dev eth0 vid 10
      We will have fdb entry such that f->dst == NULL, f->vlan_id == 10 and
      f->addr == 12:34:56:78:90:ab at this time.
      Next, change the mac address of eth0 to greater value.
        ip link set eth0 address ee:ff:12:34:56:78
      Then, mac address of br0 will be recalculated and set to aa:bb:cc:dd:ee:ff.
      However, an entry aa:bb:cc:dd:ee:ff will not be created and we will be not
      able to communicate using br0 on vlan 10.
      
      Address this issue by deleting and adding local entries whenever
      changing the mac address of the bridge device.
      
      If there already exists an entry that has the same address, for example,
      in case that br_fdb_changeaddr() has already inserted it,
      br_fdb_change_mac_address() will simply fail to insert it and no
      duplicated entry will be made, as it was.
      
      This approach also needs br_add_if() to call br_fdb_insert() before
      br_stp_recalculate_bridge_id() so that we don't create an entry whose
      dst == NULL in this function to preserve previous behavior.
      
      Note that this is a slight change in behavior where the bridge device can
      receive the traffic to the new address before calling
      br_stp_recalculate_bridge_id() in br_add_if().
      However, it is not a problem because we have already the address on the
      new port and such a way to insert new one before recalculating bridge id
      is taken in br_device_event() as well.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4b816d8
    • Toshiaki Makita's avatar
      bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address · a3ebb7ef
      Toshiaki Makita authored
      We have been always failed to delete the old entry at
      br_fdb_change_mac_address() because br_set_mac_address() updates
      dev->dev_addr before calling br_fdb_change_mac_address() and
      br_fdb_change_mac_address() uses dev->dev_addr to find the old entry.
      
      That update of dev_addr is completely unnecessary because the same work
      is done in br_stp_change_bridge_id() which is called right away after
      calling br_fdb_change_mac_address().
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3ebb7ef
    • Toshiaki Makita's avatar
      bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr · 2836882f
      Toshiaki Makita authored
      Since commit bc9a25d2 ("bridge: Add vlan support for local fdb entries"),
      br_fdb_changeaddr() has inserted a new local fdb entry only if it can
      find old one. But if we have two ports where they have the same address
      or user has deleted a local entry, there will be no entry for one of the
      ports.
      
      Example of problematic case:
        ip link set eth0 address aa:bb:cc:dd:ee:ff
        ip link set eth1 address aa:bb:cc:dd:ee:ff
        brctl addif br0 eth0
        brctl addif br0 eth1 # eth1 will not have a local entry due to dup.
        ip link set eth1 address 12:34:56:78:90:ab
      Then, the new entry for the address 12:34:56:78:90:ab will not be
      created, and the bridge device will not be able to communicate.
      
      Insert new entries regardless of whether we can find old entries or not.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2836882f
    • Toshiaki Makita's avatar
      bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr · a5642ab4
      Toshiaki Makita authored
      br_fdb_changeaddr() assumes that there is at most one local entry per port
      per vlan. It used to be true, but since commit 36fd2b63 ("bridge: allow
      creating/deleting fdb entries via netlink"), it has not been so.
      Therefore, the function might fail to search a correct previous address
      to be deleted and delete an arbitrary local entry if user has added local
      entries manually.
      
      Example of problematic case:
        ip link set eth0 address ee:ff:12:34:56:78
        brctl addif br0 eth0
        bridge fdb add 12:34:56:78:90:ab dev eth0 master
        ip link set eth0 address aa:bb:cc:dd:ee:ff
      Then, the address 12:34:56:78:90:ab might be deleted instead of
      ee:ff:12:34:56:78, the original mac address of eth0.
      
      Address this issue by introducing a new flag, added_by_user, to struct
      net_bridge_fdb_entry.
      
      Note that br_fdb_delete_by_port() has to set added_by_user to 0 in cases
      like:
        ip link set eth0 address 12:34:56:78:90:ab
        ip link set eth1 address aa:bb:cc:dd:ee:ff
        brctl addif br0 eth0
        bridge fdb add aa:bb:cc:dd:ee:ff dev eth0 master
        brctl addif br0 eth1
        brctl delif br0 eth0
      In this case, kernel should delete the user-added entry aa:bb:cc:dd:ee:ff,
      but it also should have been added by "brctl addif br0 eth1" originally,
      so we don't delete it and treat it a new kernel-created entry.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5642ab4
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · cbf2822a
      Linus Torvalds authored
      Pull CIFS fixes from Steve French:
       "Small fix from Jeff for writepages leak, and some fixes for ACLs and
        xattrs when SMB2 enabled.
      
        Am expecting another fix from Jeff and at least one more fix (for
        mounting SMB2 with cifsacl) in the next week"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        [CIFS] clean up page array when uncached write send fails
        cifs: use a flexarray in cifs_writedata
        retrieving CIFS ACLs when mounted with SMB2 fails dropping session
        Add protocol specific operation for CIFS xattrs
      cbf2822a
    • Jesper Juhl's avatar
      tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min · b10bd54c
      Jesper Juhl authored
      As far as I can tell we have used a default of 60 seconds for
      FIN_WAIT2 timeout for ages (since 2.x times??).
      
      In any case, the timeout these days is 60 seconds, so the 3 min
      comment is wrong (and cost me a few minutes of my life when I was
      debugging a FIN_WAIT2 related problem in a userspace application and
      checked the kernel source for details).
      Signed-off-by: default avatarJesper Juhl <jj@chaosbits.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b10bd54c