Commits · a5a6579db33af91f4f5134e14be758dc71c1b694 · Kirill Smelkov / linux

13 Mar, 2015 12 commits

mm: reorder can_do_mlock to fix audit denial · a5a6579d

Jeff Vander Stoep authored Mar 12, 2015

A userspace call to mmap(MAP_LOCKED) may result in the successful locking
of memory while also producing a confusing audit log denial.  can_do_mlock
checks capable and rlimit.  If either of these return positive
can_do_mlock returns true.  The capable check leads to an LSM hook used by
apparmour and selinux which produce the audit denial.  Reordering so
rlimit is checked first eliminates the denial on success, only recording a
denial when the lock is unsuccessful as a result of the denial.
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Paul Cassella <cassella@cray.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a5a6579d

kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h> · d3733e5c

Andrey Ryabinin authored Mar 12, 2015

include/linux/moduleloader.h is more suitable place for this macro.
Also change alignment to PAGE_SIZE for CONFIG_KASAN=n as such
alignment already assumed in several places.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d3733e5c

kasan, module, vmalloc: rework shadow allocation for modules · a5af5aa8

Andrey Ryabinin authored Mar 12, 2015

Current approach in handling shadow memory for modules is broken.

Shadow memory could be freed only after memory shadow corresponds it is no
longer used.  vfree() called from interrupt context could use memory its
freeing to store 'struct llist_node' in it:

    void vfree(const void *addr)
    {
    ...
        if (unlikely(in_interrupt())) {
            struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
            if (llist_add((struct llist_node *)addr, &p->list))
                    schedule_work(&p->wq);

Later this list node used in free_work() which actually frees memory.
Currently module_memfree() called in interrupt context will free shadow
before freeing module's memory which could provoke kernel crash.

So shadow memory should be freed after module's memory.  However, such
deallocation order could race with kasan_module_alloc() in module_alloc().

Free shadow right before releasing vm area.  At this point vfree()'d
memory is not used anymore and yet not available for other allocations.
New VM_KASAN flag used to indicate that vm area has dynamically allocated
shadow memory so kasan frees shadow only if it was previously allocated.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a5af5aa8

fanotify: fix event filtering with FAN_ONDIR set · b3c1030d

Suzuki K. Poulose authored Mar 12, 2015

With FAN_ONDIR set, the user can end up getting events, which it hasn't
marked.  This was revealed with fanotify04 testcase failure on
Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0
("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").

   # /opt/ltp/testcases/bin/fanotify04
   [ ... ]
  fanotify04    7  TPASS  :  event generated properly for type 100000
  fanotify04    8  TFAIL  :  fanotify04.c:147: got unexpected event 30
  fanotify04    9  TPASS  :  No event as expected

The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
a fanotify on a dir.  Then does an open(), followed by close() of the
directory and expects to see an event FAN_OPEN(0x20).  However, the
fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)).  This happens due to
the flaw in the check for event_mask in fanotify_should_send_event() which
does:

	if (event_mask & marks_mask & ~marks_ignored_mask)
		return true;

where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
       marks_mask == (FAN_ONDIR | FAN_OPEN),
       marks_ignored_mask == 0

Fix this by masking the outgoing events to the user, as we already take
care of FAN_ONDIR and FAN_EVENT_ON_CHILD.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b3c1030d

mm/nommu.c: export symbol max_mapnr · 5b8bf307

gchen gchen authored Mar 12, 2015

Several modules may need max_mapnr, so export, the related error with
allmodconfig under c6x:

  MODPOST 3327 modules
  ERROR: "max_mapnr" [fs/pstore/ramoops.ko] undefined!
  ERROR: "max_mapnr" [drivers/media/v4l2-core/videobuf2-dma-contig.ko] undefined!
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

5b8bf307

arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU · 65b9ab88

Chen Gang authored Mar 12, 2015

When !MMU, asm-generic will not define default pgprot_writecombine, so c6x
needs to define it by itself.  The related error:

    CC [M]  fs/pstore/ram_core.o
  fs/pstore/ram_core.c: In function 'persistent_ram_vmap':
  fs/pstore/ram_core.c:399:10: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
     prot = pgprot_writecombine(PAGE_KERNEL);
            ^
  fs/pstore/ram_core.c:399:8: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
     prot = pgprot_writecombine(PAGE_KERNEL);
          ^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

65b9ab88

nilfs2: fix deadlock of segment constructor during recovery · 283ee148

Ryusuke Konishi authored Mar 12, 2015

According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time.  The code path that caused the deadlock was
as follows:

  nilfs_fill_super()
    load_nilfs()
      nilfs_salvage_orphan_logs()
        * Do roll-forwarding, attach segment constructor for recovery,
          and kick it.

        nilfs_segctor_thread()
          nilfs_segctor_thread_construct()
           * A lock is held with nilfs_transaction_lock()
             nilfs_segctor_do_construct()
               nilfs_segctor_drop_written_files()
                 iput()
                   iput_final()
                     write_inode_now()
                       writeback_single_inode()
                         __writeback_single_inode()
                           do_writepages()
                             nilfs_writepage()
                               nilfs_construct_dsync_segment()
                                 nilfs_transaction_lock() --> deadlock

This can happen if commit 7ef3ff2f ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time.  The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that.  For
instance, we can reproduce the issue with the following steps:

 < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
 # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
 # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
 count=1 && reboot -nfh
 < the system will immediately reboot >
 # mount -t nilfs2 /dev/sdb1 /nilfs

The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.

This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: Yuxuan Shui <yshuiv7@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

283ee148

mm: cma: fix CMA aligned offset calculation · 850fc430

Danesh Petigara authored Mar 12, 2015

The CMA aligned offset calculation is incorrect for non-zero order_per_bit
values.

For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
align_order=12, the function returns a value of 0x17c00 instead of 0x400.

This patch fixes the CMA aligned offset calculation.

The previous calculation was wrong and would return too-large values for
the offset, so that when cma_alloc looks for free pages in the bitmap with
the requested alignment > order_per_bit, it starts too far into the bitmap
and so CMA allocations will fail despite there actually being plenty of
free pages remaining.  It will also probably have the wrong alignment.
With this change, we will get the correct offset into the bitmap.

One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.

[gregory.0xf0@gmail.com: changelog additions]
Signed-off-by: Danesh Petigara <dpetigara@broadcom.com>
Reviewed-by: Gregory Fong <gregory.0xf0@gmail.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

850fc430

mm, hugetlb: close race when setting PageTail for gigantic pages · 44fc8057

David Rientjes authored Mar 12, 2015

Now that gigantic pages are dynamically allocatable, care must be taken to
ensure that p->first_page is valid before setting PageTail.

If this isn't done, then it is possible to race and have compound_head()
return NULL.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

44fc8057

mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disabled · e009d5dc

Michal Hocko authored Mar 12, 2015

Tetsuo Handa has pointed out that __GFP_NOFAIL allocations might fail
after OOM killer is disabled if the allocation is performed by a kernel
thread.  This behavior was introduced from the very beginning by
7f33d49a ("mm, PM/Freezer: Disable OOM killer when tasks are frozen").
 This means that the basic contract for the allocation request is broken
and the context requesting such an allocation might blow up unexpectedly.

There are basically two ways forward.

1) move oom_killer_disable after kernel threads are frozen.  This has a
   risk that the OOM victim wouldn't be able to finish because it would
   depend on an already frozen kernel thread.  This would be really tricky
   to debug.

2) do not fail GFP_NOFAIL allocation no matter what and risk a
   potential Freezable kernel threads will loop and fail the suspend.
   Incidental allocations after kernel threads are frozen will at least
   dump a warning - if we are lucky and the serial console is still active
   of course...

This patch implements the later option because it is safer.  We would see
warning rather than allocation failures for the kernel threads which would
blow up otherwise and have a higher chances to identify __GFP_NOFAIL users
from deeper pm code.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@gooogle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e009d5dc

drivers/rtc/rtc-s3c.c: add .needs_src_clk to s3c6410 RTC data · 8792f777

Javier Martinez Canillas authored Mar 12, 2015

Commit df9e26d0 ("rtc: s3c: add support for RTC of Exynos3250 SoC")
added an "rtc_src" DT property to specify the clock used as a source to
the S3C real-time clock.

Not all SoCs needs this so commit eaf3a659 ("drivers/rtc/rtc-s3c.c:
fix initialization failure without rtc source clock") changed to check
the struct s3c_rtc_data .needs_src_clk to conditionally grab the clock.

But that commit didn't update the data for each IP version so the RTC
broke on the boards that needs a source clock. This is the case of at
least Exynos5250 and Exynos5440 which uses the s3c6410 RTC IP block.

This commit fixes the S3C rtc on the Exynos5250 Snow and Exynos5420
Peach Pit and Pi Chromebooks.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Chanwoo Choi <cw00.choi@samsung.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Tyler Baker <tyler.baker@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8792f777

ocfs2: make append_dio an incompat feature · 18d585f0

Mark Fasheh authored Mar 12, 2015

It turns out that making this feature ro_compat isn't quite enough to
prevent accidental corruption on mount from older kernels.  Ocfs2 (like
other file systems) will process orphaned inodes even when the user mounts
in 'ro' mode.  So for the case of a filesystem not knowing the append_dio
feature, mounting the filesystem could result in orphaned-for-dio files
being deleted, which we clearly don't want.

So instead, turn this into an incompat flag.

Btw, this is kind of my fault - initially I asked that we add a flag to
cover the feature and even suggested that we use an ro flag.  It wasn't
until I was looking through our commits for v4.0-rc1 that I realized we
actually want this to be incompat.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

18d585f0

12 Mar, 2015 7 commits

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 09d35919

Linus Torvalds authored Mar 12, 2015

Pull i2c fix from Wolfram Sang:
 "An important bugfix for the I2C subsystem core"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Revert "i2c: core: Dispose OF IRQ mapping at client removal time"

09d35919

Merge tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 91e9134e

Linus Torvalds authored Mar 12, 2015

Pull PCI fixes from Bjorn Helgaas:
 "Here are a couple updates for v4.0.

  One fixes a config accessor problem on APM X-Gene that we introduced
  when switching to generic config accessors, and the other fixes an
  older read-past-end-of-buffer problem in sysfs.

  APM X-Gene host bridge driver
    - Add register offset to config space base address (Feng Kan)

  Miscellaneous
    - Don't read past the end of sysfs "driver_override" buffer (Sasha Levin)"

* tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: xgene: Add register offset to config space base address
  PCI: Don't read past the end of sysfs "driver_override" buffer

91e9134e

Merge tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze · d3dd73fc

Linus Torvalds authored Mar 12, 2015

Pull arch/microblaze fixes from Michal Simek:
 "Fix syscall error recovery.

  Two patches - one is just preparation patch for the second which is
  fixing the problem with syscalls"

* tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Fix syscall error recovery for invalid syscall IDs
  microblaze: Coding style cleanup

d3dd73fc

Merge tag 'nios2-fix-4.0-rc4' of git://git.rocketboards.org/linux-socfpga-next · 56275112

Linus Torvalds authored Mar 12, 2015

Pull arch/nios2 fix from Ley Foon Tan:
 "Remove pt_regs from user header and use generic ucontext.h"

* tag 'nios2-fix-4.0-rc4' of git://git.rocketboards.org/linux-socfpga-next:
  nios2: update pt_regs

56275112

mm: fix up numa read-only thread grouping logic · 53da3bc2

Linus Torvalds authored Mar 12, 2015

Dave Chinner reported that commit 4d942466 ("mm: convert
p[te|md]_mknonnuma and remaining page table manipulations") slowed down
his xfsrepair test enormously.  In particular, it was using more system
time due to extra TLB flushing.

The ultimate reason turns out to be how the change to use the regular
page table accessor functions broke the NUMA grouping logic.  The old
special mknuma/mknonnuma code accessed the page table present bit and
the magic NUMA bit directly, while the new code just changes the page
protections using PROT_NONE and the regular vma protections.

That sounds equivalent, and from a fault standpoint it really is, but a
subtle side effect is that the *other* protection bits of the page table
entries also change.  And the code to decide how to group the NUMA
entries together used the writable bit to decide whether a particular
page was likely to be shared read-only or not.

And with the change to make the NUMA handling use the regular permission
setting functions, that writable bit was basically always cleared for
private mappings due to COW.  So even if the page actually ends up being
written to in the end, the NUMA balancing would act as if it was always
shared RO.

This code is a heuristic anyway, so the fix - at least for now - is to
instead check whether the page is dirty rather than writable.  The bit
doesn't change with protection changes.

NOTE! This also adds a FIXME comment to revisit this issue,

Not only should we probably re-visit the whole "is this a shared
read-only page" heuristic (we might want to take the vma permissions
into account and base this more on those than the per-page ones, and
also look at whether the particular access that triggers it is a write
or not), but the whole COW issue shows that we should think about the
NUMA fault handling some more.

For example, maybe we should do the early-COW thing that a regular fault
does.  Or maybe we should accept that while using the same bits as
PROTNONE was a good thing (and got rid of the specual NUMA bit), we
might still want to just preseve the other protection bits across NUMA
faulting.

Those are bigger questions, left for later.  This just fixes up the
heuristic so that it at least approximates working again.  More analysis
and work needed.
Reported-by: Dave Chinner <david@fromorbit.com>
Tested-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

53da3bc2

Revert "i2c: core: Dispose OF IRQ mapping at client removal time" · a4944572

Jakub Kicinski authored Mar 11, 2015

This reverts commit e4df3a0b
("i2c: core: Dispose OF IRQ mapping at client removal time")

Calling irq_dispose_mapping() will destroy the mapping and disassociate
the IRQ from the IRQ chip to which it belongs. Keeping it is OK, because
existent mappings are reused properly.

Also, this commit breaks drivers using devm* for IRQ management on
OF-based systems because devm* cleanup happens in device code, after
bus's remove() method returns.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Reported-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[wsa: updated the commit message with findings fromt the other bug report]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Fixes: e4df3a0b

a4944572

nios2: update pt_regs · 92d5dd8c

Chung-Ling Tang authored Mar 12, 2015

Remove struct pt_regs from user header and use generic ucontext.h.
Signed-off-by: Chung-Ling Tang <cltang@codesourcery.com>
Acked-by: Ley Foon Tan <lftan@altera.com>

92d5dd8c

11 Mar, 2015 2 commits

Merge tag 'for-linus-20150310' of git://git.infradead.org/linux-mtd · cca28a5f

Linus Torvalds authored Mar 10, 2015

Pull MTD fixes from Brian Norris:

 * pxa3xx_nand
   - fix timeout issues when draining the FIFO (BCH only)
   - don't crash when no chip-selects are used

 * hisi504_nand
   - depend on HAS_DMA, to fix compile errors

* tag 'for-linus-20150310' of git://git.infradead.org/linux-mtd:
  mtd: nand: MTD_NAND_HISI504 should depend on HAS_DMA
  mtd: pxa3xx_nand: fix driver when num_cs is 0
  mtd: nand: pxa3xx: Fix PIO FIFO draining

cca28a5f

Merge tag 'iommu-fixes-v4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 9c3e1323

Linus Torvalds authored Mar 10, 2015

Pull iommu fixes from Joerg Roedel:
 "The patches contain:

   - fix multiple ARM IOMMU drivers to behave well when the hardware is
     not present

   - mark MSM driver as broken

   - fix build errors with the new ARM generic io-page-table code"

* tag 'iommu-fixes-v4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/io-pgtable-arm: Add built time dependency
  iommu/msm: Mark driver BROKEN
  iommu/rockchip: Play nice in multi-platform builds
  iommu/omap: Play nice in multi-platform builds
  iommu/exynos: Play nice in multi-platform builds
  iommu/io-pgtable-arm: Fix self-test WARNs on i386

9c3e1323

10 Mar, 2015 12 commits

Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · affb8172

Linus Torvalds authored Mar 09, 2015

Pull kvm/s390 bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: non-LPAR case obsolete during facilities mask init
  KVM: s390: include guest facilities in kvm facility test
  KVM: s390: fix in memory copy of facility lists
  KVM: s390/cpacf: Fix kernel bug under z/VM
  KVM: s390/cpacf: Enable key wrapping by default

affb8172

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · ec0e6bd3

Linus Torvalds authored Mar 09, 2015

Pull s390 fixes from Martin Schwidefsky:
 "One performance optimization for page_clear and a couple of bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix incorrect ASCE after crst_table_downgrade
  s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax()
  s390/pci: unify pci_iomap symbol exports
  s390/pci: fix [un]map_resources sequence
  s390: let the compiler do page clearing
  s390/pci: fix possible information leak in mmio syscall
  s390/dcss: array index 'i' is used before limits check.
  s390/scm_block: fix off by one during cluster reservation
  s390/jump label: improve and fix sanity check
  s390/jump label: add missing jump_label_apply_nops() call

ec0e6bd3

Merge tag 'trace-fixes-v4.0-rc2-2' of... · e7901af1

Linus Torvalds authored Mar 09, 2015

Merge tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull seq-buf/ftrace fixes from Steven Rostedt:
 "This includes fixes for seq_buf_bprintf() truncation issue.  It also
  contains fixes to ftrace when /proc/sys/kernel/ftrace_enabled and
  function tracing are started.  Doing the following causes some issues:

    # echo 0 > /proc/sys/kernel/ftrace_enabled
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer
    # echo 1 > /proc/sys/kernel/ftrace_enabled
    # echo nop > /sys/kernel/debug/tracing/current_tracer
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer

  As well as with function tracing too.  Pratyush Anand first reported
  this issue to me and supplied a patch.  When I tested this on my x86
  test box, it caused thousands of backtraces and warnings to appear in
  dmesg, which also caused a denial of service (a warning for every
  function that was listed).  I applied Pratyush's patch but it did not
  fix the issue for me.  I looked into it and found a slight problem
  with trampoline accounting.  I fixed it and sent Pratyush a patch, but
  he said that it did not fix the issue for him.

  I later learned tha Pratyush was using an ARM64 server, and when I
  tested on my ARM board, I was able to reproduce the same issue as
  Pratyush.  After applying his patch, it fixed the problem.  The above
  test uncovered two different bugs, one in x86 and one in ARM and
  ARM64.  As this looked like it would affect PowerPC, I tested it on my
  PPC64 box.  It too broke, but neither the patch that fixed ARM or x86
  fixed this box (the changes were all in generic code!).  The above
  test, uncovered two more bugs that affected PowerPC.  Again, the
  changes were only done to generic code.  It's the way the arch code
  expected things to be done that was different between the archs.  Some
  where more sensitive than others.

  The rest of this series fixes the PPC bugs as well"

* tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled
  ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl
  ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl
  seq_buf: Fix seq_buf_bprintf() truncation
  seq_buf: Fix seq_buf_vprintf() truncation

e7901af1

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 36bef883

Linus Torvalds authored Mar 09, 2015

Pull networking fixes from David Miller:

 1) nft_compat accidently truncates ethernet protocol to 8-bits, from
    Arturo Borrero.

 2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov.

 3) Don't allow the space required for nftables rules to exceed the
    maximum value representable in the dlen field.  From Patrick
    McHardy.

 4) bcm63xx_enet can accidently leave interrupts permanently disabled
    due to errors in the NAPI polling exit logic.  Fix from Nicolas
    Schichan.

 5) Fix OOPSes triggerable by the ping protocol module, due to missing
    address family validations etc.  From Lorenzo Colitti.

 6) Don't use RCU locking in sleepable context in team driver, from Jiri
    Pirko.

 7) xen-netback miscalculates statistic offset pointers when reporting
    the stats to userspace.  From David Vrabel.

 8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also
    from David Vrabel.

 9) ip_check_defrag() cannot assume that skb_network_offset(),
    particularly when it is used by the AF_PACKET fanout defrag code.
    From Alexander Drozdov.

10) gianfar driver doesn't query OF node names properly when trying to
    determine the number of hw queues available.  Fix it to explicitly
    check for OF nodes named queue-group.  From Tobias Waldekranz.

11) MID field in macb driver should be 12 bits, not 16.  From Punnaiah
    Choudary Kalluri.

12) Fix unintentional regression in traceroute due to timestamp socket
    option changes.  Empty ICMP payloads should be allowed in
    non-timestamp cases.  From Willem de Bruijn.

13) When devices are unregistered, we have to get rid of AF_PACKET
    multicast list entries that point to it via ifindex.  Fix from
    Francesco Ruggeri.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  tipc: fix bug in link failover handling
  net: delete stale packet_mclist entries
  net: macb: constify macb configuration data
  MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer
  MAINTAINERS: linux-can moved to github
  can: kvaser_usb: Read all messages in a bulk-in URB buffer
  can: kvaser_usb: Avoid double free on URB submission failures
  can: peak_usb: fix missing ctrlmode_ init for every dev
  can: add missing initialisations in CAN related skbuffs
  ip: fix error queue empty skb handling
  bgmac: Clean warning messages
  tcp: align tcp_xmit_size_goal() on tcp_tso_autosize()
  net: fec: fix unbalanced clk disable on driver unbind
  net: macb: Correct the MID field length value
  net: gianfar: correctly determine the number of queue groups
  ipv4: ip_check_defrag should not assume that skb_network_offset is zero
  net: bcmgenet: properly disable password matching
  net: eth: xgene: fix booting with devicetree
  bnx2x: Force fundamental reset for EEH recovery
  xen-netback: refactor xenvif_handle_frag_list()
  ...

36bef883

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · e93df634

Linus Torvalds authored Mar 09, 2015

Pull input subsystem fixes from Dmitry Torokhov:
 "Miscellaneous driver fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: psmouse - disable "palm detection" in the focaltech driver
  Input: psmouse - disable changing resolution/rate/scale for FocalTech
  Input: psmouse - ensure that focaltech reports consistent coordinates
  Input: psmouse - remove hardcoded touchpad size from the focaltech driver
  Input: tc3589x-keypad - set IRQF_ONESHOT flag to ensure IRQ request
  Input: ALPS - fix memory leak when detection fails
  Input: sun4i-ts - add thermal driver dependency
  Input: cyapa - remove superfluous type check in cyapa_gen5_read_idac_data()
  Input: cyapa - fix unaligned functions redefinition error
  Input: mma8450 - add parent device

e93df634

Merge tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · 068c65c5

Linus Torvalds authored Mar 09, 2015

Pull regulator fixes from Mark Brown:
 "A couple of driver specific fixes plus a fix for a regression in the
  core where the updates to use sysfs group registration were overly
  enthusiastic in eliding properties and removed some that had been
  previously present"

* tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Fix regression due to NULL constraints check
  regulator: rk808: Set the enable time for LDOs
  regulator: da9210: Mask all interrupt sources to deassert interrupt line

068c65c5

Merge tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · d08edd8f

Linus Torvalds authored Mar 09, 2015

Pull spi fixes from Mark Brown:
 "A collection of driver specific fixes to which the usual comments
  about them being important if you see them mostly apply (except for
  the comment fix).  The pl022 one is particularly nasty for anyone
  affected by it"

* tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: pl022: Fix race in giveback() leading to driver lock-up
  spi: dw-mid: avoid potential NULL dereference
  spi: img-spfi: Verify max spfi transfer length
  spi: fix a typo in comment.
  spi: atmel: Fix interrupt setup for PDC transfers
  spi: dw: revisit FIFO size detection again
  spi: dw-pci: correct number of chip selects
  drivers: spi: ti-qspi: wait for busy bit clear before data write/read

d08edd8f

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · eca8dac4

Linus Torvalds authored Mar 09, 2015

Pull tpm fixes from James Morris:
 "fixes for the TPM driver"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  tpm: fix call order in tpm-chip.c
  tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send

eca8dac4

Merge tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux · ecddad64

Linus Torvalds authored Mar 09, 2015

Pull fbdev fixes from Tomi Valkeinen:
 - Fix regression in with omapdss when using i2c displays
 - Fix possible null deref in fbmon
 - Check kalloc return value in AMBA CLCD

* tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
  OMAPDSS: fix regression with display sysfs files
  video: fbdev: fix possible null dereference
  video: ARM CLCD: Add missing error check for devm_kzalloc

ecddad64

Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · c0e99a71

Linus Torvalds authored Mar 09, 2015

Pull cgroup fixes from Tejun Heo:
 "The cgroup iteration update two years ago and the recent cpuset
  restructuring introduced regressions in subset of cpuset
  configurations.  Three patches to fix them.

  All are marked for -stable"

* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: Fix cpuset sched_relax_domain_level
  cpuset: fix a warning when clearing configured masks in old hierarchy
  cpuset: initialize effective masks when clone_children is enabled

c0e99a71

Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · f930713b

Linus Torvalds authored Mar 09, 2015

Pull libata fixlet from Tejun Heo:
 "Speed limiting fix for sata_fsl"

* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  sata-fsl: Apply link speed limits

f930713b

Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · b695f31f

Linus Torvalds authored Mar 09, 2015

Pull workqueue fix from Tejun Heo:
 "One fix patch for a subtle livelock condition which can happen on
  PREEMPT_NONE kernels involving two racing cancel_work calls.  Whoever
  comes in the second has to wait for the previous one to finish.  This
  was implemented by making the later one block for the same condition
  that the former would be (work item completion) and then loop and
  retest; unfortunately, depending on the wake up order, the later one
  could lock out the former one to finish by busy looping on the cpu.

  This is fixed by implementing explicit wait mechanism.  Work item
  might not belong anywhere at this point and there's remote possibility
  of thundering herd problem.  I originally tried to use bit_waitqueue
  but it didn't work for static work items on modules.  It's currently
  using single wait queue with filtering wake up function and exclusive
  wakeup.  If this ever becomes a problem, which is not very likely, we
  can try to figure out a way to piggy back on bit_waitqueue"

* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE

b695f31f

09 Mar, 2015 7 commits

tipc: fix bug in link failover handling · e6441bae

Jon Paul Maloy authored Mar 09, 2015

In commit c637c103
("tipc: resolve race problem at unicast message reception") we
introduced a new mechanism for delivering buffers upwards from link
to socket layer.

That code contains a bug in how we handle the new link input queue
during failover. When a link is reset, some of its users may be blocked
because of congestion, and in order to resolve this, we add any pending
wakeup pseudo messages to the link's input queue, and deliver them to
the socket. This misses the case where the other, remaining link also
may have congested users. Currently, the owner node's reference to the
remaining link's input queue is unconditionally overwritten by the
reset link's input queue. This has the effect that wakeup events from
the remaining link may be unduely delayed (but not lost) for a
potentially long period.

We fix this by adding the pending events from the reset link to the
input queue that is currently referenced by the node, whichever one
it is.

This commit should be applied to both net and net-next.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e6441bae

net: delete stale packet_mclist entries · 82f17091

Francesco Ruggeri authored Mar 09, 2015

When an interface is deleted from a net namespace the ifindex in the
corresponding entries in PF_PACKET sockets' mclists becomes stale.
This can create inconsistencies if later an interface with the same ifindex
is moved from a different namespace (not that unlikely since ifindexes are
per-namespace).
In particular we saw problems with dev->promiscuity, resulting
in "promiscuity touches roof, set promiscuity failed. promiscuity
feature of device might be broken" warnings and EOVERFLOW failures of
setsockopt(PACKET_ADD_MEMBERSHIP).
This patch deletes the mclist entries for interfaces that are deleted.
Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with
EADDRNOTAVAIL if called after the interface is deleted, also make
packet_mc_drop not fail.
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

82f17091

net: macb: constify macb configuration data · 0b2eb3e9

Josh Cartwright authored Mar 09, 2015

The configurations are not modified by the driver.  Make them 'const' so
that they may be placed in a read-only section.
Signed-off-by: Josh Cartwright <joshc@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0b2eb3e9

Merge tag 'linux-can-fixes-for-4.0-20150309' of... · d0372504

David S. Miller authored Mar 09, 2015

Merge tag 'linux-can-fixes-for-4.0-20150309' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-03-09

this is a pull request for net/master for the 4.0 release cycle, it consists of
6 patches:

A patch by Oliver Hartkopp fixes a long outstanding bug in the infrastructure,
which leads to skb_under_panics when CAN interfaces are used by AF_PACKET
sockets e.g. by dhclient. Stephane Grosjean contributes a patch for the
peak_usb driver which adds a missing initialization. Two patches by Ahmed S.
Darwish fix problems in the kvaser_usb driver. Followed by two patches by
myself, updating the MAINTAINERS file
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d0372504

ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled · 524a3868

Steven Rostedt (Red Hat) authored Mar 06, 2015

Some archs (specifically PowerPC), are sensitive with the ordering of
the enabling of the calls to function tracing and setting of the
function to use to be traced.

That is, update_ftrace_function() sets what function the ftrace_caller
trampoline should call. Some archs require this to be set before
calling ftrace_run_update_code().

Another bug was discovered, that ftrace_startup_sysctl() called
ftrace_run_update_code() directly. If the function the ftrace_caller
trampoline changes, then it will not be updated. Instead a call
to ftrace_startup_enable() should be called because it tests to see
if the callback changed since the code was disabled, and will
tell the arch to update appropriately. Most archs do not need this
notification, but PowerPC does.

The problem could be seen by the following commands:

 # echo 0 > /proc/sys/kernel/ftrace_enabled
 # echo function > /sys/kernel/debug/tracing/current_tracer
 # echo 1 > /proc/sys/kernel/ftrace_enabled
 # cat /sys/kernel/debug/tracing/trace

The trace will show that function tracing was not active.

Cc: stable@vger.kernel.org # 2.6.27+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

524a3868

ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl · 1619dc3f

Pratyush Anand authored Mar 06, 2015

When ftrace is enabled globally through the proc interface, we must check if
ftrace_graph_active is set. If it is set, then we should also pass the
FTRACE_START_FUNC_RET command to ftrace_run_update_code(). Similarly, when
ftrace is disabled globally through the proc interface, we must check if
ftrace_graph_active is set. If it is set, then we should also pass the
FTRACE_STOP_FUNC_RET command to ftrace_run_update_code().

Consider the following situation.

 # echo 0 > /proc/sys/kernel/ftrace_enabled

After this ftrace_enabled = 0.

 # echo function_graph > /sys/kernel/debug/tracing/current_tracer

Since ftrace_enabled = 0, ftrace_enable_ftrace_graph_caller() is never
called.

 # echo 1 > /proc/sys/kernel/ftrace_enabled

Now ftrace_enabled will be set to true, but still
ftrace_enable_ftrace_graph_caller() will not be called, which is not
desired.

Further if we execute the following after this:
  # echo nop > /sys/kernel/debug/tracing/current_tracer

Now since ftrace_enabled is set it will call
ftrace_disable_ftrace_graph_caller(), which causes a kernel warning on
the ARM platform.

On the ARM platform, when ftrace_enable_ftrace_graph_caller() is called,
it checks whether the old instruction is a nop or not. If it's not a nop,
then it returns an error. If it is a nop then it replaces instruction at
that address with a branch to ftrace_graph_caller.
ftrace_disable_ftrace_graph_caller() behaves just the opposite. Therefore,
if generic ftrace code ever calls either ftrace_enable_ftrace_graph_caller()
or ftrace_disable_ftrace_graph_caller() consecutively two times in a row,
then it will return an error, which will cause the generic ftrace code to
raise a warning.

Note, x86 does not have an issue with this because the architecture
specific code for ftrace_enable_ftrace_graph_caller() and
ftrace_disable_ftrace_graph_caller() does not check the previous state,
and calling either of these functions twice in a row has no ill effect.

Link: http://lkml.kernel.org/r/e4fbe64cdac0dd0e86a3bf914b0f83c0b419f146.1425666454.git.panand@redhat.com

Cc: stable@vger.kernel.org # 2.6.31+
Signed-off-by: Pratyush Anand <panand@redhat.com>
[
  removed extra if (ftrace_start_up) and defined ftrace_graph_active as 0
  if CONFIG_FUNCTION_GRAPH_TRACER is not set.
]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

1619dc3f

ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl · b24d443b

Steven Rostedt (Red Hat) authored Mar 04, 2015

When /proc/sys/kernel/ftrace_enabled is set to zero, all function
tracing is disabled. But the records that represent the functions
still hold information about the ftrace_ops that are hooked to them.

ftrace_ops may request "REGS" (have a full set of pt_regs passed to
the callback), or "TRAMP" (the ops has its own trampoline to use).
When the record is updated to represent the state of the ops hooked
to it, it sets "REGS_EN" and/or "TRAMP_EN" to state that the callback
points to the correct trampoline (REGS has its own trampoline).

When ftrace_enabled is set to zero, all ftrace locations are a nop,
so they do not point to any trampoline. But the _EN flags are still
set. This can cause the accounting to go wrong when ftrace_enabled
is cleared and an ops that has a trampoline is registered or unregistered.

For example, the following will cause ftrace to crash:

 # echo function_graph > /sys/kernel/debug/tracing/current_tracer
 # echo 0 > /proc/sys/kernel/ftrace_enabled
 # echo nop > /sys/kernel/debug/tracing/current_tracer
 # echo 1 > /proc/sys/kernel/ftrace_enabled
 # echo function_graph > /sys/kernel/debug/tracing/current_tracer

As function_graph uses a trampoline, when ftrace_enabled is set to zero
the updates to the record are not done. When enabling function_graph
again, the record will still have the TRAMP_EN flag set, and it will
look for an op that has a trampoline other than the function_graph
ops, and fail to find one.

Cc: stable@vger.kernel.org # 3.17+
Reported-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

b24d443b