1. 19 Feb, 2012 3 commits
    • David Howells's avatar
      Delete the __FD_*() funcs for operating on fd_set from linux/time.h · cf420048
      David Howells authored
      Delete the __FD_*() functions for operating on fd_set structs from
      linux/time.h as they're no longer used within the kernel with the preceding
      patch and are not exported to userspace.
      
      Whilst linux/time.h *does* export the FD_*() equivalents as wrappers around
      __FD_*(), userspace provides its own definition of __FD_*().
      
      Note that the definition of FD_ZERO() in linux/time.h may not be used with the
      fd_sets associated with struct fdtable as the fd_set may have been allocated in
      a truncated fashion.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: http://lkml.kernel.org/r/20120216175006.23314.18984.stgit@warthog.procyon.org.ukSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      cf420048
    • David Howells's avatar
      Replace the fd_sets in struct fdtable with an array of unsigned longs · 1fd36adc
      David Howells authored
      Replace the fd_sets in struct fdtable with an array of unsigned longs and then
      use the standard non-atomic bit operations rather than the FD_* macros.
      
      This:
      
       (1) Removes the abuses of struct fd_set:
      
           (a) Since we don't want to allocate a full fd_set the vast majority of the
           	 time, we actually, in effect, just allocate a just-big-enough array of
           	 unsigned longs and cast it to an fd_set type - so why bother with the
           	 fd_set at all?
      
           (b) Some places outside of the core fdtable handling code (such as
           	 SELinux) want to look inside the array of unsigned longs hidden inside
           	 the fd_set struct for more efficient iteration over the entire set.
      
       (2) Eliminates the use of FD_*() macros in the kernel completely.
      
       (3) Permits the __FD_*() macros to be deleted entirely where not exposed to
           userspace.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: http://lkml.kernel.org/r/20120216174954.23314.48147.stgit@warthog.procyon.org.ukSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      1fd36adc
    • David Howells's avatar
      Wrap accesses to the fd_sets in struct fdtable · 1dce27c5
      David Howells authored
      Wrap accesses to the fd_sets in struct fdtable (for recording open files and
      close-on-exec flags) so that we can move away from using fd_sets since we
      abuse the fd_set structs by not allocating the full-sized structure under
      normal circumstances and by non-core code looking at the internals of the
      fd_sets.
      
      The first abuse means that use of FD_ZERO() on these fd_sets is not permitted,
      since that cannot be told about their abnormal lengths.
      
      This introduces six wrapper functions for setting, clearing and testing
      close-on-exec flags and fd-is-open flags:
      
      	void __set_close_on_exec(int fd, struct fdtable *fdt);
      	void __clear_close_on_exec(int fd, struct fdtable *fdt);
      	bool close_on_exec(int fd, const struct fdtable *fdt);
      	void __set_open_fd(int fd, struct fdtable *fdt);
      	void __clear_open_fd(int fd, struct fdtable *fdt);
      	bool fd_is_open(int fd, const struct fdtable *fdt);
      
      Note that I've prepended '__' to the names of the set/clear functions because
      they require the caller to hold a lock to use them.
      
      Note also that I haven't added wrappers for looking behind the scenes at the
      the array.  Possibly that should exist too.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.ukSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      1dce27c5
  2. 14 Feb, 2012 21 commits
  3. 09 Feb, 2012 11 commits
    • Linus Torvalds's avatar
      Linux 3.3-rc3 · d65b4e98
      Linus Torvalds authored
      d65b4e98
    • Linus Torvalds's avatar
      Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 63082402
      Linus Torvalds authored
      One patch fixes an bug in the ARM/MSM IOMMU code which returned sucess
      in the unmap function even when an error occured and the other patch
      adds a workaround into the AMD IOMMU driver to better handle broken IVRS
      ACPI tables (this patch fixes the case when a device is not listed in
      the table but actually translated by the iommu).
      
      * 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/msm: Fix error handling in msm_iommu_unmap()
        iommu/amd: Work around broken IVRS tables
      63082402
    • Linus Torvalds's avatar
      Merge branch '3.3-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 19e75ed4
      Linus Torvalds authored
      This series contains pending target bug-fixes and cleanups for v3.3-rc3
      that have been addressed the past weeks in lio-core.git.
      
      Some of the highlights include:
      
       - Fix handling for control CDBs with data greater than PAGE_SIZE (andy)
       - Use IP_FREEBIND for iscsi-target to address network portal creation
         issues with systemd (dax)
       - Allow PERSISTENT RESERVE IN for non-reservation holder (marco)
       - Fix iblock se_dev_attrib.unmap_granularity (marco)
       - Fix unsupported WRITE_SAME sense payload handling (martin)
       - Add workaround for zero-length control CDB handling (nab)
       - Fix discovery with INADDR_ANY and IN6ADDR_ANY_INIT (nab)
       - Fix target_submit_cmd() exception handling (nab)
       - Return correct ASC for unimplemented VPD pages (roland)
       - Don't zero pages used for data buffers (roland)
       - Fix return code of core_tpg_.*_lun (sebastian)
      
      * '3.3-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (26 commits)
        target: Fix unsupported WRITE_SAME sense payload
        iscsi: use IP_FREEBIND socket option
        iblock: fix handling of large requests
        target: handle empty string writes in sysfs
        iscsi_target: in_aton needs linux/inet.h
        target: Fix iblock se_dev_attrib.unmap_granularity
        target: Fix target_submit_cmd() exception handling
        target: Change target_submit_cmd() to return void
        target: accept REQUEST_SENSE with 18bytes
        target: Fail INQUIRY commands with EVPD==0 but PAGE CODE!=0
        target: Return correct ASC for unimplemented VPD pages
        iscsi-target: Fix discovery with INADDR_ANY and IN6ADDR_ANY_INIT
        target: Allow control CDBs with data > 1 page
        iscsi-target: Fix up a few assignments
        iscsi-target: make one-bit bitfields unsigned
        iscsi-target: Fix double list_add with iscsit_alloc_buffs reject
        iscsi-target: Fix reject release handling in iscsit_free_cmd()
        target: fix return code of core_tpg_.*_lun
        target: use save/restore lock primitive in core_dec_lacl_count()
        target: avoid multiple outputs in scsi_dump_inquiry()
        ...
      19e75ed4
    • Linus Torvalds's avatar
      Merge tag 'md-3.3-fixes' of git://neil.brown.name/md · 4d39aa1b
      Linus Torvalds authored
      Some simple md-related fixes.
      
      1/ two small fixes to ensure we handle an interrupted resync properly.
      2/ avoid loading the bitmap multiple times in dm-raid
      
      * tag 'md-3.3-fixes' of git://neil.brown.name/md:
        md: two small fixes to handling interrupt resync.
        Prevent DM RAID from loading bitmap twice.
      4d39aa1b
    • Linus Torvalds's avatar
      Merge tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6 · 4a68d54c
      Linus Torvalds authored
      SPI bug fixes for v3.3-rc2
      
      Minor SPI device driver changes.  A rename of the pch_spi_pcidev symbol
      that merely eliminates a modpost warning, and a Kconfig change to allow
      the Samsung spi driver to build on EXYNOS.
      
      * tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
        spi-topcliff-pch: rename pch_spi_pcidev to pch_spi_pcidev_driver
        spi: Add spi-s3c64xx driver dependency on ARCH_EXYNOS4
      4a68d54c
    • Linus Torvalds's avatar
      Merge branch 'akpm' (Andrew's tree) · 15a46353
      Linus Torvalds authored
      Five fixes
      
      * branch 'akpm':
        pcmcia: fix socket refcount decrementing on each resume
        mm: fix UP THP spin_is_locked BUGs
        drivers/leds/leds-lm3530.c: fix setting pltfm->als_vmax
        mm: compaction: check for overlapping nodes during isolation for migration
        nilfs2: avoid overflowing segment numbers in nilfs_ioctl_clean_segments()
      15a46353
    • Russell King's avatar
      pcmcia: fix socket refcount decrementing on each resume · 025e4ab3
      Russell King authored
      This fixes a memory-corrupting bug: not only does it cause the warning,
      but as a result of dropping the refcount to zero, it causes the
      pcmcia_socket0 device structure to be freed while it still has
      references, causing slab caches corruption.  A fatal oops quickly
      follows this warning - often even just a 'dmesg' following the warning
      causes the kernel to oops.
      
      While testing suspend/resume on an ARM device with PCMCIA support, and a
      CF card inserted, I found that after five suspend and resumes, the
      kernel would complain, and shortly die after with slab corruption.
      
        WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()
      
      As the message doesn't give a clue about which kobject, and the built-in
      debugging in drivers/base/power/main.c happens too late, this was added
      right before each get_device():
      
        printk("%s: %p [%s] %u\n", __func__, dev, kobject_name(&dev->kobj), atomic_read(&dev->kobj.kref.refcount));
      
      and on the 3rd s2ram cycle, the following behaviour observed:
      
      On the 3rd suspend/resume cycle:
      
        dpm_prepare: c1a0d998 [pcmcia_socket0] 3
        dpm_suspend: c1a0d998 [pcmcia_socket0] 3
        dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 3
        dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 3
        dpm_resume: c1a0d998 [pcmcia_socket0] 3
        dpm_complete: c1a0d998 [pcmcia_socket0] 2
      
      4th:
      
        dpm_prepare: c1a0d998 [pcmcia_socket0] 2
        dpm_suspend: c1a0d998 [pcmcia_socket0] 2
        dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 2
        dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 2
        dpm_resume: c1a0d998 [pcmcia_socket0] 2
        dpm_complete: c1a0d998 [pcmcia_socket0] 1
      
      5th:
      
        dpm_prepare: c1a0d998 [pcmcia_socket0] 1
        dpm_suspend: c1a0d998 [pcmcia_socket0] 1
        dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 1
        dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 1
        dpm_resume: c1a0d998 [pcmcia_socket0] 1
        dpm_complete: c1a0d998 [pcmcia_socket0] 0
        ------------[ cut here ]------------
        WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()
        Modules linked in: ucb1x00_core
        Backtrace:
        [<c0212090>] (dump_backtrace+0x0/0x110) from [<c04799dc>] (dump_stack+0x18/0x1c)
        [<c04799c4>] (dump_stack+0x0/0x1c) from [<c021cba0>] (warn_slowpath_common+0x50/0x68)
        [<c021cb50>] (warn_slowpath_common+0x0/0x68) from [<c021cbdc>] (warn_slowpath_null+0x24/0x28)
        [<c021cbb8>] (warn_slowpath_null+0x0/0x28) from [<c0335374>] (kobject_get+0x28/0x50)
        [<c033534c>] (kobject_get+0x0/0x50) from [<c03804f4>] (get_device+0x1c/0x24)
        [<c0388c90>] (dpm_complete+0x0/0x1a0) from [<c0389cc0>] (dpm_resume_end+0x1c/0x20)
        ...
      
      Looking at commit 7b24e798 ("pcmcia: split up central event handler"),
      the following change was made to cs.c:
      
                      return 0;
              }
       #endif
      -
      -       send_event(skt, CS_EVENT_PM_RESUME, CS_EVENT_PRI_LOW);
      +       if (!(skt->state & SOCKET_CARDBUS) && (skt->callback))
      +               skt->callback->early_resume(skt);
              return 0;
       }
      
      And the corresponding change in ds.c is from:
      
      -static int ds_event(struct pcmcia_socket *skt, event_t event, int priority)
      -{
      -       struct pcmcia_socket *s = pcmcia_get_socket(skt);
      ...
      -       switch (event) {
      ...
      -       case CS_EVENT_PM_RESUME:
      -               if (verify_cis_cache(skt) != 0) {
      -                       dev_dbg(&skt->dev, "cis mismatch - different card\n");
      -                       /* first, remove the card */
      -                       ds_event(skt, CS_EVENT_CARD_REMOVAL, CS_EVENT_PRI_HIGH);
      -                       mutex_lock(&s->ops_mutex);
      -                       destroy_cis_cache(skt);
      -                       kfree(skt->fake_cis);
      -                       skt->fake_cis = NULL;
      -                       s->functions = 0;
      -                       mutex_unlock(&s->ops_mutex);
      -                       /* now, add the new card */
      -                       ds_event(skt, CS_EVENT_CARD_INSERTION,
      -                                CS_EVENT_PRI_LOW);
      -               }
      -               break;
      ...
      -    }
      
      -    pcmcia_put_socket(s);
      
      -    return 0;
      -} /* ds_event */
      
      to:
      
      +static int pcmcia_bus_early_resume(struct pcmcia_socket *skt)
      +{
      +       if (!verify_cis_cache(skt)) {
      +               pcmcia_put_socket(skt);
      +               return 0;
      +       }
      
      +       dev_dbg(&skt->dev, "cis mismatch - different card\n");
      
      +       /* first, remove the card */
      +       pcmcia_bus_remove(skt);
      +       mutex_lock(&skt->ops_mutex);
      +       destroy_cis_cache(skt);
      +       kfree(skt->fake_cis);
      +       skt->fake_cis = NULL;
      +       skt->functions = 0;
      +       mutex_unlock(&skt->ops_mutex);
      
      +       /* now, add the new card */
      +       pcmcia_bus_add(skt);
      +       return 0;
      +}
      
      As can be seen, the original function called pcmcia_get_socket() and
      pcmcia_put_socket() around the guts, whereas the replacement code
      calls pcmcia_put_socket() only in one path.  This creates an imbalance
      in the refcounting.
      
      Testing with pcmcia_put_socket() put removed shows that the bug is gone:
      
        dpm_suspend: c1a10998 [pcmcia_socket0] 5
        dpm_suspend_noirq: c1a10998 [pcmcia_socket0] 5
        dpm_resume_noirq: c1a10998 [pcmcia_socket0] 5
        dpm_resume: c1a10998 [pcmcia_socket0] 5
        dpm_complete: c1a10998 [pcmcia_socket0] 5
      Tested-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      025e4ab3
    • Hugh Dickins's avatar
      mm: fix UP THP spin_is_locked BUGs · b9980cdc
      Hugh Dickins authored
      Fix CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_SMP=n CONFIG_DEBUG_VM=y
      CONFIG_DEBUG_SPINLOCK=n kernel: spin_is_locked() is then always false,
      and so triggers some BUGs in Transparent HugePage codepaths.
      
      asm-generic/bug.h mentions this problem, and provides a WARN_ON_SMP(x);
      but being too lazy to add VM_BUG_ON_SMP, BUG_ON_SMP, WARN_ON_SMP_ONCE,
      VM_WARN_ON_SMP_ONCE, just test NR_CPUS != 1 in the existing VM_BUG_ONs.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9980cdc
    • Axel Lin's avatar
      drivers/leds/leds-lm3530.c: fix setting pltfm->als_vmax · ec44fd42
      Axel Lin authored
      In current code, pltfm->als_vmin is set to LM3530_ALS_WINDOW_mV and
      pltfm->als_vmax is 0.  This does not make sense.  I think what we want
      here is setting pltfm->als_vmax to LM3530_ALS_WINDOW_mV.
      
      Both als_vmin and als_vmax local variables will be set to
      pltfm->als_vmin and pltfm->als_vmax by a few lines latter.  Thus also
      remove a redundant assignment for als_vmin and als_vmax in this patch.
      Signed-off-by: default avatarAxel Lin <axel.lin@gmail.com>
      Cc: Shreshtha Kumar Sahu <shreshthakumar.sahu@stericsson.com>
      Acked-by: default avatarMilo(Woogyom) Kim <milo.kim@ti.com>
      Tested-by: default avatarMilo(Woogyom) Kim <milo.kim@ti.com>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ec44fd42
    • Mel Gorman's avatar
      mm: compaction: check for overlapping nodes during isolation for migration · dc908600
      Mel Gorman authored
      When isolating pages for migration, migration starts at the start of a
      zone while the free scanner starts at the end of the zone.  Migration
      avoids entering a new zone by never going beyond the free scanned.
      
      Unfortunately, in very rare cases nodes can overlap.  When this happens,
      migration isolates pages without the LRU lock held, corrupting lists
      which will trigger errors in reclaim or during page free such as in the
      following oops
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        IP: [<ffffffff810f795c>] free_pcppages_bulk+0xcc/0x450
        PGD 1dda554067 PUD 1e1cb58067 PMD 0
        Oops: 0000 [#1] SMP
        CPU 37
        Pid: 17088, comm: memcg_process_s Tainted: G            X
        RIP: free_pcppages_bulk+0xcc/0x450
        Process memcg_process_s (pid: 17088, threadinfo ffff881c2926e000, task ffff881c2926c0c0)
        Call Trace:
          free_hot_cold_page+0x17e/0x1f0
          __pagevec_free+0x90/0xb0
          release_pages+0x22a/0x260
          pagevec_lru_move_fn+0xf3/0x110
          putback_lru_page+0x66/0xe0
          unmap_and_move+0x156/0x180
          migrate_pages+0x9e/0x1b0
          compact_zone+0x1f3/0x2f0
          compact_zone_order+0xa2/0xe0
          try_to_compact_pages+0xdf/0x110
          __alloc_pages_direct_compact+0xee/0x1c0
          __alloc_pages_slowpath+0x370/0x830
          __alloc_pages_nodemask+0x1b1/0x1c0
          alloc_pages_vma+0x9b/0x160
          do_huge_pmd_anonymous_page+0x160/0x270
          do_page_fault+0x207/0x4c0
          page_fault+0x25/0x30
      
      The "X" in the taint flag means that external modules were loaded but but
      is unrelated to the bug triggering.  The real problem was because the PFN
      layout looks like this
      
        Zone PFN ranges:
          DMA      0x00000010 -> 0x00001000
          DMA32    0x00001000 -> 0x00100000
          Normal   0x00100000 -> 0x01e80000
        Movable zone start PFN for each node
        early_node_map[14] active PFN ranges
            0: 0x00000010 -> 0x0000009b
            0: 0x00000100 -> 0x0007a1ec
            0: 0x0007a354 -> 0x0007a379
            0: 0x0007f7ff -> 0x0007f800
            0: 0x00100000 -> 0x00680000
            1: 0x00680000 -> 0x00e80000
            0: 0x00e80000 -> 0x01080000
            1: 0x01080000 -> 0x01280000
            0: 0x01280000 -> 0x01480000
            1: 0x01480000 -> 0x01680000
            0: 0x01680000 -> 0x01880000
            1: 0x01880000 -> 0x01a80000
            0: 0x01a80000 -> 0x01c80000
            1: 0x01c80000 -> 0x01e80000
      
      The fix is straight-forward.  isolate_migratepages() has to make a
      similar check to isolate_freepage to ensure that it never isolates pages
      from a zone it does not hold the LRU lock for.
      
      This was discovered in a 3.0-based kernel but it affects 3.1.x, 3.2.x
      and current mainline.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc908600
    • Xi Wang's avatar
      nilfs2: avoid overflowing segment numbers in nilfs_ioctl_clean_segments() · 1ecd3c7e
      Xi Wang authored
      nsegs is read from userspace.  Limit its value and avoid overflowing nsegs
      * sizeof(__u64) in the subsequent call to memdup_user().
      
      This patch complements 481fe17e ("nilfs2: potential integer overflow
      in nilfs_ioctl_clean_segments()").
      Signed-off-by: default avatarXi Wang <xi.wang@gmail.com>
      Cc: Haogang Chen <haogangchen@gmail.com>
      Acked-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ecd3c7e
  4. 08 Feb, 2012 5 commits