1. 16 Dec, 2013 6 commits
    • Lorenzo Pieralisi's avatar
      arm: kvm: implement CPU PM notifier · 1fcf7ce0
      Lorenzo Pieralisi authored
      Upon CPU shutdown and consequent warm-reboot, the hypervisor CPU state
      must be re-initialized. This patch implements a CPU PM notifier that
      upon warm-boot calls a KVM hook to reinitialize properly the hypervisor
      state so that the CPU can be safely resumed.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      1fcf7ce0
    • Lorenzo Pieralisi's avatar
      arm64: kernel: implement fpsimd CPU PM notifier · fb1ab1ab
      Lorenzo Pieralisi authored
      When a CPU enters a low power state, its FP register content is lost.
      This patch adds a notifier to save the FP context on CPU shutdown
      and restore it on CPU resume. The context is saved and restored only
      if the suspending thread is not a kernel thread, mirroring the current
      context switch behaviour.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      fb1ab1ab
    • Lorenzo Pieralisi's avatar
      arm64: kernel: cpu_{suspend/resume} implementation · 95322526
      Lorenzo Pieralisi authored
      Kernel subsystems like CPU idle and suspend to RAM require a generic
      mechanism to suspend a processor, save its context and put it into
      a quiescent state. The cpu_{suspend}/{resume} implementation provides
      such a framework through a kernel interface allowing to save/restore
      registers, flush the context to DRAM and suspend/resume to/from
      low-power states where processor context may be lost.
      
      The CPU suspend implementation relies on the suspend protocol registered
      in CPU operations to carry out a suspend request after context is
      saved and flushed to DRAM. The cpu_suspend interface:
      
      int cpu_suspend(unsigned long arg);
      
      allows to pass an opaque parameter that is handed over to the suspend CPU
      operations back-end so that it can take action according to the
      semantics attached to it. The arg parameter allows suspend to RAM and CPU
      idle drivers to communicate to suspend protocol back-ends; it requires
      standardization so that the interface can be reused seamlessly across
      systems, paving the way for generic drivers.
      
      Context memory is allocated on the stack, whose address is stashed in a
      per-cpu variable to keep track of it and passed to core functions that
      save/restore the registers required by the architecture.
      
      Even though, upon successful execution, the cpu_suspend function shuts
      down the suspending processor, the warm boot resume mechanism, based
      on the cpu_resume function, makes the resume path operate as a
      cpu_suspend function return, so that cpu_suspend can be treated as a C
      function by the caller, which simplifies coding the PM drivers that rely
      on the cpu_suspend API.
      
      Upon context save, the minimal amount of memory is flushed to DRAM so
      that it can be retrieved when the MMU is off and caches are not searched.
      
      The suspend CPU operation, depending on the required operations (eg CPU vs
      Cluster shutdown) is in charge of flushing the cache hierarchy either
      implicitly (by calling firmware implementations like PSCI) or explicitly
      by executing the required cache maintainance functions.
      
      Debug exceptions are disabled during cpu_{suspend}/{resume} operations
      so that debug registers can be saved and restored properly preventing
      preemption from debug agents enabled in the kernel.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      95322526
    • Lorenzo Pieralisi's avatar
      arm64: kernel: suspend/resume registers save/restore · 6732bc65
      Lorenzo Pieralisi authored
      Power management software requires the kernel to save and restore
      CPU registers while going through suspend and resume operations
      triggered by kernel subsystems like CPU idle and suspend to RAM.
      
      This patch implements code that provides save and restore mechanism
      for the arm v8 implementation. Memory for the context is passed as
      parameter to both cpu_do_suspend and cpu_do_resume functions, and allows
      the callers to implement context allocation as they deem fit.
      
      The registers that are saved and restored correspond to the registers set
      actually required by the kernel to be up and running which represents a
      subset of v8 ISA.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      6732bc65
    • Lorenzo Pieralisi's avatar
      arm64: kernel: build MPIDR_EL1 hash function data structure · 976d7d3f
      Lorenzo Pieralisi authored
      On ARM64 SMP systems, cores are identified by their MPIDR_EL1 register.
      The MPIDR_EL1 guidelines in the ARM ARM do not provide strict enforcement of
      MPIDR_EL1 layout, only recommendations that, if followed, split the MPIDR_EL1
      on ARM 64 bit platforms in four affinity levels. In multi-cluster
      systems like big.LITTLE, if the affinity guidelines are followed, the
      MPIDR_EL1 can not be considered a linear index. This means that the
      association between logical CPU in the kernel and the HW CPU identifier
      becomes somewhat more complicated requiring methods like hashing to
      associate a given MPIDR_EL1 to a CPU logical index, in order for the look-up
      to be carried out in an efficient and scalable way.
      
      This patch provides a function in the kernel that starting from the
      cpu_logical_map, implement collision-free hashing of MPIDR_EL1 values by
      checking all significative bits of MPIDR_EL1 affinity level bitfields.
      The hashing can then be carried out through bits shifting and ORing; the
      resulting hash algorithm is a collision-free though not minimal hash that can
      be executed with few assembly instructions. The mpidr_el1 is filtered through a
      mpidr mask that is built by checking all bits that toggle in the set of
      MPIDR_EL1s corresponding to possible CPUs. Bits that do not toggle do not
      carry information so they do not contribute to the resulting hash.
      
      Pseudo code:
      
      /* check all bits that toggle, so they are required */
      for (i = 1, mpidr_el1_mask = 0; i < num_possible_cpus(); i++)
      	mpidr_el1_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0));
      
      /*
       * Build shifts to be applied to aff0, aff1, aff2, aff3 values to hash the
       * mpidr_el1
       * fls() returns the last bit set in a word, 0 if none
       * ffs() returns the first bit set in a word, 0 if none
       */
      fs0 = mpidr_el1_mask[7:0] ? ffs(mpidr_el1_mask[7:0]) - 1 : 0;
      fs1 = mpidr_el1_mask[15:8] ? ffs(mpidr_el1_mask[15:8]) - 1 : 0;
      fs2 = mpidr_el1_mask[23:16] ? ffs(mpidr_el1_mask[23:16]) - 1 : 0;
      fs3 = mpidr_el1_mask[39:32] ? ffs(mpidr_el1_mask[39:32]) - 1 : 0;
      ls0 = fls(mpidr_el1_mask[7:0]);
      ls1 = fls(mpidr_el1_mask[15:8]);
      ls2 = fls(mpidr_el1_mask[23:16]);
      ls3 = fls(mpidr_el1_mask[39:32]);
      bits0 = ls0 - fs0;
      bits1 = ls1 - fs1;
      bits2 = ls2 - fs2;
      bits3 = ls3 - fs3;
      aff0_shift = fs0;
      aff1_shift = 8 + fs1 - bits0;
      aff2_shift = 16 + fs2 - (bits0 + bits1);
      aff3_shift = 32 + fs3 - (bits0 + bits1 + bits2);
      u32 hash(u64 mpidr_el1) {
      	u32 l[4];
      	u64 mpidr_el1_masked = mpidr_el1 & mpidr_el1_mask;
      	l[0] = mpidr_el1_masked & 0xff;
      	l[1] = mpidr_el1_masked & 0xff00;
      	l[2] = mpidr_el1_masked & 0xff0000;
      	l[3] = mpidr_el1_masked & 0xff00000000;
      	return (l[0] >> aff0_shift | l[1] >> aff1_shift | l[2] >> aff2_shift |
      		l[3] >> aff3_shift);
      }
      
      The hashing algorithm relies on the inherent properties set in the ARM ARM
      recommendations for the MPIDR_EL1. Exotic configurations, where for instance
      the MPIDR_EL1 values at a given affinity level have large holes, can end up
      requiring big hash tables since the compression of values that can be achieved
      through shifting is somewhat crippled when holes are present. Kernel warns if
      the number of buckets of the resulting hash table exceeds the number of
      possible CPUs by a factor of 4, which is a symptom of a very sparse HW
      MPIDR_EL1 configuration.
      
      The hash algorithm is quite simple and can easily be implemented in assembly
      code, to be used in code paths where the kernel virtual address space is
      not set-up (ie cpu_resume) and instruction and data fetches are strongly
      ordered so code must be compact and must carry out few data accesses.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      976d7d3f
    • Lorenzo Pieralisi's avatar
      arm64: kernel: add MPIDR_EL1 accessors macros · b058450f
      Lorenzo Pieralisi authored
      In order to simplify access to different affinity levels within the
      MPIDR_EL1 register values, this patch implements some preprocessor
      macros that allow to retrieve the MPIDR_EL1 affinity level value according
      to the level passed as input parameter.
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      b058450f
  2. 15 Dec, 2013 10 commits
    • Linus Torvalds's avatar
      Linux 3.13-rc4 · 319e2e3f
      Linus Torvalds authored
      319e2e3f
    • Matias Bjorling's avatar
      null_blk: mem garbage on NUMA systems during init · 57053d8c
      Matias Bjorling authored
      For NUMA systems, initializing the blk-mq layer and using per node hctx.
      We initialize submit queues to 1, while blk-mq nr_hw_queues is
      initialized to the number of NUMA nodes.
      
      This makes the null_init_hctx function overwrite memory outside of what
      it allocated.  In my case it lead to writing garbage into struct
      request_queue's mq_map.
      Signed-off-by: default avatarMatias Bjorling <m@bjorling.me>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57053d8c
    • Sergey Senozhatsky's avatar
      radeon_pm: fix oops in hwmon_attributes_visible() and radeon_hwmon_show_temp_thresh() · e4158f1b
      Sergey Senozhatsky authored
      Since commit ec39f64b ("drm/radeon/dpm: Convert to use
      devm_hwmon_register_with_groups") radeon_hwmon_init() is using
      hwmon_device_register_with_groups(), which sets `rdev' as a device
      private driver_data, while hwmon_attributes_visible() and
      radeon_hwmon_show_temp_thresh() are still waiting for `drm_device'.
      
      Fix them by using dev_get_drvdata(), in order to avoid this oops:
      
        BUG: unable to handle kernel paging request at 0000000000001e28
        IP: [<ffffffffa02ae8b4>] hwmon_attributes_visible+0x18/0x3d [radeon]
        PGD 15057e067 PUD 151a8e067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP
        Call Trace:
          internal_create_group+0x114/0x1d9
          sysfs_create_group+0xe/0x10
          sysfs_create_groups+0x22/0x5f
          device_add+0x34f/0x501
          device_register+0x15/0x18
          hwmon_device_register_with_groups+0xb5/0xed
          radeon_hwmon_init+0x56/0x7c [radeon]
          radeon_pm_init+0x134/0x7e5 [radeon]
          radeon_modeset_init+0x75f/0x8ed [radeon]
          radeon_driver_load_kms+0xc6/0x187 [radeon]
          drm_dev_register+0xf9/0x1b4 [drm]
          drm_get_pci_dev+0x98/0x129 [drm]
          radeon_pci_probe+0xa3/0xac [radeon]
          pci_device_probe+0x6e/0xcf
          driver_probe_device+0x98/0x1c4
          __driver_attach+0x5c/0x7e
          bus_for_each_dev+0x7b/0x85
          driver_attach+0x19/0x1b
          bus_add_driver+0x104/0x1ce
          driver_register+0x89/0xc5
          __pci_register_driver+0x58/0x5b
          drm_pci_init+0x86/0xea [drm]
          radeon_init+0x97/0x1000 [radeon]
          do_one_initcall+0x7f/0x117
          load_module+0x1583/0x1bb4
          SyS_init_module+0xa0/0xaf
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alexander Deucher <Alexander.Deucher@amd.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e4158f1b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 4a251dd2
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Revert CHECKSUM_COMPLETE optimization in pskb_trim_rcsum(), I can't
          figure out why it breaks things.
      
       2) Fix comparison in netfilter ipset's hash_netnet4_data_equal(), it
          was basically doing "x == x", from Dave Jones.
      
       3) Freescale FEC driver was DMA mapping the wrong number of bytes, from
          Sebastian Siewior.
      
       4) Blackhole and prohibit routes in ipv6 were not doing the right thing
          because their ->input and ->output methods were not being assigned
          correctly.  Now they behave properly like their ipv4 counterparts.
          From Kamala R.
      
       5) Several drivers advertise the NETIF_F_FRAGLIST capability, but
          really do not support this feature and will send garbage packets if
          fed fraglist SKBs.  From Eric Dumazet.
      
       6) Fix long standing user triggerable BUG_ON over loopback in RDS
          protocol stack, from Venkat Venkatsubra.
      
       7) Several not so common code paths can potentially try to invoke
          packet scheduler actions that might be NULL without checking.  Shore
          things up by either 1) defining a method as mandatory and erroring
          on registration if that method is NULL 2) defininig a method as
          optional and the registration function hooks up a default
          implementation when NULL is seen.  From Jamal Hadi Salim.
      
       8) Fix fragment detection in xen-natback driver, from Paul Durrant.
      
       9) Kill dangling enter_memory_pressure method in cg_proto ops, from
          Eric W Biederman.
      
      10) SKBs that traverse namespaces should have their local_df cleared,
          from Hannes Frederic Sowa.
      
      11) IOCB file position is not being updated by macvtap_aio_read() and
          tun_chr_aio_read().  From Zhi Yong Wu.
      
      12) Don't free virtio_net netdev before releasing all of the NAPI
          instances.  From Andrey Vagin.
      
      13) Procfs entry leak in xt_hashlimit, from Sergey Popovich.
      
      14) IPv6 routes that are no cached routes should not count against the
          garbage collection limits.  We had this almost right, but were
          missing handling addrconf generated routes properly.  From Hannes
          Frederic Sowa.
      
      15) fib{4,6}_rule_suppress() have to consider potentially seeing NULL
          route info when they are called, from Stefan Tomanek.
      
      16) TUN and MACVTAP have had truncated packet signalling for some time,
          fix from Jason Wang.
      
      17) Fix use after frrr in __udp4_lib_rcv(), from Eric Dumazet.
      
      18) xen-netback does not interpret the NAPI budget properly for TX work,
          fix from Paul Durrant.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits)
        igb: Fix for issue where values could be too high for udelay function.
        i40e: fix null dereference
        xen-netback: fix gso_prefix check
        net: make neigh_priv_len in struct net_device 16bit instead of 8bit
        drivers: net: cpsw: fix for cpsw crash when build as modules
        xen-netback: napi: don't prematurely request a tx event
        xen-netback: napi: fix abuse of budget
        sch_tbf: use do_div() for 64-bit divide
        udp: ipv4: must add synchronization in udp_sk_rx_dst_set()
        net:fec: remove duplicate lines in comment about errata ERR006358
        Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature"
        8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature
        xen-netback: make sure skb linear area covers checksum field
        net: smc91x: Fix device tree based configuration so it's usable
        udp: ipv4: fix potential use after free in udp_v4_early_demux()
        macvtap: signal truncated packets
        tun: unbreak truncated packet signalling
        net: sched: htb: fix the calculation of quantum
        net: sched: tbf: fix the calculation of max_size
        micrel: add support for KSZ8041RNLI
        ...
      4a251dd2
    • Linus Torvalds's avatar
      Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 908bfda7
      Linus Torvalds authored
      Pull x86 fixes from Peter Anvin:
       "This is a pretty small batch:
      
        The biggest single change is to stop using EFI time services on 32-bit
        platforms.  This matches our current behavior on 64-bit platforms as
        we already had ruled them out there as being too unreliable.  Turns
        out that affects 32-bit platforms, too.
      
        One NULL pointer fix for SGI UV.
      
        Two minor build fixes, one of which only affects icc and the other
        which affects icc and future versions or nonstandard default settings
        of gcc"
      
      * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, efi: Don't use (U)EFI time services on 32 bit
        x86, build, icc: Remove uninitialized_var() from compiler-intel.h
        x86/UV: Fix NULL pointer dereference in uv_flush_tlb_others() if the 'nobau' boot option is used
        x86, build: Pass in additional -mno-mmx, -mno-sse options
      908bfda7
    • Linus Torvalds's avatar
      Merge tag 'pci-v3.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 9199c4ca
      Linus Torvalds authored
      Pull PCI updates from Bjorn Helgaas:
       "PCI device hotplug
          - Move device_del() from pci_stop_dev() to pci_destroy_dev() (Rafael
            Wysocki)
      
        Host bridge drivers
          - Update maintainers for DesignWare, i.MX6, Armada, R-Car (Bjorn
            Helgaas)
          - mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin
            (Jason Gunthorpe)
      
        Miscellaneous
          - Avoid unnecessary CPU switch when calling .probe() (Alexander
            Duyck)
          - Revert "workqueue: allow work_on_cpu() to be called recursively"
            (Bjorn Helgaas)
          - Disable Bus Master only on kexec reboot (Khalid Aziz)
          - Omit PCI ID macro strings to shorten quirk names for LTO (Michal
            Marek)"
      
      * tag 'pci-v3.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
        PCI: Disable Bus Master only on kexec reboot
        PCI: mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin
        PCI: Omit PCI ID macro strings to shorten quirk names
        PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()
        Revert "workqueue: allow work_on_cpu() to be called recursively"
        PCI: Avoid unnecessary CPU switch when calling driver .probe() method
      9199c4ca
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · b5745c59
      Linus Torvalds authored
      Pull SELinux fixes from James Morris.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute()
        selinux: look for IPsec labels on both inbound and outbound packets
        selinux: handle TCP SYN-ACK packets correctly in selinux_ip_postroute()
        selinux: handle TCP SYN-ACK packets correctly in selinux_ip_output()
        selinux: fix possible memory leak
      b5745c59
    • Linus Torvalds's avatar
      Revert "selinux: consider filesystem subtype in policies" · 29b1deb2
      Linus Torvalds authored
      This reverts commit 102aefdd.
      
      Tom London reports that it causes sync() to hang on Fedora rawhide:
      
        https://bugzilla.redhat.com/show_bug.cgi?id=1033965
      
      and Josh Boyer bisected it down to this commit.  Reverting the commit in
      the rawhide kernel fixes the problem.
      
      Eric Paris root-caused it to incorrect subtype matching in that commit
      breaking fuse, and has a tentative patch, but by now we're better off
      retrying this in 3.14 rather than playing with it any more.
      Reported-by: default avatarTom London <selinux@gmail.com>
      Bisected-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Acked-by: default avatarEric Paris <eparis@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Anand Avati <avati@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29b1deb2
    • Carolyn Wyborny's avatar
      igb: Fix for issue where values could be too high for udelay function. · df29df92
      Carolyn Wyborny authored
      This patch changes the igb_phy_has_link function to check the value of the
      parameter before deciding to use udelay or mdelay in order to be sure that
      the value is not too high for udelay function.
      
      CC: stable kernel <stable@vger.kernel.org> # 3.9+
      Signed-off-by: default avatarSunil K Pandey <sunil.k.pandey@intel.com>
      Signed-off-by: default avatarKevin B Smith <kevin.b.smith@intel.com>
      Signed-off-by: default avatarCarolyn Wyborny <carolyn.wyborny@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df29df92
    • Jesse Brandeburg's avatar
      i40e: fix null dereference · 3c325ced
      Jesse Brandeburg authored
      If the vsi->tx_rings structure is NULL we don't want to panic.
      
      Change-Id: Ic694f043701738c434e8ebe0caf0673f4410dc10
      Signed-off-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: default avatarKavindya Deegala <kavindya.s.deegala@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c325ced
  3. 14 Dec, 2013 3 commits
  4. 13 Dec, 2013 21 commits