1. 01 Jul, 2013 15 commits
    • Li Zhong's avatar
      powerpc: Set cpu sibling mask before online cpu · cce606fe
      Li Zhong authored
      It seems following race is possible:
      
      	cpu0					cpux
      smp_init->cpu_up->_cpu_up
      	__cpu_up
      		kick_cpu(1)
      -------------------------------------------------------------------------
      		waiting online			...
      		...				notify CPU_STARTING
      							set cpux active
      						set cpux online
      -------------------------------------------------------------------------
      		finish waiting online
      		...
      sched_init_smp
      	init_sched_domains(cpu_active_mask)
      		build_sched_domains
      						set cpux sibling info
      -------------------------------------------------------------------------
      
      Execution of cpu0 and cpux could be concurrent between two separator
      lines.
      
      So if the cpux sibling information was set too late (normally
      impossible, but could be triggered by adding some delay in
      start_secondary, after setting cpu online), build_sched_domains()
      running on cpu0 might see cpux active, with an empty sibling mask, then
      cause some bad address accessing like following:
      
      [    0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f
      [    0.099868] Faulting instruction address: 0xc0000000000b7a64
      [    0.099883] Oops: Kernel access of bad area, sig: 11 [#1]
      [    0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
      [    0.099922] Modules linked in:
      [    0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425c-dirty #16
      [    0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000
      [    0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934
      [    0.099985] REGS: c0000001fed7f760 TRAP: 0300   Not tainted  (3.10.0-rc1-00120-gb973425c-dirty)
      [    0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 24272828  XER: 20000003
      [    0.100045] SOFTE: 1
      [    0.100053] CFAR: c000000000445ee8
      [    0.100064] DAR: c00000038518078f, DSISR: 40000000
      [    0.100073]
      GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010
      GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000
      GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f
      GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000
      GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000
      GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000
      GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0
      [    0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4
      [    0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4
      [    0.100318] Call Trace:
      [    0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable)
      [    0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14
      [    0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654
      [    0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c
      [    0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c
      [    0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68
      [    0.100445] Instruction dump:
      [    0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001
      [    0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010
      [    0.100559] ---[ end trace 31fd0ba7d8756001 ]---
      
      This patch tries to move the sibling maps updating before
      notify_cpu_starting() and cpu online, and a write barrier there to make
      sure sibling maps are updated before active and online mask.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Reviewed-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cce606fe
    • Geert Uytterhoeven's avatar
      mac: Make cuda_init_via() __init · 330dae19
      Geert Uytterhoeven authored
      cuda_init_via() is called from find_via_cuda() only, which is __init.
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      330dae19
    • Paul Gortmaker's avatar
      powerpc: Delete __cpuinit usage from all users · 061d19f2
      Paul Gortmaker authored
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the powerpc uses of the __cpuinit macros.  There
      are no __CPUINIT users in assembly files in powerpc.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      061d19f2
    • Joe Perches's avatar
      macintosh: Convert use of typedef ctl_table to struct ctl_table · 5eb969d0
      Joe Perches authored
      This typedef is unnecessary and should just be removed.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5eb969d0
    • Joe Perches's avatar
      powerpc/idle: Convert use of typedef ctl_table to struct ctl_table · cc293bf7
      Joe Perches authored
      This typedef is unnecessary and should just be removed.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cc293bf7
    • Bjorn Helgaas's avatar
      powerpc/iommu: Remove unused pci_iommu_init() and pci_direct_iommu_init() · 5524f3fc
      Bjorn Helgaas authored
      pci_iommu_init() and pci_direct_iommu_init() are not referenced anywhere,
      so remove them.
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5524f3fc
    • Kevin Hao's avatar
      powerpc: Don't flush/invalidate the d/icache for an unknown relocation type · 348c2298
      Kevin Hao authored
      For an unknown relocation type since the value of r4 is just the 8bit
      relocation type, the sum of r4 and r7 may yield an invalid memory
      address. For example:
          In normal case:
                   r4 = c00xxxxx
                   r7 = 40000000
                   r4 + r7 = 000xxxxx
      
          For an unknown relocation type:
                   r4 = 000000xx
                   r7 = 40000000
                   r4 + r7 = 400000xx
         400000xx is an invalid memory address for a board which has just
         512M memory.
      
      And for operations such as dcbst or icbi may cause bus error for an
      invalid memory address on some platforms and then cause the board
      reset. So we should skip the flush/invalidate the d/icache for
      an unknown relocation type.
      Signed-off-by: default avatarKevin Hao <haokexin@gmail.com>
      Acked-by: default avatarSuzuki K. Poulose <suzuki@in.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      348c2298
    • Aaro Koskinen's avatar
      powerpc/windfarm: Fix overtemperature clearing · 4bb29711
      Aaro Koskinen authored
      With pm81/pm91/pm121, when the overtemperature state is entered, and
      when it remains on after skipped ticks, the driver will try to leave
      it too soon (immediately on the next tick). This is because the active
      FAILURE_OVERTEMP state is not visible in "new_failure" variable of the
      current tick. Furthermore, the driver will keep trying to clear condition
      in subsequent ticks as FAILURE_OVERTEMP remains set in the "last_failure"
      variable. These will start to trigger WARNINGS from windfarm core:
      
      [  100.082735] windfarm: Clamping CPU frequency to minimum !
      [  100.108132] windfarm: Overtemp condition detected !
      [  101.952908] windfarm: Overtemp condition cleared !
      [...]
      [  102.980388] WARNING: at drivers/macintosh/windfarm_core.c:463
      [...]
      [  103.982227] WARNING: at drivers/macintosh/windfarm_core.c:463
      [...]
      [  105.030494] WARNING: at drivers/macintosh/windfarm_core.c:463
      [...]
      [  105.973666] WARNING: at drivers/macintosh/windfarm_core.c:463
      [...]
      [  106.977913] WARNING: at drivers/macintosh/windfarm_core.c:463
      
      Fix by adding a helper global variable. We leave the overtemp state only
      after all failure bits have been cleared.
      
      I saw this error on iMac G5 iSight (pm121). Also pm81/pm91 are fixed
      based on the observation that these are almost identical/copy-pasted code.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4bb29711
    • Gavin Shan's avatar
      powerpc/powernv: Use dev-node in PCI config accessors · 9bf41be6
      Gavin Shan authored
      Currently, we're using the combo (PCI bus + devfn) in the PCI
      config accessors and PCI config accessors in EEH depends on them.
      However, it's not safe to refer the PCI bus which might have been
      removed during hotplug. So we're using device node in the PCI
      config accessors and the corresponding backends just reuse them.
      
      The patch also fix one potential risk: We possiblly have frozen
      PE during the early PCI probe time, but we haven't setup the PE
      mapping yet. So the errors should be counted to PE#0.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9bf41be6
    • Gavin Shan's avatar
      powerpc/eeh: Avoid build warnings · eeb6361f
      Gavin Shan authored
      The patch is for avoiding following build warnings:
      
         The function .pnv_pci_ioda_fixup() references
         the function __init .eeh_init().
         This is often because .pnv_pci_ioda_fixup lacks a __init
      
         The function .pnv_pci_ioda_fixup() references
         the function __init .eeh_addr_cache_build().
         This is often because .pnv_pci_ioda_fixup lacks a __init
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eeb6361f
    • Gavin Shan's avatar
      powerpc/eeh: Refactor the output message · 56ca4fde
      Gavin Shan authored
      We needn't the the whole backtrace other than one-line message in
      the error reporting interrupt handler. For errors triggered by
      access PCI config space or MMIO, we replace "WARN(1, ...)" with
      pr_err() and dump_stack(). The patch also adds more output messages
      to indicate what EEH core is doing. Besides, some printk() are
      replaced with pr_warning().
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      56ca4fde
    • Gavin Shan's avatar
      powerpc/eeh: Fix address catch for PowerNV · 88b6d14b
      Gavin Shan authored
      On the PowerNV platform, the EEH address cache isn't built correctly
      because we skipped the EEH devices without binding PE. The patch
      fixes that.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      88b6d14b
    • Gavin Shan's avatar
      powerpc/powernv: Replace variables with flags · 0b9e267d
      Gavin Shan authored
      We have 2 fields in "struct pnv_phb" to trace the states. The patch
      replace the fields with one and introduces flags for that. The patch
      doesn't impact the logic.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0b9e267d
    • Gavin Shan's avatar
      powerpc/eeh: Check PCIe link after reset · 652defed
      Gavin Shan authored
      After reset (e.g. complete reset) in order to bring the fenced PHB
      back, the PCIe link might not be ready yet. The patch intends to
      make sure the PCIe link is ready before accessing its subordinate
      PCI devices. The patch also fixes that wrong values restored to
      PCI_COMMAND register for PCI bridges.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      652defed
    • Gavin Shan's avatar
      powerpc/eeh: Don't collect PCI-CFG data on PHB · c35ae179
      Gavin Shan authored
      When the PHB is fenced or dead, it's pointless to collect the data
      from PCI config space of subordinate PCI devices since it should
      return 0xFF's. The patch also fixes overwritten buffer while getting
      PCI config data.
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c35ae179
  2. 30 Jun, 2013 3 commits
  3. 25 Jun, 2013 8 commits
  4. 21 Jun, 2013 14 commits