1. 29 Jan, 2010 1 commit
    • Suresh Siddha's avatar
      x86, irq: Update the vector domain for legacy irqs handled by io-apic · 69c89efb
      Suresh Siddha authored
      In the recent change of not reserving IRQ0_VECTOR..IRQ15_VECTOR's on all
      cpu's, we start with irq 0..15 getting directed to (and handled on) cpu-0.
      
      In the logical flat mode, once the AP's are online (and before irqbalance
      comes into picture), kernel intends to handle these IRQ's on any cpu (as the
      logical flat mode allows to specify multiple cpu's for the irq destination and
      the chipset based routing can deliver to the interrupt to any one of
      the specified cpu's). This was broken with our recent change, which was ending
      up using only cpu 0 as the destination, even when the kernel was specifying to
      use all online cpu's for the logical flat mode case.
      
      Fix this by updating vector allocation domain (cfg->domain) for legacy irqs,
      when the IO-APIC handles them.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100129194330.207790269@sbs-t61.sc.intel.com>
      Tested-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      69c89efb
  2. 19 Jan, 2010 1 commit
    • Suresh Siddha's avatar
      x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's · 97943390
      Suresh Siddha authored
      Currently IRQ0..IRQ15 are assigned to IRQ0_VECTOR..IRQ15_VECTOR's on
      all the cpu's.
      
      If these IRQ's are handled by legacy pic controller, then the kernel
      handles them only on cpu 0. So there is no need to block this vector
      space on all cpu's.
      
      Similarly if these IRQ's are handled by IO-APIC, then the IRQ affinity
      will determine on which cpu's we need allocate the vector resource for
      that particular IRQ. This can be done dynamically and here also there
      is no need to block 16 vectors for IRQ0..IRQ15 on all cpu's.
      
      Fix this by initially assigning IRQ0..IRQ15 to IRQ0_VECTOR..IRQ15_VECTOR's only
      on cpu 0. If the legacy controllers like pic handles these irq's, then
      this configuration will be fixed. If more modern controllers like IO-APIC
      handle these IRQ's, then we start with this configuration and as IRQ's
      migrate, vectors (/and cpu's) associated with these IRQ's change dynamically.
      
      This will freeup the block of 16 vectors on other cpu's which don't handle
      IRQ0..IRQ15, which can now be used for other IRQ's that the particular cpu
      handle.
      
      [ hpa: this also an architectural cleanup for future legacy-PIC-free
        configurations. ]
      [ hpa: fixed typo NR_LEGACY_IRQS -> NR_IRQS_LEGACY ]
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1263932453.2814.52.camel@sbs-t61.sc.intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      97943390
  3. 18 Jan, 2010 2 commits
    • Suresh Siddha's avatar
      x86, irq: Use 0x20 for the IRQ_MOVE_CLEANUP_VECTOR instead of 0x1f · 6579b474
      Suresh Siddha authored
      After talking to some more folks inside intel (Peter Anvin, Asit Mallick),
      the safest option (for future compatibility etc) seen was to use vector 0x20
      for IRQ_MOVE_CLEANUP_VECTOR instead of using vector 0x1f (which is documented as
      reserved vector in the Intel IA32 manuals).
      
      Also we don't need to reserve the entire privilege level (all 16 vectors in
      the priority bucket that IRQ_MOVE_CLEANUP_VECTOR falls into), as the
      x86 architecture (section 10.9.3 in SDM Vol3a) specifies that with in the
      priority level, the higher the vector number the higher the priority.
      And hence we don't need to reserve the complete priority level 0x20-0x2f for
      the IRQ migration cleanup logic.
      
      So change the IRQ_MOVE_CLEANUP_VECTOR to 0x20 and  allow 0x21-0x2f to be used
      for device interrupts. 0x30-0x3f will be used for ISA interrupts (these
      also can be migrated in the context of IOAPIC and hence need to be at a higher
      priority level than IRQ_MOVE_CLEANUP_VECTOR).
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100114002118.521826763@sbs-t61.sc.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      6579b474
    • Suresh Siddha's avatar
      x86, vmi: Fix vmi_get_timer_vector() to use IRQ0_VECTOR · 722b3654
      Suresh Siddha authored
      FIRST_DEVICE_VECTOR is going away and it looks like a bad hack to steal
      FIRST_DEVICE_VECTOR / FIRST_EXTERNAL_VECTOR, when it looks like it needs
      IRQ0_VECTOR.
      
      Fix vmi_get_timer_vector() to use IRQ0_VECTOR.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100114002118.436172066@sbs-t61.sc.intel.com>
      Cc: Alok N Kataria <akataria@vmware.com>
      Cc: Zach Amsden <zach@vmware.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      722b3654
  4. 05 Jan, 2010 2 commits
    • H. Peter Anvin's avatar
      x86, apic: Don't waste a vector to improve vector spread · ea943966
      H. Peter Anvin authored
      We want to use a vector-assignment sequence that avoids stumbling onto
      0x80 earlier in the sequence, in order to improve the spread of
      vectors across priority levels on machines with a small number of
      interrupt sources.  Right now, this is done by simply making the first
      vector (0x31 or 0x41) completely unusable.  This is unnecessary; all
      we need is to start assignment at a +1 offset, we don't actually need
      to prohibit the usage of this vector once we have wrapped around.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4B426550.6000209@kernel.org>
      ea943966
    • H. Peter Anvin's avatar
      x86, apic: Reclaim IDT vectors 0x20-0x2f · 99d113b1
      H. Peter Anvin authored
      Reclaim 16 IDT vectors and make them available for general allocation.
      
      Reclaim vectors 0x20-0x2f by reallocating the IRQ_MOVE_CLEANUP_VECTOR
      to vector 0x1f.  This is in the range of vector numbers that is
      officially reserved for the CPU (for exceptions), however, the use of
      the APIC to generate any vector 0x10 or above is documented, and the
      CPU internally can receive any vector number (the legacy BIOS uses INT
      0x08-0x0f for interrupts, as messed up as that is.)
      
      Since IRQ_MOVE_CLEANUP_VECTOR has to be alone in the lowest-numbered
      priority level (block of 16), this effectively enables us to reclaim
      an otherwise-unusable APIC priority level and put it to use.
      
      Since this is a transient kernel-only allocation we can change it at
      any time, and if/when there is an exception at vector 0x1f this
      assignment needs to be changed as part of OS enabling that new feature.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4B4284C6.9030107@kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      99d113b1
  5. 30 Dec, 2009 1 commit
    • Yinghai Lu's avatar
      x86: Increase NR_IRQS and nr_irqs · 9959c888
      Yinghai Lu authored
      I have a system with lots of igb and ixgbe, when iov/vf are
      enabled for them, we hit the limit of 3064.
      
      when system has 20 pcie installed, and one card has 2
      functions, and one function needs 64 msi-x,
       may need 20 * 2 * 64 = 2560 for msi-x
      
      but if iov and vf are enabled
       may need 20 * 2 * 64 * 3 = 7680 for msi-x
      assume system with 5 ioapic, nr_irqs_gsi will be 120.
      
      NR_CPUS = 512, and nr_cpu_ids = 128
      will have NR_IRQS = 256 + 512 * 64 = 33024
      will have nr_irqs = 120 + 8 * 128 + 120 * 64 = 8824
      
      When SPARSE_IRQ is not set, there is no increase with kernel data
      size.
      
      when NR_CPUS=128, and SPARSE_IRQ is set:
         text		   data	    bss		   dec		 hex	filename
      21837444	4216564	12480736	38534744	24bfe58	vmlinux.before
      21837442	4216580	12480736	38534758	24bfe66	vmlinux.after
      when NR_CPUS=4096, and SPARSE_IRQ is set
         text		   data	    bss		   dec		 hex	filename
      21878619	5610244	13415392	40904255	270263f	vmlinux.before
      21878617	5610244	13415392	40904253	270263d	vmlinux.after
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4B398ECD.1080506@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9959c888
  6. 24 Dec, 2009 33 commits