1. 14 Feb, 2012 14 commits
    • Yinghai Lu's avatar
      PCI: Use add_list in pcie hotplug path. · 8424d759
      Yinghai Lu authored
      We need add size for hot plug path when pluging in hotplug chassis
      without cards.
      
      -v2: change descriptions. make it applicable after "pci: Check bridge
           resources after resource allocation."
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      8424d759
    • Yinghai Lu's avatar
      PCI: try to assign required+option size first · 3e6e0d80
      Yinghai Lu authored
      We found reassignment can not find a range for one resource, even if the
      total available range is large enough.
      
      bridge b1:02.0 will need 2M+3M
      bridge b1:03.0 will need 2M+3M
      
      so bridge b0:00.0 will get assigned: 4M : [f8000000-f83fffff]
         later is reassigned to 10M : [f8000000-f9ffffff]
      
      b1:02.0 is assigned to 2M : [f8000000-f81fffff]
      b1:03.0 is assigned to 2M : [f8200000-f83fffff]
      
      After that b1:03.0 get chance to be reassigned to [f8200000-f86fffff],
      but b1:02.0 will not have chance to expand, because b1:03.0 is using in
      middle one.
      
      [  187.911401] pci 0000:b1:02.0: bridge window [mem 0x00100000-0x002fffff] to [bus b2-b2] add_size 300000
      [  187.920764] pci 0000:b1:03.0: bridge window [mem 0x00100000-0x002fffff] to [bus b3-b3] add_size 300000
      [  187.930129] pci 0000:b1:02.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
      [  187.938500] pci 0000:b1:03.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
      [  187.946857] pci 0000:b0:00.0: bridge window [mem 0x00100000-0x004fffff] to [bus b1-b3] add_size 600000
      [  187.956206] pci 0000:b0:00.0: BAR 14: assigned [mem 0xf8000000-0xf83fffff]
      [  187.963102] pci 0000:b0:00.0: BAR 15: assigned [mem 0xf5000000-0xf51fffff pref]
      [  187.970434] pci 0000:b0:00.0: BAR 14: reassigned [mem 0xf8000000-0xf89fffff]
      [  187.977497] pci 0000:b1:02.0: BAR 14: assigned [mem 0xf8000000-0xf81fffff]
      [  187.984383] pci 0000:b1:02.0: BAR 15: assigned [mem 0xf5000000-0xf50fffff pref]
      [  187.991695] pci 0000:b1:03.0: BAR 14: assigned [mem 0xf8200000-0xf83fffff]
      [  187.998576] pci 0000:b1:03.0: BAR 15: assigned [mem 0xf5100000-0xf51fffff pref]
      [  188.005888] pci 0000:b1:03.0: BAR 14: reassigned [mem 0xf8200000-0xf86fffff]
      [  188.012939] pci 0000:b1:02.0: BAR 14: can't assign mem (size 0x200000)
      [  188.019471] pci 0000:b1:02.0: failed to add 300000 to res=[mem 0xf8000000-0xf81fffff]
      [  188.027326] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.034071] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.040795] pci 0000:b2:00.0: BAR 2: assigned [mem 0xf8000000-0xf80fffff 64bit]
      [  188.048119] pci 0000:b2:00.0: BAR 2: set to [mem 0xf8000000-0xf80fffff 64bit] (PCI address [0xf8000000-0xf80fffff])
      [  188.058550] pci 0000:b2:00.0: BAR 6: assigned [mem 0xf5000000-0xf50fffff pref]
      [  188.065802] pci 0000:b2:00.0: BAR 0: assigned [mem 0xf8100000-0xf8103fff 64bit]
      [  188.073125] pci 0000:b2:00.0: BAR 0: set to [mem 0xf8100000-0xf8103fff 64bit] (PCI address [0xf8100000-0xf8103fff])
      [  188.083596] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.090310] pci 0000:b2:00.0: BAR 9: can't assign mem (size 0x300000)
      [  188.096773] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.103479] pci 0000:b2:00.0: BAR 7: assigned [mem 0xf8104000-0xf810ffff 64bit]
      [  188.110801] pci 0000:b2:00.0: BAR 7: set to [mem 0xf8104000-0xf810ffff 64bit] (PCI address [0xf8104000-0xf810ffff])
      [  188.121256] pci 0000:b1:02.0: PCI bridge to [bus b2-b2]
      [  188.126512] pci 0000:b1:02.0:   bridge window [mem 0xf8000000-0xf81fffff]
      [  188.133328] pci 0000:b1:02.0:   bridge window [mem 0xf5000000-0xf50fffff pref]
      [  188.140608] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.147341] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.154076] pci 0000:b3:00.0: BAR 2: assigned [mem 0xf8200000-0xf82fffff 64bit]
      [  188.161417] pci 0000:b3:00.0: BAR 2: set to [mem 0xf8200000-0xf82fffff 64bit] (PCI address [0xf8200000-0xf82fffff])
      [  188.171865] pci 0000:b3:00.0: BAR 6: assigned [mem 0xf5100000-0xf51fffff pref]
      [  188.179090] pci 0000:b3:00.0: BAR 0: assigned [mem 0xf8300000-0xf8303fff 64bit]
      [  188.186431] pci 0000:b3:00.0: BAR 0: set to [mem 0xf8300000-0xf8303fff 64bit] (PCI address [0xf8300000-0xf8303fff])
      [  188.196884] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
      [  188.203591] pci 0000:b3:00.0: BAR 9: assigned [mem 0xf8400000-0xf86fffff 64bit]
      [  188.210909] pci 0000:b3:00.0: BAR 9: set to [mem 0xf8400000-0xf86fffff 64bit] (PCI address [0xf8400000-0xf86fffff])
      [  188.221379] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
      [  188.228089] pci 0000:b3:00.0: BAR 7: assigned [mem 0xf8304000-0xf830ffff 64bit]
      [  188.235407] pci 0000:b3:00.0: BAR 7: set to [mem 0xf8304000-0xf830ffff 64bit] (PCI address [0xf8304000-0xf830ffff])
      [  188.245843] pci 0000:b1:03.0: PCI bridge to [bus b3-b3]
      [  188.251107] pci 0000:b1:03.0:   bridge window [mem 0xf8200000-0xf86fffff]
      [  188.257922] pci 0000:b1:03.0:   bridge window [mem 0xf5100000-0xf51fffff pref]
      [  188.265180] pci 0000:b0:00.0: PCI bridge to [bus b1-b3]
      [  188.270443] pci 0000:b0:00.0:   bridge window [mem 0xf8000000-0xf89fffff]
      [  188.277250] pci 0000:b0:00.0:   bridge window [mem 0xf5000000-0xf51fffff pref]
      [  188.284512] pcieport 0000:80:02.2: PCI bridge to [bus b0-bf]
      [  188.290184] pcieport 0000:80:02.2:   bridge window [io  0xa000-0xbfff]
      [  188.296735] pcieport 0000:80:02.2:   bridge window [mem 0xf8000000-0xf8ffffff]
      [  188.303963] pcieport 0000:80:02.2:   bridge window [mem 0xf5000000-0xf5ffffff 64bit pref]
      
      Thus b2:00.0 BAR 9 does not get assigned...
      
      root cause:
      b1:02.0 can not be added more range, because b1:03.0 is just after it;
      no space between the required ranges.
      
      Solution:
      Try to assign required + optional all together at first, and if that
      fails, try again with just the required resources.
      
      -v2: seperate add_to_list change() to another patch according to Jesse.
           seperate get_res_add_size() moving to another patch according to Jesse.
           add !realloc_head->next check if the list is empty to bail early
           according to Jesse.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      3e6e0d80
    • Yinghai Lu's avatar
      PCI: Move get_res_add_size() function · 1c372353
      Yinghai Lu authored
      Need to call it from __assign_resources_sorted() later and we'd like to
      avoid a forward declaraion.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      1c372353
    • Yinghai Lu's avatar
      PCI: Make add_to_list() return status · ef62dfef
      Yinghai Lu authored
      Will be used for resource_list_x duplication when trying
      requested+optional at first.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      ef62dfef
    • Yinghai Lu's avatar
      PCI : Calculate right add_size · a4ac9fea
      Yinghai Lu authored
      During debug of one SRIOV enabled hotplug device, we found found that
      add_size is not passed properly.
      
      The device has devices under two level bridges:
      
       +-[0000:80]-+-00.0-[81-8f]--
       |           +-01.0-[90-9f]--
       |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0  Oracle Corporation Device
       |           |                               \-03.0-[a3]--+-00.0  Oracle Corporation Device
      
      Which means later the parent bridge will not try to add a big enough range:
      
      [  557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
      [  557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
      [  557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
      [  557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
      [  557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
      [  557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
      [  557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
      [  557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
      [  557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
      [  557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]
      
      It turns out we did not calculate size1 properly.
      
      static resource_size_t calculate_memsize(resource_size_t size,
                      resource_size_t min_size,
                      resource_size_t size1,
                      resource_size_t old_size,
                      resource_size_t align)
      {
              if (size < min_size)
                      size = min_size;
              if (old_size == 1 )
                      old_size = 0;
              if (size < old_size)
                      size = old_size;
              size = ALIGN(size + size1, align);
              return size;
      }
      
      We should not pass add_size with min_size in calculate_memsize since
      that will make add_size not contribute final add_size.
      
      So just pass add_size with size1 to calculate_memsize().
      
      With this change, we should have chance to remove extra addon in
      pci_reassign_resource.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      a4ac9fea
    • Masanari Iida's avatar
      PCI: Fix typo in setup-res.c · 0dea210b
      Masanari Iida authored
      Correct spelling "resouce" to "resource" in
      dricers/pci/setup-res.c
      Signed-off-by: default avatarMasanari Iida <standby24x7@gmail.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      0dea210b
    • Bjorn Helgaas's avatar
      x86/PCI: don't fall back to defaults if _CRS has no apertures · 316d86fe
      Bjorn Helgaas authored
      Host bridges that lead to things like the Uncore need not have any
      I/O port or MMIO apertures.  For example, in this case:
      
          ACPI: PCI Root Bridge [UNC1] (domain 0000 [bus ff])
          PCI: root bus ff: using default resources
          PCI host bridge to bus 0000:ff
          pci_bus 0000:ff: root bus resource [io  0x0000-0xffff]
          pci_bus 0000:ff: root bus resource [mem 0x00000000-0x3fffffffffff]
      
      we should not pretend those default resources are available on bus ff.
      
      CC: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      316d86fe
    • Konrad Rzeszutek Wilk's avatar
      xen/pciback: Support pci_reset_function, aka FLR or D3 support. · 11608318
      Konrad Rzeszutek Wilk authored
      We use the __pci_reset_function_locked to perform the action.
      Also on attaching ("bind") and detaching ("unbind") we save and
      restore the configuration states. When the device is disconnected
      from a guest we use the "pci_reset_function" to also reset the
      device before being passed to another guest.
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      11608318
    • Konrad Rzeszutek Wilk's avatar
      PCI: Introduce __pci_reset_function_locked to be used when holding device_lock. · 6fbf9e7a
      Konrad Rzeszutek Wilk authored
      The use case of this is when a driver wants to call FLR when a device
      is attached to it using the SysFS "bind" or "unbind" functionality.
      
      The call chain when a user does "bind" looks as so:
      
       echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
      
      and ends up calling:
        driver_bind:
          device_lock(dev);  <=== TAKES LOCK
          XXXX_probe:
               .. pci_enable_device()
               ...__pci_reset_function(), which calls
                       pci_dev_reset(dev, 0):
                              if (!0) {
                                      device_lock(dev) <==== DEADLOCK
      
      The __pci_reset_function_locked function allows the the drivers
      'probe' function to call the "pci_reset_function" while still holding
      the driver mutex lock.
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      6fbf9e7a
    • Julia Lawall's avatar
      PCI: drivers/pci/hotplug/ibmphp_ebda.c: add missing iounmap · 8f0cdddc
      Julia Lawall authored
      Add missing iounmap in error handling code, in a case where the function
      already preforms iounmap on some other execution path.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression e;
      statement S,S1;
      int ret;
      @@
      e = \(ioremap\|ioremap_nocache\)(...)
      ... when != iounmap(e)
      if (<+...e...+>) S
      ... when any
          when != iounmap(e)
      *if (...)
         { ... when != iounmap(e)
           return ...; }
      ... when any
      iounmap(e);
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      8f0cdddc
    • Amos Kong's avatar
      PCI: Can continually add funcs after adding func0 · f382a086
      Amos Kong authored
      Boot up a KVM guest, and hotplug multifunction
      devices(func1,func2,func0,func3) to guest.
      
      for i in 1 2 0 3;do
      qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
      (qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
      (qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
      done
      
      In linux kernel, when func0 of the slot is hot-added, the whole
      slot will be marked as 'enabled', then driver will ignore other new
      hotadded funcs.
      But in Win7 & WinXP, we can continaully add other funcs after adding
      func0, all funcs will be added in guest.
      
      drivers/pci/hotplug/acpiphp_glue.c:
      static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
      {
      ....
              for (slot = bridge->slots; slot; slot = slot->next) {
                      if (slot->flags & SLOT_ENABLED) {
                              acpiphp_disable_slot()
                      else
                              acpiphp_enable_slot()
      ....                              |
      }                                 v
                                  enable_device()
                                        |
                                        v
              //only don't enable slot if func0 is not added
      	list_for_each_entry(func, &slot->funcs, sibling) {
                     ...
              }
             slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'
      
      This patch just make pci driver can continaully add funcs after adding
      func 0. Only mark slot to 'enabled' when all funcs are added.
      
      For pci multifunction hotplug, we can add functions one by one(func 0 is
      necessary), and all functions will be removed in one time.
      Signed-off-by: default avatarAmos Kong <akong@redhat.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      f382a086
    • Myron Stowe's avatar
      x86/PCI: Convert maintaining FW-assigned BIOS BAR values to use a list · 6535943f
      Myron Stowe authored
      This patch converts the underlying maintenance aspects of FW-assigned
      BIOS BAR values from a statically allocated array within struct pci_dev
      to a list of temporary, stand alone, entries.
      Signed-off-by: default avatarMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      6535943f
    • Myron Stowe's avatar
      x86/PCI: Infrastructure to maintain a list of FW-assigned BIOS BAR values · 925845bd
      Myron Stowe authored
      Commit 58c84eda introduced functionality to try and reinstate the
      original BIOS BAR addresses of a PCI device when normal resource
      assignment attempts fail.  To keep track of the BIOS BAR addresses,
      struct pci_dev was augmented with an array to hold the BAR addresses
      of the PCI device: 'resource_size_t fw_addr[DEVICE_COUNT_RESOURCE]'.
      
      The reinstatement of BAR addresses is an uncommon event leaving the
      'fw_addr' array unused under normal circumstances.  This functionality
      is also currently architecture specific with an implementation limited
      to x86.  As the use of struct pci_dev is so prevalent, having the
      'fw_addr' array residing within such seems somewhat wasteful.
      
      This patch introduces a stand alone data structure and interfacing
      routines for maintaining a list of FW-assigned BIOS BAR value entries.
      Signed-off-by: default avatarMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      925845bd
    • Myron Stowe's avatar
      PCI: Fix starting basis for resource requests · 351fc6d1
      Myron Stowe authored
      pci_revert_fw_address() is used to reinstate a PCI device's original
      FW-assigned BIOS BAR value(s) if normal resource assignment fails.
      
      When attempting to reinstate an address, the point within the resource
      tree from which to attempt the new resource request should be the parent
      resource corresponding to the device, not the base of the resource tree
      (ioport_resource or iomem_resource).  For PCI devices this would
      typically be the resource corresponding to the upstream PCI host bridge
      or P2P bridge aperture.
      
      This patch sets the point within the resource tree to attempt a new
      resource assignment request to the PCI device's parent resource and only
      if that fails does it fall back to the base ioport_resource or
      iomem_resource.
      Signed-off-by: default avatarMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      351fc6d1
  2. 10 Feb, 2012 3 commits
    • Yinghai Lu's avatar
      PCI: Fix pci cardbus removal · 3682a394
      Yinghai Lu authored
      During test busn_res allocation with cardbus, found pci card removal is not
      working anymore, and it turns out it is broken by:
      
      |commit 79cc9601
      |Date:   Tue Nov 22 21:06:53 2011 -0800
      |
      |    PCI: Only call pci_stop_bus_device() one time for child devices at remove
      
      The above changed the behavior of pci_remove_behind_bridge that
      yenta_cardbus depended on.  So restore the old behavoir of
      pci_remove_behind_bridge (which requires stopping and removing of all
      devices) by:
      
      1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
         __pci_remove_bus_device() call it instead.
      2. add pci_stop_behind_bridge that will stop devices behind a bridge
      3. add back pci_remove_behind_bridge that will stop and remove devices
         under bridge.
      
      -v2: update commit description a little bit.
      Tested-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      3682a394
    • Vaidyanathan Srinivasan's avatar
      PCI: set pci sriov page size before reading SRIOV BAR · 8161fe91
      Vaidyanathan Srinivasan authored
      For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
      the PCI_SRIOV_BAR are queried.  The sys pagesize defaults to 4k,
      so this change is required on powerpc box with 64k base page size.
          
      This is a regression caused due to moving SRIOV init to sriov_enable().
          
      | commit afd24ece
      | Author: Ram Pai <linuxram@us.ibm.com>
          
      | PCI: delay configuration of SRIOV capability
      | The SRIOV capability, namely page size and total_vfs of a device are
      | configured during enumeration phase of the device.  This can potentially
      | interfere with the PCI operations of the platform, if the IOV capability
      | of the device is not enabled.
      Signed-off-by: default avatarVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Acked-by: default avatarRam Pai <linuxram@us.ibm.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      8161fe91
    • Yinghai Lu's avatar
      PCI: workaround hard-wired bus number V2 · 71f6bd4a
      Yinghai Lu authored
      Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
      when using ACPI resources (_CRS).
      This is default, a manual workaround (without this patch)
      would be pci=nocrs boot param.
      
      V2: Add dev_warn if the workaround is hit. This should reveal
      how common such setups are (via google) and point to possible
      problems if things are still not working as expected.
      -> Suggested by Jan Beulich.
      
      Cc: stable@vger.kernel.org
      Tested-by: garyhade@us.ibm.com
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@oracle.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      71f6bd4a
  3. 27 Jan, 2012 11 commits
  4. 26 Jan, 2012 12 commits