Commits · 72175d4ea4c442d95cf690c3e968eeee90fd43ca · nexedi / linux

01 Feb, 2019 8 commits

driver core: Make driver core own stateful device links · 72175d4e

Rafael J. Wysocki authored Feb 01, 2019

Even though stateful device links are managed by the driver core in
principle, their creators are allowed and sometimes even expected
to drop references to them via device_link_del() or
device_link_remove(), but that doesn't really play well with the
"persistent" link concept.

If "persistent" managed device links are created from driver
probe callbacks, device_link_add() called to do that will take a
new reference on the link each time the callback runs and those
references will never be dropped, which kind of isn't nice.

This issues arises because of the link reference counting carried
out by device_link_add() for existing links, but that is only done to
avoid deleting device links that may still be necessary, which
shouldn't be a concern for managed (stateful) links.  These device
links are managed by the driver core and whoever creates one of them
will need it at least as long as until the consumer driver is detached
from its device and deleting it may be left to the driver core just
fine.

For this reason, rework device_link_add() to apply the reference
counting to stateless links only and make device_link_del() and
device_link_remove() drop references to stateless links only too.
After this change, if called to add a stateful device link for
a consumer-supplier pair for which a stateful device link is
present already, device_link_add() will return the existing link
without incrementing its reference counter.  Accordingly,
device_link_del() and device_link_remove() will WARN() and do
nothing when called to drop a reference to a stateful link.  Thus,
effectively, all stateful device links will be owned by the driver
core.

In addition, clean up the handling of the link management flags,
DL_FLAG_AUTOREMOVE_CONSUMER and DL_FLAG_AUTOREMOVE_SUPPLIER, so that
(a) they are never set at the same time and (b) if device_link_add()
is called for a consumer-supplier pair with an existing stateful link
between them, the flags of that link will be combined with the flags
passed to device_link_add() to ensure that the life time of the link
is sufficient for all of the callers of device_link_add() for the
same consumer-supplier pair.

Update the device_link_add() kerneldoc comment to reflect the
above changes.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

72175d4e

IOMMU: Make dwo drivers use stateless device links · ea4f6400

Rafael J. Wysocki authored Feb 01, 2019

The device links used by rockchip-iommu and exynos-iommu are
completely managed by these drivers within the IOMMU framework,
so there is no reason to involve the driver core in the management
of these links.

For this reason, make rockchip-iommu and exynos-iommu pass
DL_FLAG_STATELESS in flags to device_link_add(), so that the device
links used by them are stateless.

[Note that this change is requisite for a subsequent one that will
 rework the management of stateful device links in the driver core
 and it will not be compatible with the two drivers in question any
 more.]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ea4f6400

driver core: Do not call rpm_put_suppliers() in pm_runtime_drop_link() · a1fdbfbb

Rafael J. Wysocki authored Feb 01, 2019

Calling rpm_put_suppliers() from pm_runtime_drop_link() is excessive
as it affects all suppliers of the consumer device and not just the
one pointed to by the device link being dropped. Worst case it may
cause the consumer device to stop working unexpectedly. Moreover, in
principle it is racy with respect to runtime PM of the consumer
device.

To avoid these problems drop runtime PM references on the particular
supplier pointed to by the link in question only and do that after
the link has been dropped from the consumer device's list of links to
suppliers, which is in device_link_free().

Fixes: a0504aec ("PM / runtime: Drop usage count for suppliers at device link removal")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a1fdbfbb

driver core: Fix adding device links to probing suppliers · 15cfb094

Rafael J. Wysocki authored Feb 01, 2019

Currently, it is not valid to add a device link from a consumer
driver ->probe callback to a supplier that is still probing too, but
generally this is a valid use case. For example, if the consumer has
just acquired a resource that can only be available if the supplier
is functional, adding a device link to that supplier right away
should be safe (and even desirable arguably), but device_link_add()
doesn't handle that case correctly and the initial state of the link
created by it is wrong then.

To address this problem, change the initial state of device links
added between a probing supplier and a probing consumer to
DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
skip such links on the supplier side.

With this change, if the supplier probe completes first,
device_links_driver_bound() called for it will skip the link state
update and when it is called for the consumer, the link state will
be updated to "active". In turn, if the consumer probe completes
first, device_links_driver_bound() called for it will change the
state of the link to "active" and when it is called for the
supplier, the link status update will be skipped.

However, in principle the supplier or consumer probe may still fail
after the link has been added, so modify device_links_no_driver() to
change device links in the "active" or "consumer probe" state to
"dormant" on the supplier side and update __device_links_no_driver()
to change the link state to "available" only if it is "consumer
probe" or "active".

Then, if the supplier probe fails first, the leftover link to the
probing consumer will become "dormant" and device_links_no_driver()
called for the consumer (when its probe fails) will clean it up.
In turn, if the consumer probe fails first, it will either drop the
link, or change its state to "available" and, in the latter case,
when device_links_no_driver() is called for the supplier, it will
update the link state to "dormant". [If the supplier probe fails,
but the consumer probe succeeds, which should not happen as long as
the consumer driver is correct, the link still will be around, but
it will be "dormant" until the supplier is probed again.]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

15cfb094

driver core: Fix handling of runtime PM flags in device_link_add() · e2f3cd83

Rafael J. Wysocki authored Feb 01, 2019

After commit ead18c23 ("driver core: Introduce device links
reference counting"), if there is a link between the given supplier
and the given consumer already, device_link_add() will refcount it
and return it unconditionally without updating its flags. It is
possible, however, that the second (or any subsequent) caller of
device_link_add() for the same consumer-supplier pair will pass
DL_FLAG_PM_RUNTIME, possibly along with DL_FLAG_RPM_ACTIVE, in flags
to it and the existing link may not behave as expected then.

First, if DL_FLAG_PM_RUNTIME is not set in the existing link's flags
at all, it needs to be set like during the original initialization of
the link.

Second, if DL_FLAG_RPM_ACTIVE is passed to device_link_add() in flags
(in addition to DL_FLAG_PM_RUNTIME), the existing link should to be
updated to reflect the "active" runtime PM configuration of the
consumer-supplier pair and extra care must be taken here to avoid
possible destructive races with runtime PM of the consumer.

To that end, redefine the rpm_active field in struct device_link
as a refcount, initialize it to 1 and make rpm_resume() (for the
consumer) and device_link_add() increment it whenever they acquire
a runtime PM reference on the supplier device. Accordingly, make
rpm_suspend() (for the consumer) and pm_runtime_clean_up_links()
decrement it and drop runtime PM references to the supplier
device in a loop until rpm_active becones 1 again.

Fixes: ead18c23 ("driver core: Introduce device links reference counting")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e2f3cd83

driver core: Do not resume suppliers under device_links_write_lock() · 5db25c9e

Rafael J. Wysocki authored Feb 01, 2019

It is incorrect to call pm_runtime_get_sync() under
device_links_write_lock(), because it may end up trying to take
device_links_read_lock() while resuming the target device and that
will deadlock in the non-SRCU case, so avoid that by resuming the
supplier device in device_link_add() before calling
device_links_write_lock().

Fixes: 21d5c57b ("PM / runtime: Use device links")
Fixes: baa8809f ("PM / runtime: Optimize the use of device links")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5db25c9e

driver core: Avoid careless re-use of existing device links · f265df55

Rafael J. Wysocki authored Feb 01, 2019

After commit ead18c23 ("driver core: Introduce device links
reference counting"), if there is a link between the given supplier
and the given consumer already, device_link_add() will refcount it
and return it unconditionally. However, if the flags passed to
it on the second (or any subsequent) attempt to create a device
link between the same consumer-supplier pair are not compatible with
the existing link's flags, that is incorrect.

First off, if the existing link is stateless and the next caller of
device_link_add() for the same consumer-supplier pair wants a
stateful one, or the other way around, the existing link cannot be
returned, because it will not match the expected behavior, so make
device_link_add() dump the stack and return NULL in that case.

Moreover, if the DL_FLAG_AUTOREMOVE_CONSUMER flag is passed to
device_link_add(), its caller will expect its reference to the link
to be dropped automatically on consumer driver removal, which will
not happen if that flag is not set in the link's flags (and
analogously for DL_FLAG_AUTOREMOVE_SUPPLIER). For this reason, make
device_link_add() update the existing link's flags accordingly
before returning it to the caller.

Fixes: ead18c23 ("driver core: Introduce device links reference counting")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f265df55

driver core: Fix DL_FLAG_AUTOREMOVE_SUPPLIER device link flag handling · c8d50986

Rafael J. Wysocki authored Feb 01, 2019

Change the list walk in device_links_driver_cleanup() to a safe one
to avoid use-after-free when dropping a link from the list during the
walk.

Also, while at it, fix device_link_add() to refuse to create
stateless device links with DL_FLAG_AUTOREMOVE_SUPPLIER set, which is
an invalid combination (setting that flag means that the driver core
should manage the link, so it cannot be stateless), and extend the
kerneldoc comment of device_link_add() to cover the
DL_FLAG_AUTOREMOVE_SUPPLIER flag properly too.

Fixes: 1689cac5 ("driver core: Add flag to autoremove device link on supplier unbind")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c8d50986

31 Jan, 2019 12 commits

devres: always use dev_name() in devm_ioremap_resource() · 8d84b18f

Sergei Shtylyov authored Jan 29, 2019

devm_ioremap_resource() prefers calling devm_request_mem_region() with a
resource name instead of a device name -- this looks pretty iff a resource
name isn't specified via a device tree with a "reg-names" property (in this
case, a resource name is set to a device node's full name), but if it is,
it doesn't really scale since these names are only unique to a given device
node, not globally; so, looking at the output of 'cat /proc/iomem', you do
not have an idea which memory region belongs to which device (see "dirmap",
"regs", and "wbuf" lines below):

08000000-0bffffff : dirmap
48000000-bfffffff : System RAM
  48000000-48007fff : reserved
  48080000-48b0ffff : Kernel code
  48b10000-48b8ffff : reserved
  48b90000-48c7afff : Kernel data
  bc6a4000-bcbfffff : reserved
  bcc0f000-bebfffff : reserved
  bec0e000-bec0efff : reserved
  bec11000-bec11fff : reserved
  bec12000-bec14fff : reserved
  bec15000-bfffffff : reserved
e6050000-e605004f : gpio@e6050000
e6051000-e605104f : gpio@e6051000
e6052000-e605204f : gpio@e6052000
e6053000-e605304f : gpio@e6053000
e6054000-e605404f : gpio@e6054000
e6055000-e605504f : gpio@e6055000
e6060000-e606050b : pin-controller@e6060000
e6e60000-e6e6003f : e6e60000.serial
e7400000-e7400fff : ethernet@e7400000
ee200000-ee2001ff : regs
ee208000-ee2080ff : wbuf

I think that devm_request_mem_region() should be called with dev_name()
despite the region names won't look as pretty as before (however, we gain
more consistency with e.g. the serial driver:

08000000-0bffffff : ee200000.rpc
48000000-bfffffff : System RAM
  48000000-48007fff : reserved
  48080000-48b0ffff : Kernel code
  48b10000-48b8ffff : reserved
  48b90000-48c7afff : Kernel data
  bc6a4000-bcbfffff : reserved
  bcc0f000-bebfffff : reserved
  bec0e000-bec0efff : reserved
  bec11000-bec11fff : reserved
  bec12000-bec14fff : reserved
  bec15000-bfffffff : reserved
e6050000-e605004f : e6050000.gpio
e6051000-e605104f : e6051000.gpio
e6052000-e605204f : e6052000.gpio
e6053000-e605304f : e6053000.gpio
e6054000-e605404f : e6054000.gpio
e6055000-e605504f : e6055000.gpio
e6060000-e606050b : e6060000.pin-controller
e6e60000-e6e6003f : e6e60000.serial
e7400000-e7400fff : e7400000.ethernet
ee200000-ee2001ff : ee200000.rpc
ee208000-ee2080ff : ee200000.rpc

Fixes: 72f8c0bf ("lib: devres: add convenience function to remap a resource")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8d84b18f

drivers: base: Use __printf markup to silence compiler · fa548d79

Mathieu Malaterre authored Jan 23, 2019

Silence warnings (triggered at W=1) by adding relevant __printf
attributes.

drivers/base/cpu.c:432:2: warning: function '__cpu_device_create' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fa548d79

firmware: intel_stratix10_service: add hardware dependency · 095ff29d

Richard Gong authored Jan 23, 2019

Add a Kconfig dependency to ensure Intel Stratix10 service layer driver
can be built only on the platform that supports it.
Signed-off-by: Richard Gong <richard.gong@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

095ff29d

driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity · 57ea974f

Alexander Duyck authored Jan 22, 2019

The current async_probe test code is only testing one device allocated
prior to driver load and only loading one device afterwards. Instead of
doing things this way it makes much more sense to load one device per CPU
in order to actually stress the async infrastructure. By doing this we
should see delays significantly increase in the event of devices being
serialized.

In addition I have updated the test to verify that we are trying to place
the work on the correct NUMA node when we are running in async mode. By
doing this we can verify the best possible outcome for device and driver
load times.

I have added a timeout value that is used to disable the sleep and instead
cause the probe routine to report an error indicating it timed out. By
doing this we limit the maximum runtime for the test to 20 seconds or less.

The last major change in this set is that I have gone through and tuned it
for handling the massive number of possible events that will be scheduled.
Instead of reporting the sleep for each individual device it is moved to
only being displayed if we enable debugging.

With this patch applied below are what a failing test and a passing test
should look like. I elided a few hundred lines in the failing test that
were duplicated since the system I was testing on had a massive number of
CPU cores:

-- Failing --
[  243.524697] test_async_driver_probe: registering first set of asynchronous devices...
[  243.535625] test_async_driver_probe: registering asynchronous driver...
[  243.543038] test_async_driver_probe: registration took 0 msecs
[  243.549559] test_async_driver_probe: registering second set of asynchronous devices...
[  243.568350] platform test_async_driver.447: registration took 9 msecs
[  243.575544] test_async_driver_probe: registering first synchronous device...
[  243.583454] test_async_driver_probe: registering synchronous driver...
[  248.825920] test_async_driver_probe: registration took 5235 msecs
[  248.825922] test_async_driver_probe: registering second synchronous device...
[  248.825928] test_async_driver test_async_driver.443: NUMA node mismatch 3 != 1
[  248.825932] test_async_driver test_async_driver.445: NUMA node mismatch 3 != 1
[  248.825935] test_async_driver test_async_driver.446: NUMA node mismatch 3 != 1
[  248.825939] test_async_driver test_async_driver.440: NUMA node mismatch 3 != 1
[  248.825943] test_async_driver test_async_driver.441: NUMA node mismatch 3 != 1
...
[  248.827150] test_async_driver test_async_driver.229: NUMA node mismatch 0 != 1
[  248.827158] test_async_driver test_async_driver.228: NUMA node mismatch 0 != 1
[  248.827220] test_async_driver test_async_driver.281: NUMA node mismatch 2 != 1
[  248.827229] test_async_driver test_async_driver.282: NUMA node mismatch 2 != 1
[  248.827240] test_async_driver test_async_driver.280: NUMA node mismatch 2 != 1
[  253.945834] test_async_driver test_async_driver.1: NUMA node mismatch 0 != 1
[  253.945878] test_sync_driver test_sync_driver.1: registration took 5119 msecs
[  253.961693] test_async_driver_probe: async events still pending, forcing timeout and synchronize
[  259.065839] test_async_driver test_async_driver.2: NUMA node mismatch 0 != 1
[  259.073786] test_async_driver test_async_driver.3: async probe took too long
[  259.081669] test_async_driver test_async_driver.3: NUMA node mismatch 0 != 1
[  259.089569] test_async_driver test_async_driver.4: async probe took too long
[  259.097451] test_async_driver test_async_driver.4: NUMA node mismatch 0 != 1
[  259.105338] test_async_driver test_async_driver.5: async probe took too long
[  259.113204] test_async_driver test_async_driver.5: NUMA node mismatch 0 != 1
[  259.121089] test_async_driver test_async_driver.6: async probe took too long
[  259.128961] test_async_driver test_async_driver.6: NUMA node mismatch 0 != 1
[  259.136850] test_async_driver test_async_driver.7: async probe took too long
...
[  262.124062] test_async_driver test_async_driver.221: async probe took too long
[  262.132130] test_async_driver test_async_driver.221: NUMA node mismatch 3 != 1
[  262.140206] test_async_driver test_async_driver.222: async probe took too long
[  262.148277] test_async_driver test_async_driver.222: NUMA node mismatch 3 != 1
[  262.156351] test_async_driver test_async_driver.223: async probe took too long
[  262.164419] test_async_driver test_async_driver.223: NUMA node mismatch 3 != 1
[  262.172630] test_async_driver_probe: Test failed with 222 errors and 336 warnings

-- Passing --
[  105.419247] test_async_driver_probe: registering first set of asynchronous devices...
[  105.432040] test_async_driver_probe: registering asynchronous driver...
[  105.439718] test_async_driver_probe: registration took 0 msecs
[  105.446239] test_async_driver_probe: registering second set of asynchronous devices...
[  105.477986] platform test_async_driver.447: registration took 22 msecs
[  105.485276] test_async_driver_probe: registering first synchronous device...
[  105.493169] test_async_driver_probe: registering synchronous driver...
[  110.597981] test_async_driver_probe: registration took 5097 msecs
[  110.604806] test_async_driver_probe: registering second synchronous device...
[  115.707490] test_sync_driver test_sync_driver.1: registration took 5094 msecs
[  115.715478] test_async_driver_probe: completed successfully
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

57ea974f

libnvdimm: Schedule device registration on node local to the device · af87b9a7

Alexander Duyck authored Jan 22, 2019

Force the device registration for nvdimm devices to be closer to the actual
device. This is achieved by using either the NUMA node ID of the region, or
of the parent. By doing this we can have everything above the region based
on the region, and everything below the region based on the nvdimm bus.

By guaranteeing NUMA locality I see an improvement of as high as 25% for
per-node init of a system with 12TB of persistent memory.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

af87b9a7

PM core: Use new async_schedule_dev command · 8b9ec6b7

Alexander Duyck authored Jan 22, 2019

Use the device specific version of the async_schedule commands to defer
various tasks related to power management. By doing this we should see a
slight improvement in performance as any device that is sensitive to
latency/locality in the setup will now be initializing on the node closest
to the device.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8b9ec6b7

driver core: Attach devices on CPU local to device node · c37e20ea

Alexander Duyck authored Jan 22, 2019

Call the asynchronous probe routines on a CPU local to the device node. By
doing this we should be able to improve our initialization time
significantly as we can avoid having to access the device from a remote
node which may introduce higher latency.

For example, in the case of initializing memory for NVDIMM this can have a
significant impact as initialing 3TB on remote node can take up to 39
seconds while initialing it on a local node only takes 23 seconds. It is
situations like this where we will see the biggest improvement.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c37e20ea

async: Add support for queueing on specific NUMA node · 6be9238e

Alexander Duyck authored Jan 22, 2019

Introduce four new variants of the async_schedule_ functions that allow
scheduling on a specific NUMA node.

The first two functions are async_schedule_near and
async_schedule_near_domain end up mapping to async_schedule and
async_schedule_domain, but provide NUMA node specific functionality. They
replace the original functions which were moved to inline function
definitions that call the new functions while passing NUMA_NO_NODE.

The second two functions are async_schedule_dev and
async_schedule_dev_domain which provide NUMA specific functionality when
passing a device as the data member and that device has a NUMA node other
than NUMA_NO_NODE.

The main motivation behind this is to address the need to be able to
schedule device specific init work on specific NUMA nodes in order to
improve performance of memory initialization.

I have seen a significant improvement in initialziation time for persistent
memory as a result of this approach. In the case of 3TB of memory on a
single node the initialization time in the worst case went from 36s down to
about 26s for a 10s improvement. As such the data shows a general benefit
for affinitizing the async work to the node local to the device.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6be9238e

workqueue: Provide queue_work_node to queue work near a given NUMA node · 8204e0c1

Alexander Duyck authored Jan 22, 2019

Provide a new function, queue_work_node, which is meant to schedule work on
a "random" CPU of the requested NUMA node. The main motivation for this is
to help assist asynchronous init to better improve boot times for devices
that are local to a specific node.

For now we just default to the first CPU that is in the intersection of the
cpumask of the node and the online cpumask. The only exception is if the
CPU is local to the node we will just use the current CPU. This should work
for our purposes as we are currently only using this for unbound work so
the CPU will be translated to a node anyway instead of being directly used.

As we are only using the first CPU to represent the NUMA node for now I am
limiting the scope of the function so that it can only be used with unbound
workqueues.
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8204e0c1

driver core: Probe devices asynchronously instead of the driver · ef0ff683

Alexander Duyck authored Jan 22, 2019

Probe devices asynchronously instead of the driver. This results in us
seeing the same behavior if the device is registered before the driver or
after. This way we can avoid serializing the initialization should the
driver not be loaded until after the devices have already been added.

The motivation behind this is that if we have a set of devices that
take a significant amount of time to load we can greatly reduce the time to
load by processing them in parallel instead of one at a time. In addition,
each device can exist on a different node so placing a single thread on one
CPU to initialize all of the devices for a given driver can result in poor
performance on a system with multiple nodes.

This approach can reduce the time needed to scan SCSI LUNs significantly.
The only way to realize that speedup is by enabling more concurrency which
is what is achieved with this patch.

To achieve this it was necessary to add a new member "async_driver" to the
device_private structure to store the driver pointer while we wait on the
deferred probe call.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ef0ff683

device core: Consolidate locking and unlocking of parent and device · ed88747c

Alexander Duyck authored Jan 22, 2019

Try to consolidate all of the locking and unlocking of both the parent and
device when attaching or removing a driver from a given device.

To do that I first consolidated the lock pattern into two functions
__device_driver_lock and __device_driver_unlock. After doing that I then
created functions specific to attaching and detaching the driver while
acquiring these locks. By doing this I was able to reduce the number of
spots where we touch need_parent_lock from 12 down to 4.

This patch should produce no functional changes, it is meant to be a code
clean-up/consolidation only.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ed88747c

driver core: Establish order of operations for device_add and device_del via bitflag · 3451a495

Alexander Duyck authored Jan 22, 2019

Add an additional bit flag to the device_private struct named "dead".

This additional flag provides a guarantee that when a device_del is
executed on a given interface an async worker will not attempt to attach
the driver following the earlier device_del call. Previously this
guarantee was not present and could result in the device_del call
attempting to remove a driver from an interface only to have the async
worker attempt to probe the driver later when it finally completes the
asynchronous probe call.

One additional change added was that I pulled the check for dev->driver
out of the __device_attach_driver call and instead placed it in the
__device_attach_async_helper call. This was motivated by the fact that the
only other caller of this, __device_attach, had already taken the
device_lock() and checked for dev->driver. Instead of testing for this
twice in this path it makes more sense to just consolidate the dev->dead
and dev->driver checks together into one set of checks.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3451a495

22 Jan, 2019 19 commits

driver core: Remove the link if there is no driver with AUTO flag · 0fe6f787

Yong Wu authored Jan 01, 2019

DL_FLAG_AUTOREMOVE_CONSUMER/SUPPLIER means "Remove the link
automatically on consumer/supplier driver unbind", that means we should
remove whole the device_link when there is no this driver no matter what
the ref_count of the link is.

CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0fe6f787

driver core: silence device link messages unless debugging · 8a4b3269

Jerome Brunet authored Dec 21, 2018

On platforms making a fair use of regulators, the dev_info() messages
coming from the device link function are a bit too verbose. The amount
of message will increase further with the clock framework joining the
device link party.

These messages looks valuable for people debugging device link related
issues, so dev_dbg() looks more appropriate than dev_info().
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8a4b3269

kobject: drop newline from msg string · 549ad243

Bo YU authored Jan 09, 2019

There is currently a missing terminating newline in non-switch case
match when msg == NULL
Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

549ad243

kobject: to repalce printk with pr_* style · b3fa29ad

Bo YU authored Jan 09, 2019

Repalce printk with pr_warn in kobject_synth_uevent and replace
printk with pr_err in uevent_net_init to make both consistent with
other code in kobject_uevent.c
Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b3fa29ad

kobject: make kset_get_ownership() 'static' · 7ab35a14

Eric Biggers authored Jan 08, 2019

kset_get_ownership() is only used in lib/kobject.c, so make it 'static'.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7ab35a14

block: rbd: convert to use BUS_ATTR_WO and RO · 7e9586ba

Greg Kroah-Hartman authored Dec 21, 2018

We are trying to get rid of BUS_ATTR() and the usage of that in rbd.c
can be trivially converted to use BUS_ATTR_WO and RO, so use those
macros instead.

Cc: Sage Weil <sage@redhat.com>
Cc: Alex Elder <elder@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7e9586ba

rapidio: rio-sysfs.c: convert to use BUS_ATTR_WO · c9fbe769

Greg Kroah-Hartman authored Dec 21, 2018

We are trying to get rid of BUS_ATTR() and the usage of that in
rio-sysfs.c can be trivially converted to use BUS_ATTR_WO(), so use that
instead.

Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c9fbe769

pseries: ibmebus.c: convert to use BUS_ATTR_WO · c1507ea8

Greg Kroah-Hartman authored Dec 21, 2018

We are trying to get rid of BUS_ATTR() and the usage of that in
ibmebus.c can be trivially converted to use BUS_ATTR_WO(), so use that
instead.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c1507ea8

PCI: pci-sysfs.c: convert to use BUS_ATTR_WO · 1094f6d0

Greg Kroah-Hartman authored Dec 21, 2018

We are trying to get rid of BUS_ATTR() and the usage of that in
pci-sysfs.c can be trivially converted to use BUS_ATTR_WO(), so use that
instead.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1094f6d0

PCI: pci.c: convert to use BUS_ATTR_RW · d61dfafc

Greg Kroah-Hartman authored Dec 21, 2018

We are trying to get rid of BUS_ATTR() and the usage of that in pci.c
can be trivially converted to use BUS_ATTR_RW(), so use that instead.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d61dfafc

f2fs: no need to check return value of debugfs_create functions · 21acc07d

Greg Kroah-Hartman authored Jan 04, 2019

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

21acc07d

debugfs: debugfs_use_start/finish do not exist anymore · 0eeb2731

Sergey Senozhatsky authored Dec 30, 2018

debugfs_use_file_start() and debugfs_use_file_finish() do not exist
since commit c9afbec2 ("debugfs: purge obsolete SRCU based removal
protection"); tweak debugfs_create_file_unsafe() comment.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0eeb2731

firmware_loader: move firmware/ to drivers/base/firmware_loader/builtin/ · f96182e9

Masahiro Yamada authored Jan 11, 2019

Currently, the 'firmware' directory only contains a single Makefile
to embed extra firmware into the kernel.

Move it to the more relevant place.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f96182e9

firmware_loader: move CONFIG_FW_LOADER_USER_HELPER switch to Makefile · 91f382a4

Masahiro Yamada authored Jan 11, 2019

The whole code of fallback_table.c is surrounded by #ifdef of
CONFIG_FW_LOADER_USER_HELPER.

Move the CONFIG_FW_LOADER_USER_HELPER switch to Makefile so that
it is not compiled at all when this CONFIG option is disabled.

I also removed the confusing comment, "Module or buit-in [sic]".
CONFIG_FW_LOADER_USER_HELPER is a boolean option.
(If it were a module, CONFIG_FW_LOADER_USER_HELPER_MODULE would
be defined instead.)
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

91f382a4

tools/firmware/ihex2fw: Replace explicit alignment with ALIGN · 925f8d4a

Andrey Smirnov authored Dec 20, 2018

(X + 3) & ~3 is the same as ALIGN(X, 4), so replace all of the
instances of the formwer in the code with the latter. While at it,
introduce a helper variable 'record_size' to avoid duplicating length
calculatin code. No functional change intended.

Cc: Chris Healy <cphealy@gmail.com>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

925f8d4a

tools/firmware/ihex2fw: Simplify next record offset calculation · 2ef8179b

Andrey Smirnov authored Dec 20, 2018

We can convert original expression for 'writelen" to use ALIGN as
follows:

    (p->len + 9) & ~3 => (p->len + 6 + 3) & ~3 => ALIGN(p->len + 6, 4)

Now, subsituting "p->len + 6" with "p->len + sizeof(p->addr) +
sizeof(p->len)" we end up with the same expression as used by kernel
couterpart in linux/ihex.h:

    ALIGN(p->len + sizeof(p->addr) + sizeof(p->len), 4)

That is a full size of the record, aligned to 4 bytes. No functional
change intended.

Cc: Chris Healy <cphealy@gmail.com>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2ef8179b

ihex: Simplify next record offset calculation · 9fb4ab4d

Andrey Smirnov authored Dec 20, 2018

Next record calucaltion can be reduced to a much more tivial ALIGN
operation as follows:

1. Splitting 5 into 2 + 3 we get

   next = ((be16_to_cpu(rec->len) + 2 + 3) & ~3) - 2            (1)

2. Using ALIGN macro we reduce (1) to:

   ALIGN(be16_to_cpu(rec->len) + 2, 4) - 2                      (2)

3. Subsituting 'next' in original next record calucation we get:

   (void *)&rec->data[ALIGN(be16_to_cpu(rec->len) + 2, 4) - 2]  (3)

4. Converting array index to pointer arithmetic we convert (3) into:

   (void *)rec + sizeof(*rec) +
   	 ALIGN(be16_to_cpu(rec->len) + 2, 4) - 2		(4)

5. Subsituting sizeof(*rec) with its value, 6, and substracting 2,
   in (4) we get:

   (void *)rec + ALIGN(be16_to_cpu(rec->len) + 2, 4) + 4        (5)

6. Since ALIGN(X, 4) + 4 == ALIGN(X + 4, 4), (5) can be converted to:

   (void *)rec + ALIGN(be16_to_cpu(rec->len) + 6, 4)            (6)

5. Subsituting 6 in (6) to sizeof(*rec) we get:

   (void *)rec + ALIGN(be16_to_cpu(rec->len) + sizeof(*rec), 4) (7)

Using expression (7) should make it more clear that next record is
located by adding full size of the current record (payload + auxiliary
data) aligned to 4 bytes, to the location of the current one. No
functional change intended.

Cc: Chris Healy <cphealy@gmail.com>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

9fb4ab4d

ihex: Check if zero-length record is at the end of the blob · 5158c36e

Andrey Smirnov authored Dec 20, 2018

When verifying the validity of IHEX file we need to make sure that
zero-length record we found is located at the end of the file. Not
doing that could result in an invalid file with a bogus zero-length in
the middle short-circuiting the check and being reported as valid.

Cc: Chris Healy <cphealy@gmail.com>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5158c36e

ihex: Share code between ihex_validate_fw() and ihex_next_binrec() · 8092e792

Andrey Smirnov authored Dec 20, 2018

Convert both ihex_validate_fw() and ihex_next_binrec() to use a helper
function to calculate next record offest. This way we only have one
place implementing next record offset calculation logic. No functional
change intended.

Cc: Chris Healy <cphealy@gmail.com>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8092e792

18 Jan, 2019 1 commit

driver core: move device->knode_class to device_private · 570d0200

Wei Yang authored Jan 18, 2019

As the description of struct device_private says, it stores data which
is private to driver core. And it already has similar fields like:
knode_parent, knode_driver, knode_driver and knode_bus. This look it is
more proper to put knode_class together with those fields to make it
private to driver core.

This patch move device->knode_class to device_private to make it comply
with code convention.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

570d0200