Commits · 82c3fbff6ed3582f2b14b5548fb976c22d7a7255 · Kirill Smelkov / linux

31 Jul, 2018 8 commits

PCI: sysfs: Resume to D0 on function reset · 82c3fbff

Lukas Wunner authored Jul 19, 2018

When performing a function reset via sysfs, the device's config space is
accessed in places such as pcie_flr() and its MMIO space is accessed e.g.
in reset_ivb_igd(), so ensure accessibility by resuming the device to D0.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>

82c3fbff

PCI: pciehp: Resume parent to D0 on config space access · 4417aa45

Lukas Wunner authored Jul 19, 2018

Ensure accessibility of a hotplug port's config space when accessed via
sysfs by resuming its parent to D0.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>

4417aa45

PCI: pciehp: Resume to D0 on enable/disable · 83503074

Lukas Wunner authored Jul 19, 2018

pciehp's IRQ thread ensures accessibility of the port by runtime resuming
its parent to D0.  However when the slot is enabled/disabled, the port
itself needs to be in D0 because its secondary bus is accessed in:

    pciehp_check_link_status(),
    pciehp_configure_device() (both called from board_added())
and
    pciehp_unconfigure_device() (called from remove_board()).

Thus, acquire a runtime PM ref on enable/disablement of the slot.

Yinghai Lu additionally discovered that some SkyLake servers feature a
Power Controller for their PCIe hotplug ports (PCIe r3.1, sec 6.7.1.8)
which requires the port to be in D0 when invoking

    pciehp_power_on_slot() (likewise called from board_added()).

If slot power is turned on while in D3hot, link training later fails:
https://lkml.kernel.org/r/20170205073454.GA253@wunner.de

The spec is silent about such a requirement, but it seems prudent to
assume that any hotplug port with a Power Controller may need this.

The present commit holds a runtime PM ref whenever slot power is turned
on and off, but it doesn't keep the port in D0 as long as slot power is
on.  If vendors determine that's necessary, they need to amend pciehp to
acquire a runtime PM ref in pciehp_power_on_slot() and release one in
pciehp_power_off_slot().
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>

83503074

PCI: pciehp: Support interrupts sent from D3hot · 6b08c385

Lukas Wunner authored Jul 28, 2018

If a hotplug port is able to send an interrupt, one would naively assume
that it is accessible at that moment. After all, if it wouldn't be
accessible, i.e. if its parent is in D3hot and the link to the hotplug
port is thus down, how should an interrupt come through?

It turns out that assumption is wrong at least for Thunderbolt: Even
though its parents are in D3hot, a Thunderbolt hotplug port is able to
signal interrupts. Because the port's config space is inaccessible and
resuming the parents may sleep, the hard IRQ handler has to defer
runtime resuming the parents and reading the Slot Status register to the
IRQ thread.

If the hotplug port uses a level-triggered INTx interrupt, it needs to
be masked until the IRQ thread has cleared the signaled events. For
simplicity, this commit also masks edge-triggered MSI/MSI-X interrupts.
Note that if the interrupt is shared (which can only happen for INTx),
other devices are starved from receiving interrupts until the IRQ thread
is scheduled, has runtime resumed the hotplug port's parents and has
read and cleared the Slot Status register.

That delay is dominated by the 10 ms D3hot->D0 transition time of each
parent port. The worst case is a Thunderbolt downstream port at the
end of a daisy chain: There may be up to six Thunderbolt controllers
in-between it and the root port, each comprising an upstream and
downstream port, plus its own upstream port. That's 13 x 10 = 130 ms.
Possible mitigations are polling the interrupt while it's disabled or
reducing the d3_delay of Thunderbolt ports if possible.

Open code masking of the interrupt instead of requesting it with the
IRQF_ONESHOT flag to minimize the period during which it is masked.
(IRQF_ONESHOT unmasks the IRQ only after the IRQ thread has finished.)

PCIe r4.0 sec 6.7.3.4 states that "If wake generation is required by the
associated form factor specification, a hotplug capable Downstream Port
must support generation of a wakeup event (using the PME mechanism) on
hotplug events that occur when the system is in a sleep state or the
Port is in device state D1, D2, or D3Hot."

This would seem to imply that PME needs to be enabled on the hotplug
port when it is runtime suspended. pci_enable_wake() currently doesn't
enable PME on bridges, it may be necessary to add an exemption for
hotplug bridges there. On "Light Ridge" Thunderbolt controllers, the
PME_Status bit is not set when an interrupt occurs while the hotplug
port is in D3hot, even if PME is enabled. (I've tested this on a Mac
and we hardcode the OSC_PCI_EXPRESS_PME_CONTROL bit to 0 on Macs in
negotiate_os_control(), modifying it to 1 didn't change the behavior.)

(Side note: Section 6.7.3.4 also states that "PME and Hot-Plug Event
interrupts (when both are implemented) always share the same MSI or
MSI-X vector". That would only seem to apply to Root Ports, however
the section never mentions Root Ports, only Downstream Ports. This is
explained in the definition of "Downstream Port" in the "Terms and
Acronyms" section of the PCIe Base Spec: "The Ports on a Switch that
are not the Upstream Port are Downstream Ports. All Ports on a Root
Complex are Downstream Ports.")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>

6b08c385

PCI: pciehp: Obey compulsory command delay after resume · 469e764c

Lukas Wunner authored Jul 19, 2018

Upon resume from system sleep, the Slot Control register is written via:

  pci_pm_resume_noirq()
    pci_pm_default_resume_early()
      pci_restore_state()
        pci_restore_pcie_state()

PCIe r4.0, sec 6.7.3.2 says that after "issuing a write transaction that
targets any portion of the Port's Slot Control register, [...] software
must wait for [the] command to complete before issuing the next command".

pciehp currently fails to enforce that rule after the above-mentioned
write.  Fix it.

(Moving restoration of the Slot Control register to pciehp doesn't seem
to make sense because the other PCIe hotplug drivers may need it as
well.)
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

469e764c

PCI: pciehp: Clear spurious events earlier on resume · 79037824

Lukas Wunner authored Jul 19, 2018

Thunderbolt hotplug ports that were occupied before system sleep resume
with their downstream link in "off" state.  Only after the Thunderbolt
controller has reestablished the PCIe tunnels does the link go up.
As a result, a spurious Presence Detect Changed and/or Data Link Layer
State Changed event occurs.

The events are not immediately acted upon because tunnel reestablishment
happens in the ->resume_noirq phase, when interrupts are still disabled.
Also, notification of events may initially be disabled in the Slot
Control register when coming out of system sleep and is reenabled in the
->resume_noirq phase through:

  pci_pm_resume_noirq()
    pci_pm_default_resume_early()
      pci_restore_state()
        pci_restore_pcie_state()

It is not guaranteed that the events are acted upon at all:  PCIe r4.0,
sec 6.7.3.4 says that "a port may optionally send an MSI when there are
hot-plug events that occur while interrupt generation is disabled, and
interrupt generation is subsequently enabled."  Note the "optionally".

If an MSI is sent, pciehp will gratuitously turn the slot off and back
on once the ->resume_early phase has commenced.

If an MSI is not sent, the extant, unacknowledged events in the Slot
Status register will prevent future notification of presence or link
changes.

Commit 13c65840 ("PCI: pciehp: Clear Presence Detect and Data Link
Layer Status Changed on resume") fixed the latter by clearing the events
in the ->resume phase.  Move this to the ->resume_noirq phase to also
fix the gratuitous disable/enablement of the slot.

The commit further restored the Slot Control register in the ->resume
phase, but that's dispensable because as shown above it's already been
done in the ->resume_noirq phase.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>

79037824

PCI: portdrv: Deduplicate PM callback iterator · 6ccb127b

Lukas Wunner authored Jul 19, 2018

Replace suspend_iter() and resume_iter() with a single function pm_iter()
to allow addition of port service callbacks for further power management
phases without having to add another iterator each time.

No functional change intended.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

6ccb127b

PCI: pciehp: Avoid slot access during reset · 5b3f7b7d

Lukas Wunner authored Jul 28, 2018

The ->reset_slot callback introduced by commits:

  2e35afae ("PCI: pciehp: Add reset_slot() method") and
  06a8d89a ("PCI: pciehp: Disable link notification across slot reset")

disables notification of Presence Detect Changed and Data Link Layer
State Changed events for the duration of a secondary bus reset.

However a bus reset not only triggers these events, but may also clear
the Presence Detect State bit in the Slot Status register and the Data
Link Layer Link Active bit in the Link Status register momentarily.
According to Sinan Kaya:

 "I know for a fact that bus reset clears the Data Link Layer Active bit
  as soon as link goes down.  It gets set again following link up.
  Presence detect depends on the HW implementation.  QDT root ports
  don't change presence detect for instance since nobody actually
  removed the card.  If an implementation supports in-band presence
  detect, the answer is yes.  As soon as the link goes down, presence
  detect bit will get cleared until recovery."
  https://lkml.kernel.org/r/42e72f83-3b24-f7ef-e5bc-290fae99259a@codeaurora.org

  In-band presence detect is also covered in Table 4-15 in PCIe r4.0,
  sec 4.2.6.

pciehp should therefore ensure that any parts of the driver that access
those bits do not run concurrently to a bus reset.  The only precaution
the commits took to that effect was to halt interrupt polling.  They
made no effort to drain the slot workqueue, cancel an outstanding
Attention Button work, or block slot enable/disable requests via sysfs
and in the ->probe hook.

Now that pciehp is converted to enable/disable the slot exclusively from
the IRQ thread, the only places accessing the two above-mentioned bits
are the IRQ thread and the ->probe hook.  Add locking to serialize them
with a bus reset.  This obviates the need to halt interrupt polling.
Do not add locking to the ->get_adapter_status sysfs callback to afford
users unfettered access to that bit.  Use an rw_semaphore in lieu of a
regular mutex to allow parallel execution of the non-reset code paths
accessing the critical bits, i.e. the IRQ thread and the ->probe hook.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rajat Jain <rajatja@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Sinan Kaya <okaya@kernel.org>

5b3f7b7d

23 Jul, 2018 21 commits

PCI: pciehp: Always enable occupied slot on probe · cdf6b736

Lukas Wunner authored Jul 19, 2018

Per PCIe r4.0, sec 6.7.3.4, a "port may optionally send an MSI when
there are hot-plug events that occur while interrupt generation is
disabled, and interrupt generation is subsequently enabled."

On probe, we currently clear all event bits in the Slot Status register
with the notable exception of the Presence Detect Changed bit.  Thereby
we seek to receive an interrupt for an already occupied slot once event
notification is enabled.

But because the interrupt is optional, users may have to specify the
pciehp_force parameter on the command line, which is inconvenient.

Moreover, now that pciehp's event handling has become resilient to
missed events, a Presence Detect Changed interrupt for a slot which is
powered on is interpreted as removal of the card.  If the slot has
already been brought up by the BIOS, receiving such an interrupt on
probe causes the slot to be powered off and immediately back on, which
is likewise undesirable.

Avoid both issues by making the behavior of pciehp_force the default and
clearing the Presence Detect Changed bit on probe.

Note that the stated purpose of pciehp_force per the MODULE_PARM_DESC
("Force pciehp, even if OSHP is missing") seems nonsensical because the
OSHP control method is only relevant for SHCP slots according to the
PCI Firmware specification r3.0, sec 4.8.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>

cdf6b736

PCI: pciehp: Become resilient to missed events · d331710e

Lukas Wunner authored Jul 19, 2018

A hotplug port's Slot Status register does not count how often each type
of event occurred, it only records the fact *that* an event has occurred.

Previously pciehp queued a work item for each event.  But if it missed
an event, e.g. removal of a card in-between two back-to-back insertions,
it queued up the wrong work item or no work item at all.  Commit
fad214b0 ("PCI: pciehp: Process all hotplug events before looking
for new ones") sought to improve the situation by shrinking the window
during which events may be missed.

But Stefan Roese reports unbalanced Card present and Link Up events,
suggesting that we're still missing events if they occur very rapidly.
Bjorn Helgaas responds that he considers pciehp's event handling
"baroque" and calls for its simplification and rationalization:
https://lkml.kernel.org/r/20180202192045.GA53759@bhelgaas-glaptop.roam.corp.google.com

It gets worse once a hotplug port is runtime suspended:  The port can
signal an interrupt while it and its parents are in D3hot, i.e. while
it is inaccessible.  By the time we've runtime resumed all parents to D0
and read the port's Slot Status register, we may have missed an arbitrary
number of events.  Event handling therefore needs to be reworked to
become resilient to missed events.

Assume that a Presence Detect Changed event has occurred.
Consider the following truth table:
- Slot is in OFF_STATE and is currently empty.    => Do nothing.
  (The event is trailing a Link Down or we've
  missed an insertion and subsequent removal.)
- Slot is in OFF_STATE and is currently occupied. => Turn the slot on.
- Slot is in ON_STATE  and is currently empty.    => Turn the slot off.
- Slot is in ON_STATE  and is currently occupied. => Turn the slot off,
  (Be cautious and assume the card in                then back on.
  the slot isn't the same as before.)

This leads to the following simple algorithm:
1 If the slot is in ON_STATE, turn it off unconditionally.
2 If the slot is currently occupied, turn it on.

Because those actions are now carried out synchronously, rather than by
scheduled work items, pciehp reacts to the *current* situation and
missed events no longer matter.

Data Link Layer State Changed events can be handled identically to
Presence Detect Changed events.  Note that in the above truth table,
a Link Up trailing a Card present event didn't have to be accounted for:
It is filtered out by pciehp_check_link_status().

As for Attention Button Pressed events, PCIe r4.0, sec 6.7.1.5 says:
"Once the Power Indicator begins blinking, a 5-second abort interval
exists during which a second depression of the Attention Button cancels
the operation."  In other words, the user can only expect the system to
react to a button press after it starts blinking.  Missed button presses
that occur in-between are irrelevant.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefan Roese <sr@denx.de>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

d331710e

PCI: pciehp: Tolerate initially unstable link · 6c35a1ac

Lukas Wunner authored Jul 19, 2018

When a device is hotplugged, Presence Detect and Link Up events often do
not occur simultaneously, but with a lag of a few milliseconds. Only
the first event received is relevant, the other one can be disregarded.

Moreover, Stefan Roese reports that on certain platforms, Link State and
Presence Detect may flap for up to 100 ms before stabilizing, suggesting
that such events should be disregarded for at least this long:
https://lkml.kernel.org/r/20180130084121.18653-1-sr@denx.de

On slot enablement, pciehp_check_link_status() waits for 100 ms per
PCIe r4.0, sec 6.7.3.3, then probes the hotplugged device's vendor
register for up to 1 second.

If this succeeds, the link is definitely up, so ignore any Presence
Detect or Link State events that occurred up to this point.

pciehp_check_link_status() then checks the Link Training bit in the
Link Status register. This is the final opportunity to detect
inaccessibility of the device and abort slot enablement. Any link
or presence change that occurs afterwards will cause the slot to be
disabled again immediately after attempting to enable it.

The astute reviewer may appreciate that achieving this behavior would be
more complicated had pciehp not just been converted to enable/disable
the slot exclusively from the IRQ thread: When the slot is enabled via
sysfs, each link or presence flap would otherwise cause the IRQ thread
to run and it would have to sense that those events are belonging to a
concurrent slot enablement operation and disregard them. It would be
much more difficult than this mere 3 line change.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefan Roese <sr@denx.de>

6c35a1ac

PCI: pciehp: Declare pciehp_enable/disable_slot() static · 25c83b84

Lukas Wunner authored Jul 19, 2018

No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c
remain, so declare the functions static.  For now this requires forward
declarations.  Those can be eliminated by reshuffling functions once the
ongoing effort to refactor the driver has settled.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

25c83b84

PCI: pciehp: Drop enable/disable lock · 1656716d

Lukas Wunner authored Jul 19, 2018

Previously slot enablement and disablement could happen concurrently.
But now it's under the exclusive control of the IRQ thread, rendering
the locking obsolete.  Drop it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

1656716d

PCI: pciehp: Enable/disable exclusively from IRQ thread · 32a8cef2

Lukas Wunner authored Jul 19, 2018

Besides the IRQ thread, there are several other places in the driver
which enable or disable the slot:

- pciehp_probe() enables the slot if it's occupied and the pciehp_force
  module parameter is used.

- pciehp_resume() enables or disables the slot after system sleep.

- pciehp_queue_pushbutton_work() enables or disables the slot after the
  5 second delay following an Attention Button press.

- pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or
  disable the slot on sysfs write.

This requires locking and complicates pciehp's state machine.

A simplification can be achieved by enabling and disabling the slot
exclusively from the IRQ thread.

Amend the functions listed above to request slot enable/disablement from
the IRQ thread by either synthesizing a Presence Detect Changed event or,
in the case of a disable user request (via sysfs or an Attention Button
press), submitting a newly introduced force disable request.  The latter
is needed because the slot shall be forced off despite being occupied.
For this force disable request, avoid colliding with Slot Status register
bits by using a bit number greater than 16.

For synchronous execution of requests (on sysfs write), wait for the
request to finish and retrieve the result.  There can only ever be one
sysfs write in flight due to the locking in kernfs_fop_write(), hence
there is no risk of returning the result of a different sysfs request to
user space.

The POWERON_STATE and POWEROFF_STATE is now no longer entered by the
above-listed functions, but solely by the IRQ thread when it begins a
power transition.  Afterwards, it moves to STATIC_STATE.  The same
applies to canceling the Attention Button work, it likewise becomes an
IRQ thread only operation.

An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is
never observed by the IRQ thread itself, only by functions called in a
different context, such as pciehp_sysfs_enable_slot().  So remove
handling of these states from pciehp_handle_button_press() and
pciehp_handle_link_change() which are exclusively called from the IRQ
thread.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

32a8cef2

PCI: pciehp: Track enable/disable status · 9590192f

Lukas Wunner authored Jul 19, 2018

handle_button_press_event() currently determines whether the slot has
been turned on or off by looking at the Power Controller Control bit in
the Slot Control register.  This assumes that an attention button
implies presence of a power controller even though that's not mandated
by the spec.  Moreover the Power Controller Control bit is unreliable
when a power fault occurs (PCIe r4.0, sec 6.7.1.8).  This issue has
existed since the driver was introduced in 2004.

Fix by replacing STATIC_STATE with ON_STATE and OFF_STATE and tracking
whether the slot has been turned on or off.  This is also a required
ingredient to make pciehp resilient to missed events, which is the
object of an upcoming commit.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

9590192f

PCI: pciehp: Publish to user space last on probe · 774d446b

Lukas Wunner authored Jul 19, 2018

The PCI hotplug core has just been refactored to separate slot
initialization for in-kernel use from publication to user space.

Take advantage of it in pciehp by publishing to user space last on
probe. This will allow enable/disablement of the slot exclusively from
the IRQ thread because the IRQ is requested after initialization for
in-kernel use (thereby getting its unique name needed by the IRQ thread)
but before user space is able to submit enable/disable requests.

On teardown, the order is the same in reverse: The user space interface
is removed prior to freeing the IRQ and destroying the slot.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

774d446b

PCI: hotplug: Demidlayer registration with the core · 51bbf9be

Lukas Wunner authored Jul 19, 2018

When a hotplug driver calls pci_hp_register(), all steps necessary for
registration are carried out in one go, including creation of a kobject
and addition to sysfs.  That's a problem for pciehp once it's converted
to enable/disable the slot exclusively from the IRQ thread:  The thread
needs to be spawned after creation of the kobject (because it uses the
kobject's name), but before addition to sysfs (because it will handle
enable/disable requests submitted via sysfs).

pci_hp_deregister() does offer a ->release callback that's invoked
after deletion from sysfs and before destruction of the kobject.  But
because pci_hp_register() doesn't offer a counterpart, hotplug drivers'
->probe and ->remove code becomes asymmetric, which is error prone
as recently discovered use-after-free bugs in pciehp's ->remove hook
have shown.

In a sense, this appears to be a case of the midlayer antipattern:

   "The core thesis of the "midlayer mistake" is that midlayers are
    bad and should not exist.  That common functionality which it is
    so tempting to put in a midlayer should instead be provided as
    library routines which can [be] used, augmented, or ignored by
    each bottom level driver independently.  Thus every subsystem
    that supports multiple implementations (or drivers) should
    provide a very thin top layer which calls directly into the
    bottom layer drivers, and a rich library of support code that
    eases the implementation of those drivers.  This library is
    available to, but not forced upon, those drivers."
        --  Neil Brown (2009), https://lwn.net/Articles/336262/

The presence of midlayer traits in the PCI hotplug core might be ascribed
to its age:  When it was introduced in February 2002, the blessings of a
library approach might not have been well known:
https://git.kernel.org/tglx/history/c/a8a2069f432c

For comparison, the driver core does offer split functions for creating
a kobject (device_initialize()) and addition to sysfs (device_add()) as
an alternative to carrying out everything at once (device_register()).
This was introduced in October 2002:
https://git.kernel.org/tglx/history/c/8b290eb19962

The odd ->release callback in the PCI hotplug core was added in 2003:
https://git.kernel.org/tglx/history/c/69f8d663b595

Clearly, a library approach would not force every hotplug driver to
implement a ->release callback, but rather allow the driver to remove
the sysfs files, release its data structures and finally destroy the
kobject.  Alternatively, a driver may choose to remove everything with
pci_hp_deregister(), then release its data structures.

To this end, offer drivers pci_hp_initialize() and pci_hp_add() as a
split-up version of pci_hp_register().  Likewise, offer pci_hp_del()
and pci_hp_destroy() as a split-up version of pci_hp_deregister().

Eliminate the ->release callback and move its code into each driver's
teardown routine.

Declare pci_hp_deregister() void, in keeping with the usual kernel
pattern that enablement can fail, but disablement cannot.  It only
returned an error if the caller passed in a NULL pointer or a slot which
has never or is no longer registered or is sharing its name with another
slot.  Those would be bugs, so WARN about them.  Few hotplug drivers
actually checked the return value and those that did only printed a
useless error message to dmesg.  Remove that.

For most drivers the conversion was straightforward since it doesn't
matter whether the code in the ->release callback is executed before or
after destruction of the kobject.  But in the case of ibmphp, it was
unclear to me whether setting slot_cur->ctrl and slot_cur->bus_on to
NULL needs to happen before the kobject is destroyed, so I erred on
the side of caution and ensured that the order stays the same.  Another
nontrivial case is pnv_php, I've found the list and kref logic difficult
to understand, however my impression was that it is safe to delete the
list element and drop the references until after the kobject is
destroyed.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>  # drivers/platform/x86
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Scott Murray <scott@spiteful.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Corentin Chary <corentin.chary@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Andy Shevchenko <andy@infradead.org>

51bbf9be

PCI: pciehp: Drop slot workqueue · 55a6b7a6

Lukas Wunner authored Jul 19, 2018

Previously the slot workqueue was used to handle events and enable or
disable the slot. That's no longer the case as those tasks are done
synchronously in the IRQ thread. The slot workqueue is thus merely used
to handle a button press after the 5 second delay and only one such work
item may be in flight at any given time. A separate workqueue isn't
necessary for this simple task, so use the system workqueue instead.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

55a6b7a6

PCI: pciehp: Handle events synchronously · 0e94916e

Lukas Wunner authored Jul 19, 2018

Up until now, pciehp's IRQ handler schedules a work item for each event,
which in turn schedules a work item to enable or disable the slot.  This
double indirection was necessary because sleeping wasn't allowed in the
IRQ handler.

However it is now that pciehp has been converted to threaded IRQ handling
and polling, so handle events synchronously in pciehp_ist() and remove
the work item infrastructure (with the exception of work items to handle
a button press after the 5 second delay).

For link or presence change events, move the register read to determine
the current link or presence state behind acquisition of the slot lock
to prevent it from becoming stale while the lock is contended.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

0e94916e

PCI: pciehp: Stop blinking on slot enable failure · b0ccd9dd

Lukas Wunner authored Jul 19, 2018

If the attention button is pressed to power on the slot AND the user
powers on the slot via sysfs before 5 seconds have elapsed AND powering
on the slot fails because either the slot is unoccupied OR the latch is
open, we neglect turning off the green LED so it keeps on blinking.

That's because the error path of pciehp_sysfs_enable_slot() doesn't call
pciehp_green_led_off(), unlike pciehp_power_thread() which does.
The bug has been present since 2004 when the driver was introduced.

Fix by deduplicating common code in pciehp_sysfs_enable_slot() and
pciehp_power_thread() into a wrapper function pciehp_enable_slot() and
renaming the existing function to __pciehp_enable_slot(). Same for
pciehp_disable_slot(). This will also simplify the upcoming rework of
pciehp's event handling.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

b0ccd9dd

PCI: pciehp: Convert to threaded polling · ec07a447

Lukas Wunner authored Jul 19, 2018

We've just converted pciehp to threaded IRQ handling, but still cannot
sleep in pciehp_ist() because the function is also called in poll mode,
which runs in softirq context (from a timer).

Convert poll mode to a kthread so that pciehp_ist() always runs in task
context.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>

ec07a447

PCI: pciehp: Convert to threaded IRQ · 7b4ce26b

Lukas Wunner authored Jul 19, 2018

pciehp's IRQ handler queues up a work item for each event signaled by
the hardware. A more modern alternative is to let a long running
kthread service the events. The IRQ handler's sole job is then to check
whether the IRQ originated from the device in question, acknowledge its
receipt to the hardware to quiesce the interrupt and wake up the kthread.

One benefit is reduced latency to handle the IRQ, which is a necessity
for realtime environments. Another benefit is that we can make pciehp
simpler and more robust by handling events synchronously in process
context, rather than asynchronously by queueing up work items. pciehp's
usage of work items is a historic artifact, it predates the introduction
of threaded IRQ handlers by two years. (The former was introduced in
2007 with commit 5d386e1a ("pciehp: Event handling rework"), the
latter in 2009 with commit 3aa551c9 ("genirq: add threaded interrupt
handler support").)

Convert pciehp to threaded IRQ handling by retrieving the pending events
in pciehp_isr(), saving them for later consumption by the thread handler
pciehp_ist() and clearing them in the Slot Status register.

By clearing the Slot Status (and thereby acknowledging the events) in
pciehp_isr(), we can avoid requesting the IRQ with IRQF_ONESHOT, which
would have the unpleasant side effect of starving devices sharing the
IRQ until pciehp_ist() has finished.

pciehp_isr() does not count how many times each event occurred, but
merely records the fact *that* an event occurred. If the same event
occurs a second time before pciehp_ist() is woken, that second event
will not be recorded separately, which is problematic according to
commit fad214b0 ("PCI: pciehp: Process all hotplug events before
looking for new ones") because we may miss removal of a card in-between
two back-to-back insertions. We're about to make pciehp_ist() resilient
to missed events. The present commit regresses the driver's behavior
temporarily in order to separate the changes into reviewable chunks.
This doesn't affect regular slow-motion hotplug, only plug-unplug-plug
operations that happen in a timespan shorter than wakeup of the IRQ
thread.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

7b4ce26b

PCI: pciehp: Document struct slot and struct controller · 4aed1cd6

Lukas Wunner authored Jul 19, 2018

Document the driver's data structures to lower the barrier to entry for
contributors.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

4aed1cd6

PCI: pciehp: Declare pciehp_unconfigure_device() void · 1d2e2673

Lukas Wunner authored Jul 19, 2018

Since commit 0f4bd801 ("PCI: hotplug: Drop checking of PCI_BRIDGE_
CONTROL in *_unconfigure_device()"), pciehp_unconfigure_device() can no
longer fail, so declare it and its sole caller remove_board() void, in
keeping with the usual kernel pattern that enablement can fail, but
disablement cannot.  No functional change intended.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>

1d2e2673

PCI: pciehp: Drop unnecessary NULL pointer check · 6641311d

Lukas Wunner authored Jul 19, 2018

pciehp_disable_slot() checks if the ctrl attribute of the slot is NULL
and bails out if so. However the function is not called prior to the
attribute being set in pcie_init_slot(), and pcie_init_slot() is not
called if ctrl is NULL. So the check is unnecessary. Drop it.

It has been present ever since the driver was introduced in 2004, but it
was already unnecessary back then:
https://git.kernel.org/tglx/history/c/c16b4b14d980Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

6641311d

PCI: pciehp: Fix unprotected list iteration in IRQ handler · 1204e35b

Lukas Wunner authored Jul 19, 2018

Commit b440bde7 ("PCI: Add pci_ignore_hotplug() to ignore hotplug
events for a device") iterates over the devices on a hotplug port's
subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem.
It is thus possible for a user to cause a crash by concurrently
manipulating the device list, e.g. by disabling slot power via sysfs
on a different CPU or by initiating a remove/rescan via sysfs.

This can't be fixed by acquiring pci_bus_sem because it may sleep.
The simplest fix is to avoid the list iteration altogether and just
check the ignore_hotplug flag on the port itself.  This works because
pci_ignore_hotplug() sets the flag both on the device as well as on its
parent bridge.

We do lose the ability to print the name of the device blocking hotplug
in the debug message, but that's probably bearable.

Fixes: b440bde7 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org

1204e35b

PCI: pciehp: Fix use-after-free on unplug · 281e878e

Lukas Wunner authored Jul 19, 2018

When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the
hotplug_slot struct is deregistered and thus freed before freeing the
IRQ.  The IRQ handler and the work items it schedules print the slot
name referenced from the freed structure in various informational and
debug log messages, each time resulting in a quadruple dereference of
freed pointers (hotplug_slot -> pci_slot -> kobject -> name).

At best the slot name is logged as "(null)", at worst kernel memory is
exposed in logs or the driver crashes:

  pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present

An attacker may provoke the bug by unplugging multiple devices on a
Thunderbolt daisy chain at once.  Unplugging can also be simulated by
powering down slots via sysfs.  The bug is particularly easy to trigger
in poll mode.

It has been present since the driver's introduction in 2004:
https://git.kernel.org/tglx/history/c/c16b4b14d980

Fix by rearranging teardown such that the IRQ is freed first.  Run the
work items queued by the IRQ handler to completion before freeing the
hotplug_slot struct by draining the work queue from the ->release_slot
callback which is invoked by pci_hp_deregister().
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.6.4

281e878e

PCI: hotplug: Don't leak pci_slot on registration failure · 4ce64358

Lukas Wunner authored Jul 19, 2018

If addition of sysfs files fails on registration of a hotplug slot, the
struct pci_slot as well as the entry in the slot_list is leaked.  The
issue has been present since the hotplug core was introduced in 2002:
https://git.kernel.org/tglx/history/c/a8a2069f432c

Perhaps the idea was that even though sysfs addition fails, the slot
should still be usable.  But that's not how drivers use the interface,
they abort probe if a non-zero value is returned.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.4.15+
Cc: Greg Kroah-Hartman <greg@kroah.com>

4ce64358

PCI: hotplug: Delete skeleton driver · b4efce5c

Lukas Wunner authored Jun 19, 2018

Ten years ago, commit 58319b80 ("PCI: Hotplug core: remove 'name'")
dropped the name element from struct hotplug_slot but neglected to update
the skeleton driver.

That same year, commit f46753c5 ("PCI: introduce pci_slot") raised the
number of arguments to pci_hp_register() from one to four.

Fourteen years ago, historic commit 7ab60fc1 ("PCI Hotplug skeleton:
final cleanups") removed all usages of the retval variable from
pcihp_skel_init() but not the variable itself, provoking a compiler
warning: https://git.kernel.org/tglx/history/c/7ab60fc1b8e7

It seems fair to assume the driver hasn't been used as a template for a new
driver in a while.  Per Bjorn's and Christoph's preference, delete it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>

b4efce5c

26 Jun, 2018 2 commits

PCI: shpchp: Separate existence of SHPC and permission to use it · b03799b0

Bjorn Helgaas authored Jun 25, 2018

The shpchp driver registers for all PCI bridge devices.  Its probe method
should fail if either (1) the bridge doesn't have an SHPC or (2) the OS
isn't allowed to use it (the platform firmware may be operating the SHPC
itself).

Separate these two tests into:

  - A new shpc_capable() that looks for the SHPC hardware and is applicable
    on all systems (ACPI and non-ACPI), and

  - A simplified acpi_get_hp_hw_control_from_firmware() that we call only
    when we already know an SHPC exists and there may be ACPI methods to
    either request permission to use it (_OSC) or transfer control to the
    OS (OSHP).

acpi_get_hp_hw_control_from_firmware() is implemented when CONFIG_ACPI=y,
but does nothing if the current platform doesn't support ACPI.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

b03799b0

PCI: shpchp: Manage SHPC unconditionally on non-ACPI systems · 6f6f4246

Bjorn Helgaas authored Jun 25, 2018

An SHPC can be operated either by platform firmware or by the OS.  The OS
uses a host bridge ACPI _OSC method to negotiate for control of SHPC.  If
firmware wants to prevent an OS from operating an SHPC, it must supply an
_OSC method that declines to grant SHPC ownership to the OS.

If acpi_pci_find_root() returns NULL, it means there's no ACPI host bridge
device (PNP0A03 or PNP0A08) and hence no _OSC method, so the OS is always
allowed to manage the SHPC.

Fix a NULL pointer dereference when CONFIG_ACPI=y but the current
hardware/firmware platform doesn't support ACPI.  In that case,
acpi_get_hp_hw_control_from_firmware() is implemented but
acpi_pci_find_root() returns NULL.

Fixes: 90cc0c3c ("PCI: shpchp: Add shpchp_is_native()")
Link: https://lkml.kernel.org/r/20180621164715.28160-1-marc.zyngier@arm.comReported-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

6f6f4246

16 Jun, 2018 8 commits

Linux 4.18-rc1 · ce397d21
Linus Torvalds authored Jun 17, 2018

ce397d21

Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block · 265c5596

Linus Torvalds authored Jun 17, 2018

Pull block fixes from Jens Axboe:
 "A collection of fixes that should go into -rc1. This contains:

   - bsg_open vs bsg_unregister race fix (Anatoliy)

   - NVMe pull request from Christoph, with fixes for regressions in
     this window, FC connect/reconnect path code unification, and a
     trace point addition.

   - timeout fix (Christoph)

   - remove a few unused functions (Christoph)

   - blk-mq tag_set reinit fix (Roman)"

* tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
  bsg: fix race of bsg_open and bsg_unregister
  block: remov blk_queue_invalidate_tags
  nvme-fabrics: fix and refine state checks in __nvmf_check_ready
  nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
  nvme-fabrics: refactor queue ready check
  blk-mq: remove blk_mq_tagset_iter
  nvme: remove nvme_reinit_tagset
  nvme-fc: fix nulling of queue data on reconnect
  nvme-fc: remove reinit_request routine
  blk-mq: don't time out requests again that are in the timeout handler
  nvme-fc: change controllers first connect to use reconnect path
  nvme: don't rely on the changed namespace list log
  nvmet: free smart-log buffer after use
  nvme-rdma: fix error flow during mapping request data
  nvme: add bio remapping tracepoint
  nvme: fix NULL pointer dereference in nvme_init_subsystem
  blk-mq: reinit q->tag_set_list entry only after grace period

265c5596

Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental · 5e7b9212

Linus Torvalds authored Jun 17, 2018

Pull documentation fixes from Mauro Carvalho Chehab:
 "This solves a series of broken links for files under Documentation,
  and improves a script meant to detect such broken links (see
  scripts/documentation-file-ref-check).

  The changes on this series are:

   - can.rst: fix a footnote reference;

   - crypto_engine.rst: Fix two parsing warnings;

   - Fix a lot of broken references to Documentation/*;

   - improve the scripts/documentation-file-ref-check script, in order
     to help detecting/fixing broken references, preventing
     false-positives.

  After this patch series, only 33 broken references to doc files are
  detected by scripts/documentation-file-ref-check"

* tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
  fix a series of Documentation/ broken file name references
  Documentation: rstFlatTable.py: fix a broken reference
  ABI: sysfs-devices-system-cpu: remove a broken reference
  devicetree: fix a series of wrong file references
  devicetree: fix name of pinctrl-bindings.txt
  devicetree: fix some bindings file names
  MAINTAINERS: fix location of DT npcm files
  MAINTAINERS: fix location of some display DT bindings
  kernel-parameters.txt: fix pointers to sound parameters
  bindings: nvmem/zii: Fix location of nvmem.txt
  docs: Fix more broken references
  scripts/documentation-file-ref-check: check tools/*/Documentation
  scripts/documentation-file-ref-check: get rid of false-positives
  scripts/documentation-file-ref-check: hint: dash or underline
  scripts/documentation-file-ref-check: add a fix logic for DT
  scripts/documentation-file-ref-check: accept more wildcards at filenames
  scripts/documentation-file-ref-check: fix help message
  media: max2175: fix location of driver's companion documentation
  media: v4l: fix broken video4linux docs locations
  media: dvb: point to the location of the old README.dvb-usb file
  ...

5e7b9212

Merge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · dbb2816f

Linus Torvalds authored Jun 17, 2018

Pull fsnotify updates from Jan Kara:
 "fsnotify cleanups unifying handling of different watch types.

  This is the shortened fsnotify series from Amir with the last five
  patches pulled out. Amir has modified those patches to not change
  struct inode but obviously it's too late for those to go into this
  merge window"

* tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: add fsnotify_add_inode_mark() wrappers
  fanotify: generalize fanotify_should_send_event()
  fsnotify: generalize send_to_group()
  fsnotify: generalize iteration of marks by object type
  fsnotify: introduce marks iteration helpers
  fsnotify: remove redundant arguments to handle_event()
  fsnotify: use type id to identify connector object type

dbb2816f

Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux · 644f2639

Linus Torvalds authored Jun 17, 2018

Pull fbdev updates from Bartlomiej Zolnierkiewicz:
 "There is nothing really major here, few small fixes, some cleanups and
  dead drivers removal:

   - mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)

   - add missing module license tags to omap/omapfb driver (Arnd
     Bergmann)

   - add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
     Bergmann)

   - convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
     (Jia-Ju Bai)

   - allow COMPILE_TEST build for viafb driver (media part was reviewed
     by media subsystem Maintainer)

   - remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
     drivers (drm parts were acked by shmob-drm driver Maintainer)

   - remove unused auo_k190xfb drivers

   - misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
     Shevchenko, Colin Ian King)"

* tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
  fb_omap2: add gpiolib dependency
  video/omap: add module license tags
  MAINTAINERS: make omapfb orphan
  video: fbdev: pxafb: match_string() conversion fixup
  video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
  video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
  video: fbdev: pxafb: Convert to use match_string() helper
  video: fbdev: via: allow COMPILE_TEST build
  video: fbdev: remove unused sh_mobile_meram driver
  drm: shmobile: remove unused MERAM support
  video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
  video: fbdev: remove unused auo_k190xfb drivers
  video: omap: Improve a size determination in omapfb_do_probe()
  video: sm501fb: Improve a size determination in sm501fb_probe()
  video: fbdev-MMP: Improve a size determination in path_init()
  video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
  video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
  video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
  video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
  video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
  ...

644f2639

Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 35773c93

Linus Torvalds authored Jun 16, 2018

Pull AFS updates from Al Viro:
 "Assorted AFS stuff - ended up in vfs.git since most of that consists
  of David's AFS-related followups to Christoph's procfs series"

* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  afs: Optimise callback breaking by not repeating volume lookup
  afs: Display manually added cells in dynamic root mount
  afs: Enable IPv6 DNS lookups
  afs: Show all of a server's addresses in /proc/fs/afs/servers
  afs: Handle CONFIG_PROC_FS=n
  proc: Make inline name size calculation automatic
  afs: Implement network namespacing
  afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
  afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
  proc: Add a way to make network proc files writable
  afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
  afs: Rearrange fs/afs/proc.c to move the show routines up
  afs: Rearrange fs/afs/proc.c by moving fops and open functions down
  afs: Move /proc management functions to the end of the file

35773c93

Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 29d6849d

Linus Torvalds authored Jun 16, 2018

Pull compat updates from Al Viro:
 "Some biarch patches - getting rid of assorted (mis)uses of
  compat_alloc_user_space().

  Not much in that area this cycle..."

* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  orangefs: simplify compat ioctl handling
  signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
  vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart

29d6849d

Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · a5b729ea

Linus Torvalds authored Jun 16, 2018

Pull aio fixes from Al Viro:
 "Assorted AIO followups and fixes"

* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  eventpoll: switch to ->poll_mask
  aio: only return events requested in poll_mask() for IOCB_CMD_POLL
  eventfd: only return events requested in poll_mask()
  aio: mark __aio_sigset::sigmask const

a5b729ea

15 Jun, 2018 1 commit

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 9215310c

Linus Torvalds authored Jun 16, 2018

Pull networking fixes from David Miller:

 1) Various netfilter fixlets from Pablo and the netfilter team.

 2) Fix regression in IPVS caused by lack of PMTU exceptions on local
    routes in ipv6, from Julian Anastasov.

 3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia.

 4) Don't crash on poll in TLS, from Daniel Borkmann.

 5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things
    including Avahi mDNS. From Bart Van Assche.

 6) Missing of_node_put in qcom/emac driver, from Yue Haibing.

 7) We lack checking of the TCP checking in one special case during SYN
    receive, from Frank van der Linden.

 8) Fix module init error paths of mac80211 hwsim, from Johannes Berg.

 9) Handle 802.1ad properly in stmmac driver, from Elad Nachman.

10) Must grab HW caps before doing quirk checks in stmmac driver, from
    Jose Abreu.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
  net: stmmac: Run HWIF Quirks after getting HW caps
  neighbour: skip NTF_EXT_LEARNED entries during forced gc
  net: cxgb3: add error handling for sysfs_create_group
  tls: fix waitall behavior in tls_sw_recvmsg
  tls: fix use-after-free in tls_push_record
  l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()
  l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels
  mlxsw: spectrum_switchdev: Fix port_vlan refcounting
  mlxsw: spectrum_router: Align with new route replace logic
  mlxsw: spectrum_router: Allow appending to dev-only routes
  ipv6: Only emit append events for appended routes
  stmmac: added support for 802.1ad vlan stripping
  cfg80211: fix rcu in cfg80211_unregister_wdev
  mac80211: Move up init of TXQs
  mac80211_hwsim: fix module init error paths
  cfg80211: initialize sinfo in cfg80211_get_station
  nl80211: fix some kernel doc tag mistakes
  hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload
  rds: avoid unenecessary cong_update in loop transport
  l2tp: clean up stale tunnel or session in pppol2tp_connect's error path
  ...

9215310c