Commits · 892bf1001459fc38c7e9bc248633681ca628e614 · Kirill Smelkov / linux

24 Mar, 2020 32 commits

mmc: core: Fixup support for HW busy detection for HPI commands · 892bf100

Ulf Hansson authored Feb 04, 2020

In case the host specify a max_busy_timeout, we need to validate that the
needed timeout for the HPI command conforms to that requirement. If that's
not the case, let's convert from a R1B response to a R1 response, as to
instruct the host to avoid HW busy detection.

Additionally, when R1B is used we must also inform the host about the busy
timeout for the command, so let's do that via updating cmd.busy_timeout.

Finally, when R1B is used and in case the host supports HW busy detection,
there should be no need for doing polling, so then skip that.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-12-ulf.hansson@linaro.org

892bf100

mmc: core: Convert to mmc_poll_for_busy() for HPI commands · 490ff95f

Ulf Hansson authored Feb 04, 2020

Rather than open coding the polling loop in mmc_interrupt_hpi(), let's
convert to use mmc_poll_for_busy().

Note that, moving to mmc_poll_for_busy() for HPI also improves the
behaviour according to below.

- Adds support for polling via the optional ->card_busy() host ops.
- Require R1_READY_FOR_DATA to be set in the CMD13 response before exiting
  the polling loop.
- Adds a throttling mechanism to avoid CPU hogging when polling.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-11-ulf.hansson@linaro.org

490ff95f

mmc: core: Drop redundant out-parameter to mmc_send_hpi_cmd() · 9f94d047

Ulf Hansson authored Feb 04, 2020

The 'u32 *status' is unused by the caller, so let's drop it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-10-ulf.hansson@linaro.org

9f94d047

mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard · 0d84c3e6

Ulf Hansson authored Feb 04, 2020

Rather than open coding the polling loop in mmc_do_erase(), let's convert
to use mmc_poll_for_busy().

To allow a slightly different error parsing during polling, compared to the
__mmc_switch() case, a new in-parameter to mmc_poll_for_busy() is needed,
but other than that the conversion is straight forward.

Besides addressing the open coding issue, moving to mmc_poll_for_busy() for
erase/trim/discard improves the behaviour according to below.

- Adds support for polling via the optional ->card_busy() host ops.
- Returns zero to indicate success when the final polling attempt finds the
  card non-busy, even if the timeout expired.
- Exits the polling loop when state moves to R1_STATE_TRAN, rather than
  when leaving R1_STATE_PRG.
- Decreases the starting range for throttling to 32-64us.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-9-ulf.hansson@linaro.org

0d84c3e6

mmc: core: Update CMD13 busy check for CMD6 commands · 2a1c7cda

Ulf Hansson authored Feb 04, 2020

Through mmc_poll_for_busy() a CMD13 may be sent to get the status of the
(e)MMC card. If the state of the card is R1_STATE_PRG, the card is
considered as being busy, which means we continue to poll with CMD13. This
seems to be sufficient, but it's also unnecessary fragile, as it means a
new command/request could potentially be sent to the card when it's in an
unknown state.

To try to improve the situation, but also to move towards a more consistent
CMD13 polling behaviour in the mmc core, let's deploy the same policy we
use for regular I/O write requests. In other words, let's check that card
returns to the R1_STATE_TRAN and that the R1_READY_FOR_DATA bit is set in
the CMD13 response, before exiting the polling loop.

Note that, potentially this changed behaviour could lead to unnecessary
waiting for the timeout to expire, if the card for some reason, moves to an
unexpected error state. However, as we bail out from the polling loop when
R1_SWITCH_ERROR bit is set or when the CMD13 fails, this shouldn't be an
issue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-8-ulf.hansson@linaro.org

2a1c7cda

mmc: core: Enable re-use of mmc_blk_in_tran_state() · 40c96853

Ulf Hansson authored Feb 04, 2020

To allow subsequent changes to re-use the code from the static function
mmc_blk_in_tran_state(), let's move it to a public header. While at it,
let's also rename it to mmc_ready_for_data(), as to try to better describe
its purpose.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-7-ulf.hansson@linaro.org

40c96853

mmc: core: Split up mmc_poll_for_busy() · 6972096a

Ulf Hansson authored Feb 04, 2020

To make the code more readable, move the part that gets the busy status of
the card out into a separate function, mmc_busy_status(). Then call it from
mmc_poll_for_busy().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-6-ulf.hansson@linaro.org

6972096a

mmc: core: Drop redundant in-parameter to __mmc_switch() · 02098ccd

Ulf Hansson authored Feb 04, 2020

The use_busy_signal in-parameter is set true by all callers of
__mmc_switch(), hence it's redundant so drop it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-5-ulf.hansson@linaro.org

02098ccd

mmc: core: Extend mmc_switch_status() to rid of __mmc_switch_status() · 60db8a47

Ulf Hansson authored Feb 04, 2020

To simplify code, let's extend mmc_switch_status() to cope with needs
addressed in __mmc_switch_status(). Then move all users to the updated
mmc_switch_status() API and drop __mmc_switch_status() altogether.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-4-ulf.hansson@linaro.org

60db8a47

mmc: core: Drop unused define · ebd4f4bd

Ulf Hansson authored Feb 04, 2020

The last user of MMC_OPS_TIMEOUT_MS was recently removed, however the
define stayed around. Let's remove it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-3-ulf.hansson@linaro.org

ebd4f4bd

mmc: core: Throttle polling rate for CMD6 · d46a24a9

Ulf Hansson authored Feb 04, 2020

In mmc_poll_for_busy() we loop continuously, either by sending a CMD13 or
by invoking the ->card_busy() host ops, as to detect when the card stops
signaling busy. This behaviour is problematic as it may cause CPU hogging,
especially when the busy signal time reaches beyond a few ms.

Let's fix the issue by adding a throttling mechanism, that inserts a
usleep_range() in between the polling attempts. The sleep range starts at
32-64us, but increases for each loop by a factor of 2, up until the range
reaches ~32-64ms. In this way, we are able to keep the loop fine-grained
enough for short busy signaling times, while also not hogging the CPU for
longer times.

Note that, this change is inspired by the similar throttling mechanism that
we already use for mmc_do_erase().
Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Baolin Wang <baolin.wang7@gmail.com>
Tested-by: Ludovic Barre <ludovic.barre@st.com>
Reviewed-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200204085449.32585-2-ulf.hansson@linaro.org

d46a24a9

mmc: host: sdhci-sprd: Add software queue support · f4498549

Baolin Wang authored Feb 12, 2020

Add software queue support to improve the performance.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/f629b32943aae9e30ffa17acf4af06c270417001.1581478569.git.baolin.wang7@gmail.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

f4498549

mmc: host: sdhci: Add a variable to defer to complete requests if needed · 4730831c

Baolin Wang authored Feb 12, 2020

When using the host software queue, it will trigger the next request in
irq handler without a context switch. But the sdhci_request() can not be
called in interrupt context when using host software queue for some host
drivers, due to the get_cd() ops can be sleepable.

But for some host drivers, such as Spreadtrum host driver, the card is
nonremovable, so the get_cd() ops is not sleepable, which means we can
complete the data request and trigger the next request in irq handler
to remove the context switch for the Spreadtrum host driver.

As suggested by Adrian, we should introduce a request_atomic() API to
indicate that a request can be called in interrupt context to remove
the context switch when using mmc host software queue. But this should
be done in another thread to convert the users of mmc host software queue.
Thus we can introduce a variable in struct sdhci_host to indicate that
we will always to defer to complete requests when using the host software
queue.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/e693e7a29beb3c1922b333f4603ea81f43d5c5b1.1581478568.git.baolin.wang7@gmail.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

4730831c

mmc: host: sdhci: Add request_done ops for struct sdhci_ops · 1774b002

Baolin Wang authored Feb 12, 2020

Add request_done ops for struct sdhci_ops as a preparation in case some
host controllers have different method to complete one request, such as
supporting request completion of MMC software queue.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/1539c801c8bbdbcd1d86f8c2dab375f5803c765a.1581478568.git.baolin.wang7@gmail.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

1774b002

mmc: core: Enable the MMC host software queue for the SD card · 045d705d

Baolin Wang authored Feb 12, 2020

Enable the MMC host software queue for the SD card if the host controller
supports the MMC host software queue.

On my Spreadtrum platform, I did not see any obvious performance changes
in 4K block size when changing to use hsq for the SD cards, I think the
reason is the SD card works at a low speed on my platform, and most of
time is spent in the hardware. But we can see some obvious improvements
when enabling the packed request based on hsq, that's why we still add hsq
support for the SD cards.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/0065b4631fef2d61c3b89d14a4ea4f2b7499ea56.1581478568.git.baolin.wang7@gmail.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

045d705d

mmc: Add MMC host software queue support · 511ce378

Baolin Wang authored Feb 12, 2020

Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead and spend some extra time to poll the card for busy completion
for I/O writes via sending CMD13, especially for high I/O per second
rates, to affect the IO performance.

Thus this patch introduces MMC software queue interface based on the
hardware command queue engine's interfaces, which is similar with the
hardware command queue engine's idea, that can remove the context
switching. Moreover we set the default queue depth as 64 for software
queue, which allows more requests to be prepared, merged and inserted
into IO scheduler to improve performance, but we only allow 2 requests
in flight, that is enough to let the irq handler always trigger the
next request without a context switch, as well as avoiding a long latency.

Moreover the host controller should support HW busy detection for I/O
operations when enabling the host software queue. That means, the host
controller must not complete a data transfer request, until after the
card stops signals busy.

From the fio testing data in cover letter, we can see the software
queue can improve some performance with 4K block size, increasing
about 16% for random read, increasing about 90% for random write,
though no obvious improvement for sequential read and write.

Moreover we can expand the software queue interface to support MMC
packed request or packed command in future.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/4409c1586a9b3ed20d57ad2faf6c262fc3ccb6e2.1581478568.git.baolin.wang7@gmail.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

511ce378

mmc: sdhci-msm: Don't enable PWRSAVE_DLL for certain sdhc hosts · 219c02ca

Ritesh Harjani authored Feb 07, 2020

SDHC core with new 14lpp and later tech DLL should not enable
PWRSAVE_DLL since such controller's internal gating cannot meet
following MCLK requirement:
When MCLK is gated OFF, it is not gated for less than 0.5us and MCLK
must be switched on for at-least 1us before DATA starts coming.

Adding support for this requirement.
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Veerabhadrarao Badiganti <vbadigan@codeaurora.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1581077075-26011-1-git-send-email-vbadigan@codeaurora.orgSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

219c02ca

mmc: sdhci-of-arasan: Remove quirk for broken base clock · 2a2b8216

Manish Narani authored Jan 21, 2020

This patch removes quirk which indicates a broken base clock. This was
making the kernel report wrong base clock of ~187MHz instead of 200MHz
even as the measurement on the hardware was showing 200MHz.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1579602095-30060-5-git-send-email-manish.narani@xilinx.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

2a2b8216

mmc: sdhci-of-arasan: Add support for DLL reset for ZynqMP platforms · 8d2e3343

Manish Narani authored Jan 21, 2020

The DLL resets are required while executing the auto tuning procedure in
ZynqMP. This patch adds code to support the same.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1579602095-30060-4-git-send-email-manish.narani@xilinx.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

8d2e3343

firmware: xilinx: Add DLL reset support · 0dc64c2b

Manish Narani authored Jan 21, 2020

SD DLL resets are required for some of the operations on ZynqMP platform.
Add DLL reset support in ZynqMP firmware driver for SD DLL reset.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lore.kernel.org/r/1579602095-30060-3-git-send-email-manish.narani@xilinx.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

0dc64c2b

firmware: xilinx: Add ZynqMP Tap Delay setup ioctl to the valid list · 93660d83

Manish Narani authored Jan 21, 2020

The Tap Delay setup ioctl was not added to valid list due to which it
may fail to set Tap Delays for SD. This patch fixes the same.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lore.kernel.org/r/1579602095-30060-2-git-send-email-manish.narani@xilinx.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

93660d83

mmc: tmio: remove superfluous callback wrappers · f22084b6

Wolfram Sang authored Jan 29, 2020

After various refactoring, we can populate the mmc_ops callbacks
directly and don't need to have wrappers for them anymore.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20200129203709.30493-7-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

f22084b6

mmc: tmio: factor out TAP usage · b2dd9a13

Wolfram Sang authored Jan 29, 2020

TAPs are Renesas SDHI specific. Now that we moved all handling to the
SDHI core, we can also move the definitions from the TMIO struct to the
SDHI one.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20200129203709.30493-6-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

b2dd9a13

mmc: tmio: enforce retune after runtime suspend · a86bf70b

Wolfram Sang authored Jan 29, 2020

Currently, select_tuning() is called after RPM resume. But
select_tuning() needs some additional function calls to work correctly.
Instead of reimplementing the whole postprocessing, just enforce
retuning.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20200129203709.30493-5-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

a86bf70b

mmc: tmio: give callback a generic name · 64982b9f

Wolfram Sang authored Jan 29, 2020

check_scc_error() is too Renesas specific. Let's just call it
check_retune() to make it also easier understandable what it does.
Only a rename, no functional change.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20200129203709.30493-4-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

64982b9f

mmc: renesas_sdhi: complain loudly if driver needs update · 3a821a82

Wolfram Sang authored Jan 29, 2020

When the tap array in the driver is too low, this is not a warning but
an error. Also _once is not helpful, we should make sure it is
prominently in the logs. It is safe to do this because this will only
show up during SoC enablement when we a new SoCs needs more taps (if
that ever will happen).
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20200129203709.30493-3-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

3a821a82

mmc: tmio: refactor tuning execution into SDHI driver · 0c482d82

Wolfram Sang authored Jan 29, 2020

Move Renesas specific code for executing the tuning with a SCC into the
SDHI driver and leave only a generic call in the TMIO driver. Simplify
the code a little by removing init_tuning() and prepare_tuning()
callbacks. The latter is directly folded into the new execute_tuning()
callbacks.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20200129203709.30493-2-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

0c482d82

mmc: renesas_sdhi: cleanup SCC defines · 6199a10e

Wolfram Sang authored Dec 17, 2019

Use increasing BIT numbers consistently and remove some superfluous
comments.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20191217114034.13290-6-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

6199a10e

mmc: renesas_sdhi: enforce manual correction for Gen3 · 9b0d6855

Wolfram Sang authored Dec 17, 2019

HW engineers say that automatic tap correction cannot be used for HS400
in all R-Car Gen3 SoCs. So, check for that SDHI variant and disable it
when HS400 is about to be enabled.
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20191217114034.13290-5-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

9b0d6855

mmc: renesas_sdhi: only check CMD status for HS400 manual correction · 71cfc927

Takeshi Saito authored Dec 17, 2019

R-Car Gen3 cannot use correction error status with HS400.
HS200: CMD and DAT signal timing are based on CLK signal.
HS400: CMD signal is based on CLK. DAT signal is based on DS signal.

In HS400, CMD signal is 200MHz(SDR). DAT signal is 200MHz(DDR).
Center position of signal is different between CMD and DAT.

TAP position should be adjusted to the center position of CMD signal.
DAT sampling timing is adjusted by HS400 calibration circuit regardless
of TAP position. Refer to renesas_sdhi_adjust_hs400mode_enable().

However, correction error status contains CMD and DAT status in HS400
(DAT signal is not masked in HS400). Therefore, correction error status
cannot use in HS400. It means that auto correction cannot be uses in
HS400. Manual correction can change to the correct TAP position by
ignoring DAT correction error status and using only CMD correction
status.
Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
[wsa: refactored patch from BSP]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20191217114034.13290-4-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

71cfc927

mmc: renesas_sdhi: Add manual correction · 11a21960

Takeshi Saito authored Dec 17, 2019

This patch adds a manual correction mechanism for SDHI. Currently, SDHI
uses automatic TAP position correction. However, TAP position can also
be corrected manually via correction error status flags.
Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20191217114034.13290-3-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

11a21960

mmc: renesas_sdhi: remove double clear of automatic correction · 44f54e70

Wolfram Sang authored Dec 17, 2019

hw_reset() clears the automatic correction bit twice. I couldn't find
anything in the docs recommending that. Removing one of them didn't
cause any regressions here, so keep it simple.
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20191217114034.13290-2-wsa+renesas@sang-engineering.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>

44f54e70

23 Mar, 2020 1 commit
- Linux 5.6-rc7 · 16fbf79b
  Linus Torvalds authored Mar 22, 2020
  
  16fbf79b
22 Mar, 2020 7 commits

Merge tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 67d584e3

Linus Torvalds authored Mar 22, 2020

Pull btrfs fixes from David Sterba:
 "Two fixes.

  The first is a regression: when dropping some incompat bits the
  conditions were reversed. The other is a fix for rename whiteout
  potentially leaving stack memory linked to a list"

* tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix removal of raid[56|1c34} incompat flags after removing block group
  btrfs: fix log context list corruption after rename whiteout error

67d584e3

Merge branch 'akpm' (patches from Andrew) · b3c03db6

Linus Torvalds authored Mar 22, 2020

Merge misc fixes from Andrew Morton:
 "10 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  x86/mm: split vmalloc_sync_all()
  mm, slub: prevent kmalloc_node crashes and memory leaks
  mm/mmu_notifier: silence PROVE_RCU_LIST warnings
  epoll: fix possible lost wakeup on epoll_ctl() path
  mm: do not allow MADV_PAGEOUT for CoW pages
  mm, memcg: throttle allocators based on ancestral memory.high
  mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
  page-flags: fix a crash at SetPageError(THP_SWAP)
  mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
  memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

b3c03db6

x86/mm: split vmalloc_sync_all() · 763802b5

Joerg Roedel authored Mar 21, 2020

Commit 3f8fd02b ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path.  While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.

Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap().  But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.

To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:

	* vmalloc_sync_mappings(), and
	* vmalloc_sync_unmappings()

Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized.  The only exception is the new call-site added in the
above mentioned commit.

Shile Zhang directed us to a report of an 80% regression in reaim
throughput.

Fixes: 3f8fd02b ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Borislav Petkov <bp@suse.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[GHES]
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org
Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

763802b5

mm, slub: prevent kmalloc_node crashes and memory leaks · 0715e6c5

Vlastimil Babka authored Mar 21, 2020

Sachin reports [1] a crash in SLUB __slab_alloc():

  BUG: Kernel NULL pointer dereference on read at 0x000073b0
  Faulting instruction address: 0xc0000000003d55f4
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1
  NIP:  c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000
  REGS: c0000008b37836d0 TRAP: 0300   Not tainted  (5.6.0-rc2-next-20200218-autotest)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004844  XER: 00000000
  CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1
  GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500
  GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620
  GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000
  GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000
  GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002
  GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122
  GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8
  GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180
  NIP ___slab_alloc+0x1f4/0x760
  LR __slab_alloc+0x34/0x60
  Call Trace:
    ___slab_alloc+0x334/0x760 (unreliable)
    __slab_alloc+0x34/0x60
    __kmalloc_node+0x110/0x490
    kvmalloc_node+0x58/0x110
    mem_cgroup_css_online+0x108/0x270
    online_css+0x48/0xd0
    cgroup_apply_control_enable+0x2ec/0x4d0
    cgroup_mkdir+0x228/0x5f0
    kernfs_iop_mkdir+0x90/0xf0
    vfs_mkdir+0x110/0x230
    do_mkdirat+0xb0/0x1a0
    system_call+0x5c/0x68

This is a PowerPC platform with following NUMA topology:

  available: 2 nodes (0-1)
  node 0 cpus:
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
  node 1 size: 35247 MB
  node 1 free: 30907 MB
  node distances:
  node   0   1
    0:  10  40
    1:  40  10

  possible numa nodes: 0-31

This only happens with a mmotm patch "mm/memcontrol.c: allocate
shrinker_map on appropriate NUMA node" [2] which effectively calls
kmalloc_node for each possible node.  SLUB however only allocates
kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on
node_to_mem_node to return such valid node for other nodes since commit
a561ce00 ("slub: fall back to node_to_mem_node() node if allocating
on memoryless node").  This is however not true in this configuration
where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31,
thus it contains zeroes and get_partial() ends up accessing
non-allocated kmem_cache_node.

A related issue was reported by Bharata (originally by Ramachandran) [3]
where a similar PowerPC configuration, but with mainline kernel without
patch [2] ends up allocating large amounts of pages by kmalloc-1k
kmalloc-512.  This seems to have the same underlying issue with
node_to_mem_node() not behaving as expected, and might probably also
lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4].

This patch should fix both issues by not relying on node_to_mem_node()
anymore and instead simply falling back to NUMA_NO_NODE, when
kmalloc_node(node) is attempted for a node that's not online, or has no
usable memory.  The "usable memory" condition is also changed from
node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly
the condition that SLUB uses to allocate kmem_cache_node structures.
The check in get_partial() is removed completely, as the checks in
___slab_alloc() are now sufficient to prevent get_partial() being
reached with an invalid node.

[1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/
[2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/
[3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/

Fixes: a561ce00 ("slub: fall back to node_to_mem_node() node if allocating on memoryless node")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christopher Lameter <cl@linux.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.czDebugged-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0715e6c5

mm/mmu_notifier: silence PROVE_RCU_LIST warnings · 63886bad

Qian Cai authored Mar 21, 2020

It is safe to traverse mm->notifier_subscriptions->list either under
SRCU read lock or mm->notifier_subscriptions->lock using
hlist_for_each_entry_rcu().  Silence the PROVE_RCU_LIST false positives,
for example,

  WARNING: suspicious RCU usage
  -----------------------------
  mm/mmu_notifier.c:484 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  3 locks held by libvirtd/802:
   #0: ffff9321e3f58148 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xe1/0x3e0
   #1: ffffffff91ae6160 (mmu_notifier_invalidate_range_start){+.+.}, at: change_p4d_range+0x5fa/0x800
   #2: ffffffff91ae6e08 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x178/0x460

  stack backtrace:
  CPU: 7 PID: 802 Comm: libvirtd Tainted: G          I       5.6.0-rc6-next-20200317+ #2
  Hardware name: HP ProLiant BL460c Gen8, BIOS I31 11/02/2014
  Call Trace:
    dump_stack+0xa4/0xfe
    lockdep_rcu_suspicious+0xeb/0xf5
    __mmu_notifier_invalidate_range_start+0x3ff/0x460
    change_p4d_range+0x746/0x800
    change_protection+0x1df/0x300
    mprotect_fixup+0x245/0x3e0
    do_mprotect_pkey+0x23b/0x3e0
    __x64_sys_mprotect+0x51/0x70
    do_syscall_64+0x91/0xae8
    entry_SYSCALL_64_after_hwframe+0x49/0xb3
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Link: http://lkml.kernel.org/r/20200317175640.2047-1-cai@lca.pwSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

63886bad

epoll: fix possible lost wakeup on epoll_ctl() path · 1b53734b

Roman Penyaev authored Mar 21, 2020

This fixes possible lost wakeup introduced by commit a218cc49.
Originally modifications to ep->wq were serialized by ep->wq.lock, but
in commit a218cc49 ("epoll: use rwlock in order to reduce
ep_poll_callback() contention") a new rw lock was introduced in order to
relax fd event path, i.e. callers of ep_poll_callback() function.

After the change ep_modify and ep_insert (both are called on epoll_ctl()
path) were switched to ep->lock, but ep_poll (epoll_wait) was using
ep->wq.lock on wqueue list modification.

The bug doesn't lead to any wqueue list corruptions, because wake up
path and list modifications were serialized by ep->wq.lock internally,
but actual waitqueue_active() check prior wake_up() call can be
reordered with modifications of ep ready list, thus wake up can be lost.

And yes, can be healed by explicit smp_mb():

  list_add_tail(&epi->rdlink, &ep->rdllist);
  smp_mb();
  if (waitqueue_active(&ep->wq))
	wake_up(&ep->wp);

But let's make it simple, thus current patch replaces ep->wq.lock with
the ep->lock for wqueue modifications, thus wake up path always observes
activeness of the wqueue correcty.

Fixes: a218cc49 ("epoll: use rwlock in order to reduce ep_poll_callback() contention")
Reported-by: Max Neunhoeffer <max@arangodb.com>
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Max Neunhoeffer <max@arangodb.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jes Sorensen <jes.sorensen@gmail.com>
Cc: <stable@vger.kernel.org>	[5.1+]
Link: http://lkml.kernel.org/r/20200214170211.561524-1-rpenyaev@suse.de
References: https://bugzilla.kernel.org/show_bug.cgi?id=205933Bisected-by: Max Neunhoeffer <max@arangodb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1b53734b

mm: do not allow MADV_PAGEOUT for CoW pages · 12e967fd

Michal Hocko authored Mar 21, 2020

Jann has brought up a very interesting point [1].  While shared pages
are excluded from MADV_PAGEOUT normally, CoW pages can be easily
reclaimed that way.  This can lead to all sorts of hard to debug
problems.  E.g.  performance problems outlined by Daniel [2].

There are runtime environments where there is a substantial memory
shared among security domains via CoW memory and a easy to reclaim way
of that memory, which MADV_{COLD,PAGEOUT} offers, can lead to either
performance degradation in for the parent process which might be more
privileged or even open side channel attacks.

The feasibility of the latter is not really clear to me TBH but there is
no real reason for exposure at this stage.  It seems there is no real
use case to depend on reclaiming CoW memory via madvise at this stage so
it is much easier to simply disallow it and this is what this patch
does.  Put it simply MADV_{PAGEOUT,COLD} can operate only on the
exclusively owned memory which is a straightforward semantic.

[1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com
[2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com

Fixes: 9c276cc6 ("mm: introduce MADV_COLD")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200312082248.GS23944@dhcp22.suse.czSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

12e967fd