1. 26 Jan, 2019 40 commits
    • Junxiao Bi's avatar
      ocfs2: fix panic due to unrecovered local alloc · f976e59e
      Junxiao Bi authored
      [ Upstream commit 532e1e54 ]
      
      mount.ocfs2 ignore the inconsistent error that journal is clean but
      local alloc is unrecovered.  After mount, local alloc not empty, then
      reserver cluster didn't alloc a new local alloc window, reserveration
      map is empty(ocfs2_reservation_map.m_bitmap_len = 0), that triggered the
      following panic.
      
      This issue was reported at
      
        https://oss.oracle.com/pipermail/ocfs2-devel/2015-May/010854.html
      
      and was advised to fixed during mount.  But this is a very unusual
      inconsistent state, usually journal dirty flag should be cleared at the
      last stage of umount until every other things go right.  We may need do
      further debug to check that.  Any way to avoid possible futher
      corruption, mount should be abort and fsck should be run.
      
        (mount.ocfs2,1765,1):ocfs2_load_local_alloc:353 ERROR: Local alloc hasn't been recovered!
        found = 6518, set = 6518, taken = 8192, off = 15912372
        ocfs2: Mounting device (202,64) on (node 0, slot 3) with ordered data mode.
        o2dlm: Joining domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 8 ) 8 nodes
        ocfs2: Mounting device (202,80) on (node 0, slot 3) with ordered data mode.
        o2hb: Region 89CEAC63CC4F4D03AC185B44E0EE0F3F (xvdf) is now a quorum device
        o2net: Accepted connection from node yvwsoa17p (num 7) at 172.22.77.88:7777
        o2dlm: Node 7 joins domain 64FE421C8C984E6D96ED12C55FEE2435 ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
        o2dlm: Node 7 joins domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 7 8 ) 9 nodes
        ------------[ cut here ]------------
        kernel BUG at fs/ocfs2/reservations.c:507!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: ocfs2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ovmapi ppdev parport_pc parport xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea acpi_cpufreq pcspkr i2c_piix4 i2c_core sg ext4 jbd2 mbcache2 sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod
        CPU: 0 PID: 4349 Comm: startWebLogic.s Not tainted 4.1.12-124.19.2.el6uek.x86_64 #2
        Hardware name: Xen HVM domU, BIOS 4.4.4OVM 09/06/2018
        task: ffff8803fb04e200 ti: ffff8800ea4d8000 task.ti: ffff8800ea4d8000
        RIP: 0010:[<ffffffffa05e96a8>]  [<ffffffffa05e96a8>] __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
        Call Trace:
          ocfs2_resmap_resv_bits+0x10d/0x400 [ocfs2]
          ocfs2_claim_local_alloc_bits+0xd0/0x640 [ocfs2]
          __ocfs2_claim_clusters+0x178/0x360 [ocfs2]
          ocfs2_claim_clusters+0x1f/0x30 [ocfs2]
          ocfs2_convert_inline_data_to_extents+0x634/0xa60 [ocfs2]
          ocfs2_write_begin_nolock+0x1c6/0x1da0 [ocfs2]
          ocfs2_write_begin+0x13e/0x230 [ocfs2]
          generic_perform_write+0xbf/0x1c0
          __generic_file_write_iter+0x19c/0x1d0
          ocfs2_file_write_iter+0x589/0x1360 [ocfs2]
          __vfs_write+0xb8/0x110
          vfs_write+0xa9/0x1b0
          SyS_write+0x46/0xb0
          system_call_fastpath+0x18/0xd7
        Code: ff ff 8b 75 b8 39 75 b0 8b 45 c8 89 45 98 0f 84 e5 fe ff ff 45 8b 74 24 18 41 8b 54 24 1c e9 56 fc ff ff 85 c0 0f 85 48 ff ff ff <0f> 0b 48 8b 05 cf c3 de ff 48 ba 00 00 00 00 00 00 00 10 48 85
        RIP   __ocfs2_resv_find_window+0x498/0x760 [ocfs2]
         RSP <ffff8800ea4db668>
        ---[ end trace 566f07529f2edf3c ]---
        Kernel panic - not syncing: Fatal exception
        Kernel Offset: disabled
      
      Link: http://lkml.kernel.org/r/20181121020023.3034-2-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarYiwen Jiang <jiangyiwen@huawei.com>
      Acked-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f976e59e
    • Qian Cai's avatar
      scsi: megaraid: fix out-of-bound array accesses · c010d494
      Qian Cai authored
      [ Upstream commit c7a082e4 ]
      
      UBSAN reported those with MegaRAID SAS-3 3108,
      
      [   77.467308] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
      [   77.475402] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
      [   77.481677] CPU: 16 PID: 333 Comm: kworker/16:1 Not tainted 4.20.0-rc5+ #1
      [   77.488556] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
      [   77.495791] Workqueue: events work_for_cpu_fn
      [   77.500154] Call trace:
      [   77.502610]  dump_backtrace+0x0/0x2c8
      [   77.506279]  show_stack+0x24/0x30
      [   77.509604]  dump_stack+0x118/0x19c
      [   77.513098]  ubsan_epilogue+0x14/0x60
      [   77.516765]  __ubsan_handle_out_of_bounds+0xfc/0x13c
      [   77.521767]  mr_update_load_balance_params+0x150/0x158 [megaraid_sas]
      [   77.528230]  MR_ValidateMapInfo+0x2cc/0x10d0 [megaraid_sas]
      [   77.533825]  megasas_get_map_info+0x244/0x2f0 [megaraid_sas]
      [   77.539505]  megasas_init_adapter_fusion+0x9b0/0xf48 [megaraid_sas]
      [   77.545794]  megasas_init_fw+0x1ab4/0x3518 [megaraid_sas]
      [   77.551212]  megasas_probe_one+0x2c4/0xbe0 [megaraid_sas]
      [   77.556614]  local_pci_probe+0x7c/0xf0
      [   77.560365]  work_for_cpu_fn+0x34/0x50
      [   77.564118]  process_one_work+0x61c/0xf08
      [   77.568129]  worker_thread+0x534/0xa70
      [   77.571882]  kthread+0x1c8/0x1d0
      [   77.575114]  ret_from_fork+0x10/0x1c
      
      [   89.240332] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
      [   89.248426] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
      [   89.254700] CPU: 16 PID: 95 Comm: kworker/u130:0 Not tainted 4.20.0-rc5+ #1
      [   89.261665] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
      [   89.268903] Workqueue: events_unbound async_run_entry_fn
      [   89.274222] Call trace:
      [   89.276680]  dump_backtrace+0x0/0x2c8
      [   89.280348]  show_stack+0x24/0x30
      [   89.283671]  dump_stack+0x118/0x19c
      [   89.287167]  ubsan_epilogue+0x14/0x60
      [   89.290835]  __ubsan_handle_out_of_bounds+0xfc/0x13c
      [   89.295828]  MR_LdRaidGet+0x50/0x58 [megaraid_sas]
      [   89.300638]  megasas_build_io_fusion+0xbb8/0xd90 [megaraid_sas]
      [   89.306576]  megasas_build_and_issue_cmd_fusion+0x138/0x460 [megaraid_sas]
      [   89.313468]  megasas_queue_command+0x398/0x3d0 [megaraid_sas]
      [   89.319222]  scsi_dispatch_cmd+0x1dc/0x8a8
      [   89.323321]  scsi_request_fn+0x8e8/0xdd0
      [   89.327249]  __blk_run_queue+0xc4/0x158
      [   89.331090]  blk_execute_rq_nowait+0xf4/0x158
      [   89.335449]  blk_execute_rq+0xdc/0x158
      [   89.339202]  __scsi_execute+0x130/0x258
      [   89.343041]  scsi_probe_and_add_lun+0x2fc/0x1488
      [   89.347661]  __scsi_scan_target+0x1cc/0x8c8
      [   89.351848]  scsi_scan_channel.part.3+0x8c/0xc0
      [   89.356382]  scsi_scan_host_selected+0x130/0x1f0
      [   89.361002]  do_scsi_scan_host+0xd8/0xf0
      [   89.364927]  do_scan_async+0x9c/0x320
      [   89.368594]  async_run_entry_fn+0x138/0x420
      [   89.372780]  process_one_work+0x61c/0xf08
      [   89.376793]  worker_thread+0x13c/0xa70
      [   89.380546]  kthread+0x1c8/0x1d0
      [   89.383778]  ret_from_fork+0x10/0x1c
      
      This is because when populating Driver Map using firmware raid map, all
      non-existing VDs set their ldTgtIdToLd to 0xff, so it can be skipped later.
      
      From drivers/scsi/megaraid/megaraid_sas_base.c ,
      memset(instance->ld_ids, 0xff, MEGASAS_MAX_LD_IDS);
      
      From drivers/scsi/megaraid/megaraid_sas_fp.c ,
      /* For non existing VDs, iterate to next VD*/
      if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1))
      	continue;
      
      However, there are a few places that failed to skip those non-existing VDs
      due to off-by-one errors. Then, those 0xff leaked into MR_LdRaidGet(0xff,
      map) and triggered the out-of-bound accesses.
      
      Fixes: 51087a86 ("megaraid_sas : Extended VD support")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Acked-by: default avatarSumit Saxena <sumit.saxena@broadcom.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c010d494
    • Yanjiang Jin's avatar
      scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() · 008b4a70
      Yanjiang Jin authored
      [ Upstream commit e57b2945 ]
      
      We must free all irqs during shutdown, else kexec's 2nd kernel would hang
      in pqi_wait_for_completion_io() as below:
      
      Call trace:
      
       pqi_wait_for_completion_io
       pqi_submit_raid_request_synchronous.constprop.78+0x23c/0x310 [smartpqi]
       pqi_configure_events+0xec/0x1f8 [smartpqi]
       pqi_ctrl_init+0x814/0xca0 [smartpqi]
       pqi_pci_probe+0x400/0x46c [smartpqi]
       local_pci_probe+0x48/0xb0
       pci_device_probe+0x14c/0x1b0
       really_probe+0x218/0x3fc
       driver_probe_device+0x70/0x140
       __driver_attach+0x11c/0x134
       bus_for_each_dev+0x70/0xc8
       driver_attach+0x30/0x38
       bus_add_driver+0x1f0/0x294
       driver_register+0x74/0x12c
       __pci_register_driver+0x64/0x70
       pqi_init+0xd0/0x10000 [smartpqi]
       do_one_initcall+0x60/0x1d8
       do_init_module+0x64/0x1f8
       load_module+0x10ec/0x1350
       __se_sys_finit_module+0xd4/0x100
       __arm64_sys_finit_module+0x28/0x34
       el0_svc_handler+0x104/0x160
       el0_svc+0x8/0xc
      
      This happens only in the following combinations:
      
      1. smartpqi is built as module, not built-in;
      2. We have a disk connected to smartpqi card;
      3. Both kexec's 1st and 2nd kernels use this disk as Rootfs' mount point.
      Signed-off-by: default avatarYanjiang Jin <yanjiang.jin@hxt-semitech.com>
      Acked-by: default avatarDon Brace <don.brace@microsemi.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      008b4a70
    • Kevin Barnett's avatar
      scsi: smartpqi: correct lun reset issues · ceecd0c6
      Kevin Barnett authored
      [ Upstream commit 2ba55c98 ]
      
      Problem:
      The Linux kernel takes a logical volume offline after a LUN reset.  This is
      generally accompanied by this message in the dmesg output:
      
      Device offlined - not ready after error recovery
      
      Root Cause:
      The root cause is a "quirk" in the timeout handling in the Linux SCSI
      layer. The Linux kernel places a 30-second timeout on most media access
      commands (reads and writes) that it send to device drivers.  When a media
      access command times out, the Linux kernel goes into error recovery mode
      for the LUN that was the target of the command that timed out. Every
      command that timed out is kept on a list inside of the Linux kernel to be
      retried later. The kernel attempts to recover the command(s) that timed out
      by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and
      TEST UNIT READY commands are successful, the kernel retries the command(s)
      that timed out.
      
      Each SCSI command issued by the kernel has a result field associated with
      it. This field indicates the final result of the command (success or
      error). When a command times out, the kernel places a value in this result
      field indicating that the command timed out.
      
      The "quirk" is that after the LUN reset and TEST UNIT READY commands are
      completed, the kernel checks each command on the timed-out command list
      before retrying it. If the result field is still "timed out", the kernel
      treats that command as not having been successfully recovered for a
      retry. If the number of commands that are in this state are greater than
      two, the kernel takes the LUN offline.
      
      Fix:
      When our RAIDStack receives a LUN reset, it simply waits until all
      outstanding commands complete. Generally, all of these outstanding commands
      complete successfully. Therefore, the fix in the smartpqi driver is to
      always set the command result field to indicate success when a request
      completes successfully. This normally isn’t necessary because the result
      field is always initialized to success when the command is submitted to the
      driver. So when the command completes successfully, the result field is
      left untouched. But in this case, the kernel changes the result field
      behind the driver’s back and then expects the field to be changed by the
      driver as the commands that timed-out complete.
      Reviewed-by: default avatarDave Carroll <david.carroll@microsemi.com>
      Reviewed-by: default avatarScott Teel <scott.teel@microsemi.com>
      Signed-off-by: default avatarKevin Barnett <kevin.barnett@microsemi.com>
      Signed-off-by: default avatarDon Brace <don.brace@microsemi.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ceecd0c6
    • Parvi Kaustubhi's avatar
      IB/usnic: Fix potential deadlock · 9d3e1c09
      Parvi Kaustubhi authored
      [ Upstream commit 8036e90f ]
      
      Acquiring the rtnl lock while holding usdev_lock could result in a
      deadlock.
      
      For example:
      
      usnic_ib_query_port()
      | mutex_lock(&us_ibdev->usdev_lock)
       | ib_get_eth_speed()
        | rtnl_lock()
      
      rtnl_lock()
      | usnic_ib_netdevice_event()
       | mutex_lock(&us_ibdev->usdev_lock)
      
      This commit moves the usdev_lock acquisition after the rtnl lock has been
      released.
      
      This is safe to do because usdev_lock is not protecting anything being
      accessed in ib_get_eth_speed(). Hence, the correct order of holding locks
      (rtnl -> usdev_lock) is not violated.
      Signed-off-by: default avatarParvi Kaustubhi <pkaustub@cisco.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9d3e1c09
    • Daniel Vetter's avatar
      sysfs: Disable lockdep for driver bind/unbind files · d9c25a6f
      Daniel Vetter authored
      [ Upstream commit 4f4b3743 ]
      
      This is the much more correct fix for my earlier attempt at:
      
      https://lkml.org/lkml/2018/12/10/118
      
      Short recap:
      
      - There's not actually a locking issue, it's just lockdep being a bit
        too eager to complain about a possible deadlock.
      
      - Contrary to what I claimed the real problem is recursion on
        kn->count. Greg pointed me at sysfs_break_active_protection(), used
        by the scsi subsystem to allow a sysfs file to unbind itself. That
        would be a real deadlock, which isn't what's happening here. Also,
        breaking the active protection means we'd need to manually handle
        all the lifetime fun.
      
      - With Rafael we discussed the task_work approach, which kinda works,
        but has two downsides: It's a functional change for a lockdep
        annotation issue, and it won't work for the bind file (which needs
        to get the errno from the driver load function back to userspace).
      
      - Greg also asked why this never showed up: To hit this you need to
        unregister a 2nd driver from the unload code of your first driver. I
        guess only gpus do that. The bug has always been there, but only
        with a recent patch series did we add more locks so that lockdep
        built a chain from unbinding the snd-hda driver to the
        acpi_video_unregister call.
      
      Full lockdep splat:
      
      [12301.898799] ============================================
      [12301.898805] WARNING: possible recursive locking detected
      [12301.898811] 4.20.0-rc7+ #84 Not tainted
      [12301.898815] --------------------------------------------
      [12301.898821] bash/5297 is trying to acquire lock:
      [12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80
      [12301.898841] but task is already holding lock:
      [12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
      [12301.898856] other info that might help us debug this:
      [12301.898862]  Possible unsafe locking scenario:
      [12301.898867]        CPU0
      [12301.898870]        ----
      [12301.898874]   lock(kn->count#39);
      [12301.898879]   lock(kn->count#39);
      [12301.898883] *** DEADLOCK ***
      [12301.898891]  May be due to missing lock nesting notation
      [12301.898899] 5 locks held by bash/5297:
      [12301.898903]  #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0
      [12301.898915]  #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190
      [12301.898925]  #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190
      [12301.898936]  #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240
      [12301.898950]  #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40
      [12301.898960] stack backtrace:
      [12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ #84
      [12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
      [12301.898982] Call Trace:
      [12301.898989]  dump_stack+0x67/0x9b
      [12301.898997]  __lock_acquire+0x6ad/0x1410
      [12301.899003]  ? kernfs_remove_by_name_ns+0x3b/0x80
      [12301.899010]  ? find_held_lock+0x2d/0x90
      [12301.899017]  ? mutex_spin_on_owner+0xe4/0x150
      [12301.899023]  ? find_held_lock+0x2d/0x90
      [12301.899030]  ? lock_acquire+0x90/0x180
      [12301.899036]  lock_acquire+0x90/0x180
      [12301.899042]  ? kernfs_remove_by_name_ns+0x3b/0x80
      [12301.899049]  __kernfs_remove+0x296/0x310
      [12301.899055]  ? kernfs_remove_by_name_ns+0x3b/0x80
      [12301.899060]  ? kernfs_name_hash+0xd/0x80
      [12301.899066]  ? kernfs_find_ns+0x6c/0x100
      [12301.899073]  kernfs_remove_by_name_ns+0x3b/0x80
      [12301.899080]  bus_remove_driver+0x92/0xa0
      [12301.899085]  acpi_video_unregister+0x24/0x40
      [12301.899127]  i915_driver_unload+0x42/0x130 [i915]
      [12301.899160]  i915_pci_remove+0x19/0x30 [i915]
      [12301.899169]  pci_device_remove+0x36/0xb0
      [12301.899176]  device_release_driver_internal+0x185/0x240
      [12301.899183]  unbind_store+0xaf/0x180
      [12301.899189]  kernfs_fop_write+0x104/0x190
      [12301.899195]  __vfs_write+0x31/0x180
      [12301.899203]  ? rcu_read_lock_sched_held+0x6f/0x80
      [12301.899209]  ? rcu_sync_lockdep_assert+0x29/0x50
      [12301.899216]  ? __sb_start_write+0x13c/0x1a0
      [12301.899221]  ? vfs_write+0x17f/0x1b0
      [12301.899227]  vfs_write+0xb9/0x1b0
      [12301.899233]  ksys_write+0x50/0xc0
      [12301.899239]  do_syscall_64+0x4b/0x180
      [12301.899247]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [12301.899253] RIP: 0033:0x7f452ac7f7a4
      [12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
      [12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4
      [12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001
      [12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730
      [12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d
      [12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d
      
      Looking around I've noticed that usb and i2c already handle similar
      recursion problems, where a sysfs file can unbind the same type of
      sysfs somewhere else in the hierarchy. Relevant commits are:
      
      commit 356c05d5
      Author: Alan Stern <stern@rowland.harvard.edu>
      Date:   Mon May 14 13:30:03 2012 -0400
      
          sysfs: get rid of some lockdep false positives
      
      commit e9b526fe
      Author: Alexander Sverdlin <alexander.sverdlin@nsn.com>
      Date:   Fri May 17 14:56:35 2013 +0200
      
          i2c: suppress lockdep warning on delete_device
      
      Implement the same trick for driver bind/unbind.
      
      v2: Put the macro into bus.c (Greg).
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Ramalingam C <ramalingam.c@intel.com>
      Cc: Arend van Spriel <aspriel@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: Bartosz Golaszewski <brgl@bgdev.pl>
      Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
      Cc: Vivek Gautam <vivek.gautam@codeaurora.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9c25a6f
    • Takashi Sakamoto's avatar
      ALSA: bebob: fix model-id of unit for Apogee Ensemble · 7afd8e3b
      Takashi Sakamoto authored
      [ Upstream commit 644b2e97 ]
      
      This commit fixes hard-coded model-id for an unit of Apogee Ensemble with
      a correct value. This unit uses DM1500 ASIC produced ArchWave AG (formerly
      known as BridgeCo AG).
      
      I note that this model supports three modes in the number of data channels
      in tx/rx streams; 8 ch pairs, 10 ch pairs, 18 ch pairs. The mode is
      switched by Vendor-dependent AV/C command, like:
      
      $ cd linux-firewire-utils
      $ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0600000000 (8ch pairs)
      $ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0601000000 (10ch pairs)
      $ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0602000000 (18ch pairs)
      
      When switching between different mode, the unit disappears from IEEE 1394
      bus, then appears on the bus with different combination of stream formats.
      In a mode of 18 ch pairs, available sampling rate is up to 96.0 kHz, else
      up to 192.0 kHz.
      
      $ ./hinawa-config-rom-printer /dev/fw1
      { 'bus-info': { 'adj': False,
                      'bmc': True,
                      'chip_ID': 21474898341,
                      'cmc': True,
                      'cyc_clk_acc': 100,
                      'generation': 2,
                      'imc': True,
                      'isc': True,
                      'link_spd': 2,
                      'max_ROM': 1,
                      'max_rec': 512,
                      'name': '1394',
                      'node_vendor_ID': 987,
                      'pmc': False},
        'root-directory': [ ['HARDWARE_VERSION', 19],
                            [ 'NODE_CAPABILITIES',
                              { 'addressing': {'64': True, 'fix': True, 'prv': False},
                                'misc': {'int': False, 'ms': False, 'spt': True},
                                'state': { 'atn': False,
                                           'ded': False,
                                           'drq': True,
                                           'elo': False,
                                           'init': False,
                                           'lst': True,
                                           'off': False},
                                'testing': {'bas': False, 'ext': False}}],
                            ['VENDOR', 987],
                            ['DESCRIPTOR', 'Apogee Electronics'],
                            ['MODEL', 126702],
                            ['DESCRIPTOR', 'Ensemble'],
                            ['VERSION', 5297],
                            [ 'UNIT',
                              [ ['SPECIFIER_ID', 41005],
                                ['VERSION', 65537],
                                ['MODEL', 126702],
                                ['DESCRIPTOR', 'Ensemble']]],
                            [ 'DEPENDENT_INFO',
                              [ ['SPECIFIER_ID', 2037],
                                ['VERSION', 1],
                                [(58, 'IMMEDIATE'), 16777159],
                                [(59, 'IMMEDIATE'), 1048576],
                                [(60, 'IMMEDIATE'), 16777159],
                                [(61, 'IMMEDIATE'), 6291456]]]]}
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7afd8e3b
    • Yangtao Li's avatar
      clocksource/drivers/integrator-ap: Add missing of_node_put() · 16df3ffc
      Yangtao Li authored
      [ Upstream commit 5eb73c83 ]
      
      The function of_find_node_by_path() acquires a reference to the node
      returned by it and that reference needs to be dropped by its caller.
      
      integrator_ap_timer_init_of() doesn't do that.  The pri_node and the
      sec_node are used as an identifier to compare against the current
      node, so we can directly drop the refcount after getting the node from
      the path as it is not used as pointer.
      
      By dropping the refcount right after getting it, a single variable is
      needed instead of two.
      
      Fix this by use a single variable and drop the refcount right after
      of_find_node_by_path().
      Signed-off-by: default avatarYangtao Li <tiny.windzz@gmail.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      16df3ffc
    • Javier Barrio's avatar
      quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls. · 8c3e1bcc
      Javier Barrio authored
      [ Upstream commit 41c4f85c ]
      
      Commit 1fa5efe3 (ext4: Use generic helpers for quotaon
      and quotaoff) made possible to call quotactl(Q_XQUOTAON/OFF) on ext4 filesystems
      with sysfile quota support. This leads to calling dquot_enable/disable without s_umount
      held in excl. mode, because quotactl_cmd_onoff checks only for Q_QUOTAON/OFF.
      
      The following WARN_ON_ONCE triggers (in this case for dquot_enable, ext4, latest Linus' tree):
      
      [  117.807056] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: quota,prjquota
      
      [...]
      
      [  155.036847] WARNING: CPU: 0 PID: 2343 at fs/quota/dquot.c:2469 dquot_enable+0x34/0xb9
      [  155.036851] Modules linked in: quota_v2 quota_tree ipv6 af_packet joydev mousedev psmouse serio_raw pcspkr i2c_piix4 intel_agp intel_gtt e1000 ttm drm_kms_helper drm agpgart fb_sys_fops syscopyarea sysfillrect sysimgblt i2c_core input_leds kvm_intel kvm irqbypass qemu_fw_cfg floppy evdev parport_pc parport button crc32c_generic dm_mod ata_generic pata_acpi ata_piix libata loop ext4 crc16 mbcache jbd2 usb_storage usbcore sd_mod scsi_mod
      [  155.036901] CPU: 0 PID: 2343 Comm: qctl Not tainted 4.20.0-rc6-00025-gf5d58277 #9
      [  155.036903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [  155.036911] RIP: 0010:dquot_enable+0x34/0xb9
      [  155.036915] Code: 41 56 41 55 41 54 55 53 4c 8b 6f 28 74 02 0f 0b 4d 8d 7d 70 49 89 fc 89 cb 41 89 d6 89 f5 4c 89 ff e8 23 09 ea ff 85 c0 74 0a <0f> 0b 4c 89 ff e8 8b 09 ea ff 85 db 74 6a 41 8b b5 f8 00 00 00 0f
      [  155.036918] RSP: 0018:ffffb09b00493e08 EFLAGS: 00010202
      [  155.036922] RAX: 0000000000000001 RBX: 0000000000000008 RCX: 0000000000000008
      [  155.036924] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff9781b67cd870
      [  155.036926] RBP: 0000000000000002 R08: 0000000000000000 R09: 61c8864680b583eb
      [  155.036929] R10: ffffb09b00493e48 R11: ffffffffff7ce7d4 R12: ffff9781b7ee8d78
      [  155.036932] R13: ffff9781b67cd800 R14: 0000000000000004 R15: ffff9781b67cd870
      [  155.036936] FS:  00007fd813250b88(0000) GS:ffff9781ba000000(0000) knlGS:0000000000000000
      [  155.036939] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  155.036942] CR2: 00007fd812ff61d6 CR3: 000000007c882000 CR4: 00000000000006b0
      [  155.036951] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  155.036953] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  155.036955] Call Trace:
      [  155.037004]  dquot_quota_enable+0x8b/0xd0
      [  155.037011]  kernel_quotactl+0x628/0x74e
      [  155.037027]  ? do_mprotect_pkey+0x2a6/0x2cd
      [  155.037034]  __x64_sys_quotactl+0x1a/0x1d
      [  155.037041]  do_syscall_64+0x55/0xe4
      [  155.037078]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  155.037105] RIP: 0033:0x7fd812fe1198
      [  155.037109] Code: 02 77 0d 48 89 c1 48 c1 e9 3f 75 04 48 8b 04 24 48 83 c4 50 5b c3 48 83 ec 08 49 89 ca 48 63 d2 48 63 ff b8 b3 00 00 00 0f 05 <48> 89 c7 e8 c1 eb ff ff 5a c3 48 63 ff b8 bb 00 00 00 0f 05 48 89
      [  155.037112] RSP: 002b:00007ffe8cd7b050 EFLAGS: 00000206 ORIG_RAX: 00000000000000b3
      [  155.037116] RAX: ffffffffffffffda RBX: 00007ffe8cd7b148 RCX: 00007fd812fe1198
      [  155.037119] RDX: 0000000000000000 RSI: 00007ffe8cd7cea9 RDI: 0000000000580102
      [  155.037121] RBP: 00007ffe8cd7b0f0 R08: 000055fc8eba8a9d R09: 0000000000000000
      [  155.037124] R10: 00007ffe8cd7b074 R11: 0000000000000206 R12: 00007ffe8cd7b168
      [  155.037126] R13: 000055fc8eba8897 R14: 0000000000000000 R15: 0000000000000000
      [  155.037131] ---[ end trace 210f864257175c51 ]---
      
      and then the syscall proceeds without s_umount locking.
      
      This patch locks the superblock ->s_umount sem. in exclusive mode for all Q_XQUOTAON/OFF
      quotactls too in addition to Q_QUOTAON/OFF.
      
      AFAICT, other than ext4, only xfs and ocfs2 are affected by this change.
      The VFS will now call in xfs_quota_* functions with s_umount held, which wasn't the case
      before. This looks good to me but I can not say for sure. Ext4 and ocfs2 where already
      beeing called with s_umount exclusive via quota_quotaon/off which is basically the same.
      Signed-off-by: default avatarJavier Barrio <javier.barrio.mart@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8c3e1bcc
    • Nikos Tsironis's avatar
      dm snapshot: Fix excessive memory usage and workqueue stalls · 8189ee47
      Nikos Tsironis authored
      [ Upstream commit 721b1d98 ]
      
      kcopyd has no upper limit to the number of jobs one can allocate and
      issue. Under certain workloads this can lead to excessive memory usage
      and workqueue stalls. For example, when creating multiple dm-snapshot
      targets with a 4K chunk size and then writing to the origin through the
      page cache. Syncing the page cache causes a large number of BIOs to be
      issued to the dm-snapshot origin target, which itself issues an even
      larger (because of the BIO splitting taking place) number of kcopyd
      jobs.
      
      Running the following test, from the device mapper test suite [1],
      
        dmtest run --suite snapshot -n many_snapshots_of_same_volume_N
      
      , with 8 active snapshots, results in the kcopyd job slab cache growing
      to 10G. Depending on the available system RAM this can lead to the OOM
      killer killing user processes:
      
      [463.492878] kthreadd invoked oom-killer: gfp_mask=0x6040c0(GFP_KERNEL|__GFP_COMP),
                    nodemask=(null), order=1, oom_score_adj=0
      [463.492894] kthreadd cpuset=/ mems_allowed=0
      [463.492948] CPU: 7 PID: 2 Comm: kthreadd Not tainted 4.19.0-rc7 #3
      [463.492950] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      [463.492952] Call Trace:
      [463.492964]  dump_stack+0x7d/0xbb
      [463.492973]  dump_header+0x6b/0x2fc
      [463.492987]  ? lockdep_hardirqs_on+0xee/0x190
      [463.493012]  oom_kill_process+0x302/0x370
      [463.493021]  out_of_memory+0x113/0x560
      [463.493030]  __alloc_pages_slowpath+0xf40/0x1020
      [463.493055]  __alloc_pages_nodemask+0x348/0x3c0
      [463.493067]  cache_grow_begin+0x81/0x8b0
      [463.493072]  ? cache_grow_begin+0x874/0x8b0
      [463.493078]  fallback_alloc+0x1e4/0x280
      [463.493092]  kmem_cache_alloc_node+0xd6/0x370
      [463.493098]  ? copy_process.part.31+0x1c5/0x20d0
      [463.493105]  copy_process.part.31+0x1c5/0x20d0
      [463.493115]  ? __lock_acquire+0x3cc/0x1550
      [463.493121]  ? __switch_to_asm+0x34/0x70
      [463.493129]  ? kthread_create_worker_on_cpu+0x70/0x70
      [463.493135]  ? finish_task_switch+0x90/0x280
      [463.493165]  _do_fork+0xe0/0x6d0
      [463.493191]  ? kthreadd+0x19f/0x220
      [463.493233]  kernel_thread+0x25/0x30
      [463.493235]  kthreadd+0x1bf/0x220
      [463.493242]  ? kthread_create_on_cpu+0x90/0x90
      [463.493248]  ret_from_fork+0x3a/0x50
      [463.493279] Mem-Info:
      [463.493285] active_anon:20631 inactive_anon:4831 isolated_anon:0
      [463.493285]  active_file:80216 inactive_file:80107 isolated_file:435
      [463.493285]  unevictable:0 dirty:51266 writeback:109372 unstable:0
      [463.493285]  slab_reclaimable:31191 slab_unreclaimable:3483521
      [463.493285]  mapped:526 shmem:4903 pagetables:1759 bounce:0
      [463.493285]  free:33623 free_pcp:2392 free_cma:0
      ...
      [463.493489] Unreclaimable slab info:
      [463.493513] Name                      Used          Total
      [463.493522] bio-6                   1028KB       1028KB
      [463.493525] bio-5                   1028KB       1028KB
      [463.493528] dm_snap_pending_exception     236783KB     243789KB
      [463.493531] dm_exception              41KB         42KB
      [463.493534] bio-4                   1216KB       1216KB
      [463.493537] bio-3                 439396KB     439396KB
      [463.493539] kcopyd_job           6973427KB    6973427KB
      ...
      [463.494340] Out of memory: Kill process 1298 (ruby2.3) score 1 or sacrifice child
      [463.494673] Killed process 1298 (ruby2.3) total-vm:435740kB, anon-rss:20180kB, file-rss:4kB, shmem-rss:0kB
      [463.506437] oom_reaper: reaped process 1298 (ruby2.3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      Moreover, issuing a large number of kcopyd jobs results in kcopyd
      hogging the CPU, while processing them. As a result, processing of work
      items, queued for execution on the same CPU as the currently running
      kcopyd thread, is stalled for long periods of time, hurting performance.
      Running the aforementioned test we get, in dmesg, messages like the
      following:
      
      [67501.194592] BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 27s!
      [67501.195586] Showing busy workqueues and worker pools:
      [67501.195591] workqueue events: flags=0x0
      [67501.195597]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
      [67501.195611]     pending: cache_reap
      [67501.195641] workqueue mm_percpu_wq: flags=0x8
      [67501.195645]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
      [67501.195656]     pending: vmstat_update
      [67501.195682] workqueue kblockd: flags=0x18
      [67501.195687]   pwq 5: cpus=2 node=0 flags=0x0 nice=-20 active=1/256
      [67501.195698]     pending: blk_timeout_work
      [67501.195753] workqueue kcopyd: flags=0x8
      [67501.195757]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
      [67501.195768]     pending: do_work [dm_mod]
      [67501.195802] workqueue kcopyd: flags=0x8
      [67501.195806]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
      [67501.195817]     pending: do_work [dm_mod]
      [67501.195834] workqueue kcopyd: flags=0x8
      [67501.195838]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
      [67501.195848]     pending: do_work [dm_mod]
      [67501.195881] workqueue kcopyd: flags=0x8
      [67501.195885]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256
      [67501.195896]     pending: do_work [dm_mod]
      [67501.195920] workqueue kcopyd: flags=0x8
      [67501.195924]   pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=2/256
      [67501.195935]     in-flight: 67:do_work [dm_mod]
      [67501.195945]     pending: do_work [dm_mod]
      [67501.195961] pool 8: cpus=4 node=0 flags=0x0 nice=0 hung=27s workers=3 idle: 129 23765
      
      The root cause for these issues is the way dm-snapshot uses kcopyd. In
      particular, the lack of an explicit or implicit limit to the maximum
      number of in-flight COW jobs. The merging path is not affected because
      it implicitly limits the in-flight kcopyd jobs to one.
      
      Fix these issues by using a semaphore to limit the maximum number of
      in-flight kcopyd jobs. We grab the semaphore before allocating a new
      kcopyd job in start_copy() and start_full_bio() and release it after the
      job finishes in copy_callback().
      
      The initial semaphore value is configurable through a module parameter,
      to allow fine tuning the maximum number of in-flight COW jobs. Setting
      this parameter to zero initializes the semaphore to INT_MAX.
      
      A default value of 2048 maximum in-flight kcopyd jobs was chosen. This
      value was decided experimentally as a trade-off between memory
      consumption, stalling the kernel's workqueues and maintaining a high
      enough throughput.
      
      Re-running the aforementioned test:
      
        * Workqueue stalls are eliminated
        * kcopyd's job slab cache uses a maximum of 130MB
        * The time taken by the test to write to the snapshot-origin target is
          reduced from 05m20.48s to 03m26.38s
      
      [1] https://github.com/jthornber/device-mapper-test-suiteSigned-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarIlias Tsitsimpis <iliastsi@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8189ee47
    • Arnaldo Carvalho de Melo's avatar
      tools lib subcmd: Don't add the kernel sources to the include path · 55f603dc
      Arnaldo Carvalho de Melo authored
      [ Upstream commit ece98049 ]
      
      At some point we decided not to directly include kernel sources files
      when building tools/perf/, but when tools/lib/subcmd/ was forked from
      tools/perf it somehow ended up adding it via these two lines in its
      Makefile:
      
        CFLAGS += -I$(srctree)/include/uapi
        CFLAGS += -I$(srctree)/include
      
      As $(srctree) points to the kernel sources.
      
      Removing those lines and keeping just:
      
        CFLAGS += -I$(srctree)/tools/include/
      
      Is enough to build tools/perf and tools/objtool.
      
      This fixes the build when building from the sources in environments such
      as the Android NDK crossbuilding from a fedora:26 system:
      
        subcmd-util.h:11:15: error: expected ',' or ';' before 'void'
         static inline void report(const char *prefix, const char *err, va_list params)
                       ^
        In file included from /git/perf/include/uapi/linux/stddef.h:2:0,
                         from /git/perf/include/uapi/linux/posix_types.h:5,
                         from /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/sys/types.h:36,
                         from /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/unistd.h:33,
                         from run-command.c:2:
        subcmd-util.h:18:17: error: '__no_instrument_function__' attribute applies only to functions
      
      The /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/sys/types.h
      file that includes linux/posix_types.h ends up getting the one in the kernel
      sources causing the breakage. Fix it.
      
      Test built tools/objtool/ too.
      Reported-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 4b6ab94e ("perf subcmd: Create subcmd library")
      Link: https://lkml.kernel.org/n/tip-5lhaoecrj12t0bqwvpiu14sm@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      55f603dc
    • Nikos Tsironis's avatar
      dm kcopyd: Fix bug causing workqueue stalls · 8561f28c
      Nikos Tsironis authored
      [ Upstream commit d7e6b8df ]
      
      When using kcopyd to run callbacks through dm_kcopyd_do_callback() or
      submitting copy jobs with a source size of 0, the jobs are pushed
      directly to the complete_jobs list, which could be under processing by
      the kcopyd thread. As a result, the kcopyd thread can continue running
      completed jobs indefinitely, without releasing the CPU, as long as
      someone keeps submitting new completed jobs through the aforementioned
      paths. Processing of work items, queued for execution on the same CPU as
      the currently running kcopyd thread, is thus stalled for excessive
      amounts of time, hurting performance.
      
      Running the following test, from the device mapper test suite [1],
      
        dmtest run --suite snapshot -n parallel_io_to_many_snaps_N
      
      , with 8 active snapshots, we get, in dmesg, messages like the
      following:
      
      [68899.948523] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 95s!
      [68899.949282] Showing busy workqueues and worker pools:
      [68899.949288] workqueue events: flags=0x0
      [68899.949295]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
      [68899.949306]     pending: vmstat_shepherd, cache_reap
      [68899.949331] workqueue mm_percpu_wq: flags=0x8
      [68899.949337]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949345]     pending: vmstat_update
      [68899.949387] workqueue dm_bufio_cache: flags=0x8
      [68899.949392]   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
      [68899.949400]     pending: work_fn [dm_bufio]
      [68899.949423] workqueue kcopyd: flags=0x8
      [68899.949429]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949437]     pending: do_work [dm_mod]
      [68899.949452] workqueue kcopyd: flags=0x8
      [68899.949458]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
      [68899.949466]     in-flight: 13:do_work [dm_mod]
      [68899.949474]     pending: do_work [dm_mod]
      [68899.949487] workqueue kcopyd: flags=0x8
      [68899.949493]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949501]     pending: do_work [dm_mod]
      [68899.949515] workqueue kcopyd: flags=0x8
      [68899.949521]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949529]     pending: do_work [dm_mod]
      [68899.949541] workqueue kcopyd: flags=0x8
      [68899.949547]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
      [68899.949555]     pending: do_work [dm_mod]
      [68899.949568] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=95s workers=4 idle: 27130 27223 1084
      
      Fix this by splitting the complete_jobs list into two parts: A user
      facing part, named callback_jobs, and one used internally by kcopyd,
      retaining the name complete_jobs. dm_kcopyd_do_callback() and
      dispatch_job() now push their jobs to the callback_jobs list, which is
      spliced to the complete_jobs list once, every time the kcopyd thread
      wakes up. This prevents kcopyd from hogging the CPU indefinitely and
      causing workqueue stalls.
      
      Re-running the aforementioned test:
      
        * Workqueue stalls are eliminated
        * The maximum writing time among all targets is reduced from 09m37.10s
          to 06m04.85s and the total run time of the test is reduced from
          10m43.591s to 7m19.199s
      
      [1] https://github.com/jthornber/device-mapper-test-suiteSigned-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarIlias Tsitsimpis <iliastsi@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8561f28c
    • AliOS system security's avatar
      dm crypt: use u64 instead of sector_t to store iv_offset · 288a8efc
      AliOS system security authored
      [ Upstream commit 8d683dcd ]
      
      The iv_offset in the mapping table of crypt target is a 64bit number when
      IV algorithm is plain64, plain64be, essiv or benbi. It will be assigned to
      iv_offset of struct crypt_config, cc_sector of struct convert_context and
      iv_sector of struct dm_crypt_request. These structures members are defined
      as a sector_t. But sector_t is 32bit when CONFIG_LBDAF is not set in 32bit
      kernel. In this situation sector_t is not big enough to store the 64bit
      iv_offset.
      
      Here is a reproducer.
      Prepare test image and device (loop is automatically allocated by cryptsetup):
      
        # dd if=/dev/zero of=tst.img bs=1M count=1
        # echo "tst"|cryptsetup open --type plain -c aes-xts-plain64 \
        --skip 500000000000000000 tst.img test
      
      On 32bit system (use IV offset value that overflows to 64bit; CONFIG_LBDAF if off)
      and device checksum is wrong:
      
        # dmsetup table test --showkeys
        0 2048 crypt aes-xts-plain64 dfa7cfe3c481f2239155739c42e539ae8f2d38f304dcc89d20b26f69daaf0933 3551657984 7:0 0
      
        # sha256sum /dev/mapper/test
        533e25c09176632b3794f35303488c4a8f3f965dffffa6ec2df347c168cb6c19 /dev/mapper/test
      
      On 64bit system (and on 32bit system with the patch), table and checksum is now correct:
      
        # dmsetup table test --showkeys
        0 2048 crypt aes-xts-plain64 dfa7cfe3c481f2239155739c42e539ae8f2d38f304dcc89d20b26f69daaf0933 500000000000000000 7:0 0
      
        # sha256sum /dev/mapper/test
        5d16160f9d5f8c33d8051e65fdb4f003cc31cd652b5abb08f03aa6fce0df75fc /dev/mapper/test
      Signed-off-by: default avatarAliOS system security <alios_sys_security@linux.alibaba.com>
      Tested-and-Reviewed-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      288a8efc
    • Taehee Yoo's avatar
      netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is set · e6b1503c
      Taehee Yoo authored
      [ Upstream commit 06aa151a ]
      
      If same destination IP address config is already existing, that config is
      just used. MAC address also should be same.
      However, there is no MAC address checking routine.
      So that MAC address checking routine is added.
      
      test commands:
         %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
      	   -j CLUSTERIP --new --hashmode sourceip \
      	   --clustermac 01:00:5e:00:00:20 --total-nodes 2 --local-node 1
         %iptables -A INPUT -p tcp -i lo -d 192.168.0.5 --dport 80 \
      	   -j CLUSTERIP --new --hashmode sourceip \
      	   --clustermac 01:00:5e:00:00:21 --total-nodes 2 --local-node 1
      
      After this patch, above commands are disallowed.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e6b1503c
    • Arnaldo Carvalho de Melo's avatar
      perf parse-events: Fix unchecked usage of strncpy() · 05f94c60
      Arnaldo Carvalho de Melo authored
      [ Upstream commit bd8d57fb ]
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        util/parse-events.c: In function 'print_symbol_events':
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'print_symbol_events.constprop',
            inlined from 'print_events' at util/parse-events.c:2508:2:
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'print_symbol_events.constprop',
            inlined from 'print_events' at util/parse-events.c:2511:2:
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 947b4ad1 ("perf list: Fix max event string size")
      Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      05f94c60
    • Arnaldo Carvalho de Melo's avatar
      perf svghelper: Fix unchecked usage of strncpy() · 9010bb9e
      Arnaldo Carvalho de Melo authored
      [ Upstream commit 2f530253 ]
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      In this specific case this would only happen if fgets() was buggy, as
      its man page states that it should read one less byte than the size of
      the destination buffer, so that it can put the nul byte at the end of
      it, so it would never copy 255 non-nul chars, as fgets reads into the
      orig buffer at most 254 non-nul chars and terminates it. But lets just
      switch to strlcpy to keep the original intent and silence the gcc 8.2
      warning.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        In function 'cpu_model',
            inlined from 'svg_cpu_box' at util/svghelper.c:378:2:
        util/svghelper.c:337:5: error: 'strncpy' output may be truncated copying 255 bytes from a string of length 255 [-Werror=stringop-truncation]
             strncpy(cpu_m, &buf[13], 255);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Fixes: f48d55ce ("perf: Add a SVG helper library file")
      Link: https://lkml.kernel.org/n/tip-xzkoo0gyr56gej39ltivuh9g@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9010bb9e
    • Adrian Hunter's avatar
      perf intel-pt: Fix error with config term "pt=0" · e5ae88fa
      Adrian Hunter authored
      [ Upstream commit 1c6f709b ]
      
      Users should never use 'pt=0', but if they do it may give a meaningless
      error:
      
      	$ perf record -e intel_pt/pt=0/u uname
      	Error:
      	The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
      	event (intel_pt/pt=0/u).
      
      Fix that by forcing 'pt=1'.
      
      Committer testing:
      
        # perf record -e intel_pt/pt=0/u uname
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u).
        /bin/dmesg | grep -i perf may provide additional information.
      
        # perf record -e intel_pt/pt=0/u uname
        pt=0 doesn't make sense, forcing pt=1
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.020 MB perf.data ]
        #
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e5ae88fa
    • Sergey Senozhatsky's avatar
      tty/serial: do not free trasnmit buffer page under port lock · 03543845
      Sergey Senozhatsky authored
      [ Upstream commit d7240214 ]
      
      LKP has hit yet another circular locking dependency between uart
      console drivers and debugobjects [1]:
      
           CPU0                                    CPU1
      
                                                  rhltable_init()
                                                   __init_work()
                                                    debug_object_init
           uart_shutdown()                          /* db->lock */
            /* uart_port->lock */                    debug_print_object()
             free_page()                              printk()
                                                       call_console_drivers()
              debug_check_no_obj_freed()                /* uart_port->lock */
               /* db->lock */
                debug_print_object()
      
      So there are two dependency chains:
      	uart_port->lock -> db->lock
      And
      	db->lock -> uart_port->lock
      
      This particular circular locking dependency can be addressed in several
      ways:
      
      a) One way would be to move debug_print_object() out of db->lock scope
         and, thus, break the db->lock -> uart_port->lock chain.
      b) Another one would be to free() transmit buffer page out of db->lock
         in UART code; which is what this patch does.
      
      It makes sense to apply a) and b) independently: there are too many things
      going on behind free(), none of which depend on uart_port->lock.
      
      The patch fixes transmit buffer page free() in uart_shutdown() and,
      additionally, in uart_port_startup() (as was suggested by Dmitry Safonov).
      
      [1] https://lore.kernel.org/lkml/20181211091154.GL23332@shao2-debian/T/#uSigned-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Dmitry Safonov <dima@arista.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      03543845
    • Johannes Thumshirn's avatar
      btrfs: improve error handling of btrfs_add_link · 6b337c59
      Johannes Thumshirn authored
      [ Upstream commit 1690dd41 ]
      
      In the error handling block, err holds the return value of either
      btrfs_del_root_ref() or btrfs_del_inode_ref() but it hasn't been checked
      since it's introduction with commit fe66a05a (Btrfs: improve error
      handling for btrfs_insert_dir_item callers) in 2012.
      
      If the error handling in the error handling fails, there's not much left
      to do and the abort either happened earlier in the callees or is
      necessary here.
      
      So if one of btrfs_del_root_ref() or btrfs_del_inode_ref() failed, abort
      the transaction, but still return the original code of the failure
      stored in 'ret' as this will be reported to the user.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6b337c59
    • Jonas Danielsson's avatar
      mmc: atmel-mci: do not assume idle after atmci_request_end · 7b83b9a2
      Jonas Danielsson authored
      [ Upstream commit ae460c11 ]
      
      On our AT91SAM9260 board we use the same sdio bus for wifi and for the
      sd card slot. This caused the atmel-mci to give the following splat on
      the serial console:
      
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 538 at drivers/mmc/host/atmel-mci.c:859 atmci_send_command+0x24/0x44
        Modules linked in:
        CPU: 0 PID: 538 Comm: mmcqd/0 Not tainted 4.14.76 #14
        Hardware name: Atmel AT91SAM9
        [<c000fccc>] (unwind_backtrace) from [<c000d3dc>] (show_stack+0x10/0x14)
        [<c000d3dc>] (show_stack) from [<c0017644>] (__warn+0xd8/0xf4)
        [<c0017644>] (__warn) from [<c0017704>] (warn_slowpath_null+0x1c/0x24)
        [<c0017704>] (warn_slowpath_null) from [<c033bb9c>] (atmci_send_command+0x24/0x44)
        [<c033bb9c>] (atmci_send_command) from [<c033e984>] (atmci_start_request+0x1f4/0x2dc)
        [<c033e984>] (atmci_start_request) from [<c033f3b4>] (atmci_request+0xf0/0x164)
        [<c033f3b4>] (atmci_request) from [<c0327108>] (mmc_start_request+0x280/0x2d0)
        [<c0327108>] (mmc_start_request) from [<c032800c>] (mmc_start_areq+0x230/0x330)
        [<c032800c>] (mmc_start_areq) from [<c03366f8>] (mmc_blk_issue_rw_rq+0xc4/0x310)
        [<c03366f8>] (mmc_blk_issue_rw_rq) from [<c03372c4>] (mmc_blk_issue_rq+0x118/0x5ac)
        [<c03372c4>] (mmc_blk_issue_rq) from [<c033781c>] (mmc_queue_thread+0xc4/0x118)
        [<c033781c>] (mmc_queue_thread) from [<c002daf8>] (kthread+0x100/0x118)
        [<c002daf8>] (kthread) from [<c000a580>] (ret_from_fork+0x14/0x34)
        ---[ end trace 594371ddfa284bd6 ]---
      
      This is:
        WARN_ON(host->cmd);
      
      This was fixed on our board by letting atmci_request_end determine what
      state we are in. Instead of unconditionally setting it to STATE_IDLE on
      STATE_END_REQUEST.
      Signed-off-by: default avatarJonas Danielsson <jonas@orbital-systems.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7b83b9a2
    • Masahiro Yamada's avatar
      kconfig: fix memory leak when EOF is encountered in quotation · 3f96ff44
      Masahiro Yamada authored
      [ Upstream commit fbac5977 ]
      
      An unterminated string literal followed by new line is passed to the
      parser (with "multi-line strings not supported" warning shown), then
      handled properly there.
      
      On the other hand, an unterminated string literal at end of file is
      never passed to the parser, then results in memory leak.
      
      [Test Code]
      
        ----------(Kconfig begin)----------
        source "Kconfig.inc"
      
        config A
                bool "a"
        -----------(Kconfig end)-----------
      
        --------(Kconfig.inc begin)--------
        config B
                bool "b\No new line at end of file
        ---------(Kconfig.inc end)---------
      
      [Summary from Valgrind]
      
        Before the fix:
      
          LEAK SUMMARY:
             definitely lost: 16 bytes in 1 blocks
             ...
      
        After the fix:
      
          LEAK SUMMARY:
             definitely lost: 0 bytes in 0 blocks
             ...
      
      Eliminate the memory leak path by handling this case. Of course, such
      a Kconfig file is wrong already, so I will add an error message later.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3f96ff44
    • Masahiro Yamada's avatar
      kconfig: fix file name and line number of warn_ignored_character() · 407148e8
      Masahiro Yamada authored
      [ Upstream commit 77c1c0fa ]
      
      Currently, warn_ignore_character() displays invalid file name and
      line number.
      
      The lexer should use current_file->name and yylineno, while the parser
      should use zconf_curname() and zconf_lineno().
      
      This difference comes from that the lexer is always going ahead
      of the parser. The parser needs to look ahead one token to make a
      shift/reduce decision, so the lexer is requested to scan more text
      from the input file.
      
      This commit fixes the warning message from warn_ignored_character().
      
      [Test Code]
      
        ----(Kconfig begin)----
        /
        -----(Kconfig end)-----
      
      [Output]
      
        Before the fix:
      
        <none>:0:warning: ignoring unsupported character '/'
      
        After the fix:
      
        Kconfig:1:warning: ignoring unsupported character '/'
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      407148e8
    • Will Deacon's avatar
      arm64: Fix minor issues with the dcache_by_line_op macro · ca5664c3
      Will Deacon authored
      [ Upstream commit 33309ecd ]
      
      The dcache_by_line_op macro suffers from a couple of small problems:
      
      First, the GAS directives that are currently being used rely on
      assembler behavior that is not documented, and probably not guaranteed
      to produce the correct behavior going forward. As a result, we end up
      with some undefined symbols in cache.o:
      
      $ nm arch/arm64/mm/cache.o
               ...
               U civac
               ...
               U cvac
               U cvap
               U cvau
      
      This is due to the fact that the comparisons used to select the
      operation type in the dcache_by_line_op macro are comparing symbols
      not strings, and even though it seems that GAS is doing the right
      thing here (undefined symbols by the same name are equal to each
      other), it seems unwise to rely on this.
      
      Second, when patching in a DC CVAP instruction on CPUs that support it,
      the fallback path consists of a DC CVAU instruction which may be
      affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.
      
      Solve these issues by unrolling the various maintenance routines and
      using the conditional directives that are documented as operating on
      strings. To avoid the complexity of nested alternatives, we move the
      DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
      to __clean_dcache_area_poc if DCPOP is not supported by the CPU.
      Reported-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Suggested-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ca5664c3
    • Lucas Stach's avatar
      clk: imx6q: reset exclusive gates on init · e107641b
      Lucas Stach authored
      [ Upstream commit f7542d81 ]
      
      The exclusive gates may be set up in the wrong way by software running
      before the clock driver comes up. In that case the exclusive setup is
      locked in its initial state, as the complementary function can't be
      activated without disabling the initial setup first.
      
      To avoid this lock situation, reset the exclusive gates to the off
      state and allow the kernel to provide the proper setup.
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Reviewed-by: default avatarDong Aisheng <Aisheng.dong@nxp.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e107641b
    • Dmitry V. Levin's avatar
      selftests: do not macro-expand failed assertion expressions · eef92871
      Dmitry V. Levin authored
      [ Upstream commit b708a3cc ]
      
      I've stumbled over the current macro-expand behaviour of the test
      harness:
      
      $ gcc -Wall -xc - <<'__EOF__'
      TEST(macro) {
      	int status = 0;
      	ASSERT_TRUE(WIFSIGNALED(status));
      }
      TEST_HARNESS_MAIN
      __EOF__
      $ ./a.out
      [==========] Running 1 tests from 1 test cases.
      [ RUN      ] global.macro
      <stdin>:4:global.macro:Expected 0 (0) != (((signed char) (((status) & 0x7f) + 1) >> 1) > 0) (0)
      global.macro: Test terminated by assertion
      [     FAIL ] global.macro
      [==========] 0 / 1 tests passed.
      [  FAILED  ]
      
      With this change the output of the same test looks much more
      comprehensible:
      
      [==========] Running 1 tests from 1 test cases.
      [ RUN      ] global.macro
      <stdin>:4:global.macro:Expected 0 (0) != WIFSIGNALED(status) (0)
      global.macro: Test terminated by assertion
      [     FAIL ] global.macro
      [==========] 0 / 1 tests passed.
      [  FAILED  ]
      
      The issue is very similar to the bug fixed in glibc assert(3)
      three years ago:
      https://sourceware.org/bugzilla/show_bug.cgi?id=18604
      
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Will Drewry <wad@chromium.org>
      Cc: linux-kselftest@vger.kernel.org
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      eef92871
    • David Disseldorp's avatar
      scsi: target: use consistent left-aligned ASCII INQUIRY data · ff00df41
      David Disseldorp authored
      [ Upstream commit 0de26357 ]
      
      spc5r17.pdf specifies:
      
        4.3.1 ASCII data field requirements
        ASCII data fields shall contain only ASCII printable characters (i.e.,
        code values 20h to 7Eh) and may be terminated with one or more ASCII null
        (00h) characters.  ASCII data fields described as being left-aligned
        shall have any unused bytes at the end of the field (i.e., highest
        offset) and the unused bytes shall be filled with ASCII space characters
        (20h).
      
      LIO currently space-pads the T10 VENDOR IDENTIFICATION and PRODUCT
      IDENTIFICATION fields in the standard INQUIRY data. However, the PRODUCT
      REVISION LEVEL field in the standard INQUIRY data as well as the T10 VENDOR
      IDENTIFICATION field in the INQUIRY Device Identification VPD Page are
      zero-terminated/zero-padded.
      
      Fix this inconsistency by using space-padding for all of the above fields.
      Signed-off-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBryant G. Ly <bly@catalogicsoftware.com>
      Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarRoman Bolshakov <r.bolshakov@yadro.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ff00df41
    • yupeng's avatar
      net: call sk_dst_reset when set SO_DONTROUTE · e1bca2f4
      yupeng authored
      [ Upstream commit 0fbe82e6 ]
      
      after set SO_DONTROUTE to 1, the IP layer should not route packets if
      the dest IP address is not in link scope. But if the socket has cached
      the dst_entry, such packets would be routed until the sk_dst_cache
      expires. So we should clean the sk_dst_cache when a user set
      SO_DONTROUTE option. Below are server/client python scripts which
      could reprodue this issue:
      
      server side code:
      
      ==========================================================================
      import socket
      import struct
      import time
      
      s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      s.bind(('0.0.0.0', 9000))
      s.listen(1)
      sock, addr = s.accept()
      sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1))
      while True:
          sock.send(b'foo')
          time.sleep(1)
      ==========================================================================
      
      client side code:
      ==========================================================================
      import socket
      import time
      
      s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      s.connect(('server_address', 9000))
      while True:
          data = s.recv(1024)
          print(data)
      ==========================================================================
      Signed-off-by: default avataryupeng <yupeng0921@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e1bca2f4
    • Vivek Gautam's avatar
      media: venus: core: Set dma maximum segment size · ec957dab
      Vivek Gautam authored
      [ Upstream commit de2563bc ]
      
      Turning on CONFIG_DMA_API_DEBUG_SG results in the following error:
      
      [  460.308650] ------------[ cut here ]------------
      [  460.313490] qcom-venus aa00000.video-codec: DMA-API: mapping sg segment longer than device claims to support [len=4194304] [max=65536]
      [  460.326017] WARNING: CPU: 3 PID: 3555 at src/kernel/dma/debug.c:1301 debug_dma_map_sg+0x174/0x254
      [  460.338888] Modules linked in: venus_dec venus_enc videobuf2_dma_sg videobuf2_memops hci_uart btqca bluetooth venus_core v4l2_mem2mem videobuf2_v4l2 videobuf2_common ath10k_snoc ath10k_core ath lzo lzo_compress zramjoydev
      [  460.375811] CPU: 3 PID: 3555 Comm: V4L2DecoderThre Tainted: G        W         4.19.1 #82
      [  460.384223] Hardware name: Google Cheza (rev1) (DT)
      [  460.389251] pstate: 60400009 (nZCv daif +PAN -UAO)
      [  460.394191] pc : debug_dma_map_sg+0x174/0x254
      [  460.398680] lr : debug_dma_map_sg+0x174/0x254
      [  460.403162] sp : ffffff80200c37d0
      [  460.406583] x29: ffffff80200c3830 x28: 0000000000010000
      [  460.412056] x27: 00000000ffffffff x26: ffffffc0f785ea80
      [  460.417532] x25: 0000000000000000 x24: ffffffc0f4ea1290
      [  460.423001] x23: ffffffc09e700300 x22: ffffffc0f4ea1290
      [  460.428470] x21: ffffff8009037000 x20: 0000000000000001
      [  460.433936] x19: ffffff80091b0000 x18: 0000000000000000
      [  460.439411] x17: 0000000000000000 x16: 000000000000f251
      [  460.444885] x15: 0000000000000006 x14: 0720072007200720
      [  460.450354] x13: ffffff800af536e0 x12: 0000000000000000
      [  460.455822] x11: 0000000000000000 x10: 0000000000000000
      [  460.461288] x9 : 537944d9c6c48d00 x8 : 537944d9c6c48d00
      [  460.466758] x7 : 0000000000000000 x6 : ffffffc0f8d98f80
      [  460.472230] x5 : 0000000000000000 x4 : 0000000000000000
      [  460.477703] x3 : 000000000000008a x2 : ffffffc0fdb13948
      [  460.483170] x1 : ffffffc0fdb0b0b0 x0 : 000000000000007a
      [  460.488640] Call trace:
      [  460.491165]  debug_dma_map_sg+0x174/0x254
      [  460.495307]  vb2_dma_sg_alloc+0x260/0x2dc [videobuf2_dma_sg]
      [  460.501150]  __vb2_queue_alloc+0x164/0x374 [videobuf2_common]
      [  460.507076]  vb2_core_reqbufs+0xfc/0x23c [videobuf2_common]
      [  460.512815]  vb2_reqbufs+0x44/0x5c [videobuf2_v4l2]
      [  460.517853]  v4l2_m2m_reqbufs+0x44/0x78 [v4l2_mem2mem]
      [  460.523144]  v4l2_m2m_ioctl_reqbufs+0x1c/0x28 [v4l2_mem2mem]
      [  460.528976]  v4l_reqbufs+0x30/0x40
      [  460.532480]  __video_do_ioctl+0x36c/0x454
      [  460.536610]  video_usercopy+0x25c/0x51c
      [  460.540572]  video_ioctl2+0x38/0x48
      [  460.544176]  v4l2_ioctl+0x60/0x74
      [  460.547602]  do_video_ioctl+0x948/0x3520
      [  460.551648]  v4l2_compat_ioctl32+0x60/0x98
      [  460.555872]  __arm64_compat_sys_ioctl+0x134/0x20c
      [  460.560718]  el0_svc_common+0x9c/0xe4
      [  460.564498]  el0_svc_compat_handler+0x2c/0x38
      [  460.568982]  el0_svc_compat+0x8/0x18
      [  460.572672] ---[ end trace ce209b87b2f3af88 ]---
      
      >From above warning one would deduce that the sg segment will overflow
      the device's capacity. In reality, the hardware can accommodate larger
      sg segments.
      So, initialize the max segment size properly to weed out this warning.
      
      Based on a similar patch sent by Sean Paul for mdss:
      https://patchwork.kernel.org/patch/10671457/Signed-off-by: default avatarVivek Gautam <vivek.gautam@codeaurora.org>
      Acked-by: default avatarStanimir Varbanov <stanimir.varbanov@linaro.org>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ec957dab
    • Nathan Chancellor's avatar
      media: firewire: Fix app_info parameter type in avc_ca{,_app}_info · b851aa7f
      Nathan Chancellor authored
      [ Upstream commit b2e9a4ed ]
      
      Clang warns:
      
      drivers/media/firewire/firedtv-avc.c:999:45: warning: implicit
      conversion from 'int' to 'char' changes value from 159 to -97
      [-Wconstant-conversion]
              app_info[0] = (EN50221_TAG_APP_INFO >> 16) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      drivers/media/firewire/firedtv-avc.c:1000:45: warning: implicit
      conversion from 'int' to 'char' changes value from 128 to -128
      [-Wconstant-conversion]
              app_info[1] = (EN50221_TAG_APP_INFO >>  8) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      drivers/media/firewire/firedtv-avc.c:1040:44: warning: implicit
      conversion from 'int' to 'char' changes value from 159 to -97
      [-Wconstant-conversion]
              app_info[0] = (EN50221_TAG_CA_INFO >> 16) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      drivers/media/firewire/firedtv-avc.c:1041:44: warning: implicit
      conversion from 'int' to 'char' changes value from 128 to -128
      [-Wconstant-conversion]
              app_info[1] = (EN50221_TAG_CA_INFO >>  8) & 0xff;
                          ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
      4 warnings generated.
      
      Change app_info's type to unsigned char to match the type of the
      member msg in struct ca_msg, which is the only thing passed into the
      app_info parameter in this function.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/105Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b851aa7f
    • Breno Leitao's avatar
      powerpc/pseries/cpuidle: Fix preempt warning · a692fe3b
      Breno Leitao authored
      [ Upstream commit 2b038cbc ]
      
      When booting a pseries kernel with PREEMPT enabled, it dumps the
      following warning:
      
         BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
         caller is pseries_processor_idle_init+0x5c/0x22c
         CPU: 13 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc3-00090-g12201a0128bc-dirty #828
         Call Trace:
         [c000000429437ab0] [c0000000009c8878] dump_stack+0xec/0x164 (unreliable)
         [c000000429437b00] [c0000000005f2f24] check_preemption_disabled+0x154/0x160
         [c000000429437b90] [c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c
         [c000000429437c10] [c000000000010ed4] do_one_initcall+0x64/0x300
         [c000000429437ce0] [c000000000c54500] kernel_init_freeable+0x3f0/0x500
         [c000000429437db0] [c0000000000112dc] kernel_init+0x2c/0x160
         [c000000429437e20] [c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c
      
      This happens because the code calls get_lppaca() which calls
      get_paca() and it checks if preemption is disabled through
      check_preemption_disabled().
      
      Preemption should be disabled because the per CPU variable may make no
      sense if there is a preemption (and a CPU switch) after it reads the
      per CPU data and when it is used.
      
      In this device driver specifically, it is not a problem, because this
      code just needs to have access to one lppaca struct, and it does not
      matter if it is the current per CPU lppaca struct or not (i.e. when
      there is a preemption and a CPU migration).
      
      That said, the most appropriate fix seems to be related to avoiding
      the debug_smp_processor_id() call at get_paca(), instead of calling
      preempt_disable() before get_paca().
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a692fe3b
    • Breno Leitao's avatar
      powerpc/xmon: Fix invocation inside lock region · e37cb09e
      Breno Leitao authored
      [ Upstream commit 8d4a8622 ]
      
      Currently xmon needs to get devtree_lock (through rtas_token()) during its
      invocation (at crash time). If there is a crash while devtree_lock is being
      held, then xmon tries to get the lock but spins forever and never get into
      the interactive debugger, as in the following case:
      
      	int *ptr = NULL;
      	raw_spin_lock_irqsave(&devtree_lock, flags);
      	*ptr = 0xdeadbeef;
      
      This patch avoids calling rtas_token(), thus trying to get the same lock,
      at crash time. This new mechanism proposes getting the token at
      initialization time (xmon_init()) and just consuming it at crash time.
      
      This would allow xmon to be possible invoked independent of devtree_lock
      being held or not.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Reviewed-by: default avatarThiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e37cb09e
    • Joel Fernandes (Google)'s avatar
      pstore/ram: Do not treat empty buffers as valid · e1e3a467
      Joel Fernandes (Google) authored
      [ Upstream commit 30696378 ]
      
      The ramoops backend currently calls persistent_ram_save_old() even
      if a buffer is empty. While this appears to work, it is does not seem
      like the right thing to do and could lead to future bugs so lets avoid
      that. It also prevents misleading prints in the logs which claim the
      buffer is valid.
      
      I got something like:
      
      	found existing buffer, size 0, start 0
      
      When I was expecting:
      
      	no valid data in buffer (sig = ...)
      
      This bails out early (and reports with pr_debug()), since it's an
      acceptable state.
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e1e3a467
    • A.s. Dong's avatar
      clk: imx: make mux parent strings const · bf34ede3
      A.s. Dong authored
      [ Upstream commit 9e5ef7a5 ]
      
      As the commit 2893c379 ("clk: make strings in parent name arrays
      const"), let's make the parent strings const, otherwise we may meet
      the following warning when compiling:
      
      drivers/clk/imx/clk-imx7ulp.c: In function 'imx7ulp_clocks_init':
      drivers/clk/imx/clk-imx7ulp.c:73:35: warning: passing argument 5 of
      	'imx_clk_mux_flags' discards 'const' qualifier from pointer target type
      
        clks[IMX7ULP_CLK_APLL_PRE_SEL] = imx_clk_mux_flags("apll_pre_sel", base + 0x508, 0,
      	1, pll_pre_sels, ARRAY_SIZE(pll_pre_sels), CLK_SET_PARENT_GATE);
                                         ^
      In file included from drivers/clk/imx/clk-imx7ulp.c:23:0:
      drivers/clk/imx/clk.h:200:27: note: expected 'const char **' but argument is
       of type 'const char * const*'
      ...
      
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarDong Aisheng <aisheng.dong@nxp.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf34ede3
    • Daniel Santos's avatar
      jffs2: Fix use of uninitialized delayed_work, lockdep breakage · f63e5b19
      Daniel Santos authored
      [ Upstream commit a788c527 ]
      
      jffs2_sync_fs makes the assumption that if CONFIG_JFFS2_FS_WRITEBUFFER
      is defined then a write buffer is available and has been initialized.
      However, this does is not the case when the mtd device has no
      out-of-band buffer:
      
      int jffs2_nand_flash_setup(struct jffs2_sb_info *c)
      {
              if (!c->mtd->oobsize)
                      return 0;
      ...
      
      The resulting call to cancel_delayed_work_sync passing a uninitialized
      (but zeroed) delayed_work struct forces lockdep to become disabled.
      
      [   90.050639] overlayfs: upper fs does not support tmpfile.
      [   90.652264] INFO: trying to register non-static key.
      [   90.662171] the code is fine but needs lockdep annotation.
      [   90.673090] turning off the locking correctness validator.
      [   90.684021] CPU: 0 PID: 1762 Comm: mount_root Not tainted 4.14.63 #0
      [   90.696672] Stack : 00000000 00000000 80d8f6a2 00000038 805f0000 80444600 8fe364f4 805dfbe7
      [   90.713349]         80563a30 000006e2 8068370c 00000001 00000000 00000001 8e2fdc48 ffffffff
      [   90.730020]         00000000 00000000 80d90000 00000000 00000106 00000000 6465746e 312e3420
      [   90.746690]         6b636f6c 03bf0000 f8000000 20676e69 00000000 80000000 00000000 8e2c2a90
      [   90.763362]         80d90000 00000001 00000000 8e2c2a90 00000003 80260dc0 08052098 80680000
      [   90.780033]         ...
      [   90.784902] Call Trace:
      [   90.789793] [<8000f0d8>] show_stack+0xb8/0x148
      [   90.798659] [<8005a000>] register_lock_class+0x270/0x55c
      [   90.809247] [<8005cb64>] __lock_acquire+0x13c/0xf7c
      [   90.818964] [<8005e314>] lock_acquire+0x194/0x1dc
      [   90.828345] [<8003f27c>] flush_work+0x200/0x24c
      [   90.837374] [<80041dfc>] __cancel_work_timer+0x158/0x210
      [   90.847958] [<801a8770>] jffs2_sync_fs+0x20/0x54
      [   90.857173] [<80125cf4>] iterate_supers+0xf4/0x120
      [   90.866729] [<80158fc4>] sys_sync+0x44/0x9c
      [   90.875067] [<80014424>] syscall_common+0x34/0x58
      Signed-off-by: default avatarDaniel Santos <daniel.santos@pobox.com>
      Reviewed-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f63e5b19
    • Chuck Lever's avatar
      rxe: IB_WR_REG_MR does not capture MR's iova field · ce91ad1a
      Chuck Lever authored
      [ Upstream commit b024dd0e ]
      
      FRWR memory registration is done with a series of calls and WRs.
      1. ULP invokes ib_dma_map_sg()
      2. ULP invokes ib_map_mr_sg()
      3. ULP posts an IB_WR_REG_MR on the Send queue
      
      Step 2 generates an iova. It is permissible for ULPs to change this
      iova (with certain restrictions) between steps 2 and 3.
      
      rxe_map_mr_sg captures the MR's iova but later when rxe processes the
      REG_MR WR, it ignores the MR's iova field. If a ULP alters the MR's iova
      after step 2 but before step 3, rxe never captures that change.
      
      When the remote sends an RDMA Read targeting that MR, rxe looks up the
      R_key, but the altered iova does not match the iova stored in the MR,
      causing the RDMA Read request to fail.
      Reported-by: default avatarAnna Schumaker <schumaker.anna@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ce91ad1a
    • Ondrej Mosnacek's avatar
      selinux: always allow mounting submounts · fbbfb5c6
      Ondrej Mosnacek authored
      [ Upstream commit 2cbdcb88 ]
      
      If a superblock has the MS_SUBMOUNT flag set, we should always allow
      mounting it. These mounts are done automatically by the kernel either as
      part of mounting some parent mount (e.g. debugfs always mounts tracefs
      under "tracing" for compatibility) or they are mounted automatically as
      needed on subdirectory accesses (e.g. NFS crossmnt mounts). Since such
      automounts are either an implicit consequence of the parent mount (which
      is already checked) or they can happen during regular accesses (where it
      doesn't make sense to check against the current task's context), the
      mount permission check should be skipped for them.
      
      Without this patch, attempts to access contents of an automounted
      directory can cause unexpected SELinux denials.
      
      In the current kernel tree, the MS_SUBMOUNT flag is set only via
      vfs_submount(), which is called only from the following places:
       - AFS, when automounting special "symlinks" referencing other cells
       - CIFS, when automounting "referrals"
       - NFS, when automounting subtrees
       - debugfs, when automounting tracefs
      
      In all cases the submounts are meant to be transparent to the user and
      it makes sense that if mounting the master is allowed, then so should be
      the automounts. Note that CAP_SYS_ADMIN capability checking is already
      skipped for (SB_KERNMOUNT|SB_SUBMOUNT) in:
       - sget_userns() in fs/super.c:
      	if (!(flags & (SB_KERNMOUNT|SB_SUBMOUNT)) &&
      	    !(type->fs_flags & FS_USERNS_MOUNT) &&
      	    !capable(CAP_SYS_ADMIN))
      		return ERR_PTR(-EPERM);
       - sget() in fs/super.c:
              /* Ensure the requestor has permissions over the target filesystem */
              if (!(flags & (SB_KERNMOUNT|SB_SUBMOUNT)) && !ns_capable(user_ns, CAP_SYS_ADMIN))
                      return ERR_PTR(-EPERM);
      
      Verified internally on patched RHEL 7.6 with a reproducer using
      NFS+httpd and selinux-tesuite.
      
      Fixes: 93faccbb ("fs: Better permission checking for submounts")
      Signed-off-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fbbfb5c6
    • Yoshihiro Shimoda's avatar
      usb: gadget: udc: renesas_usb3: add a safety connection way for forced_b_device · 8f472434
      Yoshihiro Shimoda authored
      [ Upstream commit ceb94bc5 ]
      
      This patch adds a safety connection way for "forced_b_device" with
      "workaround_for_vbus" like below:
      
      < Example for R-Car E3 Ebisu >
       # modprobe <any usb gadget driver>
       # echo 1 > /sys/kernel/debug/ee020000.usb/b_device
       (connect a usb cable to host side.)
       # echo 2 > /sys/kernel/debug/ee020000.usb/b_device
      
      Previous code should have connected a usb cable before the "b_device"
      is set to 1 on the Ebisu board. However, if xHCI driver on the board
      is probed, it causes some troubles:
       - Conflicts USB VBUS/signals between the board and another host.
       - "Cannot enable. Maybe the USB cable is bad?" might happen on
         both the board and another host with a usb hub.
       - Cannot enumerate a usb gadget correctly because an interruption
         of VBUS change happens unexpectedly.
      Reported-by: default avatarKazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f472434
    • Anders Roxell's avatar
      arm64: perf: set suppress_bind_attrs flag to true · 85a3e682
      Anders Roxell authored
      [ Upstream commit 81e9fa8b ]
      
      The armv8_pmuv3 driver doesn't have a remove function, and when the test
      'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled, the following Call trace
      can be seen.
      
      [    1.424287] Failed to register pmu: armv8_pmuv3, reason -17
      [    1.424870] WARNING: CPU: 0 PID: 1 at ../kernel/events/core.c:11771 perf_event_sysfs_init+0x98/0xdc
      [    1.425220] Modules linked in:
      [    1.425531] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         4.19.0-rc7-next-20181012-00003-ge7a97b1ad77b-dirty #35
      [    1.425951] Hardware name: linux,dummy-virt (DT)
      [    1.426212] pstate: 80000005 (Nzcv daif -PAN -UAO)
      [    1.426458] pc : perf_event_sysfs_init+0x98/0xdc
      [    1.426720] lr : perf_event_sysfs_init+0x98/0xdc
      [    1.426908] sp : ffff00000804bd50
      [    1.427077] x29: ffff00000804bd50 x28: ffff00000934e078
      [    1.427429] x27: ffff000009546000 x26: 0000000000000007
      [    1.427757] x25: ffff000009280710 x24: 00000000ffffffef
      [    1.428086] x23: ffff000009408000 x22: 0000000000000000
      [    1.428415] x21: ffff000009136008 x20: ffff000009408730
      [    1.428744] x19: ffff80007b20b400 x18: 000000000000000a
      [    1.429075] x17: 0000000000000000 x16: 0000000000000000
      [    1.429418] x15: 0000000000000400 x14: 2e79726f74636572
      [    1.429748] x13: 696420656d617320 x12: 656874206e692065
      [    1.430060] x11: 6d616e20656d6173 x10: 2065687420687469
      [    1.430335] x9 : ffff00000804bd50 x8 : 206e6f7361657220
      [    1.430610] x7 : 2c3376756d705f38 x6 : ffff00000954d7ce
      [    1.430880] x5 : 0000000000000000 x4 : 0000000000000000
      [    1.431226] x3 : 0000000000000000 x2 : ffffffffffffffff
      [    1.431554] x1 : 4d151327adc50b00 x0 : 0000000000000000
      [    1.431868] Call trace:
      [    1.432102]  perf_event_sysfs_init+0x98/0xdc
      [    1.432382]  do_one_initcall+0x6c/0x1a8
      [    1.432637]  kernel_init_freeable+0x1bc/0x280
      [    1.432905]  kernel_init+0x18/0x160
      [    1.433115]  ret_from_fork+0x10/0x18
      [    1.433297] ---[ end trace 27fd415390eb9883 ]---
      
      Rework to set suppress_bind_attrs flag to avoid removing the device when
      CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, since there's no real reason to
      remove the armv8_pmuv3 driver.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Co-developed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      85a3e682
    • Maciej W. Rozycki's avatar
      MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur · c85acbf7
      Maciej W. Rozycki authored
      [ Upstream commit e4849aff ]
      
      The Broadcom SiByte BCM1250, BCM1125, and BCM1125H SOCs have an onchip
      DRAM controller that supports memory amounts of up to 16GiB, and due to
      how the address decoder has been wired in the SOC any memory beyond 1GiB
      is actually mapped starting from 4GiB physical up, that is beyond the
      32-bit addressable limit[1].  Consequently if the maximum amount of
      memory has been installed, then it will span up to 19GiB.
      
      Many of the evaluation boards we support that are based on one of these
      SOCs have their memory soldered and the amount present fits in the
      32-bit address range.  The BCM91250A SWARM board however has actual DIMM
      slots and accepts, depending on the peripherals revision of the SOC, up
      to 4GiB or 8GiB of memory in commercially available JEDEC modules[2].
      I believe this is also the case with the BCM91250C2 LittleSur board.
      This means that up to either 3GiB or 7GiB of memory requires 64-bit
      addressing to access.
      
      I believe the BCM91480B BigSur board, which has the BCM1480 SOC instead,
      accepts at least as much memory, although I have no documentation or
      actual hardware available to verify that.
      
      Both systems have PCI slots installed for use by any PCI option boards,
      including ones that only support 32-bit addressing (additionally the
      32-bit PCI host bridge of the BCM1250, BCM1125, and BCM1125H SOCs limits
      addressing to 32-bits), and there is no IOMMU available.  Therefore for
      PCI DMA to work in the presence of memory beyond enable swiotlb for the
      affected systems.
      
      All the other SOC onchip DMA devices use 40-bit addressing and therefore
      can address the whole memory, so only enable swiotlb if PCI support and
      support for DMA beyond 4GiB have been both enabled in the configuration
      of the kernel.
      
      This shows up as follows:
      
      Broadcom SiByte BCM1250 B2 @ 800 MHz (SB1 rev 2)
      Board type: SiByte BCM91250A (SWARM)
      Determined physical RAM map:
       memory: 000000000fe7fe00 @ 0000000000000000 (usable)
       memory: 000000001ffffe00 @ 0000000080000000 (usable)
       memory: 000000000ffffe00 @ 00000000c0000000 (usable)
       memory: 0000000087fffe00 @ 0000000100000000 (usable)
      software IO TLB: mapped [mem 0xcbffc000-0xcfffc000] (64MB)
      
      in the bootstrap log and removes failures like these:
      
      defxx 0000:02:00.0: dma_direct_map_page: overflow 0x0000000185bc6080+4608 of device mask ffffffff bus mask 0
      fddi0: Receive buffer allocation failed
      fddi0: Adapter open failed!
      IP-Config: Failed to open fddi0
      defxx 0000:09:08.0: dma_direct_map_page: overflow 0x0000000185bc6080+4608 of device mask ffffffff bus mask 0
      fddi1: Receive buffer allocation failed
      fddi1: Adapter open failed!
      IP-Config: Failed to open fddi1
      
      when memory beyond 4GiB is handed out to devices that can only do 32-bit
      addressing.
      
      This updates commit cce335ae ("[MIPS] 64-bit Sibyte kernels need
      DMA32.").
      
      References:
      
      [1] "BCM1250/BCM1125/BCM1125H User Manual", Revision 1250_1125-UM100-R,
          Broadcom Corporation, 21 Oct 2002, Section 3: "System Overview",
          "Memory Map", pp. 34-38
      
      [2] "BCM91250A User Manual", Revision 91250A-UM100-R, Broadcom
          Corporation, 18 May 2004, Section 3: "Physical Description",
          "Supported DRAM", p. 23
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      [paul.burton@mips.com: Remove GPL text from dma.c; SPDX tag covers it]
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Patchwork: https://patchwork.linux-mips.org/patch/21108/
      References: cce335ae ("[MIPS] 64-bit Sibyte kernels need DMA32.")
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c85acbf7
    • Borislav Petkov's avatar
      x86/mce: Fix -Wmissing-prototypes warnings · ff3c3ca3
      Borislav Petkov authored
      [ Upstream commit 68b5e432 ]
      
      Add the proper includes and make smca_get_name() static.
      
      Fix an actual bug too which the warning triggered:
      
        arch/x86/kernel/cpu/mcheck/therm_throt.c:395:39: error: conflicting \
        types for ‘smp_thermal_interrupt’
         asmlinkage __visible void __irq_entry smp_thermal_interrupt(struct pt_regs *r)
                                               ^~~~~~~~~~~~~~~~~~~~~
        In file included from arch/x86/kernel/cpu/mcheck/therm_throt.c:29:
        ./arch/x86/include/asm/traps.h:107:17: note: previous declaration of \
      	  ‘smp_thermal_interrupt’ was here
         asmlinkage void smp_thermal_interrupt(void);
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Yi Wang <wang.yi59@zte.com.cn>
      Cc: Michael Matz <matz@suse.de>
      Cc: x86@kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811081633160.1549@nanos.tec.linutronix.deSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      ff3c3ca3