1. 30 Aug, 2017 7 commits
    • Nikolay Aleksandrov's avatar
      sch_fq_codel: avoid double free on init failure · 30c31d74
      Nikolay Aleksandrov authored
      It is very unlikely to happen but the backlogs memory allocation
      could fail and will free q->flows, but then ->destroy() will free
      q->flows too. For correctness remove the first free and let ->destroy
      clean up.
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30c31d74
    • Nikolay Aleksandrov's avatar
      sch_cbq: fix null pointer dereferences on init failure · 3501d059
      Nikolay Aleksandrov authored
      CBQ can fail on ->init by wrong nl attributes or simply for missing any,
      f.e. if it's set as a default qdisc then TCA_OPTIONS (opt) will be NULL
      when it is activated. The first thing init does is parse opt but it will
      dereference a null pointer if used as a default qdisc, also since init
      failure at default qdisc invokes ->reset() which cancels all timers then
      we'll also dereference two more null pointers (timer->base) as they were
      never initialized.
      
      To reproduce:
      $ sysctl net.core.default_qdisc=cbq
      $ ip l set ethX up
      
      Crash log of the first null ptr deref:
      [44727.907454] BUG: unable to handle kernel NULL pointer dereference at (null)
      [44727.907600] IP: cbq_init+0x27/0x205
      [44727.907676] PGD 59ff4067
      [44727.907677] P4D 59ff4067
      [44727.907742] PUD 59c70067
      [44727.907807] PMD 0
      [44727.907873]
      [44727.907982] Oops: 0000 [#1] SMP
      [44727.908054] Modules linked in:
      [44727.908126] CPU: 1 PID: 21312 Comm: ip Not tainted 4.13.0-rc6+ #60
      [44727.908235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [44727.908477] task: ffff88005ad42700 task.stack: ffff880037214000
      [44727.908672] RIP: 0010:cbq_init+0x27/0x205
      [44727.908838] RSP: 0018:ffff8800372175f0 EFLAGS: 00010286
      [44727.909018] RAX: ffffffff816c3852 RBX: ffff880058c53800 RCX: 0000000000000000
      [44727.909222] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8800372175f8
      [44727.909427] RBP: ffff880037217650 R08: ffffffff81b0f380 R09: 0000000000000000
      [44727.909631] R10: ffff880037217660 R11: 0000000000000020 R12: ffffffff822a44c0
      [44727.909835] R13: ffff880058b92000 R14: 00000000ffffffff R15: 0000000000000001
      [44727.910040] FS:  00007ff8bc583740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
      [44727.910339] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [44727.910525] CR2: 0000000000000000 CR3: 00000000371e5000 CR4: 00000000000406e0
      [44727.910731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [44727.910936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [44727.911141] Call Trace:
      [44727.911291]  ? lockdep_init_map+0xb6/0x1ba
      [44727.911461]  ? qdisc_alloc+0x14e/0x187
      [44727.911626]  qdisc_create_dflt+0x7a/0x94
      [44727.911794]  ? dev_activate+0x129/0x129
      [44727.911959]  attach_one_default_qdisc+0x36/0x63
      [44727.912132]  netdev_for_each_tx_queue+0x3d/0x48
      [44727.912305]  dev_activate+0x4b/0x129
      [44727.912468]  __dev_open+0xe7/0x104
      [44727.912631]  __dev_change_flags+0xc6/0x15c
      [44727.912799]  dev_change_flags+0x25/0x59
      [44727.912966]  do_setlink+0x30c/0xb3f
      [44727.913129]  ? check_chain_key+0xb0/0xfd
      [44727.913294]  ? check_chain_key+0xb0/0xfd
      [44727.913463]  rtnl_newlink+0x3a4/0x729
      [44727.913626]  ? rtnl_newlink+0x117/0x729
      [44727.913801]  ? ns_capable_common+0xd/0xb1
      [44727.913968]  ? ns_capable+0x13/0x15
      [44727.914131]  rtnetlink_rcv_msg+0x188/0x197
      [44727.914300]  ? rcu_read_unlock+0x3e/0x5f
      [44727.914465]  ? rtnl_newlink+0x729/0x729
      [44727.914630]  netlink_rcv_skb+0x6c/0xce
      [44727.914796]  rtnetlink_rcv+0x23/0x2a
      [44727.914956]  netlink_unicast+0x103/0x181
      [44727.915122]  netlink_sendmsg+0x326/0x337
      [44727.915291]  sock_sendmsg_nosec+0x14/0x3f
      [44727.915459]  sock_sendmsg+0x29/0x2e
      [44727.915619]  ___sys_sendmsg+0x209/0x28b
      [44727.915784]  ? do_raw_spin_unlock+0xcd/0xf8
      [44727.915954]  ? _raw_spin_unlock+0x27/0x31
      [44727.916121]  ? __handle_mm_fault+0x651/0xdb1
      [44727.916290]  ? check_chain_key+0xb0/0xfd
      [44727.916461]  __sys_sendmsg+0x45/0x63
      [44727.916626]  ? __sys_sendmsg+0x45/0x63
      [44727.916792]  SyS_sendmsg+0x19/0x1b
      [44727.916950]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [44727.917125] RIP: 0033:0x7ff8bbc96690
      [44727.917286] RSP: 002b:00007ffc360991e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [44727.917579] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007ff8bbc96690
      [44727.917783] RDX: 0000000000000000 RSI: 00007ffc36099230 RDI: 0000000000000003
      [44727.917987] RBP: ffff880037217f98 R08: 0000000000000001 R09: 0000000000000003
      [44727.918190] R10: 00007ffc36098fb0 R11: 0000000000000246 R12: 0000000000000006
      [44727.918393] R13: 000000000066f1a0 R14: 00007ffc360a12e0 R15: 0000000000000000
      [44727.918597]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [44727.918774] Code: 41 5f 5d c3 66 66 66 66 90 55 48 8d 56 04 45 31 c9
      49 c7 c0 80 f3 b0 81 48 89 e5 41 55 41 54 53 48 89 fb 48 8d 7d a8 48 83
      ec 48 <0f> b7 0e be 07 00 00 00 83 e9 04 e8 e6 f7 d8 ff 85 c0 0f 88 bb
      [44727.919332] RIP: cbq_init+0x27/0x205 RSP: ffff8800372175f0
      [44727.919516] CR2: 0000000000000000
      
      Fixes: 0fbbeb1b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3501d059
    • Nikolay Aleksandrov's avatar
      sch_hfsc: fix null pointer deref and double free on init failure · 3bdac362
      Nikolay Aleksandrov authored
      Depending on where ->init fails we can get a null pointer deref due to
      uninitialized hires timer (watchdog) or a double free of the qdisc hash
      because it is already freed by ->destroy().
      
      Fixes: 8d553738 ("net/sched/hfsc: allocate tcf block for hfsc root class")
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3bdac362
    • Nikolay Aleksandrov's avatar
      sch_hhf: fix null pointer dereference on init failure · 32db864d
      Nikolay Aleksandrov authored
      If sch_hhf fails in its ->init() function (either due to wrong
      user-space arguments as below or memory alloc failure of hh_flows) it
      will do a null pointer deref of q->hh_flows in its ->destroy() function.
      
      To reproduce the crash:
      $ tc qdisc add dev eth0 root hhf quantum 2000000 non_hh_weight 10000000
      
      Crash log:
      [  690.654882] BUG: unable to handle kernel NULL pointer dereference at (null)
      [  690.655565] IP: hhf_destroy+0x48/0xbc
      [  690.655944] PGD 37345067
      [  690.655948] P4D 37345067
      [  690.656252] PUD 58402067
      [  690.656554] PMD 0
      [  690.656857]
      [  690.657362] Oops: 0000 [#1] SMP
      [  690.657696] Modules linked in:
      [  690.658032] CPU: 3 PID: 920 Comm: tc Not tainted 4.13.0-rc6+ #57
      [  690.658525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [  690.659255] task: ffff880058578000 task.stack: ffff88005acbc000
      [  690.659747] RIP: 0010:hhf_destroy+0x48/0xbc
      [  690.660146] RSP: 0018:ffff88005acbf9e0 EFLAGS: 00010246
      [  690.660601] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
      [  690.661155] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff821f63f0
      [  690.661710] RBP: ffff88005acbfa08 R08: ffffffff81b10a90 R09: 0000000000000000
      [  690.662267] R10: 00000000f42b7019 R11: ffff880058578000 R12: 00000000ffffffea
      [  690.662820] R13: ffff8800372f6400 R14: 0000000000000000 R15: 0000000000000000
      [  690.663769] FS:  00007f8ae5e8b740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
      [  690.667069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  690.667965] CR2: 0000000000000000 CR3: 0000000058523000 CR4: 00000000000406e0
      [  690.668918] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  690.669945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  690.671003] Call Trace:
      [  690.671743]  qdisc_create+0x377/0x3fd
      [  690.672534]  tc_modify_qdisc+0x4d2/0x4fd
      [  690.673324]  rtnetlink_rcv_msg+0x188/0x197
      [  690.674204]  ? rcu_read_unlock+0x3e/0x5f
      [  690.675091]  ? rtnl_newlink+0x729/0x729
      [  690.675877]  netlink_rcv_skb+0x6c/0xce
      [  690.676648]  rtnetlink_rcv+0x23/0x2a
      [  690.677405]  netlink_unicast+0x103/0x181
      [  690.678179]  netlink_sendmsg+0x326/0x337
      [  690.678958]  sock_sendmsg_nosec+0x14/0x3f
      [  690.679743]  sock_sendmsg+0x29/0x2e
      [  690.680506]  ___sys_sendmsg+0x209/0x28b
      [  690.681283]  ? __handle_mm_fault+0xc7d/0xdb1
      [  690.681915]  ? check_chain_key+0xb0/0xfd
      [  690.682449]  __sys_sendmsg+0x45/0x63
      [  690.682954]  ? __sys_sendmsg+0x45/0x63
      [  690.683471]  SyS_sendmsg+0x19/0x1b
      [  690.683974]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [  690.684516] RIP: 0033:0x7f8ae529d690
      [  690.685016] RSP: 002b:00007fff26d2d6b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  690.685931] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f8ae529d690
      [  690.686573] RDX: 0000000000000000 RSI: 00007fff26d2d700 RDI: 0000000000000003
      [  690.687047] RBP: ffff88005acbff98 R08: 0000000000000001 R09: 0000000000000000
      [  690.687519] R10: 00007fff26d2d480 R11: 0000000000000246 R12: 0000000000000002
      [  690.687996] R13: 0000000001258070 R14: 0000000000000001 R15: 0000000000000000
      [  690.688475]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [  690.688887] Code: 00 00 e8 2a 02 ae ff 49 8b bc 1d 60 02 00 00 48 83
      c3 08 e8 19 02 ae ff 48 83 fb 20 75 dc 45 31 f6 4d 89 f7 4d 03 bd 20 02
      00 00 <49> 8b 07 49 39 c7 75 24 49 83 c6 10 49 81 fe 00 40 00 00 75 e1
      [  690.690200] RIP: hhf_destroy+0x48/0xbc RSP: ffff88005acbf9e0
      [  690.690636] CR2: 0000000000000000
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: 10239edf ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32db864d
    • Nikolay Aleksandrov's avatar
      sch_multiq: fix double free on init failure · e89d469e
      Nikolay Aleksandrov authored
      The below commit added a call to ->destroy() on init failure, but multiq
      still frees ->queues on error in init, but ->queues is also freed by
      ->destroy() thus we get double free and corrupted memory.
      
      Very easy to reproduce (eth0 not multiqueue):
      $ tc qdisc add dev eth0 root multiq
      RTNETLINK answers: Operation not supported
      $ ip l add dumdum type dummy
      (crash)
      
      Trace log:
      [ 3929.467747] general protection fault: 0000 [#1] SMP
      [ 3929.468083] Modules linked in:
      [ 3929.468302] CPU: 3 PID: 967 Comm: ip Not tainted 4.13.0-rc6+ #56
      [ 3929.468625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 3929.469124] task: ffff88003716a700 task.stack: ffff88005872c000
      [ 3929.469449] RIP: 0010:__kmalloc_track_caller+0x117/0x1be
      [ 3929.469746] RSP: 0018:ffff88005872f6a0 EFLAGS: 00010246
      [ 3929.470042] RAX: 00000000000002de RBX: 0000000058a59000 RCX: 00000000000002df
      [ 3929.470406] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff821f7020
      [ 3929.470770] RBP: ffff88005872f6e8 R08: 000000000001f010 R09: 0000000000000000
      [ 3929.471133] R10: ffff88005872f730 R11: 0000000000008cdd R12: ff006d75646d7564
      [ 3929.471496] R13: 00000000014000c0 R14: ffff88005b403c00 R15: ffff88005b403c00
      [ 3929.471869] FS:  00007f0b70480740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
      [ 3929.472286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 3929.472677] CR2: 00007ffcee4f3000 CR3: 0000000059d45000 CR4: 00000000000406e0
      [ 3929.473209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 3929.474109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 3929.474873] Call Trace:
      [ 3929.475337]  ? kstrdup_const+0x23/0x25
      [ 3929.475863]  kstrdup+0x2e/0x4b
      [ 3929.476338]  kstrdup_const+0x23/0x25
      [ 3929.478084]  __kernfs_new_node+0x28/0xbc
      [ 3929.478478]  kernfs_new_node+0x35/0x55
      [ 3929.478929]  kernfs_create_link+0x23/0x76
      [ 3929.479478]  sysfs_do_create_link_sd.isra.2+0x85/0xd7
      [ 3929.480096]  sysfs_create_link+0x33/0x35
      [ 3929.480649]  device_add+0x200/0x589
      [ 3929.481184]  netdev_register_kobject+0x7c/0x12f
      [ 3929.481711]  register_netdevice+0x373/0x471
      [ 3929.482174]  rtnl_newlink+0x614/0x729
      [ 3929.482610]  ? rtnl_newlink+0x17f/0x729
      [ 3929.483080]  rtnetlink_rcv_msg+0x188/0x197
      [ 3929.483533]  ? rcu_read_unlock+0x3e/0x5f
      [ 3929.483984]  ? rtnl_newlink+0x729/0x729
      [ 3929.484420]  netlink_rcv_skb+0x6c/0xce
      [ 3929.484858]  rtnetlink_rcv+0x23/0x2a
      [ 3929.485291]  netlink_unicast+0x103/0x181
      [ 3929.485735]  netlink_sendmsg+0x326/0x337
      [ 3929.486181]  sock_sendmsg_nosec+0x14/0x3f
      [ 3929.486614]  sock_sendmsg+0x29/0x2e
      [ 3929.486973]  ___sys_sendmsg+0x209/0x28b
      [ 3929.487340]  ? do_raw_spin_unlock+0xcd/0xf8
      [ 3929.487719]  ? _raw_spin_unlock+0x27/0x31
      [ 3929.488092]  ? __handle_mm_fault+0x651/0xdb1
      [ 3929.488471]  ? check_chain_key+0xb0/0xfd
      [ 3929.488847]  __sys_sendmsg+0x45/0x63
      [ 3929.489206]  ? __sys_sendmsg+0x45/0x63
      [ 3929.489576]  SyS_sendmsg+0x19/0x1b
      [ 3929.489901]  entry_SYSCALL_64_fastpath+0x23/0xc2
      [ 3929.490172] RIP: 0033:0x7f0b6fb93690
      [ 3929.490423] RSP: 002b:00007ffcee4ed588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 3929.490881] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f0b6fb93690
      [ 3929.491198] RDX: 0000000000000000 RSI: 00007ffcee4ed5d0 RDI: 0000000000000003
      [ 3929.491521] RBP: ffff88005872ff98 R08: 0000000000000001 R09: 0000000000000000
      [ 3929.491801] R10: 00007ffcee4ed350 R11: 0000000000000246 R12: 0000000000000002
      [ 3929.492075] R13: 000000000066f1a0 R14: 00007ffcee4f5680 R15: 0000000000000000
      [ 3929.492352]  ? trace_hardirqs_off_caller+0xa7/0xcf
      [ 3929.492590] Code: 8b 45 c0 48 8b 45 b8 74 17 48 8b 4d c8 83 ca ff 44
      89 ee 4c 89 f7 e8 83 ca ff ff 49 89 c4 eb 49 49 63 56 20 48 8d 48 01 4d
      8b 06 <49> 8b 1c 14 48 89 c2 4c 89 e0 65 49 0f c7 08 0f 94 c0 83 f0 01
      [ 3929.493335] RIP: __kmalloc_track_caller+0x117/0x1be RSP: ffff88005872f6a0
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: f07d1501 ("multiq: Further multiqueue cleanup")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e89d469e
    • Nikolay Aleksandrov's avatar
      sch_htb: fix crash on init failure · 88c2ace6
      Nikolay Aleksandrov authored
      The commit below added a call to the ->destroy() callback for all qdiscs
      which failed in their ->init(), but some were not prepared for such
      change and can't handle partially initialized qdisc. HTB is one of them
      and if any error occurs before the qdisc watchdog timer and qdisc work are
      initialized then we can hit either a null ptr deref (timer->base) when
      canceling in ->destroy or lockdep error info about trying to register
      a non-static key and a stack dump. So to fix these two move the watchdog
      timer and workqueue init before anything that can err out.
      To reproduce userspace needs to send broken htb qdisc create request,
      tested with a modified tc (q_htb.c).
      
      Trace log:
      [ 2710.897602] BUG: unable to handle kernel NULL pointer dereference at (null)
      [ 2710.897977] IP: hrtimer_active+0x17/0x8a
      [ 2710.898174] PGD 58fab067
      [ 2710.898175] P4D 58fab067
      [ 2710.898353] PUD 586c0067
      [ 2710.898531] PMD 0
      [ 2710.898710]
      [ 2710.899045] Oops: 0000 [#1] SMP
      [ 2710.899232] Modules linked in:
      [ 2710.899419] CPU: 1 PID: 950 Comm: tc Not tainted 4.13.0-rc6+ #54
      [ 2710.899646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [ 2710.900035] task: ffff880059ed2700 task.stack: ffff88005ad4c000
      [ 2710.900262] RIP: 0010:hrtimer_active+0x17/0x8a
      [ 2710.900467] RSP: 0018:ffff88005ad4f960 EFLAGS: 00010246
      [ 2710.900684] RAX: 0000000000000000 RBX: ffff88003701e298 RCX: 0000000000000000
      [ 2710.900933] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003701e298
      [ 2710.901177] RBP: ffff88005ad4f980 R08: 0000000000000001 R09: 0000000000000001
      [ 2710.901419] R10: ffff88005ad4f800 R11: 0000000000000400 R12: 0000000000000000
      [ 2710.901663] R13: ffff88003701e298 R14: ffffffff822a4540 R15: ffff88005ad4fac0
      [ 2710.901907] FS:  00007f2f5e90f740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
      [ 2710.902277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2710.902500] CR2: 0000000000000000 CR3: 0000000058ca3000 CR4: 00000000000406e0
      [ 2710.902744] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 2710.902977] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 2710.903180] Call Trace:
      [ 2710.903332]  hrtimer_try_to_cancel+0x1a/0x93
      [ 2710.903504]  hrtimer_cancel+0x15/0x20
      [ 2710.903667]  qdisc_watchdog_cancel+0x12/0x14
      [ 2710.903866]  htb_destroy+0x2e/0xf7
      [ 2710.904097]  qdisc_create+0x377/0x3fd
      [ 2710.904330]  tc_modify_qdisc+0x4d2/0x4fd
      [ 2710.904511]  rtnetlink_rcv_msg+0x188/0x197
      [ 2710.904682]  ? rcu_read_unlock+0x3e/0x5f
      [ 2710.904849]  ? rtnl_newlink+0x729/0x729
      [ 2710.905017]  netlink_rcv_skb+0x6c/0xce
      [ 2710.905183]  rtnetlink_rcv+0x23/0x2a
      [ 2710.905345]  netlink_unicast+0x103/0x181
      [ 2710.905511]  netlink_sendmsg+0x326/0x337
      [ 2710.905679]  sock_sendmsg_nosec+0x14/0x3f
      [ 2710.905847]  sock_sendmsg+0x29/0x2e
      [ 2710.906010]  ___sys_sendmsg+0x209/0x28b
      [ 2710.906176]  ? do_raw_spin_unlock+0xcd/0xf8
      [ 2710.906346]  ? _raw_spin_unlock+0x27/0x31
      [ 2710.906514]  ? __handle_mm_fault+0x651/0xdb1
      [ 2710.906685]  ? check_chain_key+0xb0/0xfd
      [ 2710.906855]  __sys_sendmsg+0x45/0x63
      [ 2710.907018]  ? __sys_sendmsg+0x45/0x63
      [ 2710.907185]  SyS_sendmsg+0x19/0x1b
      [ 2710.907344]  entry_SYSCALL_64_fastpath+0x23/0xc2
      
      Note that probably this bug goes further back because the default qdisc
      handling always calls ->destroy on init failure too.
      
      Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
      Fixes: 0fbbeb1b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88c2ace6
    • Sekhar Nori's avatar
      net: ti: cpsw-common: dont print error if ti_cm_get_macid() fails · f0e82d73
      Sekhar Nori authored
      It is quite common for ti_cm_get_macid() to fail on some of the
      platforms it is invoked on. They include any platform where
      mac address is not part of SoC register space.
      
      On these platforms, mac address is read and populated in
      device-tree by bootloader. An example is TI DA850.
      
      Downgrade the severity of message to "information", so it does
      not spam logs when 'quiet' boot is desired.
      Signed-off-by: default avatarSekhar Nori <nsekhar@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0e82d73
  2. 29 Aug, 2017 14 commits
  3. 28 Aug, 2017 19 commits
    • Roopa Prabhu's avatar
      bridge: check for null fdb->dst before notifying switchdev drivers · ef9a5a62
      Roopa Prabhu authored
      current switchdev drivers dont seem to support offloading fdb
      entries pointing to the bridge device which have fdb->dst
      not set to any port. This patch adds a NULL fdb->dst check in
      the switchdev notifier code.
      
      This patch fixes the below NULL ptr dereference:
      $bridge fdb add 00:02:00:00:00:33 dev br0 self
      
      [   69.953374] BUG: unable to handle kernel NULL pointer dereference at
      0000000000000008
      [   69.954044] IP: br_switchdev_fdb_notify+0x29/0x80
      [   69.954044] PGD 66527067
      [   69.954044] P4D 66527067
      [   69.954044] PUD 7899c067
      [   69.954044] PMD 0
      [   69.954044]
      [   69.954044] Oops: 0000 [#1] SMP
      [   69.954044] Modules linked in:
      [   69.954044] CPU: 1 PID: 3074 Comm: bridge Not tainted 4.13.0-rc6+ #1
      [   69.954044] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org
      04/01/2014
      [   69.954044] task: ffff88007b827140 task.stack: ffffc90001564000
      [   69.954044] RIP: 0010:br_switchdev_fdb_notify+0x29/0x80
      [   69.954044] RSP: 0018:ffffc90001567918 EFLAGS: 00010246
      [   69.954044] RAX: 0000000000000000 RBX: ffff8800795e0880 RCX:
      00000000000000c0
      [   69.954044] RDX: ffffc90001567920 RSI: 000000000000001c RDI:
      ffff8800795d0600
      [   69.954044] RBP: ffffc90001567938 R08: ffff8800795d0600 R09:
      0000000000000000
      [   69.954044] R10: ffffc90001567a88 R11: ffff88007b849400 R12:
      ffff8800795e0880
      [   69.954044] R13: ffff8800795d0600 R14: ffffffff81ef8880 R15:
      000000000000001c
      [   69.954044] FS:  00007f93d3085700(0000) GS:ffff88007fd00000(0000)
      knlGS:0000000000000000
      [   69.954044] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   69.954044] CR2: 0000000000000008 CR3: 0000000066551000 CR4:
      00000000000006e0
      [   69.954044] Call Trace:
      [   69.954044]  fdb_notify+0x3f/0xf0
      [   69.954044]  __br_fdb_add.isra.12+0x1a7/0x370
      [   69.954044]  br_fdb_add+0x178/0x280
      [   69.954044]  rtnl_fdb_add+0x10a/0x200
      [   69.954044]  rtnetlink_rcv_msg+0x1b4/0x240
      [   69.954044]  ? skb_free_head+0x21/0x40
      [   69.954044]  ? rtnl_calcit.isra.18+0xf0/0xf0
      [   69.954044]  netlink_rcv_skb+0xed/0x120
      [   69.954044]  rtnetlink_rcv+0x15/0x20
      [   69.954044]  netlink_unicast+0x180/0x200
      [   69.954044]  netlink_sendmsg+0x291/0x370
      [   69.954044]  ___sys_sendmsg+0x180/0x2e0
      [   69.954044]  ? filemap_map_pages+0x2db/0x370
      [   69.954044]  ? do_wp_page+0x11d/0x420
      [   69.954044]  ? __handle_mm_fault+0x794/0xd80
      [   69.954044]  ? vma_link+0xcb/0xd0
      [   69.954044]  __sys_sendmsg+0x4c/0x90
      [   69.954044]  SyS_sendmsg+0x12/0x20
      [   69.954044]  do_syscall_64+0x63/0xe0
      [   69.954044]  entry_SYSCALL64_slow_path+0x25/0x25
      [   69.954044] RIP: 0033:0x7f93d2bad690
      [   69.954044] RSP: 002b:00007ffc7217a638 EFLAGS: 00000246 ORIG_RAX:
      000000000000002e
      [   69.954044] RAX: ffffffffffffffda RBX: 00007ffc72182eac RCX:
      00007f93d2bad690
      [   69.954044] RDX: 0000000000000000 RSI: 00007ffc7217a670 RDI:
      0000000000000003
      [   69.954044] RBP: 0000000059a1f7f8 R08: 0000000000000006 R09:
      000000000000000a
      [   69.954044] R10: 00007ffc7217a400 R11: 0000000000000246 R12:
      00007ffc7217a670
      [   69.954044] R13: 00007ffc72182a98 R14: 00000000006114c0 R15:
      00007ffc72182aa0
      [   69.954044] Code: 1f 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 f6 47
      20 04 74 0a 83 fe 1c 74 09 83 fe 1d 74 2c c9 66 90 c3 48 8b 47 10 48 8d
      55 e8 <48> 8b 70 08 0f b7 47 1e 48 83 c7 18 48 89 7d f0 bf 03 00 00 00
      [   69.954044] RIP: br_switchdev_fdb_notify+0x29/0x80 RSP:
      ffffc90001567918
      [   69.954044] CR2: 0000000000000008
      [   69.954044] ---[ end trace 03e9eec4a82c238b ]---
      
      Fixes: 6b26b51b ("net: bridge: Add support for notifying devices about FDB add/del")
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef9a5a62
    • Xin Long's avatar
      ipv6: set dst.obsolete when a cached route has expired · 1e2ea8ad
      Xin Long authored
      Now it doesn't check for the cached route expiration in ipv6's
      dst_ops->check(), because it trusts dst_gc that would clean the
      cached route up when it's expired.
      
      The problem is in dst_gc, it would clean the cached route only
      when it's refcount is 1. If some other module (like xfrm) keeps
      holding it and the module only release it when dst_ops->check()
      fails.
      
      But without checking for the cached route expiration, .check()
      may always return true. Meanwhile, without releasing the cached
      route, dst_gc couldn't del it. It will cause this cached route
      never to expire.
      
      This patch is to set dst.obsolete with DST_OBSOLETE_KILL in .gc
      when it's expired, and check obsolete != DST_OBSOLETE_FORCE_CHK
      in .check.
      
      Note that this is even needed when ipv6 dst_gc timer is removed
      one day. It would set dst.obsolete in .redirect and .update_pmtu
      instead, and check for cached route expiration when getting it,
      just like what ipv4 route does.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e2ea8ad
    • Wei Wang's avatar
      ipv6: fix sparse warning on rt6i_node · 4e587ea7
      Wei Wang authored
      Commit c5cff856 adds rcu grace period before freeing fib6_node. This
      generates a new sparse warning on rt->rt6i_node related code:
        net/ipv6/route.c:1394:30: error: incompatible types in comparison
        expression (different address spaces)
        ./include/net/ip6_fib.h:187:14: error: incompatible types in comparison
        expression (different address spaces)
      
      This commit adds "__rcu" tag for rt6i_node and makes sure corresponding
      rcu API is used for it.
      After this fix, sparse no longer generates the above warning.
      
      Fixes: c5cff856 ("ipv6: add rcu grace period before freeing fib6_node")
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e587ea7
    • Stefano Brivio's avatar
      cxgb4: Fix stack out-of-bounds read due to wrong size to t4_record_mbox() · 0f308686
      Stefano Brivio authored
      Passing commands for logging to t4_record_mbox() with size
      MBOX_LEN, when the actual command size is actually smaller,
      causes out-of-bounds stack accesses in t4_record_mbox() while
      copying command words here:
      
      	for (i = 0; i < size / 8; i++)
      		entry->cmd[i] = be64_to_cpu(cmd[i]);
      
      Up to 48 bytes from the stack are then leaked to debugfs.
      
      This happens whenever we send (and log) commands described by
      structs fw_sched_cmd (32 bytes leaked), fw_vi_rxmode_cmd (48),
      fw_hello_cmd (48), fw_bye_cmd (48), fw_initialize_cmd (48),
      fw_reset_cmd (48), fw_pfvf_cmd (32), fw_eq_eth_cmd (16),
      fw_eq_ctrl_cmd (32), fw_eq_ofld_cmd (32), fw_acl_mac_cmd(16),
      fw_rss_glb_config_cmd(32), fw_rss_vi_config_cmd(32),
      fw_devlog_cmd(32), fw_vi_enable_cmd(48), fw_port_cmd(32),
      fw_sched_cmd(32), fw_devlog_cmd(32).
      
      The cxgb4vf driver got this right instead.
      
      When we call t4_record_mbox() to log a command reply, a MBOX_LEN
      size can be used though, as get_mbox_rpl() will fill cmd_rpl up
      completely.
      
      Fixes: 7f080c3f ("cxgb4: Add support to enable logging of firmware mailbox commands")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f308686
    • Maxime Ripard's avatar
      net: stmmac: sun8i: Remove the compatibles · ad4540cc
      Maxime Ripard authored
      Since the bindings have been controversial, and we follow the DT stable ABI
      rule, we shouldn't let a driver with a DT binding that might change slip
      through in a stable release.
      
      Remove the compatibles to make sure the driver will not probe and no-one
      will start using the binding currently implemented. This commit will
      obviously need to be reverted in due time.
      Signed-off-by: default avatarMaxime Ripard <maxime.ripard@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad4540cc
    • David S. Miller's avatar
      Merge branch 'nfp-flow-dissector-layer' · c73c8a8e
      David S. Miller authored
      Pieter Jansen van Vuuren says:
      
      ====================
      nfp: fix layer calculation and flow dissector use
      
      Previously when calculating the supported key layers MPLS, IPv4/6
      TTL and TOS were not considered. Formerly flow dissectors were referenced
      without first checking that they are in use and correctly populated by TC.
      Additionally this patch set fixes the incorrect use of mask field for vlan
      matching.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c73c8a8e
    • Pieter Jansen van Vuuren's avatar
      nfp: remove incorrect mask check for vlan matching · 6afd33e4
      Pieter Jansen van Vuuren authored
      Previously the vlan tci field was incorrectly exact matched. This patch
      fixes this by using the flow dissector to populate the vlan tci field.
      
      Fixes: 5571e8c9 ("nfp: extend flower matching capabilities")
      Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6afd33e4
    • Pieter Jansen van Vuuren's avatar
      nfp: fix supported key layers calculation · 74af5975
      Pieter Jansen van Vuuren authored
      Previously when calculating the supported key layers MPLS, IPv4/6
      TTL and TOS were not considered. This patch checks that the TTL and
      TOS fields are masked out before offloading. Additionally this patch
      checks that MPLS packets are correctly handled, by not offloading them.
      
      Fixes: af9d842c ("nfp: extend flower add flow offload")
      Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74af5975
    • Pieter Jansen van Vuuren's avatar
      nfp: fix unchecked flow dissector use · a7cd39e0
      Pieter Jansen van Vuuren authored
      Previously flow dissectors were referenced without first checking that
      they are in use and correctly populated by TC. This patch fixes this by
      checking each flow dissector key before referencing them.
      
      Fixes: 5571e8c9 ("nfp: extend flower matching capabilities")
      Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7cd39e0
    • David S. Miller's avatar
      Merge branch 'l2tp-tunnel-refs' · 77146b5d
      David S. Miller authored
      Guillaume Nault says:
      
      ====================
      l2tp: fix some l2tp_tunnel_find() issues in l2tp_netlink
      
      Since l2tp_tunnel_find() doesn't take a reference on the tunnel it
      returns, its users are almost guaranteed to be racy.
      
      This series defines l2tp_tunnel_get() which can be used as a safe
      replacement, and converts some of l2tp_tunnel_find() users in the
      l2tp_netlink module.
      
      Other users often combine this issue with other more or less subtle
      races. They will be fixed incrementally in followup series.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77146b5d
    • Guillaume Nault's avatar
      l2tp: hold tunnel used while creating sessions with netlink · e702c120
      Guillaume Nault authored
      Use l2tp_tunnel_get() to retrieve tunnel, so that it can't go away on
      us. Otherwise l2tp_tunnel_destruct() might release the last reference
      count concurrently, thus freeing the tunnel while we're using it.
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e702c120
    • Guillaume Nault's avatar
      l2tp: hold tunnel while handling genl TUNNEL_GET commands · 4e4b21da
      Guillaume Nault authored
      Use l2tp_tunnel_get() instead of l2tp_tunnel_find() so that we get
      a reference on the tunnel, preventing l2tp_tunnel_destruct() from
      freeing it from under us.
      
      Also move l2tp_tunnel_get() below nlmsg_new() so that we only take
      the reference when needed.
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e4b21da
    • Guillaume Nault's avatar
      l2tp: hold tunnel while handling genl tunnel updates · 8c0e4215
      Guillaume Nault authored
      We need to make sure the tunnel is not going to be destroyed by
      l2tp_tunnel_destruct() concurrently.
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c0e4215
    • Guillaume Nault's avatar
      l2tp: hold tunnel while processing genl delete command · bb0a32ce
      Guillaume Nault authored
      l2tp_nl_cmd_tunnel_delete() needs to take a reference on the tunnel, to
      prevent it from being concurrently freed by l2tp_tunnel_destruct().
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb0a32ce
    • Guillaume Nault's avatar
      l2tp: hold tunnel while looking up sessions in l2tp_netlink · 54652eb1
      Guillaume Nault authored
      l2tp_tunnel_find() doesn't take a reference on the returned tunnel.
      Therefore, it's unsafe to use it because the returned tunnel can go
      away on us anytime.
      
      Fix this by defining l2tp_tunnel_get(), which works like
      l2tp_tunnel_find(), but takes a reference on the returned tunnel.
      Caller then has to drop this reference using l2tp_tunnel_dec_refcount().
      
      As l2tp_tunnel_dec_refcount() needs to be moved to l2tp_core.h, let's
      simplify the patch and not move the L2TP_REFCNT_DEBUG part. This code
      has been broken (not even compiling) in May 2012 by
      commit a4ca44fa ("net: l2tp: Standardize logging styles")
      and fixed more than two years later by
      commit 29abe2fd ("l2tp: fix missing line continuation"). So it
      doesn't appear to be used by anyone.
      
      Same thing for l2tp_tunnel_free(); instead of moving it to l2tp_core.h,
      let's just simplify things and call kfree_rcu() directly in
      l2tp_tunnel_dec_refcount(). Extra assertions and debugging code
      provided by l2tp_tunnel_free() didn't help catching any of the
      reference counting and socket handling issues found while working on
      this series.
      
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54652eb1
    • Guillaume Nault's avatar
      l2tp: initialise session's refcount before making it reachable · 9ee369a4
      Guillaume Nault authored
      Sessions must be fully initialised before calling
      l2tp_session_add_to_tunnel(). Otherwise, there's a short time frame
      where partially initialised sessions can be accessed by external users.
      
      Fixes: dbdbc73b ("l2tp: fix duplicate session creation")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ee369a4
    • Antoine Tenart's avatar
      net: mvpp2: fix the mac address used when using PPv2.2 · 4c228682
      Antoine Tenart authored
      The mac address is only retrieved from h/w when using PPv2.1. Otherwise
      the variable holding it is still checked and used if it contains a valid
      value. As the variable isn't initialized to an invalid mac address
      value, we end up with random mac addresses which can be the same for all
      the ports handled by this PPv2 driver.
      
      Fixes this by initializing the h/w mac address variable to {0}, which is
      an invalid mac address value. This way the random assignation fallback
      is called and all ports end up with their own addresses.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@free-electrons.com>
      Fixes: 26975821 ("net: mvpp2: handle misc PPv2.1/PPv2.2 differences")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c228682
    • Aleksander Morgado's avatar
      cdc_ncm: flag the u-blox TOBY-L4 as wwan · 3b638f0f
      Aleksander Morgado authored
      The u-blox TOBY-L4 is a LTE Advanced (Cat 6) module with HSPA+ and 2G
      fallback.
      
      Unlike the TOBY-L2, this module has one single USB layout and exposes
      several TTYs for control and a NCM interface for data. Connecting this
      module may be done just by activating the desired PDP context with
      'AT+CGACT=1,<cid>' and then running DHCP on the NCM interface.
      Signed-off-by: default avatarAleksander Morgado <aleksander@aleksander.es>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b638f0f
    • Jesper Dangaard Brouer's avatar
      net: missing call of trace_napi_poll in busy_poll_stop · 1e22391e
      Jesper Dangaard Brouer authored
      Noticed that busy_poll_stop() also invoke the drivers napi->poll()
      function pointer, but didn't have an associated call to trace_napi_poll()
      like all other call sites.
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e22391e