1. 07 Apr, 2015 10 commits
    • Pratyush Anand's avatar
      ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl · f7b96c35
      Pratyush Anand authored
      commit 1619dc3f upstream.
      
      When ftrace is enabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_START_FUNC_RET command to ftrace_run_update_code(). Similarly, when
      ftrace is disabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_STOP_FUNC_RET command to ftrace_run_update_code().
      
      Consider the following situation.
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
      
      After this ftrace_enabled = 0.
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      Since ftrace_enabled = 0, ftrace_enable_ftrace_graph_caller() is never
      called.
      
       # echo 1 > /proc/sys/kernel/ftrace_enabled
      
      Now ftrace_enabled will be set to true, but still
      ftrace_enable_ftrace_graph_caller() will not be called, which is not
      desired.
      
      Further if we execute the following after this:
        # echo nop > /sys/kernel/debug/tracing/current_tracer
      
      Now since ftrace_enabled is set it will call
      ftrace_disable_ftrace_graph_caller(), which causes a kernel warning on
      the ARM platform.
      
      On the ARM platform, when ftrace_enable_ftrace_graph_caller() is called,
      it checks whether the old instruction is a nop or not. If it's not a nop,
      then it returns an error. If it is a nop then it replaces instruction at
      that address with a branch to ftrace_graph_caller.
      ftrace_disable_ftrace_graph_caller() behaves just the opposite. Therefore,
      if generic ftrace code ever calls either ftrace_enable_ftrace_graph_caller()
      or ftrace_disable_ftrace_graph_caller() consecutively two times in a row,
      then it will return an error, which will cause the generic ftrace code to
      raise a warning.
      
      Note, x86 does not have an issue with this because the architecture
      specific code for ftrace_enable_ftrace_graph_caller() and
      ftrace_disable_ftrace_graph_caller() does not check the previous state,
      and calling either of these functions twice in a row has no ill effect.
      
      Link: http://lkml.kernel.org/r/e4fbe64cdac0dd0e86a3bf914b0f83c0b419f146.1425666454.git.panand@redhat.comSigned-off-by: default avatarPratyush Anand <panand@redhat.com>
      [
        removed extra if (ftrace_start_up) and defined ftrace_graph_active as 0
        if CONFIG_FUNCTION_GRAPH_TRACER is not set.
      ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f7b96c35
    • Ahmed S. Darwish's avatar
      can: kvaser_usb: Read all messages in a bulk-in URB buffer · fe421e97
      Ahmed S. Darwish authored
      commit 2fec5104 upstream.
      
      The Kvaser firmware can only read and write messages that are
      not crossing the USB endpoint's wMaxPacketSize boundary. While
      receiving commands from the CAN device, if the next command in
      the same URB buffer crossed that max packet size boundary, the
      firmware puts a zero-length placeholder command in its place
      then moves the real command to the next boundary mark.
      
      The driver did not recognize such behavior, leading to missing
      a good number of rx events during a heavy rx load session.
      
      Moreover, a tx URB context only gets freed upon receiving its
      respective tx ACK event. Over time, the free tx URB contexts
      pool gets depleted due to the missing ACK events. Consequently,
      the netif transmission queue gets __permanently__ stopped; no
      frames could be sent again except after restarting the CAN
      newtwork interface.
      Signed-off-by: default avatarAhmed S. Darwish <ahmed.darwish@valeo.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fe421e97
    • Ahmed S. Darwish's avatar
      can: kvaser_usb: Avoid double free on URB submission failures · 0f75a0df
      Ahmed S. Darwish authored
      commit deb2701c upstream.
      
      Upon a URB submission failure, the driver calls usb_free_urb()
      but then manually frees the URB buffer by itself.  Meanwhile
      usb_free_urb() has alredy freed out that transfer buffer since
      we're the only code path holding a reference to this URB.
      
      Remove two of such invalid manual free().
      Signed-off-by: default avatarAhmed S. Darwish <ahmed.darwish@valeo.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0f75a0df
    • Oliver Hartkopp's avatar
      can: add missing initialisations in CAN related skbuffs · 40705519
      Oliver Hartkopp authored
      commit 96943901 upstream.
      
      When accessing CAN network interfaces with AF_PACKET sockets e.g. by dhclient
      this can lead to a skb_under_panic due to missing skb initialisations.
      
      Add the missing initialisations at the CAN skbuff creation times on driver
      level (rx path) and in the network layer (tx path).
      Reported-by: default avatarAustin Schuh <austin@peloton-tech.com>
      Reported-by: default avatarDaniel Steer <daniel.steer@mclaren.com>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      [ kamal: backport to 3.13-stable: no alloc_canfd_skb() in 3.13 ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      40705519
    • James Bottomley's avatar
      libsas: Fix Kernel Crash in smp_execute_task · 5d8ebfb6
      James Bottomley authored
      commit 6302ce4d upstream.
      
      This crash was reported:
      
      [  366.947370] sd 3:0:1:0: [sdb] Spinning up disk....
      [  368.804046] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [  368.804072] IP: [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
      [  368.804098] PGD 0
      [  368.804114] Oops: 0002 [#1] SMP
      [  368.804143] CPU 1
      [  368.804151] Modules linked in: sg netconsole s3g(PO) uinput joydev hid_multitouch usbhid hid snd_hda_codec_via cpufreq_userspace cpufreq_powersave cpufreq_stats uhci_hcd cpufreq_conservative snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sdhci_pci snd_page_alloc sdhci snd_timer snd psmouse evdev serio_raw pcspkr soundcore xhci_hcd shpchp s3g_drm(O) mvsas mmc_core ahci libahci drm i2c_core acpi_cpufreq mperf video processor button thermal_sys dm_dmirror exfat_fs exfat_core dm_zcache dm_mod padlock_aes aes_generic padlock_sha iscsi_target_mod target_core_mod configfs sswipe libsas libata scsi_transport_sas picdev via_cputemp hwmon_vid fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sd_mod crc_t10dif usb_storage scsi_mod ehci_hcd usbcore usb_common
      [  368.804749]
      [  368.804764] Pid: 392, comm: kworker/u:3 Tainted: P        W  O 3.4.87-logicube-ng.22 #1 To be filled by O.E.M. To be filled by O.E.M./EPIA-M920
      [  368.804802] RIP: 0010:[<ffffffff81358457>]  [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
      [  368.804827] RSP: 0018:ffff880117001cc0  EFLAGS: 00010246
      [  368.804842] RAX: 0000000000000000 RBX: ffff8801185030d0 RCX: ffff88008edcb420
      [  368.804857] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8801185030d4
      [  368.804873] RBP: ffff8801181531c0 R08: 0000000000000020 R09: 00000000fffffffe
      [  368.804885] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801185030d4
      [  368.804899] R13: 0000000000000002 R14: ffff880117001fd8 R15: ffff8801185030d8
      [  368.804916] FS:  0000000000000000(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
      [  368.804931] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  368.804946] CR2: 0000000000000000 CR3: 000000000160b000 CR4: 00000000000006e0
      [  368.804962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  368.804978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  368.804995] Process kworker/u:3 (pid: 392, threadinfo ffff880117000000, task ffff8801181531c0)
      [  368.805009] Stack:
      [  368.805017]  ffff8801185030d8 0000000000000000 ffffffff8161ddf0 ffffffff81056f7c
      [  368.805062]  000000000000b503 ffff8801185030d0 ffff880118503000 0000000000000000
      [  368.805100]  ffff8801185030d0 ffff8801188b8000 ffff88008edcb420 ffffffff813583ac
      [  368.805135] Call Trace:
      [  368.805153]  [<ffffffff81056f7c>] ? up+0xb/0x33
      [  368.805168]  [<ffffffff813583ac>] ? mutex_lock+0x16/0x25
      [  368.805194]  [<ffffffffa018c414>] ? smp_execute_task+0x4e/0x222 [libsas]
      [  368.805217]  [<ffffffffa018ce1c>] ? sas_find_bcast_dev+0x3c/0x15d [libsas]
      [  368.805240]  [<ffffffffa018ce4f>] ? sas_find_bcast_dev+0x6f/0x15d [libsas]
      [  368.805264]  [<ffffffffa018e989>] ? sas_ex_revalidate_domain+0x37/0x2ec [libsas]
      [  368.805280]  [<ffffffff81355a2a>] ? printk+0x43/0x48
      [  368.805296]  [<ffffffff81359a65>] ? _raw_spin_unlock_irqrestore+0xc/0xd
      [  368.805318]  [<ffffffffa018b767>] ? sas_revalidate_domain+0x85/0xb6 [libsas]
      [  368.805336]  [<ffffffff8104e5d9>] ? process_one_work+0x151/0x27c
      [  368.805351]  [<ffffffff8104f6cd>] ? worker_thread+0xbb/0x152
      [  368.805366]  [<ffffffff8104f612>] ? manage_workers.isra.29+0x163/0x163
      [  368.805382]  [<ffffffff81052c4e>] ? kthread+0x79/0x81
      [  368.805399]  [<ffffffff8135fea4>] ? kernel_thread_helper+0x4/0x10
      [  368.805416]  [<ffffffff81052bd5>] ? kthread_flush_work_fn+0x9/0x9
      [  368.805431]  [<ffffffff8135fea0>] ? gs_change+0x13/0x13
      [  368.805442] Code: 83 7d 30 63 7e 04 f3 90 eb ab 4c 8d 63 04 4c 8d 7b 08 4c 89 e7 e8 fa 15 00 00 48 8b 43 10 4c 89 3c 24 48 89 63 10 48 89 44 24 08 <48> 89 20 83 c8 ff 48 89 6c 24 10 87 03 ff c8 74 35 4d 89 ee 41
      [  368.805851] RIP  [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
      [  368.805877]  RSP <ffff880117001cc0>
      [  368.805886] CR2: 0000000000000000
      [  368.805899] ---[ end trace b720682065d8f4cc ]---
      
      It's directly caused by 89d3cf6a [SCSI] libsas: add mutex for SMP task
      execution, but shows a deeper cause: expander functions expect to be able to
      cast to and treat domain devices as expanders.  The correct fix is to only do
      expander discover when we know we've got an expander device to avoid wrongly
      casting a non-expander device.
      Reported-by: default avatarPraveen Murali <pmurali@logicube.com>
      Tested-by: default avatarPraveen Murali <pmurali@logicube.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5d8ebfb6
    • jmlatten@linux.vnet.ibm.com's avatar
      tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send · c5ca6f5a
      jmlatten@linux.vnet.ibm.com authored
      commit 62dfd912 upstream.
      
      Problem: When IMA and VTPM are both enabled in kernel config,
      kernel hangs during bootup on LE OS.
      
      Why?: IMA calls tpm_pcr_read() which results in tpm_ibmvtpm_send
      and tpm_ibmtpm_recv getting called. A trace showed that
      tpm_ibmtpm_recv was hanging.
      
      Resolution: tpm_ibmtpm_recv was hanging because tpm_ibmvtpm_send
      was sending CRQ message that probably did not make much sense
      to phype because of Endianness. The fix below sends correctly
      converted CRQ for LE. This was not caught before because it
      seems IMA is not enabled by default in kernel config and
      IMA exercises this particular code path in vtpm.
      
      Tested with IMA and VTPM enabled in kernel config and VTPM
      enabled on both a BE OS and a LE OS ppc64 lpar. This exercised
      CRQ and TPM command code paths in vtpm.
      Patch is against Peter's tpmdd tree on github which included
      Vicky's previous vtpm le patches.
      Signed-off-by: default avatarJoy Latten <jmlatten@linux.vnet.ibm.com>
      Reviewed-by: default avatarAshley Lai <ashley@ahsleylai.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c5ca6f5a
    • Alexander Sverdlin's avatar
      spi: pl022: Fix race in giveback() leading to driver lock-up · 9bed79a9
      Alexander Sverdlin authored
      commit cd6fa8d2 upstream.
      
      Commit fd316941 ("spi/pl022: disable port when unused") introduced a race,
      which leads to possible driver lock up (easily reproducible on SMP).
      
      The problem happens in giveback() function where the completion of the transfer
      is signalled to SPI subsystem and then the HW SPI controller is disabled. Another
      transfer might be setup in between, which brings driver in locked-up state.
      
      Exact event sequence on SMP:
      
      core0                                   core1
      
                                              => pump_transfers()
                                              /* message->state == STATE_DONE */
                                                => giveback()
                                                  => spi_finalize_current_message()
      
      => pl022_unprepare_transfer_hardware()
      => pl022_transfer_one_message
        => flush()
        => do_interrupt_dma_transfer()
          => set_up_next_transfer()
          /* Enable SSP, turn on interrupts */
          writew((readw(SSP_CR1(pl022->virtbase)) |
                 SSP_CR1_MASK_SSE), SSP_CR1(pl022->virtbase));
      
      ...
      
      => pl022_interrupt_handler()
        => readwriter()
      
                                              /* disable the SPI/SSP operation */
                                              => writew((readw(SSP_CR1(pl022->virtbase)) &
                                                        (~SSP_CR1_MASK_SSE)), SSP_CR1(pl022->virtbase));
      
      Lockup! SPI controller is disabled and the data will never be received. Whole
      SPI subsystem is waiting for transfer ACK and blocked.
      
      So, only signal transfer completion after disabling the controller.
      
      Fixes: fd316941 (spi/pl022: disable port when unused)
      Signed-off-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      9bed79a9
    • Brian King's avatar
      bnx2x: Force fundamental reset for EEH recovery · 1b0c66ad
      Brian King authored
      commit da293700 upstream.
      
      EEH recovery for bnx2x based adapters is not reliable on all Power
      systems using the default hot reset, which can result in an
      unrecoverable EEH error. Forcing the use of fundamental reset
      during EEH recovery fixes this.
      Signed-off-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1b0c66ad
    • Tejun Heo's avatar
      workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE · 0d4bfc4d
      Tejun Heo authored
      commit 8603e1b3 upstream.
      
      cancel[_delayed]_work_sync() are implemented using
      __cancel_work_timer() which grabs the PENDING bit using
      try_to_grab_pending() and then flushes the work item with PENDING set
      to prevent the on-going execution of the work item from requeueing
      itself.
      
      try_to_grab_pending() can always grab PENDING bit without blocking
      except when someone else is doing the above flushing during
      cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
      this case, __cancel_work_timer() currently invokes flush_work().  The
      assumption is that the completion of the work item is what the other
      canceling task would be waiting for too and thus waiting for the same
      condition and retrying should allow forward progress without excessive
      busy looping
      
      Unfortunately, this doesn't work if preemption is disabled or the
      latter task has real time priority.  Let's say task A just got woken
      up from flush_work() by the completion of the target work item.  If,
      before task A starts executing, task B gets scheduled and invokes
      __cancel_work_timer() on the same work item, its try_to_grab_pending()
      will return -ENOENT as the work item is still being canceled by task A
      and flush_work() will also immediately return false as the work item
      is no longer executing.  This puts task B in a busy loop possibly
      preventing task A from executing and clearing the canceling state on
      the work item leading to a hang.
      
      task A			task B			worker
      
      						executing work
      __cancel_work_timer()
        try_to_grab_pending()
        set work CANCELING
        flush_work()
          block for work completion
      						completion, wakes up A
      			__cancel_work_timer()
      			while (forever) {
      			  try_to_grab_pending()
      			    -ENOENT as work is being canceled
      			  flush_work()
      			    false as work is no longer executing
      			}
      
      This patch removes the possible hang by updating __cancel_work_timer()
      to explicitly wait for clearing of CANCELING rather than invoking
      flush_work() after try_to_grab_pending() fails with -ENOENT.
      
      Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
      
      v3: bit_waitqueue() can't be used for work items defined in vmalloc
          area.  Switched to custom wake function which matches the target
          work item and exclusive wait and wakeup.
      
      v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
          the target bit waitqueue has wait_bit_queue's on it.  Use
          DEFINE_WAIT_BIT() and __wake_up_bit() instead.  Reported by Tomeu
          Vizoso.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarRabin Vincent <rabin.vincent@axis.com>
      Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
      Tested-by: default avatarJesper Nilsson <jesper.nilsson@axis.com>
      Tested-by: default avatarRabin Vincent <rabin.vincent@axis.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0d4bfc4d
    • Jason Low's avatar
      cpuset: Fix cpuset sched_relax_domain_level · 0ca8b4fa
      Jason Low authored
      commit 283cb41f upstream.
      
      The cpuset.sched_relax_domain_level can control how far we do
      immediate load balancing on a system. However, it was found on recent
      kernels that echo'ing a value into cpuset.sched_relax_domain_level
      did not reduce any immediate load balancing.
      
      The reason this occurred was because the update_domain_attr_tree() traversal
      did not update for the "top_cpuset". This resulted in nothing being changed
      when modifying the sched_relax_domain_level parameter.
      
      This patch is able to address that problem by having update_domain_attr_tree()
      allow updates for the root in the cpuset traversal.
      
      Fixes: fc560a26 ("cpuset: replace cpuset->stack_list with cpuset_for_each_descendant_pre()")
      Signed-off-by: default avatarJason Low <jason.low2@hp.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Tested-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0ca8b4fa
  2. 06 Apr, 2015 30 commits
    • Jiri Pirko's avatar
      team: don't traverse port list using rcu in team_set_mac_address · 3b78d504
      Jiri Pirko authored
      [ Upstream commit 9215f437 ]
      
      Currently the list is traversed using rcu variant. That is not correct
      since dev_set_mac_address can be called which eventually calls
      rtmsg_ifinfo_build_skb and there, skb allocation can sleep. So fix this
      by remove the rcu usage here.
      
      Fixes: 3d249d4c "net: introduce ethernet teaming device"
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3b78d504
    • Lorenzo Colitti's avatar
      net: ping: Return EAFNOSUPPORT when appropriate. · 80437c5a
      Lorenzo Colitti authored
      [ Upstream commit 9145736d ]
      
      1. For an IPv4 ping socket, ping_check_bind_addr does not check
         the family of the socket address that's passed in. Instead,
         make it behave like inet_bind, which enforces either that the
         address family is AF_INET, or that the family is AF_UNSPEC and
         the address is 0.0.0.0.
      2. For an IPv6 ping socket, ping_check_bind_addr returns EINVAL
         if the socket family is not AF_INET6. Return EAFNOSUPPORT
         instead, for consistency with inet6_bind.
      3. Make ping_v4_sendmsg and ping_v6_sendmsg return EAFNOSUPPORT
         instead of EINVAL if an incorrect socket address structure is
         passed in.
      4. Make IPv6 ping sockets be IPv6-only. The code does not support
         IPv4, and it cannot easily be made to support IPv4 because
         the protocol numbers for ICMP and ICMPv6 are different. This
         makes connect(::ffff:192.0.2.1) fail with EAFNOSUPPORT instead
         of making the socket unusable.
      
      Among other things, this fixes an oops that can be triggered by:
      
          int s = socket(AF_INET, SOCK_DGRAM, IPPROTO_ICMP);
          struct sockaddr_in6 sin6 = {
              .sin6_family = AF_INET6,
              .sin6_addr = in6addr_any,
          };
          bind(s, (struct sockaddr *) &sin6, sizeof(sin6));
      
      Change-Id: If06ca86d9f1e4593c0d6df174caca3487c57a241
      Signed-off-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      80437c5a
    • Michal Kubeček's avatar
      udp: only allow UFO for packets from SOCK_DGRAM sockets · c2625358
      Michal Kubeček authored
      [ Upstream commit acf8dd0a ]
      
      If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a
      UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to
      CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer
      checksum is to be computed on segmentation. However, in this case,
      skb->csum_start and skb->csum_offset are never set as raw socket
      transmit path bypasses udp_send_skb() where they are usually set. As a
      result, driver may access invalid memory when trying to calculate the
      checksum and store the result (as observed in virtio_net driver).
      
      Moreover, the very idea of modifying the userspace provided UDP header
      is IMHO against raw socket semantics (I wasn't able to find a document
      clearly stating this or the opposite, though). And while allowing
      CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit
      too intrusive change just to handle a corner case like this. Therefore
      disallowing UFO for packets from SOCK_DGRAM seems to be the best option.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c2625358
    • Ben Shelton's avatar
      usb: plusb: Add support for National Instruments host-to-host cable · 5575a389
      Ben Shelton authored
      [ Upstream commit 42c972a1 ]
      
      The National Instruments USB Host-to-Host Cable is based on the Prolific
      PL-25A1 chipset.  Add its VID/PID so the plusb driver will recognize it.
      Signed-off-by: default avatarBen Shelton <ben.shelton@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5575a389
    • Eric Dumazet's avatar
      macvtap: make sure neighbour code can push ethernet header · 295817aa
      Eric Dumazet authored
      [ Upstream commit 2f1d8b9e ]
      
      Brian reported crashes using IPv6 traffic with macvtap/veth combo.
      
      I tracked the crashes in neigh_hh_output()
      
      -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
      
      Neighbour code assumes headroom to push Ethernet header is
      at least 16 bytes.
      
      It appears macvtap has only 14 bytes available on arches
      where NET_IP_ALIGN is 0 (like x86)
      
      Effect is a corruption of 2 bytes right before skb->head,
      and possible crashes if accessing non existing memory.
      
      This fix should also increase IPv4 performance, as paranoid code
      in ip_finish_output2() wont have to call skb_realloc_headroom()
      Reported-by: default avatarBrian Rak <brak@vultr.com>
      Tested-by: default avatarBrian Rak <brak@vultr.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      295817aa
    • Matthew Thode's avatar
      net: reject creation of netdev names with colons · 8b876126
      Matthew Thode authored
      [ Upstream commit a4176a93 ]
      
      colons are used as a separator in netdev device lookup in dev_ioctl.c
      
      Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME
      Signed-off-by: default avatarMatthew Thode <mthode@mthode.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8b876126
    • Ignacy Gawędzki's avatar
      ematch: Fix auto-loading of ematch modules. · 82664ff9
      Ignacy Gawędzki authored
      [ Upstream commit 34eea79e ]
      
      In tcf_em_validate(), after calling request_module() to load the
      kind-specific module, set em->ops to NULL before returning -EAGAIN, so
      that module_put() is not called again by tcf_em_tree_destroy().
      Signed-off-by: default avatarIgnacy Gawędzki <ignacy.gawedzki@green-communications.fr>
      Acked-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      82664ff9
    • Alexander Drozdov's avatar
      ipv4: ip_check_defrag should not assume that skb_network_offset is zero · 65135a40
      Alexander Drozdov authored
      [ Upstream commit 3e32e733 ]
      
      ip_check_defrag() may be used by af_packet to defragment outgoing packets.
      skb_network_offset() of af_packet's outgoing packets is not zero.
      Signed-off-by: default avatarAlexander Drozdov <al.drozdov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      65135a40
    • Ignacy Gawędzki's avatar
      gen_stats.c: Duplicate xstats buffer for later use · 2ef0e288
      Ignacy Gawędzki authored
      [ Upstream commit 1c4cff0c ]
      
      The gnet_stats_copy_app() function gets called, more often than not, with its
      second argument a pointer to an automatic variable in the caller's stack.
      Therefore, to avoid copying garbage afterwards when calling
      gnet_stats_finish_copy(), this data is better copied to a dynamically allocated
      memory that gets freed after use.
      
      [xiyou.wangcong@gmail.com: remove a useless kfree()]
      Signed-off-by: default avatarIgnacy Gawędzki <ignacy.gawedzki@green-communications.fr>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2ef0e288
    • WANG Cong's avatar
      rtnetlink: call ->dellink on failure when ->newlink exists · ffd99bd6
      WANG Cong authored
      [ Upstream commit 7afb8886 ]
      
      Ignacy reported that when eth0 is down and add a vlan device
      on top of it like:
      
        ip link add link eth0 name eth0.1 up type vlan id 1
      
      We will get a refcount leak:
      
        unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2
      
      The problem is when rtnl_configure_link() fails in rtnl_newlink(),
      we simply call unregister_device(), but for stacked device like vlan,
      we almost do nothing when we unregister the upper device, more work
      is done when we unregister the lower device, so call its ->dellink().
      Reported-by: default avatarIgnacy Gawedzki <ignacy.gawedzki@green-communications.fr>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ffd99bd6
    • Daniel Borkmann's avatar
      rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY · 0a9ab344
      Daniel Borkmann authored
      [ Upstream commit 364d5716 ]
      
      ifla_vf_policy[] is wrong in advertising its individual member types as
      NLA_BINARY since .type = NLA_BINARY in combination with .len declares the
      len member as *max* attribute length [0, len].
      
      The issue is that when do_setvfinfo() is being called to set up a VF
      through ndo handler, we could set corrupted data if the attribute length
      is less than the size of the related structure itself.
      
      The intent is exactly the opposite, namely to make sure to pass at least
      data of minimum size of len.
      
      Fixes: ebc08a6f ("rtnetlink: Add VF config code to rtnetlink")
      Cc: Mitch Williams <mitch.a.williams@intel.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0a9ab344
    • Catalin Marinas's avatar
      net: compat: Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg · 1b38ff79
      Catalin Marinas authored
      commit d720d8ce upstream.
      
      With commit a7526eb5 (net: Unbreak compat_sys_{send,recv}msg), the
      MSG_CMSG_COMPAT flag is blocked at the compat syscall entry points,
      changing the kernel compat behaviour from the one before the commit it
      was trying to fix (1be374a0, net: Block MSG_CMSG_COMPAT in
      send(m)msg and recv(m)msg).
      
      On 32-bit kernels (!CONFIG_COMPAT), MSG_CMSG_COMPAT is 0 and the native
      32-bit sys_sendmsg() allows flag 0x80000000 to be set (it is ignored by
      the kernel). However, on a 64-bit kernel, the compat ABI is different
      with commit a7526eb5.
      
      This patch changes the compat_sys_{send,recv}msg behaviour to the one
      prior to commit 1be374a0.
      
      The problem was found running 32-bit LTP (sendmsg01) binary on an arm64
      kernel. Arguably, LTP should not pass 0xffffffff as flags to sendmsg()
      but the general rule is not to break user ABI (even when the user
      behaviour is not entirely sane).
      
      Fixes: a7526eb5 (net: Unbreak compat_sys_{send,recv}msg)
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1b38ff79
    • Jiri Pirko's avatar
      team: fix possible null pointer dereference in team_handle_frame · c8fc8473
      Jiri Pirko authored
      commit 57e59563 upstream.
      
      Currently following race is possible in team:
      
      CPU0                                        CPU1
                                                  team_port_del
                                                    team_upper_dev_unlink
                                                      priv_flags &= ~IFF_TEAM_PORT
      team_handle_frame
        team_port_get_rcu
          team_port_exists
            priv_flags & IFF_TEAM_PORT == 0
          return NULL (instead of port got
                       from rx_handler_data)
                                                    netdev_rx_handler_unregister
      
      The thing is that the flag is removed before rx_handler is unregistered.
      If team_handle_frame is called in between, team_port_exists returns 0
      and team_port_get_rcu will return NULL.
      So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
      that team_handle_frame will always see valid rx_handler_data pointer.
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Fixes: 3d249d4c ("net: introduce ethernet teaming device")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c8fc8473
    • Pravin B Shelar's avatar
      openvswitch: Fix net exit. · 429a7f3d
      Pravin B Shelar authored
      commit 7b4577a9 upstream.
      
      Open vSwitch allows moving internal vport to different namespace
      while still connected to the bridge. But when namespace deleted
      OVS does not detach these vports, that results in dangling
      pointer to netdevice which causes kernel panic as follows.
      This issue is fixed by detaching all ovs ports from the deleted
      namespace at net-exit.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
      IP: [<ffffffffa0aadaa5>] ovs_vport_locate+0x35/0x80 [openvswitch]
      Oops: 0000 [#1] SMP
      Call Trace:
       [<ffffffffa0aa6391>] lookup_vport+0x21/0xd0 [openvswitch]
       [<ffffffffa0aa65f9>] ovs_vport_cmd_get+0x59/0xf0 [openvswitch]
       [<ffffffff8167e07c>] genl_family_rcv_msg+0x1bc/0x3e0
       [<ffffffff8167e319>] genl_rcv_msg+0x79/0xc0
       [<ffffffff8167d919>] netlink_rcv_skb+0xb9/0xe0
       [<ffffffff8167deac>] genl_rcv+0x2c/0x40
       [<ffffffff8167cffd>] netlink_unicast+0x12d/0x1c0
       [<ffffffff8167d3da>] netlink_sendmsg+0x34a/0x6b0
       [<ffffffff8162e140>] sock_sendmsg+0xa0/0xe0
       [<ffffffff8162e5e8>] ___sys_sendmsg+0x408/0x420
       [<ffffffff8162f541>] __sys_sendmsg+0x51/0x90
       [<ffffffff8162f592>] SyS_sendmsg+0x12/0x20
       [<ffffffff81764ee9>] system_call_fastpath+0x12/0x17
      Reported-by: default avatarAssaf Muller <amuller@redhat.com>
      Fixes: 46df7b81("openvswitch: Add support for network namespaces.")
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Reviewed-by: default avatarThomas Graf <tgraf@noironetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      429a7f3d
    • Guenter Roeck's avatar
      net: phy: Fix verification of EEE support in phy_init_eee · 2be68df0
      Guenter Roeck authored
      commit 54da5a8b upstream.
      
      phy_init_eee uses phy_find_setting(phydev->speed, phydev->duplex)
      to find a valid entry in the settings array for the given speed
      and duplex value. For full duplex 1000baseT, this will return
      the first matching entry, which is the entry for 1000baseKX_Full.
      
      If the phy eee does not support 1000baseKX_Full, this entry will not
      match, causing phy_init_eee to fail for no good reason.
      
      Fixes: 9a9c56cb ("net: phy: fix a bug when verify the EEE support")
      Fixes: 3e707706 ("phy: Expand phy speed/duplex settings array")
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ kamal: backport to 3.13-stable: context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2be68df0
    • Alexander Drozdov's avatar
      ipv4: ip_check_defrag should correctly check return value of skb_copy_bits · c43c36f7
      Alexander Drozdov authored
      commit fba04a9e upstream.
      
      skb_copy_bits() returns zero on success and negative value on error,
      so it is needed to invert the condition in ip_check_defrag().
      
      Fixes: 1bf3751e ("ipv4: ip_check_defrag must not modify skb before unsharing")
      Signed-off-by: default avatarAlexander Drozdov <al.drozdov@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c43c36f7
    • David Ramos's avatar
      svcrpc: fix memory leak in gssp_accept_sec_context_upcall · 2c5200bf
      David Ramos authored
      commit a1d1e9be upstream.
      
      Our UC-KLEE tool found a kernel memory leak of 512 bytes (on x86_64) for
      each call to gssp_accept_sec_context_upcall()
      (net/sunrpc/auth_gss/gss_rpc_upcall.c). Since it appears that this call
      can be triggered by remote connections (at least, from a cursory a
      glance at the call chain), it may be exploitable to cause kernel memory
      exhaustion. We found the bug in kernel 3.16.3, but it appears to date
      back to commit 9dfd87da (2013-08-20).
      
      The gssp_accept_sec_context_upcall() function performs a pair of calls
      to gssp_alloc_receive_pages() and gssp_free_receive_pages().  The first
      allocates memory for arg->pages.  The second then frees the pages
      pointed to by the arg->pages array, but not the array itself.
      Reported-by: default avatarDavid A. Ramos <daramos@stanford.edu>
      Fixes: 9dfd87da ("rpc: fix huge kmalloc's in gss-proxy”)
      Signed-off-by: default avatarDavid A. Ramos <daramos@stanford.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2c5200bf
    • Eric Dumazet's avatar
      netfilter: xt_socket: fix a stack corruption bug · 667d697a
      Eric Dumazet authored
      commit 78296c97 upstream.
      
      As soon as extract_icmp6_fields() returns, its local storage (automatic
      variables) is deallocated and can be overwritten.
      
      Lets add an additional parameter to make sure storage is valid long
      enough.
      
      While we are at it, adds some const qualifiers.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: b64c9256 ("tproxy: added IPv6 support to the socket match")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      667d697a
    • Al Viro's avatar
      sunrpc: fix braino in ->poll() · ed10228f
      Al Viro authored
      commit 1711fd9a upstream.
      
      POLL_OUT isn't what callers of ->poll() are expecting to see; it's
      actually __SI_POLL | 2 and it's a siginfo code, not a poll bitmap
      bit...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Bruce Fields <bfields@fieldses.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ed10228f
    • Johan Hovold's avatar
      TTY: fix tty_wait_until_sent on 64-bit machines · 16f581e0
      Johan Hovold authored
      commit 79fbf4a5 upstream.
      
      Fix overflow bug in tty_wait_until_sent on 64-bit machines, where an
      infinite timeout (0) would be passed to the underlying tty-driver's
      wait_until_sent-operation as a negative timeout (-1), causing it to
      return immediately.
      
      This manifests itself for example as tcdrain() returning immediately,
      drivers not honouring the drain flags when setting terminal attributes,
      or even dropped data on close as a requested infinite closing-wait
      timeout would be ignored.
      
      The first symptom  was reported by Asier LLANO who noted that tcdrain()
      returned prematurely when using the ftdi_sio usb-serial driver.
      
      Fix this by passing 0 rather than MAX_SCHEDULE_TIMEOUT (LONG_MAX) to the
      underlying tty driver.
      
      Note that the serial-core wait_until_sent-implementation is not affected
      by this bug due to a lucky chance (comparison to an unsigned maximum
      timeout), and neither is the cyclades one that had an explicit check for
      negative timeouts, but all other tty drivers appear to be affected.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarZIV-Asier Llano Palacios <asier.llano@cgglobal.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Reviewed-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      16f581e0
    • Johan Hovold's avatar
      USB: serial: fix infinite wait_until_sent timeout · 4d140681
      Johan Hovold authored
      commit f528bf4f upstream.
      
      Make sure to handle an infinite timeout (0).
      
      Note that wait_until_sent is currently never called with a 0-timeout
      argument due to a bug in tty_wait_until_sent.
      
      Fixes: dcf01050 ("USB: serial: add generic wait_until_sent
      implementation")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4d140681
    • Johan Hovold's avatar
      net: irda: fix wait_until_sent poll timeout · 51431c26
      Johan Hovold authored
      commit 2c3fbe3c upstream.
      
      In case an infinite timeout (0) is requested, the irda wait_until_sent
      implementation would use a zero poll timeout rather than the default
      200ms.
      
      Note that wait_until_sent is currently never called with a 0-timeout
      argument due to a bug in tty_wait_until_sent.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      51431c26
    • Peter Hurley's avatar
      console: Fix console name size mismatch · 51efbdc1
      Peter Hurley authored
      commit 30a22c21 upstream.
      
      commit 6ae9200f ("enlarge console.name") increased the storage
      for the console name to 16 bytes, but not the corresponding
      struct console_cmdline::name storage. Console names longer than
      8 bytes cause read beyond end-of-string and failure to match
      console; I'm not sure if there are other unexpected consequences.
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      51efbdc1
    • Jiri Slaby's avatar
      tty: fix up atime/mtime mess, take four · 90c9ba22
      Jiri Slaby authored
      commit f0bf0bd0 upstream.
      
      This problem was taken care of three times already in
      * b0de59b5 (TTY: do not update
        atime/mtime on read/write),
      * 37b7f3c7 (TTY: fix atime/mtime
        regression), and
      * b0b88565 (tty: fix up atime/mtime
        mess, take three)
      
      But it still misses one point. As John Paul correctly points out, we
      do not care about setting date. If somebody ever changes wall
      time backwards (by mistake for example), tty timestamps are never
      updated until the original wall time passes.
      
      So check the absolute difference of times and if it large than "8
      seconds or so", always update the time. That means we will update
      immediatelly when changing time. Ergo, CAP_SYS_TIME can foul the
      check, but it was always that way.
      
      Thanks John for serving me this so nicely debugged.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Reported-by: default avatarJohn Paul Perry <john_paul.perry@alcatel-lucent.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      90c9ba22
    • Russell King's avatar
      Change email address for 8250_pci · 46c175b1
      Russell King authored
      commit f2e0ea86 upstream.
      
      I'm still receiving reports to my email address, so let's point this
      at the linux-serial mailing list instead.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      46c175b1
    • Mathias Nyman's avatar
      xhci: Workaround for PME stuck issues in Intel xhci · 57b66723
      Mathias Nyman authored
      commit b8cb91e0 upstream.
      
      The xhci in Intel Sunrisepoint and Cherryview platforms need a driver
      workaround for a Stuck PME that might either block PME events in suspend,
      or create spurious PME events preventing runtime suspend.
      
      Workaround is to clear a internal PME flag, BIT(28) in a vendor specific
      PMCTRL register at offset 0x80a4, in both suspend resume callbacks
      
      Without this, xhci connected usb devices might never be able to wake up the
      system from suspend, or prevent device from going to suspend (xhci d3)
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      57b66723
    • Aleksander Morgado's avatar
      xhci: fix reporting of 0-sized URBs in control endpoint · d6784342
      Aleksander Morgado authored
      commit 45ba2154 upstream.
      
      When a control transfer has a short data stage, the xHCI controller generates
      two transfer events: a COMP_SHORT_TX event that specifies the untransferred
      amount, and a COMP_SUCCESS event. But when the data stage is not short, only the
      COMP_SUCCESS event occurs. Therefore, xhci-hcd must set urb->actual_length to
      urb->transfer_buffer_length while processing the COMP_SUCCESS event, unless
      urb->actual_length was set already by a previous COMP_SHORT_TX event.
      
      The driver checks this by seeing whether urb->actual_length == 0, but this alone
      is the wrong test, as it is entirely possible for a short transfer to have an
      urb->actual_length = 0.
      
      This patch changes the xhci driver to rely on a new td->urb_length_set flag,
      which is set to true when a COMP_SHORT_TX event is received and the URB length
      updated at that stage.
      
      This fixes a bug which affected the HSO plugin, which relies on URBs with
      urb->actual_length == 0 to halt re-submitting the RX URB in the control
      endpoint.
      Signed-off-by: default avatarAleksander Morgado <aleksander@aleksander.es>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d6784342
    • Quentin Casasnovas's avatar
      Btrfs:__add_inode_ref: out of bounds memory read when looking for extended ref. · e64d0b15
      Quentin Casasnovas authored
      commit dd9ef135 upstream.
      
      Improper arithmetics when calculting the address of the extended ref could
      lead to an out of bounds memory read and kernel panic.
      Signed-off-by: default avatarQuentin Casasnovas <quentin.casasnovas@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e64d0b15
    • Filipe Manana's avatar
      Btrfs: fix data loss in the fast fsync path · 774a162f
      Filipe Manana authored
      commit 3a8b36f3 upstream.
      
      When using the fast file fsync code path we can miss the fact that new
      writes happened since the last file fsync and therefore return without
      waiting for the IO to finish and write the new extents to the fsync log.
      
      Here's an example scenario where the fsync will miss the fact that new
      file data exists that wasn't yet durably persisted:
      
      1. fs_info->last_trans_committed == N - 1 and current transaction is
         transaction N (fs_info->generation == N);
      
      2. do a buffered write;
      
      3. fsync our inode, this clears our inode's full sync flag, starts
         an ordered extent and waits for it to complete - when it completes
         at btrfs_finish_ordered_io(), the inode's last_trans is set to the
         value N (via btrfs_update_inode_fallback -> btrfs_update_inode ->
         btrfs_set_inode_last_trans);
      
      4. transaction N is committed, so fs_info->last_trans_committed is now
         set to the value N and fs_info->generation remains with the value N;
      
      5. do another buffered write, when this happens btrfs_file_write_iter
         sets our inode's last_trans to the value N + 1 (that is
         fs_info->generation + 1 == N + 1);
      
      6. transaction N + 1 is started and fs_info->generation now has the
         value N + 1;
      
      7. transaction N + 1 is committed, so fs_info->last_trans_committed
         is set to the value N + 1;
      
      8. fsync our inode - because it doesn't have the full sync flag set,
         we only start the ordered extent, we don't wait for it to complete
         (only in a later phase) therefore its last_trans field has the
         value N + 1 set previously by btrfs_file_write_iter(), and so we
         have:
      
             inode->last_trans <= fs_info->last_trans_committed
                 (N + 1)              (N + 1)
      
         Which made us not log the last buffered write and exit the fsync
         handler immediately, returning success (0) to user space and resulting
         in data loss after a crash.
      
      This can actually be triggered deterministically and the following excerpt
      from a testcase I made for xfstests triggers the issue. It moves a dummy
      file across directories and then fsyncs the old parent directory - this
      is just to trigger a transaction commit, so moving files around isn't
      directly related to the issue but it was chosen because running 'sync' for
      example does more than just committing the current transaction, as it
      flushes/waits for all file data to be persisted. The issue can also happen
      at random periods, since the transaction kthread periodicaly commits the
      current transaction (about every 30 seconds by default).
      The body of the test is:
      
        _scratch_mkfs >> $seqres.full 2>&1
        _init_flakey
        _mount_flakey
      
        # Create our main test file 'foo', the one we check for data loss.
        # By doing an fsync against our file, it makes btrfs clear the 'needs_full_sync'
        # bit from its flags (btrfs inode specific flags).
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 8K" \
                        -c "fsync" $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Now create one other file and 2 directories. We will move this second file
        # from one directory to the other later because it forces btrfs to commit its
        # currently open transaction if we fsync the old parent directory. This is
        # necessary to trigger the data loss bug that affected btrfs.
        mkdir $SCRATCH_MNT/testdir_1
        touch $SCRATCH_MNT/testdir_1/bar
        mkdir $SCRATCH_MNT/testdir_2
      
        # Make sure everything is durably persisted.
        sync
      
        # Write more 8Kb of data to our file.
        $XFS_IO_PROG -c "pwrite -S 0xbb 8K 8K" $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Move our 'bar' file into a new directory.
        mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar
      
        # Fsync our first directory. Because it had a file moved into some other
        # directory, this made btrfs commit the currently open transaction. This is
        # a condition necessary to trigger the data loss bug.
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1
      
        # Now fsync our main test file. If the fsync succeeds, we expect the 8Kb of
        # data we wrote previously to be persisted and available if a crash happens.
        # This did not happen with btrfs, because of the transaction commit that
        # happened when we fsynced the parent directory.
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
      
        # Simulate a crash/power loss.
        _load_flakey_table $FLAKEY_DROP_WRITES
        _unmount_flakey
      
        _load_flakey_table $FLAKEY_ALLOW_WRITES
        _mount_flakey
      
        # Now check that all data we wrote before are available.
        echo "File content after log replay:"
        od -t x1 $SCRATCH_MNT/foo
      
        status=0
        exit
      
      The expected golden output for the test, which is what we get with this
      fix applied (or when running against ext3/4 and xfs), is:
      
        wrote 8192/8192 bytes at offset 0
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        wrote 8192/8192 bytes at offset 8192
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        File content after log replay:
        0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
        *
        0020000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
        *
        0040000
      
      Without this fix applied, the output shows the test file does not have
      the second 8Kb extent that we successfully fsynced:
      
        wrote 8192/8192 bytes at offset 0
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        wrote 8192/8192 bytes at offset 8192
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        File content after log replay:
        0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
        *
        0020000
      
      So fix this by skipping the fsync only if we're doing a full sync and
      if the inode's last_trans is <= fs_info->last_trans_committed, or if
      the inode is already in the log. Also remove setting the inode's
      last_trans in btrfs_file_write_iter since it's useless/unreliable.
      
      Also because btrfs_file_write_iter no longer sets inode->last_trans to
      fs_info->generation + 1, don't set last_trans to 0 if we bail out and don't
      bail out if last_trans is 0, otherwise something as simple as the following
      example wouldn't log the second write on the last fsync:
      
        1. write to file
      
        2. fsync file
      
        3. fsync file
             |--> btrfs_inode_in_log() returns true and it set last_trans to 0
      
        4. write to file
             |--> btrfs_file_write_iter() no longers sets last_trans, so it
                  remained with a value of 0
        5. fsync
             |--> inode->last_trans == 0, so it bails out without logging the
                  second write
      
      A test case for xfstests will be sent soon.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      774a162f
    • Andy Lutomirski's avatar
      x86/asm/entry/64: Remove a bogus 'ret_from_fork' optimization · 0a0cdf2c
      Andy Lutomirski authored
      commit 956421fb upstream.
      
      'ret_from_fork' checks TIF_IA32 to determine whether 'pt_regs' and
      the related state make sense for 'ret_from_sys_call'.  This is
      entirely the wrong check.  TS_COMPAT would make a little more
      sense, but there's really no point in keeping this optimization
      at all.
      
      This fixes a return to the wrong user CS if we came from int
      0x80 in a 64-bit task.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/4710be56d76ef994ddf59087aad98c000fbab9a4.1424989793.git.luto@amacapital.net
      [ Backported from tip:x86/asm. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0a0cdf2c