1. 12 Dec, 2013 10 commits
  2. 08 Dec, 2013 30 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.10.23 · 184c20bb
      Greg Kroah-Hartman authored
      184c20bb
    • Pierre Ossman's avatar
      drm/radeon/audio: correct ACR table · 84a0264e
      Pierre Ossman authored
      commit 3e71985f upstream.
      
      The values were taken from the HDMI spec, but they assumed
      exact x/1.001 clocks. Since we round the clocks, we also need
      to calculate different N and CTS values.
      
      Note that the N for 25.2/1.001 MHz at 44.1 kHz audio is out of
      spec. Hopefully this mode is rarely used and/or HDMI sinks
      tolerate overly large values of N.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=69675Signed-off-by: default avatarPierre Ossman <pierre@ossman.eu>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84a0264e
    • Pierre Ossman's avatar
      drm/radeon/audio: improve ACR calculation · f789de21
      Pierre Ossman authored
      commit a2098250 upstream.
      
      In order to have any realistic chance of calculating proper
      ACR values, we need to be able to calculate both N and CTS,
      not just CTS. We still aim for the ideal N as specified in
      the HDMI spec though.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=69675Signed-off-by: default avatarPierre Ossman <pierre@ossman.eu>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f789de21
    • Miroslav Lichvar's avatar
      ntp: Make periodic RTC update more reliable · 9baca2ff
      Miroslav Lichvar authored
      commit a97ad0c4 upstream.
      
      The current code requires that the scheduled update of the RTC happens
      in the closest tick to the half of the second. This seems to be
      difficult to achieve reliably. The scheduled work may be missing the
      target time by a tick or two and be constantly rescheduled every second.
      
      Relax the limit to 10 ticks. As a typical RTC drifts in the 11-minute
      update interval by several milliseconds, this shouldn't affect the
      overall accuracy of the RTC much.
      Signed-off-by: default avatarMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9baca2ff
    • Tomoki Sekiyama's avatar
      elevator: acquire q->sysfs_lock in elevator_change() · 72b9401c
      Tomoki Sekiyama authored
      commit 7c8a3679 upstream.
      
      Add locking of q->sysfs_lock into elevator_change() (an exported function)
      to ensure it is held to protect q->elevator from elevator_init(), even if
      elevator_change() is called from non-sysfs paths.
      sysfs path (elv_iosched_store) uses __elevator_change(), non-locking
      version, as the lock is already taken by elv_iosched_store().
      Signed-off-by: default avatarTomoki Sekiyama <tomoki.sekiyama@hds.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72b9401c
    • Tomoki Sekiyama's avatar
      elevator: Fix a race in elevator switching and md device initialization · 6d53d392
      Tomoki Sekiyama authored
      commit eb1c160b upstream.
      
      The soft lockup below happens at the boot time of the system using dm
      multipath and the udev rules to switch scheduler.
      
      [  356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483]
      [  356.127001] RIP: 0010:[<ffffffff81072a7d>]  [<ffffffff81072a7d>] lock_timer_base.isra.35+0x1d/0x50
      ...
      [  356.127001] Call Trace:
      [  356.127001]  [<ffffffff81073810>] try_to_del_timer_sync+0x20/0x70
      [  356.127001]  [<ffffffff8118b08a>] ? kmem_cache_alloc_node_trace+0x20a/0x230
      [  356.127001]  [<ffffffff810738b2>] del_timer_sync+0x52/0x60
      [  356.127001]  [<ffffffff812ece22>] cfq_exit_queue+0x32/0xf0
      [  356.127001]  [<ffffffff812c98df>] elevator_exit+0x2f/0x50
      [  356.127001]  [<ffffffff812c9f21>] elevator_change+0xf1/0x1c0
      [  356.127001]  [<ffffffff812caa50>] elv_iosched_store+0x20/0x50
      [  356.127001]  [<ffffffff812d1d09>] queue_attr_store+0x59/0xb0
      [  356.127001]  [<ffffffff812143f6>] sysfs_write_file+0xc6/0x140
      [  356.127001]  [<ffffffff811a326d>] vfs_write+0xbd/0x1e0
      [  356.127001]  [<ffffffff811a3ca9>] SyS_write+0x49/0xa0
      [  356.127001]  [<ffffffff8164e899>] system_call_fastpath+0x16/0x1b
      
      This is caused by a race between md device initialization by multipathd and
      shell script to switch the scheduler using sysfs.
      
       - multipathd:
         SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load
         -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init
          q->elevator = elevator_alloc(q, e); // not yet initialized
      
       - sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler':
         elevator_switch (in the call trace above)
          struct elevator_queue *old = q->elevator;
          q->elevator = elevator_alloc(q, new_e);
          elevator_exit(old);                 // lockup! (*)
      
       - multipathd: (cont.)
          err = e->ops.elevator_init_fn(q);   // init fails; q->elevator is modified
      
      (*) When del_timer_sync() is called, lock_timer_base() will loop infinitely
      while timer->base == NULL. In this case, as timer will never initialized,
      it results in lockup.
      
      This patch introduces acquisition of q->sysfs_lock around elevator_init()
      into blk_init_allocated_queue(), to provide mutual exclusion between
      initialization of the q->scheduler and switching of the scheduler.
      
      This should fix this bugzilla:
      https://bugzilla.redhat.com/show_bug.cgi?id=902012Signed-off-by: default avatarTomoki Sekiyama <tomoki.sekiyama@hds.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d53d392
    • Neil Horman's avatar
      iommu: Remove stack trace from broken irq remapping warning · 64545123
      Neil Horman authored
      commit 05104a4e upstream.
      
      The warning for the irq remapping broken check in intel_irq_remapping.c is
      pretty pointless.  We need the warning, but we know where its comming from, the
      stack trace will always be the same, and it needlessly triggers things like
      Abrt.  This changes the warning to just print a text warning about BIOS being
      broken, without the stack trace, then sets the appropriate taint bit.  Since we
      automatically disable irq remapping, theres no need to contiue making Abrt jump
      at this problem
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: Joerg Roedel <joro@8bytes.org>
      CC: Bjorn Helgaas <bhelgaas@google.com>
      CC: Andy Lutomirski <luto@amacapital.net>
      CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Signed-off-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64545123
    • Julian Stecklina's avatar
      iommu/vt-d: Fixed interaction of VFIO_IOMMU_MAP_DMA with IOMMU address limits · 3de762b3
      Julian Stecklina authored
      commit f9423606 upstream.
      
      The BUG_ON in drivers/iommu/intel-iommu.c:785 can be triggered from userspace via
      VFIO by calling the VFIO_IOMMU_MAP_DMA ioctl on a vfio device with any address
      beyond the addressing capabilities of the IOMMU. The problem is that the ioctl code
      calls iommu_iova_to_phys before it calls iommu_map. iommu_map handles the case that
      it gets addresses beyond the addressing capabilities of its IOMMU.
      intel_iommu_iova_to_phys does not.
      
      This patch fixes iommu_iova_to_phys to return NULL for addresses beyond what the
      IOMMU can handle. This in turn causes the ioctl call to fail in iommu_map and
      (correctly) return EFAULT to the user with a helpful warning message in the kernel
      log.
      Signed-off-by: default avatarJulian Stecklina <jsteckli@os.inf.tu-dresden.de>
      Acked-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3de762b3
    • Simon Wood's avatar
      HID: lg: fix Report Descriptor for Logitech MOMO Force (Black) · 7dc39b55
      Simon Wood authored
      commit 348cbaa8 upstream.
      
      By default the Logitech MOMO Force (Black) presents a combined accel/brake
      axis ('Y'). This patch modifies the HID descriptor to present seperate
      accel/brake axes ('Y' and 'Z').
      Signed-off-by: default avatarSimon Wood <simon@mungewell.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7dc39b55
    • Sasha Levin's avatar
      video: kyro: fix incorrect sizes when copying to userspace · ca77581e
      Sasha Levin authored
      commit 2ab68ec9 upstream.
      
      kyro would copy u32s and specify sizeof(unsigned long) as the size to copy.
      
      This would copy more data than intended and cause memory corruption and might
      leak kernel memory.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca77581e
    • Mel Gorman's avatar
      mm: numa: return the number of base pages altered by protection changes · f99510dc
      Mel Gorman authored
      commit 72403b4a upstream.
      
      Commit 0255d491 ("mm: Account for a THP NUMA hinting update as one
      PTE update") was added to account for the number of PTE updates when
      marking pages prot_numa.  task_numa_work was using the old return value
      to track how much address space had been updated.  Altering the return
      value causes the scanner to do more work than it is configured or
      documented to in a single unit of work.
      
      This patch reverts that commit and accounts for the number of THP
      updates separately in vmstat.  It is up to the administrator to
      interpret the pair of values correctly.  This is a straight-forward
      operation and likely to only be of interest when actively debugging NUMA
      balancing problems.
      
      The impact of this patch is that the NUMA PTE scanner will scan slower
      when THP is enabled and workloads may converge slower as a result.  On
      the flip size system CPU usage should be lower than recent tests
      reported.  This is an illustrative example of a short single JVM specjbb
      test
      
      specjbb
                             3.12.0                3.12.0
                            vanilla      acctupdates
      TPut 1      26143.00 (  0.00%)     25747.00 ( -1.51%)
      TPut 7     185257.00 (  0.00%)    183202.00 ( -1.11%)
      TPut 13    329760.00 (  0.00%)    346577.00 (  5.10%)
      TPut 19    442502.00 (  0.00%)    460146.00 (  3.99%)
      TPut 25    540634.00 (  0.00%)    549053.00 (  1.56%)
      TPut 31    512098.00 (  0.00%)    519611.00 (  1.47%)
      TPut 37    461276.00 (  0.00%)    474973.00 (  2.97%)
      TPut 43    403089.00 (  0.00%)    414172.00 (  2.75%)
      
                    3.12.0      3.12.0
                   vanillaacctupdates
      User         5169.64     5184.14
      System        100.45       80.02
      Elapsed       252.75      251.85
      
      Performance is similar but note the reduction in system CPU time.  While
      this showed a performance gain, it will not be universal but at least
      it'll be behaving as documented.  The vmstats are obviously different but
      here is an obvious interpretation of them from mmtests.
      
                                      3.12.0      3.12.0
                                     vanillaacctupdates
      NUMA page range updates        1408326    11043064
      NUMA huge PMD updates                0       21040
      NUMA PTE updates               1408326      291624
      
      "NUMA page range updates" == nr_pte_updates and is the value returned to
      the NUMA pte scanner.  NUMA huge PMD updates were the number of THP
      updates which in combination can be used to calculate how many ptes were
      updated from userspace.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reported-by: default avatarAlex Thorlton <athorlton@sgi.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f99510dc
    • Stephen Boyd's avatar
      clockevents: Prefer CPU local devices over global devices · 7281bb56
      Stephen Boyd authored
      commit 70e5975d upstream.
      
      On an SMP system with only one global clockevent and a dummy
      clockevent per CPU we run into problems. We want the dummy
      clockevents to be registered as the per CPU tick devices, but
      we can only achieve that if we register the dummy clockevents
      before the global clockevent or if we artificially inflate the
      rating of the dummy clockevents to be higher than the rating
      of the global clockevent. Failure to do so leads to boot
      hangs when the dummy timers are registered on all other CPUs
      besides the CPU that accepted the global clockevent as its tick
      device and there is no broadcast timer to poke the dummy
      devices.
      
      If we're registering multiple clockevents and one clockevent is
      global and the other is local to a particular CPU we should
      choose to use the local clockevent regardless of the rating of
      the device. This way, if the clockevent is a dummy it will take
      the tick device duty as long as there isn't a higher rated tick
      device and any global clockevent will be bumped out into
      broadcast mode, fixing the problem described above.
      Reported-and-tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Tested-by: soren.brinkmann@xilinx.com
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: John Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/20130613183950.GA32061@codeaurora.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Kim Phillips <kim.phillips@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7281bb56
    • Thomas Gleixner's avatar
      clockevents: Split out selection logic · 9bae8ea0
      Thomas Gleixner authored
      commit 45cb8e01 upstream.
      
      Split out the clockevent device selection logic. Preparatory patch to
      allow unbinding active clockevent devices.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Link: http://lkml.kernel.org/r/20130425143436.431796247@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Kim Phillips <kim.phillips@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9bae8ea0
    • Thomas Gleixner's avatar
      clockevents: Add module refcount · 409d4ffa
      Thomas Gleixner authored
      commit ccf33d68 upstream.
      
      We want to be able to remove clockevent modules as well. Add a
      refcount so we don't remove a module with an active clock event
      device.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Link: http://lkml.kernel.org/r/20130425143436.307435149@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Kim Phillips <kim.phillips@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      409d4ffa
    • Thomas Gleixner's avatar
      clockevents: Get rid of the notifier chain · e8d63033
      Thomas Gleixner authored
      commit 7172a286 upstream.
      
      7+ years and still a single user. Kill it.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      Link: http://lkml.kernel.org/r/20130425143436.098520211@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Kim Phillips <kim.phillips@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8d63033
    • Mateusz Guzik's avatar
      aio: restore locking of ioctx list on removal · f8715e7d
      Mateusz Guzik authored
      Commit 36f55889
      "aio: refcounting cleanup" resulted in ioctx_lock not being held
      during ctx removal, leaving the list susceptible to corruptions.
      
      In mainline kernel the issue went away as a side effect of
      db446a08 "aio: convert the ioctx list to
      table lookup v3".
      
      Fix the problem by restoring appropriate locking.
      Signed-off-by: default avatarMateusz Guzik <mguzik@redhat.com>
      Reported-by: default avatarEryu Guan <eguan@redhat.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Acked-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8715e7d
    • KOBAYASHI Yoshitake's avatar
      mmc: block: fix a bug of error handling in MMC driver · 604ae797
      KOBAYASHI Yoshitake authored
      commit c8760069 upstream.
      
      Current MMC driver doesn't handle generic error (bit19 of device
      status) in write sequence. As a result, write data gets lost when
      generic error occurs. For example, a generic error when updating a
      filesystem management information causes a loss of write data and
      corrupts the filesystem. In the worst case, the system will never
      boot.
      
      This patch includes the following functionality:
        1. To enable error checking for the response of CMD12 and CMD13
           in write command sequence
        2. To retry write sequence when a generic error occurs
      
      Messages are added for v2 to show what occurs.
      Signed-off-by: default avatarKOBAYASHI Yoshitake <yoshitake.kobayashi@toshiba.co.jp>
      Signed-off-by: default avatarChris Ball <cjb@laptop.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      604ae797
    • Dwight Engen's avatar
      xfs: add capability check to free eofblocks ioctl · bdd0a8e5
      Dwight Engen authored
      commit 8c567a7f upstream.
      
      Check for CAP_SYS_ADMIN since the caller can truncate preallocated
      blocks from files they do not own nor have write access to. A more
      fine grained access check was considered: require the caller to
      specify their own uid/gid and to use inode_permission to check for
      write, but this would not catch the case of an inode not reachable
      via path traversal from the callers mount namespace.
      
      Add check for read-only filesystem to free eofblocks ioctl.
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatarDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      Cc: Kees Cook <keescook@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bdd0a8e5
    • Eric Dumazet's avatar
      tcp: gso: fix truesize tracking · e8ef7eff
      Eric Dumazet authored
      [ Upstream commit 0d08c42c ]
      
      commit 6ff50cd5 ("tcp: gso: do not generate out of order packets")
      had an heuristic that can trigger a warning in skb_try_coalesce(),
      because skb->truesize of the gso segments were exactly set to mss.
      
      This breaks the requirement that
      
      skb->truesize >= skb->len + truesizeof(struct sk_buff);
      
      It can trivially be reproduced by :
      
      ifconfig lo mtu 1500
      ethtool -K lo tso off
      netperf
      
      As the skbs are looped into the TCP networking stack, skb_try_coalesce()
      warns us of these skb under-estimating their truesize.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8ef7eff
    • fan.du's avatar
      {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation · dd5e5276
      fan.du authored
      [ Upstream commit 3868204d ]
      
      commit a553e4a6 ("[PKTGEN]: IPSEC support")
      tried to support IPsec ESP transport transformation for pktgen, but acctually
      this doesn't work at all for two reasons(The orignal transformed packet has
      bad IPv4 checksum value, as well as wrong auth value, reported by wireshark)
      
      - After transpormation, IPv4 header total length needs update,
        because encrypted payload's length is NOT same as that of plain text.
      
      - After transformation, IPv4 checksum needs re-caculate because of payload
        has been changed.
      
      With this patch, armmed pktgen with below cofiguration, Wireshark is able to
      decrypted ESP packet generated by pktgen without any IPv4 checksum error or
      auth value error.
      
      pgset "flag IPSEC"
      pgset "flows 1"
      Signed-off-by: default avatarFan Du <fan.du@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd5e5276
    • Hannes Frederic Sowa's avatar
      ipv6: fix possible seqlock deadlock in ip6_finish_output2 · e761bada
      Hannes Frederic Sowa authored
      [ Upstream commit 7f88c6b2 ]
      
      IPv6 stats are 64 bits and thus are protected with a seqlock. By not
      disabling bottom-half we could deadlock here if we don't disable bh and
      a softirq reentrantly updates the same mib.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e761bada
    • Eric Dumazet's avatar
      inet: fix possible seqlock deadlocks · 7be560d6
      Eric Dumazet authored
      [ Upstream commit f1d8cba6 ]
      
      In commit c9e90429 ("ipv4: fix possible seqlock deadlock") I left
      another places where IP_INC_STATS_BH() were improperly used.
      
      udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from
      process context, not from softirq context.
      
      This was detected by lockdep seqlock support.
      Reported-by: default avatarjongman heo <jongman.heo@samsung.com>
      Fixes: 584bdf8c ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7be560d6
    • Jiri Pirko's avatar
      team: fix master carrier set when user linkup is enabled · a52a9149
      Jiri Pirko authored
      [ Upstream commit f5e0d343 ]
      
      When user linkup is enabled and user sets linkup of individual port,
      we need to recompute linkup (carrier) of master interface so the change
      is reflected. Fix this by calling __team_carrier_check() which does the
      needed work.
      
      Please apply to all stable kernels as well. Thanks.
      Reported-by: default avatarJan Tluka <jtluka@redhat.com>
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a52a9149
    • Shawn Landden's avatar
      net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST · 86a24344
      Shawn Landden authored
      [ Upstream commit d3f7d56a ]
      
      Commit 35f9c09f (tcp: tcp_sendpages() should call tcp_push() once)
      added an internal flag MSG_SENDPAGE_NOTLAST, similar to
      MSG_MORE.
      
      algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages()
      and need to see the new flag as identical to MSG_MORE.
      
      This fixes sendfile() on AF_ALG.
      
      v3: also fix udp
      Reported-and-tested-by: default avatarShawn Landden <shawnlandden@gmail.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Original-patch: Richard Weinberger <richard@nod.at>
      Signed-off-by: default avatarShawn Landden <shawn@churchofgit.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86a24344
    • Yang Yingliang's avatar
      net: 8139cp: fix a BUG_ON triggered by wrong bytes_compl · ac3e7d9b
      Yang Yingliang authored
      [ Upstream commit 7fe0ee09 ]
      
      Using iperf to send packets(GSO mode is on), a bug is triggered:
      
      [  212.672781] kernel BUG at lib/dynamic_queue_limits.c:26!
      [  212.673396] invalid opcode: 0000 [#1] SMP
      [  212.673882] Modules linked in: 8139cp(O) nls_utf8 edd fuse loop dm_mod ipv6 i2c_piix4 8139too i2c_core intel_agp joydev pcspkr hid_generic intel_gtt floppy sr_mod mii button sg cdrom ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata scsi_mod [last unloaded: 8139cp]
      [  212.676084] CPU: 0 PID: 4124 Comm: iperf Tainted: G           O 3.12.0-0.7-default+ #16
      [  212.676084] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  212.676084] task: ffff8800d83966c0 ti: ffff8800db4c8000 task.ti: ffff8800db4c8000
      [  212.676084] RIP: 0010:[<ffffffff8122e23f>]  [<ffffffff8122e23f>] dql_completed+0x17f/0x190
      [  212.676084] RSP: 0018:ffff880116e03e30  EFLAGS: 00010083
      [  212.676084] RAX: 00000000000005ea RBX: 0000000000000f7c RCX: 0000000000000002
      [  212.676084] RDX: ffff880111dd0dc0 RSI: 0000000000000bd4 RDI: ffff8800db6ffcc0
      [  212.676084] RBP: ffff880116e03e48 R08: 0000000000000992 R09: 0000000000000000
      [  212.676084] R10: ffffffff8181e400 R11: 0000000000000004 R12: 000000000000000f
      [  212.676084] R13: ffff8800d94ec840 R14: ffff8800db440c80 R15: 000000000000000e
      [  212.676084] FS:  00007f6685a3c700(0000) GS:ffff880116e00000(0000) knlGS:0000000000000000
      [  212.676084] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  212.676084] CR2: 00007f6685ad6460 CR3: 00000000db714000 CR4: 00000000000006f0
      [  212.676084] Stack:
      [  212.676084]  ffff8800db6ffc00 000000000000000f ffff8800d94ec840 ffff880116e03eb8
      [  212.676084]  ffffffffa041509f ffff880116e03e88 0000000f16e03e88 ffff8800d94ec000
      [  212.676084]  00000bd400059858 000000050000000f ffffffff81094c36 ffff880116e03eb8
      [  212.676084] Call Trace:
      [  212.676084]  <IRQ>
      [  212.676084]  [<ffffffffa041509f>] cp_interrupt+0x4ef/0x590 [8139cp]
      [  212.676084]  [<ffffffff81094c36>] ? ktime_get+0x56/0xd0
      [  212.676084]  [<ffffffff8108cf73>] handle_irq_event_percpu+0x53/0x170
      [  212.676084]  [<ffffffff8108d0cc>] handle_irq_event+0x3c/0x60
      [  212.676084]  [<ffffffff8108fdb5>] handle_fasteoi_irq+0x55/0xf0
      [  212.676084]  [<ffffffff810045df>] handle_irq+0x1f/0x30
      [  212.676084]  [<ffffffff81003c8b>] do_IRQ+0x5b/0xe0
      [  212.676084]  [<ffffffff8142beaa>] common_interrupt+0x6a/0x6a
      [  212.676084]  <EOI>
      [  212.676084]  [<ffffffffa0416a21>] ? cp_start_xmit+0x621/0x97c [8139cp]
      [  212.676084]  [<ffffffffa0416a09>] ? cp_start_xmit+0x609/0x97c [8139cp]
      [  212.676084]  [<ffffffff81378ed9>] dev_hard_start_xmit+0x2c9/0x550
      [  212.676084]  [<ffffffff813960a9>] sch_direct_xmit+0x179/0x1d0
      [  212.676084]  [<ffffffff813793f3>] dev_queue_xmit+0x293/0x440
      [  212.676084]  [<ffffffff813b0e46>] ip_finish_output+0x236/0x450
      [  212.676084]  [<ffffffff810e59e7>] ? __alloc_pages_nodemask+0x187/0xb10
      [  212.676084]  [<ffffffff813b10e8>] ip_output+0x88/0x90
      [  212.676084]  [<ffffffff813afa64>] ip_local_out+0x24/0x30
      [  212.676084]  [<ffffffff813aff0d>] ip_queue_xmit+0x14d/0x3e0
      [  212.676084]  [<ffffffff813c6fd1>] tcp_transmit_skb+0x501/0x840
      [  212.676084]  [<ffffffff813c8323>] tcp_write_xmit+0x1e3/0xb20
      [  212.676084]  [<ffffffff81363237>] ? skb_page_frag_refill+0x87/0xd0
      [  212.676084]  [<ffffffff813c8c8b>] tcp_push_one+0x2b/0x40
      [  212.676084]  [<ffffffff813bb7e6>] tcp_sendmsg+0x926/0xc90
      [  212.676084]  [<ffffffff813e1d21>] inet_sendmsg+0x61/0xc0
      [  212.676084]  [<ffffffff8135e861>] sock_aio_write+0x101/0x120
      [  212.676084]  [<ffffffff81107cf1>] ? vma_adjust+0x2e1/0x5d0
      [  212.676084]  [<ffffffff812163e0>] ? timerqueue_add+0x60/0xb0
      [  212.676084]  [<ffffffff81130b60>] do_sync_write+0x60/0x90
      [  212.676084]  [<ffffffff81130d44>] ? rw_verify_area+0x54/0xf0
      [  212.676084]  [<ffffffff81130f66>] vfs_write+0x186/0x190
      [  212.676084]  [<ffffffff811317fd>] SyS_write+0x5d/0xa0
      [  212.676084]  [<ffffffff814321e2>] system_call_fastpath+0x16/0x1b
      [  212.676084] Code: ca 41 89 dc 41 29 cc 45 31 db 29 c2 41 89 c5 89 d0 45 29 c5 f7 d0 c1 e8 1f e9 43 ff ff ff 66 0f 1f 44 00 00 31 c0 e9 7b ff ff ff <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 c7 47 40 00
      [  212.676084] RIP  [<ffffffff8122e23f>] dql_completed+0x17f/0x190
      ------------[ cut here ]------------
      
      When a skb has frags, bytes_compl plus skb->len nr_frags times in cp_tx().
      It's not the correct value(actually, it should plus skb->len once) and it
      will trigger the BUG_ON(bytes_compl > num_queued - dql->num_completed).
      So only increase bytes_compl when finish sending all frags. pkts_compl also
      has a wrong value, fix it too.
      
      It's introduced by commit 871f0d4c ("8139cp: enable bql").
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac3e7d9b
    • David Chang's avatar
      r8169: check ALDPS bit and disable it if enabled for the 8168g · 15624326
      David Chang authored
      [ Upstream commit 1bac1072 ]
      
      Windows driver will enable ALDPS function, but linux driver and firmware
      do not have any configuration related to ALDPS function for 8168g.
      So restart system to linux and remove the NIC cable, LAN enter ALDPS,
      then LAN RX will be disabled.
      
      This issue can be easily reproduced on dual boot windows and linux
      system with RTL_GIGA_MAC_VER_40 chip.
      
      Realtek said, ALDPS function can be disabled by configuring to PHY,
      switch to page 0x0A43, reg0x10 bit2=0.
      Signed-off-by: default avatarDavid Chang <dchang@suse.com>
      Acked-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15624326
    • Veaceslav Falico's avatar
      af_packet: block BH in prb_shutdown_retire_blk_timer() · 5b9e9be7
      Veaceslav Falico authored
      [ Upstream commit ec6f809f ]
      
      Currently we're using plain spin_lock() in prb_shutdown_retire_blk_timer(),
      however the timer might fire right in the middle and thus try to re-aquire
      the same spinlock, leaving us in a endless loop.
      
      To fix that, use the spin_lock_bh() to block it.
      
      Fixes: f6fb8f10 ("af-packet: TPACKET_V3 flexible buffer implementation.")
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Daniel Borkmann <dborkman@redhat.com>
      CC: Willem de Bruijn <willemb@google.com>
      CC: Phil Sutter <phil@nwl.cc>
      CC: Eric Dumazet <edumazet@google.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b9e9be7
    • Daniel Borkmann's avatar
      packet: fix use after free race in send path when dev is released · 026bb405
      Daniel Borkmann authored
      [ Upstream commit e40526cb ]
      
      Salam reported a use after free bug in PF_PACKET that occurs when
      we're sending out frames on a socket bound device and suddenly the
      net device is being unregistered. It appears that commit 827d9780
      introduced a possible race condition between {t,}packet_snd() and
      packet_notifier(). In the case of a bound socket, packet_notifier()
      can drop the last reference to the net_device and {t,}packet_snd()
      might end up suddenly sending a packet over a freed net_device.
      
      To avoid reverting 827d9780 and thus introducing a performance
      regression compared to the current state of things, we decided to
      hold a cached RCU protected pointer to the net device and maintain
      it on write side via bind spin_lock protected register_prot_hook()
      and __unregister_prot_hook() calls.
      
      In {t,}packet_snd() path, we access this pointer under rcu_read_lock
      through packet_cached_dev_get() that holds reference to the device
      to prevent it from being freed through packet_notifier() while
      we're in send path. This is okay to do as dev_put()/dev_hold() are
      per-cpu counters, so this should not be a performance issue. Also,
      the code simplifies a bit as we don't need need_rls_dev anymore.
      
      Fixes: 827d9780 ("af-packet: Use existing netdev reference for bound sockets.")
      Reported-by: default avatarSalam Noureddine <noureddine@aristanetworks.com>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarSalam Noureddine <noureddine@aristanetworks.com>
      Cc: Ben Greear <greearb@candelatech.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      026bb405
    • Ding Tianhong's avatar
      bridge: flush br's address entry in fdb when remove the bridge dev · 1acb97ae
      Ding Tianhong authored
      [ Upstream commit f8730420 ]
      
      When the following commands are executed:
      
      brctl addbr br0
      ifconfig br0 hw ether <addr>
      rmmod bridge
      
      The calltrace will occur:
      
      [  563.312114] device eth1 left promiscuous mode
      [  563.312188] br0: port 1(eth1) entered disabled state
      [  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
      [  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ #9
      [  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
      [  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
      [  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
      [  563.468209] Call Trace:
      [  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
      [  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
      [  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
      [  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
      [  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
      [  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
      [  570.377958] Bridge firewalling registered
      
      --------------------------- cut here -------------------------------
      
      The reason is that when the bridge dev's address is changed, the
      br_fdb_change_mac_address() will add new address in fdb, but when
      the bridge was removed, the address entry in the fdb did not free,
      the bridge_fdb_cache still has objects when destroy the cache, Fix
      this by flushing the bridge address entry when removing the bridge.
      
      v2: according to the Toshiaki Makita and Vlad's suggestion, I only
          delete the vlan0 entry, it still have a leak here if the vlan id
          is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
          to flush all entries whose dst is NULL for the bridge.
      Suggested-by: default avatarToshiaki Makita <toshiaki.makita1@gmail.com>
      Suggested-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1acb97ae
    • Vlad Yasevich's avatar
      net: core: Always propagate flag changes to interfaces · 05cf2143
      Vlad Yasevich authored
      [ Upstream commit d2615bf4 ]
      
      The following commit:
          b6c40d68
          net: only invoke dev->change_rx_flags when device is UP
      
      tried to fix a problem with VLAN devices and promiscuouse flag setting.
      The issue was that VLAN device was setting a flag on an interface that
      was down, thus resulting in bad promiscuity count.
      This commit blocked flag propagation to any device that is currently
      down.
      
      A later commit:
          deede2fa
          vlan: Don't propagate flag changes on down interfaces
      
      fixed VLAN code to only propagate flags when the VLAN interface is up,
      thus fixing the same issue as above, only localized to VLAN.
      
      The problem we have now is that if we have create a complex stack
      involving multiple software devices like bridges, bonds, and vlans,
      then it is possible that the flags would not propagate properly to
      the physical devices.  A simple examle of the scenario is the
      following:
      
        eth0----> bond0 ----> bridge0 ---> vlan50
      
      If bond0 or eth0 happen to be down at the time bond0 is added to
      the bridge, then eth0 will never have promisc mode set which is
      currently required for operation as part of the bridge.  As a
      result, packets with vlan50 will be dropped by the interface.
      
      The only 2 devices that implement the special flag handling are
      VLAN and DSA and they both have required code to prevent incorrect
      flag propagation.  As a result we can remove the generic solution
      introduced in b6c40d68 and leave
      it to the individual devices to decide whether they will block
      flag propagation or not.
      Reported-by: default avatarStefan Priebe <s.priebe@profihost.ag>
      Suggested-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05cf2143