1. 13 Jan, 2014 7 commits
    • Dario Faggioli's avatar
      sched/deadline: Add latency tracing for SCHED_DEADLINE tasks · af6ace76
      Dario Faggioli authored
      It is very likely that systems that wants/needs to use the new
      SCHED_DEADLINE policy also want to have the scheduling latency of
      the -deadline tasks under control.
      
      For this reason a new version of the scheduling wakeup latency,
      called "wakeup_dl", is introduced.
      
      As a consequence of applying this patch there will be three wakeup
      latency tracer:
      
       * "wakeup", that deals with all tasks in the system;
       * "wakeup_rt", that deals with -rt and -deadline tasks only;
       * "wakeup_dl", that deals with -deadline tasks only.
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-9-git-send-email-juri.lelli@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      af6ace76
    • Harald Gustafsson's avatar
      sched/deadline: Add period support for SCHED_DEADLINE tasks · 755378a4
      Harald Gustafsson authored
      Make it possible to specify a period (different or equal than
      deadline) for -deadline tasks. Relative deadlines (D_i) are used on
      task arrivals to generate new scheduling (absolute) deadlines as "d =
      t + D_i", and periods (P_i) to postpone the scheduling deadlines as "d
      = d + P_i" when the budget is zero.
      
      This is in general useful to model (and schedule) tasks that have slow
      activation rates (long periods), but have to be scheduled soon once
      activated (short deadlines).
      Signed-off-by: default avatarHarald Gustafsson <harald.gustafsson@ericsson.com>
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-7-git-send-email-juri.lelli@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      755378a4
    • Dario Faggioli's avatar
      sched/deadline: Add SCHED_DEADLINE avg_update accounting · 239be4a9
      Dario Faggioli authored
      Make the core scheduler and load balancer aware of the load
      produced by -deadline tasks, by updating the moving average
      like for sched_rt.
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-6-git-send-email-juri.lelli@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      239be4a9
    • Juri Lelli's avatar
      sched/deadline: Add SCHED_DEADLINE SMP-related data structures & logic · 1baca4ce
      Juri Lelli authored
      Introduces data structures relevant for implementing dynamic
      migration of -deadline tasks and the logic for checking if
      runqueues are overloaded with -deadline tasks and for choosing
      where a task should migrate, when it is the case.
      
      Adds also dynamic migrations to SCHED_DEADLINE, so that tasks can
      be moved among CPUs when necessary. It is also possible to bind a
      task to a (set of) CPU(s), thus restricting its capability of
      migrating, or forbidding migrations at all.
      
      The very same approach used in sched_rt is utilised:
       - -deadline tasks are kept into CPU-specific runqueues,
       - -deadline tasks are migrated among runqueues to achieve the
         following:
          * on an M-CPU system the M earliest deadline ready tasks
            are always running;
          * affinity/cpusets settings of all the -deadline tasks is
            always respected.
      
      Therefore, this very special form of "load balancing" is done with
      an active method, i.e., the scheduler pushes or pulls tasks between
      runqueues when they are woken up and/or (de)scheduled.
      IOW, every time a preemption occurs, the descheduled task might be sent
      to some other CPU (depending on its deadline) to continue executing
      (push). On the other hand, every time a CPU becomes idle, it might pull
      the second earliest deadline ready task from some other CPU.
      
      To enforce this, a pull operation is always attempted before taking any
      scheduling decision (pre_schedule()), as well as a push one after each
      scheduling decision (post_schedule()). In addition, when a task arrives
      or wakes up, the best CPU where to resume it is selected taking into
      account its affinity mask, the system topology, but also its deadline.
      E.g., from the scheduling point of view, the best CPU where to wake
      up (and also where to push) a task is the one which is running the task
      with the latest deadline among the M executing ones.
      
      In order to facilitate these decisions, per-runqueue "caching" of the
      deadlines of the currently running and of the first ready task is used.
      Queued but not running tasks are also parked in another rb-tree to
      speed-up pushes.
      Signed-off-by: default avatarJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-5-git-send-email-juri.lelli@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1baca4ce
    • Dario Faggioli's avatar
      sched/deadline: Add SCHED_DEADLINE structures & implementation · aab03e05
      Dario Faggioli authored
      Introduces the data structures, constants and symbols needed for
      SCHED_DEADLINE implementation.
      
      Core data structure of SCHED_DEADLINE are defined, along with their
      initializers. Hooks for checking if a task belong to the new policy
      are also added where they are needed.
      
      Adds a scheduling class, in sched/dl.c and a new policy called
      SCHED_DEADLINE. It is an implementation of the Earliest Deadline
      First (EDF) scheduling algorithm, augmented with a mechanism (called
      Constant Bandwidth Server, CBS) that makes it possible to isolate
      the behaviour of tasks between each other.
      
      The typical -deadline task will be made up of a computation phase
      (instance) which is activated on a periodic or sporadic fashion. The
      expected (maximum) duration of such computation is called the task's
      runtime; the time interval by which each instance need to be completed
      is called the task's relative deadline. The task's absolute deadline
      is dynamically calculated as the time instant a task (better, an
      instance) activates plus the relative deadline.
      
      The EDF algorithms selects the task with the smallest absolute
      deadline as the one to be executed first, while the CBS ensures each
      task to run for at most its runtime every (relative) deadline
      length time interval, avoiding any interference between different
      tasks (bandwidth isolation).
      Thanks to this feature, also tasks that do not strictly comply with
      the computational model sketched above can effectively use the new
      policy.
      
      To summarize, this patch:
       - introduces the data structures, constants and symbols needed;
       - implements the core logic of the scheduling algorithm in the new
         scheduling class file;
       - provides all the glue code between the new scheduling class and
         the core scheduler and refines the interactions between sched/dl
         and the other existing scheduling classes.
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      Signed-off-by: default avatarMichael Trimarchi <michael@amarulasolutions.com>
      Signed-off-by: default avatarFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-4-git-send-email-juri.lelli@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      aab03e05
    • Dario Faggioli's avatar
      sched: Add new scheduler syscalls to support an extended scheduling parameters ABI · d50dde5a
      Dario Faggioli authored
      Add the syscalls needed for supporting scheduling algorithms
      with extended scheduling parameters (e.g., SCHED_DEADLINE).
      
      In general, it makes possible to specify a periodic/sporadic task,
      that executes for a given amount of runtime at each instance, and is
      scheduled according to the urgency of their own timing constraints,
      i.e.:
      
       - a (maximum/typical) instance execution time,
       - a minimum interval between consecutive instances,
       - a time constraint by which each instance must be completed.
      
      Thus, both the data structure that holds the scheduling parameters of
      the tasks and the system calls dealing with it must be extended.
      Unfortunately, modifying the existing struct sched_param would break
      the ABI and result in potentially serious compatibility issues with
      legacy binaries.
      
      For these reasons, this patch:
      
       - defines the new struct sched_attr, containing all the fields
         that are necessary for specifying a task in the computational
         model described above;
      
       - defines and implements the new scheduling related syscalls that
         manipulate it, i.e., sched_setattr() and sched_getattr().
      
      Syscalls are introduced for x86 (32 and 64 bits) and ARM only, as a
      proof of concept and for developing and testing purposes. Making them
      available on other architectures is straightforward.
      
      Since no "user" for these new parameters is introduced in this patch,
      the implementation of the new system calls is just identical to their
      already existing counterpart. Future patches that implement scheduling
      policies able to exploit the new data structure must also take care of
      modifying the sched_*attr() calls accordingly with their own purposes.
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      [ Rewrote to use sched_attr. ]
      Signed-off-by: default avatarJuri Lelli <juri.lelli@gmail.com>
      [ Removed sched_setscheduler2() for now. ]
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-3-git-send-email-juri.lelli@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d50dde5a
    • Ingo Molnar's avatar
      Merge branch 'sched/urgent' into sched/core · 56b48110
      Ingo Molnar authored
      Pick up the latest fixes before applying new changes.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      56b48110
  2. 12 Jan, 2014 1 commit
    • Rik van Riel's avatar
      sched: Calculate effective load even if local weight is 0 · 9722c2da
      Rik van Riel authored
      Thomas Hellstrom bisected a regression where erratic 3D performance is
      experienced on virtual machines as measured by glxgears. It identified
      commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA
      node") as the problem which had modified the behaviour of effective_load.
      
      Effective load calculates the difference to the system-wide load if a
      scheduling entity was moved to another CPU. The task group is not heavier
      as a result of the move but overall system load can increase/decrease as a
      result of the change. Commit 58d081b5 ("sched/numa: Avoid overloading CPUs
      on a preferred NUMA node") changed effective_load to make it suitable for
      calculating if a particular NUMA node was compute overloaded. To reduce
      the cost of the function, it assumed that a current sched entity weight
      of 0 was uninteresting but that is not the case.
      
      wake_affine() uses a weight of 0 for sync wakeups on the grounds that it
      is assuming the waking task will sleep and not contribute to load in the
      near future. In this case, we still want to calculate the effective load
      of the sched entity hierarchy. As effective_load is no longer used by
      task_numa_compare since commit fb13c7ee (sched/numa: Use a system-wide
      search to find swap/migration candidates), this patch simply restores the
      historical behaviour.
      Reported-and-tested-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      [ Wrote changelog]
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140106113912.GC6178@suse.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9722c2da
  3. 10 Jan, 2014 20 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 228fdc08
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Famouse last words: "final pull request" :-)
      
        I'm sending this because Jason Wang's fixes are pretty important
      
         1) Add missing per-cpu stats initialization to ip6_vti.  Otherwise
            lockdep spits out a call trace.  From Li RongQing.
      
         2) Fix NULL oops in wireless hwsim, from Javier Lopez
      
         3) TIPC deferred packet queue unlink must NULL out skb->next to avoid
            crashes.  From Erik Hugne
      
         4) Fix access to uninitialized buffer in nf_nat netfilter code, from
            Daniel Borkmann
      
         5) Fix lifetime of ipv6 loopback and SIT tunnel addresses, otherwise
            they basically timeout immediately.  From Hannes Frederic Sowa
      
         6) Fix DMA unmapping of TSO packets in bnx2x driver, from Michal
            Schmidt
      
         7) Do not allow L2 forwarding offload via macvtap device, the way
            things are now it will not end up being forwaded at all.  From
            Jason Wang
      
         8) Fix transmit queue selection via ndo_dfwd_start_xmit(), fixing
            things like applying NETIF_F_LLTX to the wrong device (!!) and
            eliding the proper transmit watchdog handling
      
         9) qlcnic driver was not updating tx statistics at all, from Manish
            Chopra"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        qlcnic: Fix ethtool statistics length calculation
        qlcnic: Fix bug in TX statistics
        net: core: explicitly select a txq before doing l2 forwarding
        macvlan: forbid L2 fowarding offload for macvtap
        bnx2x: fix DMA unmapping of TSO split BDs
        ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME
        bnx2x: prevent WARN during driver unload
        tipc: correctly unlink packets from deferred packet queue
        ipv6: pcpu_tstats.syncp should be initialised in ip6_vti.c
        netfilter: only warn once on wrong seqadj usage
        netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper
        NFC: Fix target mode p2p link establishment
        iwlwifi: add new devices for 7265 series
        mac80211: move "bufferable MMPDU" check to fix AP mode scan
        mac80211_hwsim: Fix NULL pointer dereference
      228fdc08
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs · e2bc4470
      Linus Torvalds authored
      Pull xfs bugfixes from Ben Myers:
       "Here we have a bugfix for an off-by-one in the remote attribute
        verifier that results in a forced shutdown which you can hit with v5
        superblock by creating a 64k xattr, and a fix for a missing
        destroy_work_on_stack() in the allocation worker.
      
        It's a bit late, but they are both fairly straightforward"
      
      * tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs:
        xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
        xfs: fix off-by-one error in xfs_attr3_rmt_verify
      e2bc4470
    • Linus Torvalds's avatar
      Merge branch 'leds-fixes-for-3.13' of... · 324c66ff
      Linus Torvalds authored
      Merge branch 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds
      
      Pull LED fix from Bryan Wu:
       "Pali Rohár and Pavel Machek reported the LED of Nokia N900 doesn't
        work with our latest 3.13-rc6 kernel.  Milo fixed the regression here"
      
      * 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
        leds: lp5521/5523: Remove duplicate mutex
      324c66ff
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · cff539b1
      Linus Torvalds authored
      Pull ACPI and power management fixes from Rafael Wysocki:
      
       - Recent commits modifying the lists of C-states in the intel_idle
         driver introduced bugs leading to crashes on some systems.  Two fixes
         from Jiang Liu.
      
       - The ACPI AC driver should receive all types of notifications, but
         recent change made it ignore some of them.  Fix from Alexander Mezin.
      
       - intel_pstate's validity checks for MSRs it depends on are not
         sufficient to catch the lack of support in nested KVM setups, so they
         are extended to cover that case.  From Dirk Brandewie.
      
       - NEC LZ750/LS has a botched up _BIX method in its ACPI tables, so our
         ACPI battery driver needs a quirk for it.  From Lan Tianyu.
      
       - The tpm_ppi driver sometimes leaks memory allocated by
         acpi_get_name().  Fix from Jiang Liu.
      
      * tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        intel_idle: close avn_cstates array with correct marker
        Revert "intel_idle: mark states tables with __initdata tag"
        ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS
        intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
        ACPI / TPM: fix memory leak when walking ACPI namespace
        ACPI / AC: change notification handler type to ACPI_ALL_NOTIFY
      cff539b1
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes · c43a5eb2
      Linus Torvalds authored
      Pull MFD fix from Samuel Ortiz:
       "This is the 2nd MFD pull request for 3.13
      
        It only contains one fix for the rtsx_pcr driver.  Without it we see a
        kernel panic on some machines, when resuming from suspend to RAM"
      
      * tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
        mfd: rtsx_pcr: Disable interrupts before cancelling delayed works
      c43a5eb2
    • Milo Kim's avatar
      leds: lp5521/5523: Remove duplicate mutex · e70988d1
      Milo Kim authored
      It can be a problem when a pattern is loaded via the firmware interface.
      LP55xx common driver has already locked the mutex in 'lp55xx_firmware_loaded()'.
      So it should be deleted.
      
      On the other hand, locks are required in store_engine_load()
      on updating program memory.
      Reported-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Reported-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarMilo Kim <milo.kim@ti.com>
      Signed-off-by: default avatarBryan Wu <cooloney@gmail.com>
      Cc: <stable@vger.kernel.org>
      e70988d1
    • Chuansheng Liu's avatar
      xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK() · 1f4a63bf
      Chuansheng Liu authored
      In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to
      call destroy_work_on_stack() which frees the debug object to pair
      with INIT_WORK_ONSTACK().
      Signed-off-by: default avatarLiu, Chuansheng <chuansheng.liu@intel.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 6f96b306)
      1f4a63bf
    • Jie Liu's avatar
      xfs: fix off-by-one error in xfs_attr3_rmt_verify · bba719b5
      Jie Liu authored
      With CRC check is enabled, if trying to set an attributes value just
      equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote
      attr write verification procedure failure, which would yield the back
      trace like below:
      
      <snip>
      XFS (sda7): Internal error xfs_attr3_rmt_write_verify at line 191 of file fs/xfs/xfs_attr_remote.c
      <snip>
      Call Trace:
      [<ffffffff816f0042>] dump_stack+0x45/0x56
      [<ffffffffa0d99c8b>] xfs_error_report+0x3b/0x40 [xfs]
      [<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
      [<ffffffffa0d99ce5>] xfs_corruption_error+0x55/0x80 [xfs]
      [<ffffffffa0dbef6b>] xfs_attr3_rmt_write_verify+0x14b/0x1a0 [xfs]
      [<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
      [<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
      [<ffffffffa0d96edd>] _xfs_buf_ioapply+0x6d/0x390 [xfs]
      [<ffffffff81184cda>] ? vm_map_ram+0x31a/0x460
      [<ffffffff81097230>] ? wake_up_state+0x20/0x20
      [<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
      [<ffffffffa0d9726b>] xfs_buf_iorequest+0x6b/0xc0 [xfs]
      [<ffffffffa0d97315>] xfs_bdstrat_cb+0x55/0xb0 [xfs]
      [<ffffffffa0d97906>] xfs_bwrite+0x46/0x80 [xfs]
      [<ffffffffa0dbfa94>] xfs_attr_rmtval_set+0x334/0x490 [xfs]
      [<ffffffffa0db84aa>] xfs_attr_leaf_addname+0x24a/0x410 [xfs]
      [<ffffffffa0db8893>] xfs_attr_set_int+0x223/0x470 [xfs]
      [<ffffffffa0db8b76>] xfs_attr_set+0x96/0xb0 [xfs]
      [<ffffffffa0db13b2>] xfs_xattr_set+0x42/0x70 [xfs]
      [<ffffffff811df9b2>] generic_setxattr+0x62/0x80
      [<ffffffff811e0213>] __vfs_setxattr_noperm+0x63/0x1b0
      [<ffffffff81307afe>] ? evm_inode_setxattr+0xe/0x10
      [<ffffffff811e0415>] vfs_setxattr+0xb5/0xc0
      [<ffffffff811e054e>] setxattr+0x12e/0x1c0
      [<ffffffff811c6e82>] ? final_putname+0x22/0x50
      [<ffffffff811c708b>] ? putname+0x2b/0x40
      [<ffffffff811cc4bf>] ? user_path_at_empty+0x5f/0x90
      [<ffffffff811bdfd9>] ? __sb_start_write+0x49/0xe0
      [<ffffffff81168589>] ? vm_mmap_pgoff+0x99/0xc0
      [<ffffffff811e07df>] SyS_setxattr+0x8f/0xe0
      [<ffffffff81700c2d>] system_call_fastpath+0x1a/0x1f
      
      Tests:
          setfattr -n user.longxattr -v `perl -e 'print "A"x65536'` testfile
      
      This patch fix it to check the remote EA size is greater than the
      XATTR_SIZE_MAX rather than more than or equal to it, because it's
      valid if the specified EA value size is equal to the limitation as
      per VFS setxattr interface.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 85dd0707)
      bba719b5
    • Shahed Shaikh's avatar
      qlcnic: Fix ethtool statistics length calculation · d6e9c89a
      Shahed Shaikh authored
      o Consider number of Tx queues while calculating the length of
        Tx statistics as part of ethtool stats.
      o Calculate statistics lenght properly for 82xx and 83xx adapter
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6e9c89a
    • Manish Chopra's avatar
      qlcnic: Fix bug in TX statistics · 1ac6762a
      Manish Chopra authored
      o Driver was not updating TX stats so it was not populating
        statistics in `ifconfig` command output.
      Signed-off-by: default avatarManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ac6762a
    • Jason Wang's avatar
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang authored
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f663dd9a
    • Jason Wang's avatar
      macvlan: forbid L2 fowarding offload for macvtap · b13ba1b8
      Jason Wang authored
      L2 fowarding offload will bypass the rx handler of real device. This will make
      the packet could not be forwarded to macvtap device. Another problem is the
      dev_hard_start_xmit() called for macvtap does not have any synchronization.
      
      Fix this by forbidding L2 forwarding for macvtap.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b13ba1b8
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · c4d70998
      David S. Miller authored
      John W. Linville says:
      
      ====================
      For the mac80211 bits, Johannes says:
      
      "I have a fix from Javier for mac80211_hwsim when used with wmediumd
      userspace, and a fix from Felix for buffering in AP mode."
      
      For the NFC bits, Samuel says:
      
      "This pull request only contains one fix for a regression introduced with
      commit e29a9e2a. Without this fix, we can not establish a p2p link
      in target mode. Only initiator mode works."
      
      For the iwlwifi bits, Emmanuel says:
      
      "It only includes new device IDs so it's not vital. If you have a pull
      request to net.git anyway, I'd happy to have this in."
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4d70998
    • Michal Schmidt's avatar
      bnx2x: fix DMA unmapping of TSO split BDs · 95e92fd4
      Michal Schmidt authored
      bnx2x triggers warnings with CONFIG_DMA_API_DEBUG=y:
      
        WARNING: CPU: 0 PID: 2253 at lib/dma-debug.c:887 check_unmap+0xf8/0x920()
        bnx2x 0000:28:00.0: DMA-API: device driver frees DMA memory with
        different size [device address=0x00000000da2b389e] [map size=1490 bytes]
        [unmap size=66 bytes]
      
      The reason is that bnx2x splits a TSO BD into two BDs (headers + data)
      using one DMA mapping for both, but it uses only the length of the first
      BD when unmapping.
      
      This patch fixes the bug by unmapping the whole length of the two BDs.
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95e92fd4
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux · 21e20e22
      Linus Torvalds authored
      Pull clock fixes from Mike Turquette:
       "Late fixes for clock drivers.  All of these fixes are for user-visible
        regressions, typically boot failures or other unsafe system
        configuration that causes badness"
      
      * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
        clk: clk-divider: fix divisor > 255 bug
        clk: exynos: File scope reg_save array should depend on PM_SLEEP
        clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock
        ARM: dts: exynos5250: Fix MDMA0 clock number
        clk: samsung: exynos5250: Add MDMA0 clocks
        clk: samsung: exynos5250: Fix ACP gate register offset
        clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks
        clk: samsung: exynos4: Correct SRC_MFC register
      21e20e22
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 2aa63ce0
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A few fixes for Renesas platforms to fixup DMA masks (this started
        causing errors once the DMA API added checks for valid masks in 3.13)"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: shmobile: mackerel: Fix coherent DMA mask
        ARM: shmobile: kzm9g: Fix coherent DMA mask
        ARM: shmobile: armadillo: Fix coherent DMA mask
      2aa63ce0
    • Hannes Frederic Sowa's avatar
      ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME · 07edd741
      Hannes Frederic Sowa authored
      In the past the IFA_PERMANENT flag indicated, that the valid and preferred
      lifetime where ignored. Since change fad8da3e ("ipv6 addrconf: fix
      preferred lifetime state-changing behavior while valid_lft is infinity")
      we honour at least the preferred lifetime on those addresses. As such
      the valid lifetime gets recalculated and updated to 0.
      
      If loopback address is added manually this problem does not occur.
      Also if NetworkManager manages IPv6, those addresses will get added via
      inet6_rtm_newaddr and thus will have a correct lifetime, too.
      Reported-by: default avatarFrançois-Xavier Le Bail <fx.lebail@yahoo.com>
      Reported-by: default avatarDamien Wyart <damien.wyart@gmail.com>
      Fixes: fad8da3e ("ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity")
      Cc: Yasushi Asano <yasushi.asano@jp.fujitsu.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07edd741
    • Yuval Mintz's avatar
      bnx2x: prevent WARN during driver unload · 9a2620c8
      Yuval Mintz authored
      Starting with commit 80c33ddd "net: add might_sleep() call to napi_disable"
      bnx2x fails the might_sleep tests causing a stack trace to appear whenever
      the driver is unloaded, as local_bh_disable() is being called before
      napi_disable().
      
      This changes the locking schematics related to CONFIG_NET_RX_BUSY_POLL,
      preventing the need for calling local_bh_disable() and thus eliminating
      the issue.
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a2620c8
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpuidle' · 13de22c5
      Rafael J. Wysocki authored
      * pm-cpuidle:
        intel_idle: close avn_cstates array with correct marker
        Revert "intel_idle: mark states tables with __initdata tag"
      13de22c5
    • Jiang Liu's avatar
      intel_idle: close avn_cstates array with correct marker · 88390996
      Jiang Liu authored
      Close avn_cstates array with correct marker to avoid overflow
      in function intel_idle_cpu_init().
      
      [rjw: The problem was introduced when commit 22e580d0 was merged
       on top of eba682a5 (intel_idle: shrink states tables).]
      
      Fixes: 22e580d0 (intel_idle: Fixed C6 state on Avoton/Rangeley processors)
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      88390996
  4. 09 Jan, 2014 4 commits
    • John W. Linville's avatar
      Merge branch 'master' of... · 0f74d82d
      John W. Linville authored
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
      0f74d82d
    • Jiang Liu's avatar
      Revert "intel_idle: mark states tables with __initdata tag" · ba0dc81e
      Jiang Liu authored
      This reverts commit 9d046ccb.
      
      Commit 9d046ccb marks all state tables with __initdata, but
      the state table may be accessed when doing CPU online, which then
      causing system crash as below:
      
      [  204.188841] BUG: unable to handle kernel paging request at ffffffff8227cce8
      [  204.196844] IP: [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130
      [  204.203996] PGD 1e11067 PUD 1e12063 PMD 455859063 PTE 800000000227c062
      [  204.211638] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  204.216975] Modules linked in: x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd gpio_ich microcode joydev sb_edac edac_core ipmi_si lpc_ich ipmi_msghandler lp tpm_tis parport wmi mac_hid acpi_pad hid_generic ixgbe isci usbhid dca hid libsas ptp ahci libahci scsi_transport_sas megaraid_sas pps_core mdio
      [  204.262815] CPU: 11 PID: 1489 Comm: bash Not tainted 3.13.0-rc7+ #48
      [  204.269993] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.L09.1312061514 12/06/2013
      [  204.281646] task: ffff8804303a24a0 ti: ffff880440fac000 task.ti: ffff880440fac000
      [  204.290311] RIP: 0010:[<ffffffff814aa1c0>]  [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130
      [  204.300184] RSP: 0018:ffff880440fadd28  EFLAGS: 00010286
      [  204.306192] RAX: ffffffff8227cca0 RBX: ffffe8fff1a03400 RCX: 0000000000000007
      [  204.314244] RDX: ffff88045f400000 RSI: 0000000000000009 RDI: 0000000000001120
      [  204.322296] RBP: ffff880440fadd38 R08: 0000000000000000 R09: 0000000000000001
      [  204.330411] R10: 0000000000000001 R11: 0000000000000000 R12: 000000000000001e
      [  204.338482] R13: 00000000ffffffdb R14: 0000000000000001 R15: 0000000000000000
      [  204.346743] FS:  00007f64f7b0c740(0000) GS:ffff88045ce00000(0000) knlGS:0000000000000000
      [  204.355919] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  204.362449] CR2: ffffffff8227cce8 CR3: 0000000444ab0000 CR4: 00000000001407e0
      [  204.370520] Stack:
      [  204.372853]  000000000000001e ffffffff81f10240 ffff880440fadd50 ffffffff814aa307
      [  204.381519]  ffffffff81ea80e0 ffff880440fadda0 ffffffff8185a230 0000000000000000
      [  204.390196]  000000000000001e 0000000000000002 0000000000000002 0000000000000000
      [  204.398856] Call Trace:
      [  204.401683]  [<ffffffff814aa307>] cpu_hotplug_notify+0x57/0x70
      [  204.408638]  [<ffffffff8185a230>] notifier_call_chain+0x100/0x150
      [  204.415553]  [<ffffffff810a7dae>] __raw_notifier_call_chain+0xe/0x10
      [  204.422772]  [<ffffffff81072163>] cpu_notify+0x23/0x50
      [  204.428616]  [<ffffffff810723b2>] _cpu_up+0x132/0x1a0
      [  204.434361]  [<ffffffff8107249d>] cpu_up+0x7d/0xa0
      [  204.439819]  [<ffffffff81836c9c>] cpu_subsys_online+0x3c/0x90
      [  204.446345]  [<ffffffff81554625>] device_online+0x45/0xa0
      [  204.452471]  [<ffffffff815546ce>] online_store+0x4e/0x80
      [  204.458511]  [<ffffffff815519a8>] dev_attr_store+0x18/0x30
      [  204.464744]  [<ffffffff812a68f1>] sysfs_write_file+0x151/0x1c0
      [  204.471681]  [<ffffffff81217ef1>] vfs_write+0xe1/0x160
      [  204.477524]  [<ffffffff8121889c>] SyS_write+0x4c/0x90
      [  204.483270]  [<ffffffff8185f2ed>] system_call_fastpath+0x1a/0x1f
      [  204.490081] Code: 41 54 41 89 fc 8b 3d 48 25 85 01 53 48 8b 1d 30 25 85 01 48 03 1c c5 40 90 fb 81 48 8b 05 19 25 85 01 c7 43 0c 01 00 00 00 66 90 <48> 83 78 48 00 74 4f 41 83 c0 01 41 39 f0 7e 10 48 c7 c7 38 79
      [  204.515723] RIP  [<ffffffff814aa1c0>] intel_idle_cpu_init+0x40/0x130
      [  204.522996]  RSP <ffff880440fadd28>
      [  204.526976] CR2: ffffffff8227cce8
      [  204.530766] ---[ end trace 336f56cc3d1cfc8c ]---
      
      Fixes: 9d046ccb (intel_idle: mark states tables with __initdata tag)
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      ba0dc81e
    • Linus Torvalds's avatar
      Merge branch 'parisc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 7d1c153a
      Linus Torvalds authored
      Pull parisc fix from Helge Deller:
       "This patch fixes the kmap/kunmap implementation on parisc and finally
        makes AIO work on parisc"
      
      * 'parisc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Ensure full cache coherency for kmap/kunmap
      7d1c153a
    • Linus Torvalds's avatar
      Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · f8829150
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Late fixes for libata.  Nothing too interesting.  Adding missing PM
        callbacks to satat_sis and an additional PCI ID for ahci"
      
      * 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        sata_sis: missing PM support
        ahci: add PCI ID for Marvell 88SE9170 SATA controller
      f8829150
  5. 08 Jan, 2014 7 commits
  6. 07 Jan, 2014 1 commit