Commits · 61db26c6454b5a0bd74ec23968b568e38ea8321a · nexedi / linux

14 Feb, 2013 22 commits

gianfar: gfar_process_frame returns void · 61db26c6

Claudiu Manoil authored Feb 14, 2013

No return code is expected from gfar_process_frame(), hence
change it to return void.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

61db26c6

gianfar: GRO_DROP is unlikely · bd9e89f2

Claudiu Manoil authored Feb 14, 2013

The change is significant since it affects the rx hot path.
Paul observed and documented the effects at asm level, see
below:

"It turns out that it does make a difference, since gfar_process_frame
gets inlined, and so the increment code gets moved out of line (I have
marked the if statment with * and the increment code within "-----"):

  ------------------------- as is currently ------------------
     4d14:       80 61 00 18     lwz     r3,24(r1)
     4d18:       7f c4 f3 78     mr      r4,r30
     4d1c:       48 00 00 01     bl      4d1c <gfar_clean_rx_ring+0x10c>
  *  4d20:       2f 83 00 04     cmpwi   cr7,r3,4
     4d24:       40 9e 00 1c     bne-    cr7,4d40
<gfar_clean_rx_ring+0x130>
        ----------------------------
     4d28:       81 3c 01 f8     lwz     r9,504(r28)
     4d2c:       81 5c 01 fc     lwz     r10,508(r28)
     4d30:       31 4a 00 01     addic   r10,r10,1
     4d34:       7d 29 01 94     addze   r9,r9
     4d38:       91 3c 01 f8     stw     r9,504(r28)
     4d3c:       91 5c 01 fc     stw     r10,508(r28)
        ----------------------------
     4d40:       a0 1f 00 24     lhz     r0,36(r31)
     4d44:       81 3f 00 00     lwz     r9,0(r31)
     4d48:       7f a4 eb 78     mr      r4,r29
     4d4c:       7f e3 fb 78     mr      r3,r31

  -------------------------- unlikely ------------------------
     4d14:       80 61 00 18     lwz     r3,24(r1)
     4d18:       7f c4 f3 78     mr      r4,r30
     4d1c:       48 00 00 01     bl      4d1c <gfar_clean_rx_ring+0x10c>
  *  4d20:       2f 83 00 04     cmpwi   cr7,r3,4
     4d24:       41 9e 03 94     beq-    cr7,50b8
<gfar_clean_rx_ring+0x4a8>
     4d28:       a0 1f 00 24     lhz     r0,36(r31)
     4d2c:       81 3f 00 00     lwz     r9,0(r31)
     4d30:       7f a4 eb 78     mr      r4,r29
     4d34:       7f e3 fb 78     mr      r3,r31
[...]
     50b8:       81 3c 01 f8     lwz     r9,504(r28)
     50bc:       81 5c 01 fc     lwz     r10,508(r28)
     50c0:       31 4a 00 01     addic   r10,r10,1
     50c4:       7d 29 01 94     addze   r9,r9
     50c8:       91 3c 01 f8     stw     r9,504(r28)
     50cc:       91 5c 01 fc     stw     r10,508(r28)
     50d0:       4b ff fc 58     b       4d28 <gfar_clean_rx_ring+0x118>

So, the increment does actually get moved ~1k away."

Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bd9e89f2

gianfar: Cleanup and optimize struct gfar_private · b597d20d

Claudiu Manoil authored Feb 14, 2013

Group run-time critical fields within the 1st cacheline (32B)
followed by the tx|rx_queue reference arrays and the interrupt
group instances (gfargrp), all cacheline aligned.

This has several benefits. Firstly comes the performance benefit
by having the members required by the driver's hot path re-grouped
in the structure's first cache lines, whereas the unimportant
members were pushed towards the end of the struct.
Another benefit comes from eliminating a 24 byte memory hole that
was rendering gfar_priv's 2nd cacheline useless. The default gcc
layout of gfar_private leaves an implicit 24 byte hole after the
errata (enum) member. This patch fixes it.

The uchar bitfields were pushed towards the end of the struct
as these are not run-time performance critical (used for init
time operations). Because there is no other 2 byte member
around to couple the uchar bitfields memeber with, we will
have an addititnal 2 byte hole after the bitfields. This is
unsignificant however, and it doesn't influence gfar_priv's
size, because the whole structure is padded to be a 32B multiple.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b597d20d

gianfar: Add device ref (dev) in gfar_private · 369ec162

Claudiu Manoil authored Feb 14, 2013

Use device pointer (dev) to simplify the code and to
avoid double indirections, especially on the hot path.

Basically, instead of accessing priv to get the ofdev
reference and then accessing the ofdev structure to
dereference the needed dev pointer, we will get the
dev pointer directly from priv.

The dev pointer is required on the hot path, see gfar_new_rxbdp
or gfar_clean_rx_ring (or xmit), and this patch makes
it available directly from priv's 1st cacheline.

This change is reflected at asm level too, taking (the hot)
gfar_new_rxbdp():
initial version -
    18c0:	7c 7e 1b 78 	mr      r30,r3

    18d0:	81 69 04 3c 	lwz     r11,1084(r9)

    18d8:	34 6b 00 10 	addic.  r3,r11,16
    18dc:	41 82 00 08 	beq-    18e4

patched version -
    18d0:	80 69 04 38 	lwz     r3,1080(r9)

    18d8:	2f 83 00 00 	cmpwi   cr7,r3,0
    18dc:	41 9e 00 08 	beq-    cr7,18e4
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

369ec162

gianfar: Remove unused device_node ref in gfar_private · 41a20609

Claudiu Manoil authored Feb 14, 2013

Remove unused device node pointer.
Remove duplicated SET_NETDEV_DEV().
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

41a20609

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · e0376d00

David S. Miller authored Feb 14, 2013

Steffen Klassert says:

====================
1) Remove a duplicated call to skb_orphan() in pf_key, from Cong Wang.

2) Prepare xfrm and pf_key for algorithms without pf_key support,
   from Jussi Kivilinna.

3) Fix an unbalanced lock in xfrm_output_one(), from Li RongQing.

4) Add an IPsec state resolution packet queue to handle
   packets that are send before the states are resolved.

5) xfrm4_policy_fini() is unused since 2.6.11, time to remove it.
   From Michal Kubecek.

6) The xfrm gc threshold was configurable just in the initial
   namespace, make it configurable in all namespaces. From
   Michal Kubecek.

7) We currently can not insert policies with mark and mask
   such that some flows would be matched from both policies.
   Allow this if the priorities of these policies are different,
   the one with the higher priority is used in this case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e0376d00

bridge: make ifla_br_policy and br_af_ops static · 15004cab

Cong Wang authored Feb 13, 2013

They are only used within this file.

Cc: Vlad Yasevich <vyasevic@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

15004cab

bgmac: add read of interrupt mask after disabling interrupts · 4160815f

Nathan Hintz authored Feb 13, 2013

The specs prescribe an immediate read of the interrupt mask after
disabling interrupts.  This patch updates the driver to match the
specs.
Signed-off-by: Nathan Hintz <nlhintz@hotmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4160815f

bridge: use __u16 in if_bridge.h · 9f89ec82

Cong Wang authored Feb 14, 2013

We should use "__u16" instead of "u16" in the user-space visable
header.

Cc: Vlad Yasevich <vyasevic@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f89ec82

Merge branch 'bridge_vlan' · 93197b13

David S. Miller authored Feb 13, 2013

Vlad Yasevich says:

====================
VLAN filtering/VLAN aware bridge

Changes since v10
* Updated implemenation of ndo_fdb_del in emulex and qlogic drivers.

Changes since v9:
* series re-ordering so make functionality more distinct.  Basic vlan
  filtering is patches 1-4.  Support for PVID/untagged vlans is patches
  5 and 6.  VLAN support for FDB/MDB is patches 7-11.  Patch 12 is
  still additional egress policy.
* Slight simplification to code that extracts the VID from skb.  Since we
  now depend on the vlan module, at the time of input skb_tci is guaranteed
  to be set if the packet had 8021q header.  We can simply refere to it.
* Changed the opaque 'parent' pointer from prior patches to a union so we
  can be much more explicit in our assignments.
* Lots of additional testing with STP turned on.  No issues were observed.

Changes since v8:
* Unified vlans_to_* calls into a single interface
* Fixed the rest of the issues report by Michal Miroslaw
* Fixed a bug where fdb entries were not created for all added vlans.

Changes since v7:
* Rebases on the latest net-next and removed the vlan wrapper patch from
the series.
* Fixed a crash in br_fdb_add/br_fdb_delete.

Changes since v6:
* VLANs are now stored in a VLAN bitmap per port.  This allows for O(1)
lookup at ingress and egress.  We simply check to see if the bit associated
with the vlan id is set in the map.  The drawback to this approach is that
it wastes some space when there is only a small number of VLANs.
* In addition to the build time configuration option, VLAN filtering also has
a configuration paramter in sysfs.  By default the filtering is turned off
and all traffic is permitted.  When the filtring is turned on, we do strict
matching to the filter configured.  Thus, if there is no configuration, all
packets are rejected.  This was done to make the behavior more streight
forward.  Without this (and if egress policy patch is rejected), the
decision for how to forward untagged traffic that was not filtered at ingress
is almost impossible to make.  It would not be right to deliver to every
port that has PVID set as, each port may have a different PVID.
* Separate egress policy bitmap patch has been isolated and is provided last
in the series.  This has been a more contentious piece of functionality and I
wanted to isolate it so that it could easily be dropped and not block the whole
series.

Changes since v5:
 - Pulled VLAN filtering into its own file and made it a configuration options.
 - Made new vlan filtering option dependent on VLAN_8021Q.
 - Got rid of HW filter inlines and moved then vlan_core.c.
   (All of the above suggested by Stephen Hemminger)

Changes since v4:
 - Pull per-port vlan data into its own structures and give it to the bridge
   device thus making bridge device behave like a regular port for vlan
   configuration.
 - Add a per-vlan 'untagged' bitmap that determins egress policy.  If a port
   is part of this bitmap, traffic egresses untagged.
 - PVID is now used for ingress policy only.  Incomming frames without VLAN tag
   are assigned to the PVID vlan.  Egress is determined via bitmap memberships.
 - Allow for incremental config of a vlan.  Now, PVID and untagged memberships
   may be set on existing vlans.  They however can NOT be cleared separately.
 - VLAN deletion is now done via RTM_DELLINK command for PF_BRIDGE family.
   This cleans up the netlink interface.

Changes since v3:
 - Re-integrated compiler problems that got left out last time.  Appologies.
 - checkpatches.pl errors fixed

Changes since v2:
 - Added inline functiosn to manimulate vlan hw filters and re-use in 8021q
   and bridge code.
 - Use rtnl_dereference (Michael Tsirkin)
 - Remove synchronize_net() call (Eric Dumazet)
 - Fix NULL ptr deref bug I introduced in br_ifinfo_notify.

Changes since v1:
 - Fixed some forwarding bugs.
 - Add vlan to local fdb entries.  New local entries are created per vlan
   to facilite correct forwarding to bridge interface.
 - Allow configuration of vlans directly on the bridge master device
   in addition to ports.

Changes since rfc v2:
 - Per-port vlan bitmap is gone and is replaced with a vlan list.
 - Added bridge vlan list, which is referenced by each port.  Entries in
   the birdge vlan list have port bitmap that shows which port are parts
   of which vlan.
 - Netlink API changes.
 - Dropped sysfs support for now.  If people think this is really usefull,
   can add it back.
 - Support for native/untagged vlans.

Changes since rfc v1:
 - Comments addressed regarding formatting and RCU usage
 - iocts have been removed and changed over the netlink interface.
 - Added support of user added ndb entries.
 - changed sysfs interface to export a bitmap.  Also added a write interface.
   I am not sure how much I like it, but it made my testing easier/faster.  I
   might change the write interface to take text instead of binary.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

93197b13

bridge: Separate egress policy bitmap · 35e03f3a

Vlad Yasevich authored Feb 13, 2013

Add an ability to configure a separate "untagged" egress
policy to the VLAN information of the bridge.  This superseeds PVID
policy and makes PVID ingress-only.  The policy is configured with a
new flag and is represented as a port bitmap per vlan.  Egress frames
with a VLAN id in "untagged" policy bitmap would egress
the port without VLAN header.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

35e03f3a

bridge: Add vlan support for local fdb entries · bc9a25d2

Vlad Yasevich authored Feb 13, 2013

When VLAN is added to the port, a local fdb entry for that port
(the entry with the mac address of the port) is added for that
VLAN.  This way we can correctly determine if the traffic
is for the bridge itself.  If the address of the port changes,
we try to change all the local fdb entries we have for that port.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bc9a25d2

bridge: Add vlan support to static neighbors · 1690be63

Vlad Yasevich authored Feb 13, 2013

When a user adds bridge neighbors, allow him to specify VLAN id.
If the VLAN id is not specified, the neighbor will be added
for VLANs currently in the ports filter list.  If no VLANs are
configured on the port, we use vlan 0 and only add 1 entry.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1690be63

bridge: Add vlan id to multicast groups · b0e9a30d

Vlad Yasevich authored Feb 13, 2013

Add vlan_id to multicasts groups so that we know which vlan
each group belongs to and can correctly forward to appropriate vlan.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b0e9a30d

bridge: Add vlan to unicast fdb entries · 2ba071ec

Vlad Yasevich authored Feb 13, 2013

This patch adds vlan to unicast fdb entries that are created for
learned addresses (not the manually configured ones).  It adds
vlan id into the hash mix and uses vlan as an addditional parameter
for an entry match.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2ba071ec

bridge: Add the ability to configure pvid · 552406c4

Vlad Yasevich authored Feb 13, 2013

A user may designate a certain vlan as PVID.  This means that
any ingress frame that does not contain a vlan tag is assigned to
this vlan and any forwarding decisions are made with this vlan in mind.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

552406c4

bridge: Implement vlan ingress/egress policy with PVID. · 78851988

Vlad Yasevich authored Feb 13, 2013

At ingress, any untagged traffic is assigned to the PVID.
Any tagged traffic is filtered according to membership bitmap.

At egress, if the vlan matches the PVID, the frame is sent
untagged.  Otherwise the frame is sent tagged.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

78851988

bridge: Dump vlan information from a bridge port · 6cbdceeb

Vlad Yasevich authored Feb 13, 2013

Using the RTM_GETLINK dump the vlan filter list of a given
bridge port.  The information depends on setting the filter
flag similar to how nic VF info is dumped.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6cbdceeb

bridge: Add netlink interface to configure vlans on bridge ports · 407af329

Vlad Yasevich authored Feb 13, 2013

Add a netlink interface to add and remove vlan configuration on bridge port.
The interface uses the RTM_SETLINK message and encodes the vlan
configuration inside the IFLA_AF_SPEC.  It is possble to include multiple
vlans to either add or remove in a single message.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

407af329

bridge: Verify that a vlan is allowed to egress on given port · 85f46c6b

Vlad Yasevich authored Feb 13, 2013

When bridge forwards a frame, make sure that a frame is allowed
to egress on that port.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85f46c6b

bridge: Validate that vlan is permitted on ingress · a37b85c9

Vlad Yasevich authored Feb 13, 2013

When a frame arrives on a port or transmitted by the bridge,
if we have VLANs configured, validate that a given VLAN is allowed
to enter the bridge.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a37b85c9

bridge: Add vlan filtering infrastructure · 243a2e63

Vlad Yasevich authored Feb 13, 2013

Adds an optional infrustructure component to bridge that would allow
native vlan filtering in the bridge.  Each bridge port (as well
as the bridge device) now get a VLAN bitmap.  Each bit in the bitmap
is associated with a vlan id.  This way if the bit corresponding to
the vid is set in the bitmap that the packet with vid is allowed to
enter and exit the port.

Write access the bitmap is protected by RTNL and read access
protected by RCU.

Vlan functionality is disabled by default.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

243a2e63

13 Feb, 2013 15 commits

net: sctp: add build check for sctp_sf_eat_sack_6_2/jsctp_sf_eat_sack · 22222997

Daniel Borkmann authored Feb 13, 2013

In order to avoid any future surprises of kernel panics due to jprobes
function mismatches (as e.g. fixed in 4cb9d6ea: sctp: jsctp_sf_eat_sack:
fix jprobes function signature mismatch), we should check both function
types during build and scream loudly if they do not match. __same_type
resolves to __builtin_types_compatible_p, which is 1 in case both types
are the same and 0 otherwise, qualifiers are ignored. Tested by myself.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

22222997

net: sctp: minor: make jsctp_sf_eat_sack static · 1e558174

Daniel Borkmann authored Feb 13, 2013

The function jsctp_sf_eat_sack can be made static, no need to extend
its visibility.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1e558174

bgmac: return error on failed PHY write · 217a55a3

Rafał Miłecki authored Feb 12, 2013

Some callers may want to know if PHY write succeed. Also make PHY
functions static, they are not exported anywhere.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

217a55a3

be2net: remove BUG_ON() in be_mcc_compl_is_new() · 9e9ff4b7

Sathya Perla authored Feb 12, 2013

The current code expects that the last word (with valid bit)
of an MCC compl is DMAed in one shot. This may not be the case.
Remove this assertion.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9e9ff4b7

net: ethernet: ti: remove redundant NULL check. · 79876e03

Cyril Roelandt authored Feb 12, 2013

cpdma_chan_destroy() on a NULL pointer is a no-op, so the NULL check in
cpdma_ctlr_destroy() can safely be removed.
Signed-off-by: Cyril Roelandt <tipecaml@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79876e03

net: Fix possible wrong checksum generation. · c9af6db4

Pravin B Shelar authored Feb 11, 2013

Patch cef401de (net: fix possible wrong checksum
generation) fixed wrong checksum calculation but it broke TSO by
defining new GSO type but not a netdev feature for that type.
net_gso_ok() would not allow hardware checksum/segmentation
offload of such packets without the feature.

Following patch fixes TSO and wrong checksum. This patch uses
same logic that Eric Dumazet used. Patch introduces new flag
SKBTX_SHARED_FRAG if at least one frag can be modified by
the user. but SKBTX_SHARED_FRAG flag is kept in skb shared
info tx_flags rather than gso_type.

tx_flags is better compared to gso_type since we can have skb with
shared frag without gso packet. It does not link SHARED_FRAG to
GSO, So there is no need to define netdev feature for this.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c9af6db4

Merge branch 'tcp_tsoffset' · b8fa4100

David S. Miller authored Feb 13, 2013

Andrey Vagin says:

====================
If a TCP socket will get live-migrated from one box to another the
timestamps (which are typically ON) will get screwed up -- the new
kernel will generate TS values that has nothing to do with what they
were on dump. The solution is to yet again fix the kernel and put a
"timestamp offset" on a socket.

A socket offset is added in places where externally visible tcp
timestamp option is parsed/initialized.

Connections in the SYN_RECV state are not supported, global
tcp_time_stamp is used for them, because repair mode doesn't support
this state. In a future it can be implemented by the similar way as for
TIME_WAIT sockets.

For time-wait sockets offset is inhereted by a proper tcp_sock.

A per-socket offset can be set only for sockets in repair mode.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b8fa4100

tcp: send packets with a socket timestamp · ee684b6f

Andrey Vagin authored Feb 11, 2013

A socket timestamp is a sum of the global tcp_time_stamp and
a per-socket offset.

A socket offset is added in places where externally visible
tcp timestamp option is parsed/initialized.

Connections in the SYN_RECV state are not supported, global
tcp_time_stamp is used for them, because repair mode doesn't support
this state. In a future it can be implemented by the similar way
as for TIME_WAIT sockets.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee684b6f

tcp: set and get per-socket timestamp · 93be6ce0

Andrey Vagin authored Feb 11, 2013

A timestamp can be set, only if a socket is in the repair mode.

This patch adds a new socket option TCP_TIMESTAMP, which allows to
get and set current tcp times stamp.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

93be6ce0

tcp: adding a per-socket timestamp offset · ceaa1fef

Andrey Vagin authored Feb 11, 2013

This functionality is used for restoring tcp sockets. A tcp timestamp
depends on how long a system has been running, so it's differ for each
host. The solution is to set a per-socket offset.

A per-socket offset for a TIME_WAIT socket is inherited from a proper
tcp socket.

tcp_request_sock doesn't have a timestamp offset, because the repair
mode for them are not implemented.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ceaa1fef

Merge branch 'gfar-ethtool-atomic' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux · d0023f82

David S. Miller authored Feb 13, 2013

Paul Gortmaker says:

====================
Eric noticed that the handling of local u64 ethtool counters for
this driver commonly found on Freescale ppc-32 boards was racy.

However, before converting them over to atomic64_t, I noticed
that an internal struct was being used to determine the offsets
for exporting this data into the ethtool buffer, and in doing
so, it assumed that the counters would always be u64.  Rather
than keep this implicit assumption, a simple code cleanup gets
rid of the struct completely, and leaves less conversion sites.

The alternative solution would have been to take advantage of
the fact that the counters are all relating to error conditions,
and hence make them internally u32.  In doing so, we'd be assuming
that U32_MAX of any particular error condition is highly unlikely.
This might have made sense if any increments were in a hot path.

Tested with "ethtool -S eth0" on sbc8548 board.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d0023f82

netpoll: fix smatch warnings in netpoll core code · 959d5fde

Neil Horman authored Feb 13, 2013

Dan Carpenter contacted me with some notes regarding some smatch warnings in the
netpoll code, some of which I introduced with my recent netpoll locking fixes,
some which were there prior. Specifically they were:

net-next/net/core/netpoll.c:243 netpoll_poll_dev() warn: inconsistent
returns mutex:&ni->dev_lock: locked (213,217) unlocked (210,243)
net-next/net/core/netpoll.c:706 netpoll_neigh_reply() warn: potential
pointer math issue ('skb_transport_header(send_skb)' is a 128 bit pointer)

This patch corrects the locking imbalance (the first error), and adds some
parenthesis to correct the second error. Tested by myself. Applies to net-next
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

959d5fde

net: skbuff: fix compile error in skb_panic() · 99d5851e

James Hogan authored Feb 13, 2013

I get the following build error on next-20130213 due to the following
commit:

commit f05de73b ("skbuff: create
skb_panic() function and its wrappers").

It adds an argument called panic to a function that uses the BUG() macro
which tries to call panic, but the argument masks the panic() function
declaration, resulting in the following error (gcc 4.2.4):

net/core/skbuff.c In function 'skb_panic':
net/core/skbuff.c +126 : error: called object 'panic' is not a function

This is fixed by renaming the argument to msg.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Jean Sacren <sakiwit@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

99d5851e

gianfar: convert u64 status counters to atomic64_t · 212079df

Paul Gortmaker authored Feb 12, 2013

While looking at some asm dump for an unrelated change, Eric
noticed in the following stats count increment code:

    50b8:       81 3c 01 f8     lwz     r9,504(r28)
    50bc:       81 5c 01 fc     lwz     r10,508(r28)
    50c0:       31 4a 00 01     addic   r10,r10,1
    50c4:       7d 29 01 94     addze   r9,r9
    50c8:       91 3c 01 f8     stw     r9,504(r28)
    50cc:       91 5c 01 fc     stw     r10,508(r28)

that a 64 bit counter was used on ppc-32 without sync
and hence the "ethtool -S" output was racy.

Here we convert all the values to use atomic64_t so that
the output will always be consistent.
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

212079df

gianfar: remove largely unused gfar_stats struct · 68719786

Paul Gortmaker authored Feb 12, 2013

The gfar_stats struct is only used in copying out data
via ethtool.  It is declared as the extra stats, followed
by the rmon stats.  However, the rmon stats are never
actually ever used in the driver; instead the rmon data
is a u32 register read that is cast directly into the
ethtool buf.

It seems the only reason rmon is in the struct at all is
to give the offset(s) at which it should be exported into
the ethtool buffer.  But note gfar_stats doesn't contain
a gfar_extra_stats as a substruct -- instead it contains
a u64 array of equal element count.  This implicitly means
we have two independent declarations of what gfar_extra_stats
really is.  Rather than have this duality, we already have
defines which give us the offset directly, and hence do not
need the struct at all.

Further, since we know the extra_stats is unconditionally
always present, we can write it out to the ethtool buf
1st, and then optionally write out the rmon data.  There
is no need for two independent loops, both of which are
simply copying out the extra_stats to buf offset zero.

This also helps pave the way towards allowing the extra
stats fields to be converted to atomic64_t values, without
having their types directly influencing the ethtool stats
export code (gfar_fill_stats) that expects to deal with u64.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

68719786

12 Feb, 2013 3 commits

act_police: improved accuracy at high rates · c6d14ff1

Jiri Pirko authored Feb 12, 2013

Current act_police uses rate table computed by the "tc" userspace
program, which has the following issue:

The rate table has 256 entries to map packet lengths to token (time
units).  With TSO sized packets, the 256 entry granularity leads to
loss/gain of rate, making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch explicitly computes
the time and accounts for packet transmission times with nanosecond
granularity.

This is a followup to 56b765b7
("htb: improved accuracy at high rates").
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6d14ff1

act_police: move struct tcf_police to act_police.c · 0e243218

Jiri Pirko authored Feb 12, 2013

It's not used anywhere else, so move it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0e243218

tbf: improved accuracy at high rates · b757c933

Jiri Pirko authored Feb 12, 2013

Current TBF uses rate table computed by the "tc" userspace program,
which has the following issue:

The rate table has 256 entries to map packet lengths to
token (time units).  With TSO sized packets, the 256 entry granularity
leads to loss/gain of rate, making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch explicitly computes
the time and accounts for packet transmission times with nanosecond
granularity.

This is a followup to 56b765b7
("htb: improved accuracy at high rates").
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b757c933