1. 25 Aug, 2014 8 commits
    • Daniel Borkmann's avatar
      random32: improvements to prandom_bytes · a98406e2
      Daniel Borkmann authored
      This patch addresses a couple of minor items, mostly addesssing
      prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
      for length arguments, 2) We can use put_unaligned() when filling
      the array instead of open coding it [ perhaps some archs will
      further benefit from their own arch specific implementation when
      GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
      int as type for getting the arch seed, 5) Make use of
      prandom_u32_max() for timer slack.
      
      Regarding the change to put_unaligned(), callers of prandom_bytes()
      which internally invoke prandom_bytes_state(), don't bother as
      they expect the array to be filled randomly and don't have any
      control of the internal state what-so-ever (that's also why we
      have periodic reseeding there, etc), so they really don't care.
      
      Now for the direct callers of prandom_bytes_state(), which
      are solely located in test cases for MTD devices, that is,
      drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
      
      These tests basically fill a test write-vector through
      prandom_bytes_state() with an a-priori defined seed each time
      and write that to a MTD device. Later on, they set up a read-vector
      and read back that blocks from the device. So in the verification
      phase, the write-vector is being re-setup [ so same seed and
      prandom_bytes_state() called ], and then memcmp()'ed against the
      read-vector to check if the data is the same.
      
      Akinobu, Lothar and I also tested this patch and it runs through
      the 3 relevant MTD test cases w/o any errors on the nandsim device
      (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
      and i.MX6):
      
        # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
                           third_id_byte=0x00 fourth_id_byte=0x15
        # modprobe mtd_oobtest dev=0
        # modprobe mtd_pagetest dev=0
        # modprobe mtd_subpagetest dev=0
      
      We also don't have any users depending directly on a particular
      result of the PRNG (except the PRNG self-test itself), and that's
      just fine as it e.g. allowed us easily to do things like upgrading
      from taus88 to taus113.
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Tested-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: default avatarLothar Waßmann <LW@KARO-electronics.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a98406e2
    • David S. Miller's avatar
      Merge branch 'csums-next' · c1e60bd4
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      net: Checksum offload changes - Part V
      
      I am working on overhauling RX checksum offload. Goals of this effort
      are:
      
      - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
      - Preserve CHECKSUM_COMPLETE through encapsulation layers
      - Don't do skb_checksum more than once per packet
      - Unify GRO and non-GRO csum verification as much as possible
      - Unify the checksum functions (checksum_init)
      - Simplify code
      
      What is in this fifth patch set:
      
      - Added GRO checksum validation functions
      - Call the GRO validations functions from TCP and GRE gro_receive
      - Perform checksum verification in the UDP gro_receive path using
        GRO functions and add support for gro_receive in UDP6
      
      Changes in V2:
      
      - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it
        to CHECKSUM_COMPLETE from GRO checksum validation. This avoids
        performance penalty in checksumming bytes which are before the header
        GRO is at.
      
      Please review carefully and test if possible, mucking with basic
      checksum functions is always a little precarious :-)
      
      ----
      
      Test results with this patch set are below. I did not notice any
      performace regression.
      
      Tests run:
         TCP_STREAM: super_netperf with 200 streams
         TCP_RR: super_netperf with 200 streams and -r 1,1
      
      Device bnx2x (10Gbps):
         No GRE RSS hash (RX interrupts occur on one core)
         UDP RSS port hashing enabled.
      
      * GRE with checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              9.91% CPU utilization
              5163.78 Mbps
          TCP_RR
              50.64% CPU utilization
              219/347/502 90/95/99% latencies
              834103 tps
        Without fix:
          TCP_STREAM
              10.05% CPU utilization
              5186.22 tps
          TCP_RR
              49.70% CPU utilization
              227/338/486 90/95/99% latencies
              813450 tps
      
      * GRE without checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              10.18% CPU utilization
              5159 Mbps
          TCP_RR
              51.86% CPU utilization
              214/325/471 90/95/99% latencies
              865943 tps
        Without fix:
          TCP_STREAM
              10.26% CPU utilization
              5307.87 Mbps
          TCP_RR
              50.59% CPU utilization
              224/325/476 90/95/99% latencies
              846429 tps
      
      *** Simulate device returns CHECKSUM_COMPLETE
      
      * VXLAN with checksum
        With fix:
          TCP_STREAM
              13.03% CPU utilization
              9093.9 Mbps
          TCP_RR
              95.96% CPU utilization
              161/259/474 90/95/99% latencies
              1.14806e+06 tps
        Without fix:
          TCP_STREAM
              13.59% CPU utilization
              9093.97 Mbps
          TCP_RR
              93.95% CPU utilization
              160/259/484 90/95/99% latencies
              1.10262e+06 tps
      
      * VXLAN without checksum
        With fix:
          TCP_STREAM
              13.28% CPU utilization
              9093.87 Mbps
          TCP_RR
              95.04% CPU utilization
              155/246/439 90/95/99% latencies
              1.15e+06 tps
        Without fix:
          TCP_STREAM
              13.37% CPU utilization
              9178.45 Mbps
          TCP_RR
              93.74% CPU utilization
              161/257/469 90/95/99% latencies
              1.1068e+06 Mbps
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1e60bd4
    • Tom Herbert's avatar
      gre: When GRE csum is present count as encap layer wrt csum · 48a5fc77
      Tom Herbert authored
      In GRE demux if the GRE checksum pop rcv encapsulation so that any
      encapsulated checksums are treated as tunnel checksums.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48a5fc77
    • Tom Herbert's avatar
      udp: additional GRO support · 57c67ff4
      Tom Herbert authored
      Implement GRO for UDPv6. Add UDP checksum verification in gro_receive
      for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57c67ff4
    • Tom Herbert's avatar
      tcp: Call skb_gro_checksum_validate · 149d0774
      Tom Herbert authored
      In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP
      checksum in the gro context.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      149d0774
    • Tom Herbert's avatar
      758f75d1
    • Tom Herbert's avatar
      net: add gro_compute_pseudo functions · 1933a785
      Tom Herbert authored
      Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are
      the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo
      for GRO path. The IP header is taken from skb_gro_network_header.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1933a785
    • Tom Herbert's avatar
      net: skb_gro_checksum_* functions · 573e8fca
      Tom Herbert authored
      Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check,
      and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete.
      These are the cognates of the normal checksum functions but are used
      in the gro_receive path and operate on GRO related fields in sk_buffs.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      573e8fca
  2. 23 Aug, 2014 32 commits