1. 01 Sep, 2005 14 commits
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Add sysfs support for connection structs · 76812d81
      Michael Ellerman authored
      To aid in field debugging, add sysfs support for iseries_veth's connection
      structures. At the moment this is all read-only, however we could think about
      adding write support for some attributes in future.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      76812d81
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Fix bogus counting of TX errors · db5e8718
      Michael Ellerman authored
      There's a number of problems with the way iseries_veth counts TX errors.
      
      Firstly it counts conditions which aren't really errors as TX errors. This
      includes if we don't have a connection struct for the other LPAR, or if the
      other LPAR is currently down (or just doesn't want to talk to us). Neither
      of these should count as TX errors.
      
      Secondly, it counts one TX error for each LPAR that fails to accept the packet.
      This can lead to TX error counts higher than the total number of packets sent
      through the interface. This is confusing for users.
      
      This patch fixes that behaviour. The non-error conditions are no longer
      counted, and we introduce a new and I think saner meaning to the TX counts.
      
      If a packet is successfully transmitted to any LPAR then it is transmitted
      and tx_packets is incremented by 1.
      
      If there is an error transmitting a packet to any LPAR then that is counted
      as one error, ie. tx_errors is incremented by 1.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      db5e8718
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Simplify full-queue handling · e0808494
      Michael Ellerman authored
      The iseries_veth driver often has multiple netdevices sending packets over
      a single connection to another LPAR. If the bandwidth to the other LPAR is
      exceeded, all the netdevices must have their queues stopped.
      
      The current code achieves this by queueing one incoming skb on the
      per-netdevice port structure. When the connection is able to send more packets
      we iterate through the port structs and flush any packet that is queued,
      as well as restarting the associated netdevice's queue.
      
      This arrangement makes less sense now that we have per-connection TX timers,
      rather than the per-netdevice generic TX timer.
      
      The new code simply detects when one of the connections is full, and stops
      the queue of all associated netdevices. Then when a packet is acked on that
      connection (ie. there is space again) all the queues are woken up.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      e0808494
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Add a per-connection ack timer · 24562ffa
      Michael Ellerman authored
      Currently the iseries_veth driver contravenes the specification in
      Documentation/networking/driver.txt, in that if packets are not acked by
      the other LPAR they will sit around forever.
      
      This patch adds a per-connection timer which fires if we've had no acks for
      five seconds. This is superior to the generic TX timer because it catches
      the case of a small number of packets being sent and never acked.
      
      This fixes a bug we were seeing on real systems, where some IPv6 neighbour
      discovery packets would not be acked and then prevent the module from being
      removed, due to skbs lying around.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      24562ffa
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Remove TX timeout code · 48683d72
      Michael Ellerman authored
      The iseries_veth driver uses the generic TX timeout watchdog, however a better
      solution is in the works, so remove this code.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      48683d72
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Use kobjects to track lifecycle of connection structs · f0c129ca
      Michael Ellerman authored
      The iseries_veth driver can attach to multiple vlans, which correspond to
      multiple net devices. However there is only 1 connection between each LPAR,
      so the connection structure may be shared by multiple net devices.
      
      This makes module removal messy, because we can't deallocate the connections
      until we know there are no net devices still using them. The solution is to
      use ref counts on the connections, so we can delete them (actually stop) as
      soon as the ref count hits zero.
      
      This patch fixes (part of) a bug we were seeing with IPv6 sending probes to
      a dead LPAR, which would then hang us forever due to leftover skbs.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      f0c129ca
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Make init_connection() & destroy_connection() symmetrical · ec60beeb
      Michael Ellerman authored
      This patch makes veth_init_connection() and veth_destroy_connection()
      symmetrical in that they allocate/deallocate the same data.
      
      Currently if there's an error while initialising connections (ie. ENOMEM)
      we call veth_module_cleanup(), however this will oops because we call
      driver_unregister() before we've called driver_register(). I've never seen
      this actually happen though.
      
      So instead we explicitly call veth_destroy_connection() for each connection,
      any that have been set up will be deallocated.
      
      We also fix a potential leak if vio_register_driver() fails.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      ec60beeb
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Only call dma_unmap_single() if dma_map_single() succeeded · cbf9074c
      Michael Ellerman authored
      The iseries_veth driver unconditionally calls dma_unmap_single() even
      when the corresponding dma_map_single() may have failed.
      
      Rework the code a bit to keep the return value from dma_unmap_single()
      around, and then check if it's a dma_mapping_error() before we do
      the dma_unmap_single().
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      cbf9074c
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Replace lock-protected atomic with an ordinary variable · b08bd5c0
      Michael Ellerman authored
      The iseries_veth driver uses atomic ops to manipulate the in_use field of
      one of its per-connection structures. However all references to the
      flag occur while the connection's lock is held, so the atomic ops aren't
      necessary.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      b08bd5c0
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Remove redundant message stack lock · d7893ddd
      Michael Ellerman authored
      The iseries_veth driver keeps a stack of messages for each connection
      and a lock to protect the stack. However there is also a per-connection lock
      which makes the message stack lock redundant.
      
      Remove the message stack lock and document the fact that callers of the
      stack-manipulation functions must hold the connection's lock.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      d7893ddd
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Fix broken promiscuous handling · 2a5391a1
      Michael Ellerman authored
      Due to a logic bug, once promiscuous mode is enabled in the iseries_veth
      driver it is never disabled.
      
      The driver keeps two flags, promiscuous and all_mcast which have exactly the
      same effect. This is because we only ever receive packets destined for us,
      or multicast packets. So consolidate them into one promiscuous flag for
      simplicity.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      2a5391a1
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Try to avoid pathological reset behaviour · 58c5900b
      Michael Ellerman authored
      The iseries_veth driver contains a state machine which is used to manage
      how connections are setup and neogotiated between LPARs.
      
      If one side of a connection resets for some reason, the two LPARs can get
      stuck in a race to re-setup the connection. This can lead to the connection
      being declared dead by one or both ends. In practice the connection is
      declared dead by one or both ends approximately 8/10 times a connection is
      reset, although it is rare for connections to be reset.
      
      (an example here: http://michael.ellerman.id.au/files/misc/veth-trace.html)
      
      The core of the problem is that the end that resets the connection doesn't
      wait for the other end to become aware of the reset. So the resetting end
      starts setting the connection back up, and then receives a reset from the
      other end (which is the response to the initial reset). And so on.
      
      We're severely limited in what we can do to fix this. The protocol between
      LPARs is essentially fixed, as we have to interoperate with both OS/400
      and old Linux drivers. Which also means we need a fix that only changes the
      code on one end.
      
      The only fix I've found given that, is to just blindly sleep for a bit when
      resetting the connection, in the hope that the other end will get itself
      sorted.  Needless to say I'd love it if someone has a better idea.
      
      This does work, I've so far been unable to get it to break, whereas without
      the fix a reset of one end will lead to a dead connection ~8/10 times.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      58c5900b
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Remove a FIXME WRT deletion of the ack_timer · abfda471
      Michael Ellerman authored
      The iseries_veth driver has a timer which we use to send acks. When the
      connection is reset or stopped we need to delete the timer.
      
      Currently we only call del_timer() when resetting a connection, which means
      the timer might run again while the connection is being re-setup. As it turns
      out that's ok, because the flags the timer consults have been reset.
      
      It's cleaner though to call del_timer_sync() once we've dropped the lock,
      although the timer may still run between us dropping the lock and calling
      del_timer_sync(), but as above that's ok.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      abfda471
    • Michael Ellerman's avatar
      [PATCH] iseries_veth: Cleanup error and debug messages · 61a3c696
      Michael Ellerman authored
      Currently the iseries_veth driver prints the file name and line number in its
      error messages. This isn't very useful for most users, so just print
      "iseries_veth: message" instead.
      
       - convert uses of veth_printk() to veth_debug()/veth_error()/veth_info()
       - make terminology consistent, ie. always refer to LPAR not lpar
       - be consistent about printing return codes as %d not %x
       - make format strings fit in 80 columns
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      61a3c696
  2. 31 Aug, 2005 2 commits
  3. 30 Aug, 2005 24 commits