1. 17 Jan, 2005 19 commits
    • Rusty Russell's avatar
      [NETFILTER]: Don't cacheline align slab allocs · 0477d38c
      Rusty Russell authored
      Anton points out that cacheline aligning conntrack entries is a wank.
      He's right: there's lots of them, and they're currently ~200 bytes.
      Same with the cargo-cult programming in ipt_hashlimit.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0477d38c
    • Rusty Russell's avatar
      [NETFILTER]: Get rid of 'initialized' in nat structure: use conntrack status bits · 4c88c4e3
      Rusty Russell authored
      Fairly simple patch to move the 'initialized' NAT bitfield to bits in
      the 'status' word.  This saves the size of a pointer from the
      connection tracking structure.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c88c4e3
    • Rusty Russell's avatar
      [NETFILTER]: Use a bit in conntrack status to indicate sequence number adjustment · 7cfb7fc4
      Rusty Russell authored
      Rather than calling the sequence adjustment code on every connection
      which has a helper, we can set a status bit on the conntrack when we
      change the length of a TCP packet, and use that to indicate that we
      should call the routine.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7cfb7fc4
    • Rusty Russell's avatar
      [NETFILTER]: Remove ip_conntrack_tuple_hash 'ctrack' pointer · f97bf1b1
      Rusty Russell authored
      We keep a pointer from the hash table entry into the connection
      tracking entry it's a part of.  However, there's a spare byte in the
      hash entry anyway, which we can use to indicate which of the two
      tuples it is, and the simply use container_of() to access the
      conntrack.
      
      This saves two pointers per connection tracking entry.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f97bf1b1
    • Rusty Russell's avatar
      [NETFILTER]: Remove manip array from conntrack entry · 8d5f3377
      Rusty Russell authored
      Original patch and multo bugfixes by Krisztian Kovacs.
      
      Now NAT has been simplified, there is only one place to NAT each
      packet.  That means we can intuit what to do by looking at the
      difference between this packet and the reply we expect, getting rid of
      the manips[] array in the connection tracking structure, which is 72
      bytes.  Rework NAT to be based on 'change this packet to make src/dst
      look like this tuple'.
      
      1) Each protocol's manip_pkt takes a 'struct ip_conntrack_manip',
         which is half (the source half) of a tuple.  Hand the whole desired
         tuple to the NAT code and have it use the 'maniptype' arg to decide
         what part to copy.
      
      2) Krisztian points out that we don't need the NAT lock to read the
         NAT information (or the tuples) as they never change once set, and
         while being set we have exclusive access.  A lock is only needed to
         deal with only remaining NAT list: the bysource hash.
      
      3) We don't need to rehash for the bysource hash: it depends on the
         incoming packet, which we can't change.
      
      4) Many NAT functions only need the maniptype they are to perform, not
         the actual hook, which makes the code clearer.
      
      5) New status bits to indicate what NAT needs to be done.  We can
         always figure it out by inverting the tuple we expect in the other
         direction and comparing it, but this is faster.
      
      6) Rename 'do_bindings' to 'nat_packet'.
      
      7) ICMP handing is vastly simplified: we unconditionally change to
         look the way we want.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d5f3377
    • Rusty Russell's avatar
      [NETFILTER]: Adrian Bunk's cleanup patches · cd795640
      Rusty Russell authored
      Adrian Bunk's cleanup patch, updated for after all the Rusty patches.
      The ip_nat_protocol_register/unregister EXPORT_SYMBOLs() stay, as they
      are used by future patches.
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified)
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd795640
    • Rusty Russell's avatar
      [NETFILTER]: Remove remaining multirange related code · 1ae14212
      Rusty Russell authored
      From: KOVACS Krisztian <hidden@sch.bme.hu>
      
        Hi Rusty,
      
      Your recent patch which removed the byipsproto hash left some unused
      code around. The following patch cleans up that. I'm not sure it's
      correct, but please take a look at it.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ae14212
    • Rusty Russell's avatar
      [NETFILTER]: Make expectations timeouts compulsory · 0cac7232
      Rusty Russell authored
      This patch simplifies the code by always having expectation timeouts.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0cac7232
    • Rusty Russell's avatar
      [NETFILTER]: Simplify expect handling · 2a526ac9
      Rusty Russell authored
      Now we've changed expect handling, we can simplify it significantly.
      
      1) struct ip_conntrack_expect only exists until the connection
         matching it is created.  Now NAT is done directly at the time the
         expectation is matched, we don't need to keep this information
         around.
      
      2) The term 'master' is used everywhere to mean the connection that
         expected this connection.  The 'master' field in the new connection
         points straight to the master connection, and holds a reference.
      
      3) There is no direct link from the connection to the expectations it
         has created: we walk the global list to find them if we need to
         clean them up.  Each expectation holds a reference.
      
      4) The ip_conntrack_expect_tuple_lock is now a proper subset of
         ip_conntrack_lock, so we can eliminate it.
      
      5) Remove flags from helper: the policy of evicting the oldest
         expectation seems to be appropriate for everyone.
      
      6) ip_conntrack_expect_find_get() and ip_conntrack_expect_put() are no
         longer required.
      
      7) Remove reference count from expectations, and don't free when we
         fail ip_conntrack_expect_related(): have user call
         ip_conntrack_expect_free().
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a526ac9
    • Rusty Russell's avatar
      [NETFILTER]: Fix up IRC, AMANDA, TFTP and SNMP · 55d349b2
      Rusty Russell authored
      Fixes up the other helpers for direct conntrack->NAT helper calling.
      SNMP doesn't really need a conntrack helper, but under this new model,
      the NAT helper will register at that point anyway: NAT helpers
      themselves are removed.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55d349b2
    • Rusty Russell's avatar
      [NETFILTER]: Call NAT helper modules directly from conntrack modules, fixup FTP · 92bb4f8e
      Rusty Russell authored
      Currently connection tracking and NAT helper modules for a protocol
      interact only indirectly (the conntrack module places information in
      the conntrack structure, which the NAT module pulls out).
      
      This leads to several issues:
      1) Both modules must know what port to watch, and must match.
      2) Identifying the particular packet which created the connection
         is cumbersome (TCP) or impossible (UDP).
      3) The connection tracking code sets up an expectation which the
         NAT code then has to change.
      4) The lack of direct symbol dependencies means we have to contrive
         one, since they are functionally dependent.
      
      Here is the current code flow:
      FTP CONTROL PACKET:
      NF_IP_PRE_ROUTING:
         ip_conntrack_in
            resolve_normal_ct
               init_conntrack: sets ct->helper to ip_conntrack_ftp.c:help()
         ct->help(): if PORT/PASV command:
            Sets exp->help.exp_ftp_info to tcp seq number of data.
            ip_conntrack_expect(): expects the connection
      
         ip_nat_setup_info: sets ct->nat.info->helper to ip_nat_ftp.c:help()
         ip_nat_fn:
            proto->exp_matches_pkt: if packet matches expectation
            ct->nat.info->helper(): If packet going client->server,
                  and packet data is one in ct_ftp_info:
               ftp_data_fixup():
                  ip_conntrack_change_expect(): change the expectation
                  Modify packet contents with new address.
      
      NF_IP_POST_ROUTING:
         ip_nat_fn
            ct->nat.info->helper(): If packet going server->client,
                  and packet data is one in ct_ftp_info:
               ftp_data_fixup():
                  ip_conntrack_change_expect(): change the expectation
                  Modify packet contents with new address.
      
      FTP DATA (EXPECTED) CONNECTION FIRST PACKET:
      NF_IP_PRE_ROUTING:
         ip_conntrack_in
            resolve_normal_ct
               init_conntrack: set ct->master.
         ip_nat_fn:
            master->nat.info.helper->expect()
               Set up source NAT mapping to match FTP control connection.
      
      NF_IP_PRE_ROUTING:
         ip_nat_fn:
            master->nat.info.helper->expect()
               Set up dest NAT mapping to match FTP control connection.
      
      
      The new flow looks like this:
      FTP CONTROL PACKET:
      NF_IP_PRE_ROUTING:
         ip_conntrack_in
            resolve_normal_ct
               init_conntrack: sets ct->helper to ip_conntrack_ftp.c:help()
      
      NF_IP_POST_ROUTING:
         ip_confirm:
            ct->helper->help:
               If !ip_nat_ftp_hook: ip_conntrack_expect().
               ip_nat_ftp: 
                  set exp->oldproto to old port.
                  ip_conntrack_change_expect(): change the expectation
                  set exp->expectfn to ftp_nat_expected.
                  Modify packet contents with new address.
      
      FTP DATA (EXPECTED) CONNECTION FIRST PACKET:
      NF_IP_PRE_ROUTING:
         ip_conntrack_in
            resolve_normal_ct
               init_conntrack: set ct->master.
               call exp->expectfn (ftp_nat_expected):
                   call ip_nat_follow_master().
      
      The big changes are that the ip_nat_ftp module sets ip_conntrack_ftp's
      ip_nat_ftp_hook when it initializes, so it calls the NAT code directly
      when a packet containing the expect information is found by the
      conntrack helper: and this interface can carry all the information
      these two want to share.  Also, that conntrack helper is called as the
      packet leaves the box, so there are no issues with expectations being
      set up before the packet has been filtered.  The NAT helper doesn't
      need to register and duplicate the conntrack ports.
      
      The other trick is ip_nat_follow_master(), which does the NAT setup
      all at once (source and destination NAT as required) such that the
      expected connection is NATed the same way the master connection
      was.
      
      We also call ip_conntrack_tcp_update() (which I incidentally neatened)
      after mangling a TCP packet; ip_nat_seq_adjust() does this, but now
      mangling is done at the last possible moment, after
      ip_nat_seq_adjust() was already called.
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92bb4f8e
    • Rusty Russell's avatar
      [NETFILTER]: Fix overlapping expectations in existing expectation code · 13b9f4df
      Rusty Russell authored
      Change kmem_cache_free() calls in ip_conntrack_expect_related() to
      ip_conntrack_expect_put(): they should be equivalent but allows a hack
      in next patch (caller can keep expect).
      
      More importantly, a previous expectation should only be refreshed and return
      EEXIST if it's owned by the same connection (nfsim found this bug).
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13b9f4df
    • David S. Miller's avatar
      85ef7720
    • Arthur Kepner's avatar
    • Christoph Hellwig's avatar
      [IPV6]: Fix EUI64 generation on S/390. · b74ac55d
      Christoph Hellwig authored
       - put a dev_id field in struct net_device, so that it uses space that
         would be wasted by padding otherwise.
       - if this fields is non-null let ipv6_generate_eui64 use the algorithm
         from the QETH code to generate an EUI that's different for each
         OS instance.  See code comments for details.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b74ac55d
    • Thomas Graf's avatar
      [PKT_SCHED]: Fix c99ism in cls_api.c · 86679f6f
      Thomas Graf authored
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86679f6f
    • Herbert Xu's avatar
      [NETLINK]: Orphan SKBs in netlink_trim(). · f76f745c
      Herbert Xu authored
      This makes the skb->truesize modifications always OK.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f76f745c
    • David S. Miller's avatar
      Merge bk://kernel.bkbits.net/acme/connection_sock-2.6 · 1255a1e9
      David S. Miller authored
      into nuts.davemloft.net:/disk1/BK/net-2.6
      1255a1e9
    • David S. Miller's avatar
      Merge bk://bk.skbuff.net:20611/linux-2.6-inet6 · 20408758
      David S. Miller authored
      into nuts.davemloft.net:/disk1/BK/net-2.6
      20408758
  2. 16 Jan, 2005 21 commits