1. 15 Aug, 2016 5 commits
    • Hariprasad Shenai's avatar
      cxgb4: Add control net_device for configuring PCIe VF · 7829451c
      Hariprasad Shenai authored
      Issue:
      For instance, the current APIs assume a 1-to-1 mapping of Network Ports,
      Physical Functions and the SR-IOV Virtual Functions of those Physical
      Functions. This is not the case with our cards where any Virtual
      Function can be hooked up to any Port -- or any number of Ports the
      current Linux APIs also assume only 1 Network Interface/Port can be
      accessed per Virtual Function.
      
      Another issue is that these APIs assume that the Administrative Driver
      is attached to the Physical Function Associated with a Virtual Function.
      This is not the case with our card where all administration is performed
      by a Driver which is not attached to any of the Physical Functions which
      have SR-IOV PCI Capabilities.
      
      Another consequence of these assumptions is the inability to utilize all
      of the cards SR-IOV resources. For instance, our cards have SR-IOV
      Capabilities on Physical Functions 0..3 and the administrative Driver
      attaches to Physical Function 4. Each of the Physical Functions 0..3 can
      support up to 16 Virtual Functions. With the current Linux APIs, a
      2-Port card would only be able to use the Virtual Functions on Physical
      Function 0..1 and not allow the Virtual Functions on Physical Functions
      2..3 to be used since there are no Ports 2..3 on a 2-Port card.
      
      Fix:
      Since the control node is always the netdevice for all VF ACL commands.
      Created a dummy netdevice for each Physical Function from 0 to 3 through
      which one could control their VFs. The device won't be associated with
      any port, since it doesn't need to transmit/receive. Its purely used
      for VF management purpose only. The device will be registered only when
      VF for a particular PF is configured using PCI sysfs interface and
      unregistered while pci_disable_sriov() for the PF is called.
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7829451c
    • David S. Miller's avatar
      Merge branch 'proc-per-ns' · a878c020
      David S. Miller authored
      Dmitry Torokhov says:
      
      ====================
      Make /proc per net namespace objects belong to container
      
      Currently [almost] all /proc objects belong to the global root, even if
      data belongs to a given namespace within a container and (at least for
      sysctls) we work around permssions checks to allow container's root to
      access the data.
      
      This series changes ownership of net namespace /proc objects
      (/proc/net/self/* and /proc/sys/net/*) to be container's root and not
      global root when there exists mapping for container's root in user
      namespace.
      
      This helps when running Android CTS in a container, but I think it makes
      sense regardless.
      
      Changes from V1:
      
      - added fix for crash when !CONFIG_NET_NS (new patch #1)
      - addressed Eric'c comments for error handling style in patch #3 and
        added his Ack
      - adjusted patch #2 to use the same style of erro handling
      - sent out as series instead of separate patches
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a878c020
    • Dmitry Torokhov's avatar
      net: make net namespace sysctls belong to container's owner · e79c6a4f
      Dmitry Torokhov authored
      If net namespace is attached to a user namespace let's make container's
      root owner of sysctls affecting said network namespace instead of global
      root.
      
      This also allows us to clean up net_ctl_permissions() because we do not
      need to fudge permissions anymore for the container's owner since it now
      owns the objects in question.
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e79c6a4f
    • Dmitry Torokhov's avatar
      proc: make proc entries inherit ownership from parent · c110486f
      Dmitry Torokhov authored
      There are certain parameters that belong to net namespace and that are
      exported in /proc. They should be controllable by the container's owner,
      but are currently owned by global root and thus not available.
      
      Let's change proc code to inherit ownership of parent entry, and when
      create per-ns "net" proc entry set it up as owned by container's owner.
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c110486f
    • Dmitry Torokhov's avatar
      netns: do not call pernet ops for not yet set up init_net namespace · f8c46cb3
      Dmitry Torokhov authored
      When CONFIG_NET_NS is disabled, registering pernet operations causes
      init() to be called immediately with init_net as an argument. Unfortunately
      this leads to some pernet ops, such as proc_net_ns_init() to be called too
      early, when init_net namespace has not been fully initialized. This causes
      issues when we want to change pernet ops to use more data from the net
      namespace in question, for example reference user namespace that owns our
      network namespace.
      
      To fix this we could either play game of musical chairs and rearrange init
      order, or we could do the same as when CONFIG_NET_NS is enabled, and
      postpone calling pernet ops->init() until namespace is set up properly.
      
      Note that we can not simply undo commit ed160e83 ("[NET]: Cleanup
      pernet operation without CONFIG_NET_NS") and use the same implementations
      for __register_pernet_operations() and __unregister_pernet_operations(),
      because many pernet ops are marked as __net_initdata and will be discarded,
      which wreaks havoc on our ops lists. Here we rely on the fact that we only
      use lists until init_net is fully initialized, which happens much earlier
      than discarding __net_initdata sections.
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8c46cb3
  2. 13 Aug, 2016 35 commits