1. 19 Oct, 2021 10 commits
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for multiply cooling devices · 249606d3
      Vadim Pasternak authored
      Add new registers to support systems with multiply cooling devices.
      Modular systems support up-to four cooling devices. This capability
      is detected according to the registers initial setting.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093609.3771576-1-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      249606d3
    • Vadim Pasternak's avatar
      Documentation/ABI: Add new line card attributes for mlxreg-io sysfs interfaces · 5b0a315c
      Vadim Pasternak authored
      Add documentation for the new attributes for line cards:
      - CPLDs versioning.
      - Write protection control for 'nvram' devices.
      - Line card reset reasons.
      - Enabling burning of FPGA and CPLDs.
      - Enabling burning of FPGA and gearbox SPI flashes,
      - Enabling power of whole line card.
      - Enabling power of QSFP ports equipped on line card.
      - The maximum powered required for line card feeding.
      - Line card configuration Id.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-10-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      5b0a315c
    • Vadim Pasternak's avatar
      Documentation/ABI: Add new attributes for mlxreg-io sysfs interfaces · 527cd54d
      Vadim Pasternak authored
      Add documentation for the new attributes:
      - "bios_active_image"; "bios_auth_fail"; "bios_upgrade_fail";
        "bios_safe_mode" to represent various BIOS statuses.
      - "lc{n}_enable" - for put/release the line card to/from enable state.
      - "lc{n}_pwr" - for power on/off the line card.
      - "lc{n}_rst_mask" - for line card reset state enforced by ASIC, when
        it sets it due to some abnormal ASIC behavior.
      - "psu3_on"; "psu4_on" - for connection/disconnection power supply unit
        to/from the power source.
      - "pm_mgmt_en" - for setting power management control ownership. When
        power management control is provided by hardware, it means that
        hardware will automatically power off one or more line cards in case
        system power budget is under power required for feeding all powered
        on line cards. It could be a case, when some of power units lost
        power good state.
      - "shutdown_unlock" - for unlocking system after hardware or firmware
        thermal shutdown, which causes locking of the all interfaces to ASIC.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-9-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      527cd54d
    • Vadim Pasternak's avatar
      platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices · 62f9529b
      Vadim Pasternak authored
      Provide support for the Nvidia MSN4800-XX line cards for MSN4800
      Ethernet modular switch system, providing a high performance switching
      solution for Enterprise Data Centers (EDC) for building Ethernet based
      clusters, High-Performance Computing (HPC) and embedded environments.
      Initial version provides support for line card type MSN4800-C16. This
      type of line card is equipped with:
      - Lattice CPLD device, used for system and ports control.
      - four Nvidia gearbox devices, used for port splitting.
      - FPGA device, used for gearboxes management.
      - 16x100G QSFP28 ports.
      - hotpswap controllers, voltage regulators, analog-to-digital
        convertors, nvram devices.
      - status LED.
      
      During initialization driver creates:
      - line card's I2C tree through "i2c-mux-mlxcpd" driver.
      - line card's LED objects through "leds-mlxreg" driver.
      - line card's CPLD register space input / output "hwmon" attributes for
        line control and monitoring through "mlxreg-io" driver. These
        attributes provide CPLD and FPAG versioning, control for upgradable
        components burning, NVRAM devices write protection, line card
        revision, line card power consuming, line card reset cause
        indication, etcetera.
      
      Lattice CPLD device and nvram devices are feeding from auxiliary power
      domain and accessible, when line card is powered off. These devices
      are connected by line card driver probing routine, invoked after line
      card security verification is done by hardware and event lc#n_verified
      is received for line card located in slot #n.
      
      Gearboxes, FPGA, hotpswap controllers, voltage regulators,
      analog-to-digital convertors are feeding from main power domain. These
      devices are connected after power good event "lc#n_powered" is received
      for line card located in slot #n.
      
      The driver 'mlxreg-lc' is driven by 'mlxreg-hotplug' driver following
      relevant "hotplug" events.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-8-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      62f9529b
    • Vadim Pasternak's avatar
      platform_data/mlxreg: Add new field for secured access · 9d93d787
      Vadim Pasternak authored
      Extend structure 'mlxreg_core_data' with the field "secured". The
      purpose of this field is to restrict access to some attributes, if
      kernel is configured with security options, like:
      LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY.
      Access to some attributes, which for example, allow burning of some
      hardware components, like FPGA, CPLD, SPI, etcetera can break the
      system. In case user does not want to allow such access, it can disable
      it by setting security options.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-7-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      9d93d787
    • Vadim Pasternak's avatar
      platform/mellanox: mlxreg-io: Extend number of hwmon attributes · bbfd79c6
      Vadim Pasternak authored
      Extend maximum number of the attributes, exposed to 'sysfs'.
      It is requires in order to support modular systems, which
      provide more attributes for system control, statuses and info.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-6-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      bbfd79c6
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Configure notifier callbacks for modular system · 67eb006c
      Vadim Pasternak authored
      Add event notifier callbacks for modular system line cards. These
      callbacks are to be passed to "mlxreg-hotplug" driver by line card
      driver during probing. Then, when any line card related hotplug event
      is received (insertion ,power, synch, ready), hotplug driver will
      invoke callback for the relevant line card.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-5-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      67eb006c
    • Vadim Pasternak's avatar
      platform/mellanox: mlxreg-hotplug: Extend logic for hotplug devices operations · bb1023b6
      Vadim Pasternak authored
      Extend the structure 'mlxreg_hotplug_device" with platform device field
      to allow transition of the register map and system interrupt line number
      to underlying hotplug devices, sharing the same register map and
      same interrupt line with 'mlxreg-hotplug' driver.
      
      Extend logic for hotplug devices creation and removing according to
      the action associated with the hotplug device description. Previously
      hotplug driver was capable to attach / de-attach upon hotplug events
      only I2C devices handled by simple I2C drivers. Now it should be able
      to attach also devices handled by the platform drivers.
      
      The motivation is to allow transition of platform data like:
      - system interrupt line number, sharing with 'mlxreg-hotplug' to
        underlying hotplug devices.
      - shared register map of programmable devices on main board to
        underlying hotplug devices.
      
      Additioanlly the number of 'sysfs' attributes is increased, since
      modular system defines more 'sysfs' attributes.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-4-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      bb1023b6
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add initial support for new modular system · a5d8f57e
      Vadim Pasternak authored
      Add initial chassis management support for Nvidia modular Ethernet
      switch systems MSN4800, providing a high performance switching solution
      for Enterprise Data Centers (EDC) for building Ethernet based clusters,
      High-Performance Computing (HPC) and embedded environments.
      
      This system could be equipped with the different types of replaceable
      line cards and management board. The first system flavor will support
      the line card type MSN4800-C16 equipped with Lattice CPLD devices aimed
      for system and ASIC control, one Nvidia FPGA for gearboxes (PHYs)
      management, and four Nvidia gearboxes for the port control and with
      16x100GbE QSFP28 ports and also with various devices for electrical
      control.
      
      The system is equipped with eight slots for line cards, four slots for
      power supplies and six slots for fans. It could be configured as fully
      populated or with even only one line card. The line cards are
      hot-pluggable.
      In the future when more line card flavors are to be available (for
      example line cards with 8x200Gb Eth port, with 4x400 Eth ports, or with
      some kind of smart cards for offloading purpose), any type of line card
      could be inserted at any slot.
      
      The system is based on Nvidia Spectrum-3 ASIC. The switch height is
      4U and it fits standard rack size.
      
      System could be configured as fully populated or with even only one
      line card. The line cards are hot-pluggable.
      
      Line cards are connected to the chassis through I2C interface for the
      chassis management operations and through PCIe for the networking
      operations. Future line cards could be connected to the chassis through
      InfiniBand fabric, instead of PCIe.
      
      The first type of line card supports 16x100GbE QSFP28 Ethernet ports.
      Those line cards equipped with the programmable devices aimed for
      system control of Nvidia Ethernet switch ASIC control, Nvidia FPGA,
      Nvidia gearboxes (PHYs).
      The next coming  card generations are supposed to support:
      - Line cards with 8x200Gbe QSFP28 Ethernet ports.
      - Line cards with 4x400Gbe QSFP-DD Ethernet ports.
      - Smart cards equipped with Nvidia ARM CPU for offloading and for fast
        access to the storage (EBoF).
      - Fabric cards for inter-connection.
      
      The basic system initialization flow with input signals from the
      programmable device to kernel hotplug driver and with OS response
      to some of these signals is depicted below.
      
      lc#n_prsnt	*-> Input: line card presence in/out events.
      		    Informational event. Required action - 'udev' event
      		    generation for logging.
      lc#n_verified	*-> Input: line card verification status events coming
      		    after line card security signature validation by
      		    hardware. Required action - connect line card
      		    driver and initialized line card devices feeding
      		    from system auxiliary power domain.
      lc#n_pwr	<-* Output: line card power on / off from OS. Action
      		    should be performed by platform power management
      		    driver.
      lc#n_powered	*-> Input: line card power on/off events coming after
      		    line card "power good" on/off events, mean that
      		    line card power up sequence has been successfully
      		    completed or line card "power good" status has been
      		    dropped. Required action - connect line card
      		    devices feeding from system main power domain.
      lc#n_synced	*-> Input: line card synchronization events, coming
      		    after hardware-firmware synchronization handshake.
      		    Required action - to enable line card, in case
      		    lc#n_ready has been received before.
      lc#n_ready	*-> Input: line card ready events, indicating line card
      		    PHYs ready / unready states. Required action -
      		    enable line card, in case lc#n_synced has been
      		    received before.
      lc#n_enable	<-* Output: line card enable from OS - release FPGA and
      		    PHYs line card devices from reset state. Action
      		    should be performed by platform power management
      		    driver.
      lc#n_active	*-> Input: when line card "active event" is received
      		    for particular line card, its network, hardware
      		    monitoring and thermal interfaces should be
      		    configured according to the configuration obtained
      		    from the firmware. When opposite "inactive event"
      		    is received all the above interfaces should be
      		    teared down. Required action - connect / disconnect
      		    the above line card interfaces through ASIC I2C
      		    chassis management driver.
      
      For initial support:
      - Define new system type 'VMOD0011' to support new modular system.
      - Provide initial platform configuration for new system type.
      - Extend the registers definitions.
      - Add support for modular system registers related to line card
        specific events - insertion/removal, power on/off, verification
        and activation.
      - Add hotplug configuration for the above events.
      - Add configurations for hotplug actions for the modular system.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-3-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      a5d8f57e
    • Vadim Pasternak's avatar
      platform_data/mlxreg: Add new type to support modular systems · aafa1caf
      Vadim Pasternak authored
      Add new types for the Nvidia modular systems MSN4800 which could
      be equipped with the different types of replaceable line cards.
      
      Add new type to specify the kind of hotplug events for the line cards.
      The line card events are generated by the programmable device located
      on the main board. This device implements interrupt controller logic.
      Line card interrupts are associated with different line cards states
      during its initialization: insertion, security signature validation,
      power good state, security validation, hardware-firmware
      synchronization state, line card PHYs readiness state, firmware
      availability for line card ports. Also under some circumstances
      hardware can generate thermal shutdown for particular line card.
      
      Add new type specifying the action, which should be performed when
      particular hotplug event is received. This action defines in which way
      hotplug event should be handled by hotplug driver. There are the next
      actions types:
      - Connect I2C device with empty 'platform_data' field according to the
        platform topology, if device is configured (for example, power unit
        micro-controller driver, when power unit is connected to power source
        (this is what is currently supported).
      - Connect device with 'platform_data' field set according to the
        platform topology. The purpose is to pass 'platform_data' through
        hotplug driver to underlying device (for example line card driver).
      - No device is associated with hotplug event - just send "udev" event
       (this is what is currently supported).
      
      Extend structure 'mlxreg_hotplug_device' with hotplug action field.
      
      Extend structure 'mlxreg_core_data' with:
      - Registers for line card power and enabling control.
      - Slot number field, to indicate at which physical slot replaceable
        line card device is located.
      Signed-off-by: default avatarVadim Pasternak <vadimp@nvidia.com>
      Reviewed-by: default avatarMichael Shych <michaelsh@nvidia.com>
      Link: https://lore.kernel.org/r/20211002093238.3771419-2-vadimp@nvidia.comSigned-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      aafa1caf
  2. 11 Oct, 2021 14 commits
  3. 28 Sep, 2021 6 commits
  4. 21 Sep, 2021 5 commits
  5. 16 Sep, 2021 3 commits
  6. 14 Sep, 2021 2 commits