An error occurred fetching the project authors.
- 15 Mar, 2023 3 commits
-
-
Dafna Hirschfeld authored
Since hw_fini return error code for failure indication, we should check its return value. Currently it might only fail upon soft-reset from hl_device_reset. Later patch will add hw_fini failure in case of polling timeout in hard-reset. Signed-off-by:
Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Reviewed-by:
Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
We later use cpucp packet for soft reset which might fail so we should be able propagate the failure case. Signed-off-by:
Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Reviewed-by:
Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
-
Ofir Bitton authored
In order to allow TPC engines to raise an assert, we must expose the relevant MSIX interrupt to the user so he will configure the engine correctly. In addition, we implement the corresponding interrupt handler that will notify the user upon such an event. Signed-off-by:
Ofir Bitton <obitton@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Reviewed-by:
Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
-
- 26 Jan, 2023 3 commits
-
-
Ohad Sharabi authored
This function shall be used whenever components enable/binning masks should be updated. Usage is in one of the below cases: - update user (or default) component masks - update when getting the masks from FW (either CPUCP or COMMS) Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Now that we have a subsystem for compute accelerators, move the habanalabs driver to it. This patch only moves the files and fixes the Makefiles. Future patches will change the existing code to register to the accel subsystem and expose the accel device char files instead of the habanalabs device char files. Update the MAINTAINERS file to reflect this change. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
As ASICs are evolving, we will need to update the DRAM properties at various points because we may get different information from the f/w at different points of the initialization. This ASIC function is a foundation for this capability. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
- 23 Nov, 2022 1 commit
-
-
Tomer Tayar authored
Add missing le32_to_cpu() conversions, and use %d for the value returned from atomic_read(). Signed-off-by:
Tomer Tayar <ttayar@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
- 19 Sep, 2022 1 commit
-
-
farah kassabri authored
As part of the RAS that is done by the f/w, we should send a message to the f/w when a user either acquires or releases the device. Signed-off-by:
farah kassabri <fkassabri@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
- 18 Sep, 2022 5 commits
-
-
Ofir Bitton authored
Except Goya, none of our ASICs require context switch flow, hence we enable this flow only where it is needed. Signed-off-by:
Ofir Bitton <obitton@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Dani Liberman authored
Since hwmon fini code is common for all asics, unified it to common function. Signed-off-by:
Dani Liberman <dliberman@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
We don't use KDMA concurrently in the driver. The only use is through debugfs and we don't protect concurrent access through it. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
In order to be more explicit we should use the term compute_reset for describing the reset in which only the compute engines gets reset. Signed-off-by:
Ofir Bitton <obitton@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Dani Liberman authored
Change is_idle functions so it would be more usable outside debugfs. Do this by replacing seq_file parameter with regular string. Signed-off-by:
Dani Liberman <dliberman@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
- 12 Jul, 2022 15 commits
-
-
Oded Gabbay authored
H/W being dirty during initialization is completely expected in case f/w tools are used before loading the driver. As it is not an error, and as it doesn't give any meaningful information to the user, no point of printing it. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Doing compute reset can be the traditional inference soft reset that is supported only in Goya. Or it can be the new reset upon device release, which is supported in Gaudi2 and above. Therefore, wherever suitable, use the terminology of compute reset instead of soft reset. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Tomer Tayar authored
For gaudi2 we need to send a value to F/W as part of the PCI_ACCESS packet. As a preparation, modify hl_fw_send_pci_access_msg() to have a 'value' field. Signed-off-by:
Tomer Tayar <ttayar@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
Currently we are not waiting for preboot ready after hard reset. This leads to a race in which COMMs protocol begins but will get no response from the f/w. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
New asic properties were added for Gaudi2. We want to initialize and use them, when relevant, also for Goya and Gaudi. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
There are a number of new ASIC-specific functions that were added for Gaudi2. To make the common code work, we need to define empty implementations of those functions for Goya and Gaudi. Some functions will return error if called with Goya/Gaudi. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add the ASIC-specific code for Gaudi2. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the Gaudi2 ASIC. It also contains the code to initialize the F/W of the Gaudi2 ASIC and to receive events from the F/W. It contains new debugfs entry to dump razwi events. razwi is a case where the device's engines create a transaction that reaches an invalid destination. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
PCI bar size is resource_size_t so we should use %pa to make it work correctly on all architectures. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
Because in future ASICs the driver will allow the user to set the page size we need to make sure this data is propagated in all APIs. In addition, since this is already an ASIC property we no longer need ASIC function for it. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
We dropped support for page sizes that are not power of 2. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
This is a pre-requisite patch for adding tracepoints to the DMA memory operations (allocation/free) in the driver. The main purpose is to be able to cross data with the map operations and determine whether memory violation occurred, for example free DMA allocation before unmapping it from device memory. To achieve this the DMA alloc/free code flows were refactored so that a single DMA tracepoint will catch many flows. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
The values in this enum are not used by h/w but are a contract between userspace and the kernel driver so they must be defined in the uapi file. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
We use scrub_device_mem only to scrub the entire SRAM and entire DRAM. Therefore there is no need to send addr and size args to the callback. Signed-off-by:
Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
There is a rare race condition in CB completion mechanism, that can occur under a very high pressure of command submissions. The preconditions for this to happen are: 1. There should be enough command submissions for the pre-allocated patched CB pool to run out of commands. At this stage we start allocating new patched CBs as they arrive. 2. CB size has to be exactly (128*n + 104)B for some n, i.e. 24B below a cache line end. The flow: 1. Two command buffers being completed on different streams, at the same time. Denote those CB1 and CB2. 2. Each command buffer is injected with two messages, 16B each - one for a HBW update of the completion queue, another to raise interrupt. 3. Assume CB1 updated the completion queue and raise the interrupt. 4. Assume CB2 updated the completion queue but did not raise the interrupt yet. 5. The host receives the interrupt. It goes over the completion queue and sees two completions - CB1 and CB2. Release them both. 6. CB2 performs the last command. The problem is that the last command is split between 2 cache lines. So to read the last 8B of the last command, it has to access the host again. Problem is - CB2 is already released. This causes a DMAR error. The solution to this problem is simply to make sure the last two commands in the CB are always in the same cache line, using NOP padding. Signed-off-by:
Yuri Nudelman <ynudelman@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
This asic callback function is not called anymore from the common code. The asic-specific function itself is called but from within the asic-specific code. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org>
-
- 22 May, 2022 12 commits
-
-
Ohad Sharabi authored
When user requests to prefetch the MMU translations, the driver will not block the user until prefetch is done. Instead, the prefetch work will be delegated to a WQ which will do it in the background. This way, the prefetch may progress without blocking the user at all. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
Add the ability to scrub the device memory with a given value. Add file 'dram_mem_scrub_val' to set the value and a file 'dram_mem_scrub' to scrub the dram. This is very important to help during automated tests, when you want the CI system to randomize the memory before training certain DL topologies. Signed-off-by:
Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
With the new code required for the flow added, we can now switch to using the new memory manager infrastructure, removing the old code. Signed-off-by:
Yuri Nudelman <ynudelman@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
Instead of using for_each_sg when iterating sgt that contains dma entries, use the more proper for_each_sgtable_dma_sg macro. In addition, both Goya and Gaudi have the exact same implementation of the asic function that encapsulate the usage of this macro, so it is better to move that implementation to the common code. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
Currently we have two reset prints per reset. One is in the common code and one in each asic-specific file. We can change the asic-specific message to be debug only as we can know the type of reset being done according to the print in the common code, which is also easier to maintain. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
Halting compute engines is a print that doesn't add us any information because it is always done in the reset process and not used elsewhere. Even if it was, we don't use prints to mark functions we passed through. Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
The debugfs memory access now uses the callback 'access_dev_mem' so there is no use of the callbacks 'debugfs_{read32,read64,write32,write6}'. Remove them. Signed-off-by:
Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
This is a preparation for unifying the code of accessing device memory through debugfs. Add struct fields and callbacks that will later be used in debugfs code and will reduce code duplication among the different read{32,64}/write{32,64} callbacks of every asic. Signed-off-by:
Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
When Gaudi device is secured the monitors data in the configuration space is blocked from PCI access. As we need to enable user to get sync-manager monitors registers when debugging, this patch adds a debugfs that dumps the information to a binary file (blob). When a root user will trigger the dump, the driver will send request to the f/w to fill a data structure containing dump of all monitors registers. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
This is necessary pre-requisite for future ASIC support, where MMU TLB prefetch is supported. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tomer Tayar authored
The required DMA mask is no longer based on input from the F/W, but it is fixed per ASIC according to its address space. As such, the per-ASIC function to get this value can be replaced with a property variable. Signed-off-by:
Tomer Tayar <ttayar@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
Future devices will support multiple device memory page sizes. In addition, an API for the user was added for it to be able to control the device memory allocation page size. This patch is a complementary patch to inform the user of the available page size supported by the device. Signed-off-by:
Ohad Sharabi <osharabi@habana.ai> Reviewed-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-