- 06 Apr, 2019 3 commits
-
-
Oded Gabbay authored
During hard-reset, contexts are closed as part of the tear-down process. After a context is closed, the driver cleans up the page tables of that context in the device's DRAM. This action is both dangerous and unnecessary. It is unnecessary, because the device is going through a hard-reset, which means the device's DRAM contents are no longer valid and the device's MMU is being reset. It is dangerous, because if the hard-reset came as a result of a PCI freeze, this action may cause the entire host machine to hang. Therefore, prevent all device PTE updates when a hard-reset operation is pending. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
This patch makes some improvement in how IOCTLs behave when the device is disabled or under reset. The new code checks, at the start of every IOCTL, if the device is disabled or in reset. If so, it prints an appropriate kernel message and returns -EBUSY to user-space. In addition, the code modifies the location of where the hard_reset_pending flag is being set or cleared: 1. It is now cleared immediately after the reset *tear-down* flow is finished but before the re-initialization flow begins. 2. It is being set in the remove function of the device, to make the behavior the same with the hard-reset flow There are two exceptions to the disable or in reset check: 1. The HL_INFO_DEVICE_STATUS opcode in the INFO IOCTL. This opcode allows the user to inquire about the status of the device, whether it is operational, in reset or malfunction (disabled). If the driver will block this IOCTL, the user won't be able to retrieve the status in case of malfunction or in reset. 2. The WAIT_FOR_CS IOCTL. This IOCTL allows the user to inquire about the status of a CS. We want to allow the user to continue to do so, even if we started a soft-reset process because it will allow the user to get the correct error code for each CS he submitted. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
This patch fixes a bug in the implementation of the function that removes the device. The bug can happen when the device is removed but not the driver itself (e.g. remove by the OS due to PCI freeze in Power architecture). In that case, there maybe open users that are calling IOCTLs while the device is removed. This is a possible race condition that the driver must handle. Otherwise, a kernel panic may occur. This race is prevented in the hard-reset flow, because the driver makes sure the users are closed before continuing with the hard-reset. This race can not occur when the driver itself is removed because the OS makes sure all the file descriptors are closed. The fix is to make sure the open users close their file descriptors and if they don't (after a certain amount of time), the driver sends them a SIGKILL, because the remove of the device can't be stopped. The patch re-uses the same code that is called from the hard-reset flow. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 04 Apr, 2019 2 commits
-
-
Oded Gabbay authored
To make the memory ioctl code more readable, this patch moves the legacy/debug code path of mmu-disabled to a separate function, which is called (if necessary) from the main memory ioctl function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
This patch removes the enum value of ASIC_AUTO_DETECT because we can use the validity of the pdev variable to know whether we have a real device or a simulator. For a real device, we detect the asic type from the device ID while for a simulator, the simulator code calls create_hdev() with the specified ASIC type. Set ASIC_INVALID as the first option in the enum to make sure that no other enum value will receive the value 0 (which indicates a non-existing entry in the simulator array). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 02 Apr, 2019 1 commit
-
-
Oded Gabbay authored
This patch does some refactoring in goya.c to make code more reusable between goya code and the goya simulator code (which is not upstreamed). In addition, the patch removes some dead functions from goya.c which are not used by the current upstream code Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 03 Apr, 2019 1 commit
-
-
Oded Gabbay authored
This patch adds a better explanation about the sequence number that is returned per CS. It also fixes the comment about queue numbering rules. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 01 Apr, 2019 2 commits
-
-
Omer Shpigelman authored
This patch adds the ASIC-specific function for GOYA to configure the coresight components. Most of the components have an enabled/disabled flag, depending on whether the user wants to enable the component or disable it. For some of the components, such as ETR and SPMU, the user can also request to read values from them. Those values are needed for the user to parse the trace data. The ETR configuration is also checked for security purposes, to make sure the trace data is written to the device's DRAM. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Omer Shpigelman authored
Habanalabs ASICs use the ARM coresight infrastructure to support debug, tracing and profiling of neural networks topologies. Because the coresight is configured using register writes and reads, and some of the registers hold sensitive information (e.g. the address in the device's DRAM where the trace data is written to), the user must go through the kernel driver to configure this mechanism. This patch implements the common code of the IOCTL and calls the ASIC-specific function for the actual H/W configuration. The IOCTL supports configuration of seven coresight components: ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP The user specifies which component he wishes to configure and provides a pointer to a structure (located in its process space) that contains the relevant configuration. The common code copies the relevant data from the user-space to kernel space and then calls the ASIC-specific function to do the H/W configuration. After the configuration is done, which is usually composed of several IOCTL calls depending on what the user wanted to trace, the user can start executing the topology. The trace data will be written to the user's area in the device's DRAM. After the tracing operation is complete, and user will call the IOCTL again to disable the tracing operation. The user also need to read values from registers for some of the components (e.g. the size of the trace data in the device's DRAM). In that case, the user will provide a pointer to an "output" structure in user-space, which the IOCTL code will fill according the to selected component. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 02 Apr, 2019 1 commit
-
-
Oded Gabbay authored
This patch removes an extra ; after the closing brackets of a while loop. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
-
- 31 Mar, 2019 2 commits
-
-
Oded Gabbay authored
Unmapping ptes in the device MMU on Palladium can take a long time, which can cause a kernel BUG of CPU soft lockup. This patch minimize the chances for this bug by sleeping a little between unmapping ptes. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
GIT does not like extra blank lines at the end of the file, so this patch removes those lines. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
-
- 27 Mar, 2019 1 commit
-
-
Oded Gabbay authored
This patch improves two error messages to help the user to better understand what error occurred. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 24 Mar, 2019 1 commit
-
-
Dalit Ben Zoor authored
This patch adds a new opcode to INFO IOCTL that returns the device status. This will allow users to query the device status in order to avoid sending command submissions while device is in reset. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 21 Mar, 2019 1 commit
-
-
Dalit Ben Zoor authored
This patch allows the user to modify the TPC PLL clock relaxation value on-the-fly in order to reduce power consumption. To enable this, the patch removes the protection from the specific register that controls this behavior. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 20 Mar, 2019 1 commit
-
-
Dalit Ben Zoor authored
On init or context switch, set TPC clock relaxation counter register to a golden value. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 17 Mar, 2019 1 commit
-
-
Oded Gabbay authored
Hard-reset of our device should never fail, due to dangers of permanent damage to the H/W. This patch removes the last place in the reset path where the driver might exit before doing the actual reset. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 07 Mar, 2019 1 commit
-
-
Oded Gabbay authored
This patch refactors the code that is responsible to set the DMA mask for the device. Upon each change of the dma mask, the driver will save the new value that was set. This is needed in order to make sure we don't try to increase the mask a second time, in case we failed in the first time. This is especially relevant for Power machines, as that may cause a change in configuration of the TVT which will break the device. Goya will first try to set the device's dma mask to 39 bits, so that the memory that is allocated on the host machine for communication with the device's cpu will be in a bus address which is lower then 39 bits. Later, Goya will try to increase that mask to 48 bits, but only if setting the mask to 39 bits was successful. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 24 Feb, 2019 1 commit
-
-
Omer Shpigelman authored
This patch adds shadow mapping to the MMU module. The shadow mapping allows traversing the page table in host memory rather reading each PTE from the device memory. It brings better performance and avoids reading from invalid device address upon PCI errors. Only at the end of map/unmap flow, writings to the device are performed in order to sync the H/W page tables with the shadow ones. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 12 Mar, 2019 1 commit
-
-
Tomer Tayar authored
The addr/data32 debugfs nodes currently permit the access to only physical addresses of a device. This patch extends it and allows accessing also device's DRAM virtual addresses. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 07 Mar, 2019 2 commits
-
-
Tomer Tayar authored
Print the name of a busy engine when checking if a device is idle. The change is done mainly to help a user to pinpoint problems in his topology's recipe. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
This patch adds two comments in uapi/habanalabs.h: - From which queue id the internal queues begin - Invalid values that can be returned in the seq field from the CS IOCTL Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 06 Mar, 2019 1 commit
-
-
Tomer Tayar authored
Remove pointers to ASIC-specific functions and instead call the functions explicitly as they are not accessed from outside the ASIC-specific files. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 05 Mar, 2019 2 commits
-
-
Tomer Tayar authored
Move duplicated PCI-related code from ASIC-specific files into the common pci.c file. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Oded Gabbay authored
At the start of some IOCTLs we check if the device is disabled or in reset. If it is, we return -EBUSY and print a message to kernel log. Because these IOCTLs can be called at very high frequency, use ratelimit to avoid spamming the kernel log. Also use the same type of message - dev_warn - in all the relevant IOCTLs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 28 Feb, 2019 1 commit
-
-
Oded Gabbay authored
This patch removes some old defines which are not in use anymore. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 04 Mar, 2019 2 commits
-
-
Oded Gabbay authored
The Event Queue MSI/X ID is different per ASIC. This patch renames the current define to have the GOYA_ prefix to mark it only for Goya. It also moves it from the common armcp_if.h file to the ASIC specific goya_fw_if.h file. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Tomer Tayar authored
This patch moves the code that is responsible of the communication vs. the F/W to a dedicated file. This will allow us to share the code between different ASICs. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 27 Feb, 2019 1 commit
-
-
Dotan Barak authored
This will prevent unneeded include of header files, which may increase compilation time. Signed-off-by: Dotan Barak <dbarak@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 24 Feb, 2019 2 commits
-
-
Oded Gabbay authored
The goya_non_fatal_events array actually contains all the possible events the driver can receive from the F/W. Therefore, use a proper name for the array. The patch also adds missing event Ids to the goya_async_event_id enum. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
Igor Grinberg authored
This patch adds a definition of a new status in the device CPU boot stages and add the handling of the new status. Signed-off-by: Igor Grinberg <igrinberg@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
-
- 02 Apr, 2019 5 commits
-
-
Chengguang Xu authored
The function comment of __register_chrdev_region() is out of date, so update it based on the code. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chengguang Xu authored
It's just code cleanup, not functional change. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chengguang Xu authored
register_chrdev_region() carefully checks minor range before calling __register_chrdev_region() but there is another path from alloc_chrdev_region() which does not check the range properly. So add a check for given minor range in __register_chrdev_region(). Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chengguang Xu authored
Current overlap checking cannot correctly handle a case which is baseminor < existing baseminor && baseminor + minorct > existing baseminor + minorct. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vishnu DASA authored
No functional changes, cleanup only. Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 01 Apr, 2019 1 commit
-
-
Greg Kroah-Hartman authored
We want the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 31 Mar, 2019 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Paolo Bonzini: "A collection of x86 and ARM bugfixes, and some improvements to documentation. On top of this, a cleanup of kvm_para.h headers, which were exported by some architectures even though they not support KVM at all. This is responsible for all the Kbuild changes in the diffstat" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION KVM: doc: Document the life cycle of a VM and its resources KVM: selftests: complete IO before migrating guest state KVM: selftests: disable stack protector for all KVM tests KVM: selftests: explicitly disable PIE for tests KVM: selftests: assert on exit reason in CR4/cpuid sync test KVM: x86: update %rip after emulating IO x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init kvm/x86: Move MSR_IA32_ARCH_CAPABILITIES to array emulated_msrs KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts kvm: don't redefine flags as something else kvm: mmu: Used range based flushing in slot_handle_level_range KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported KVM: x86: remove check on nr_mmu_pages in kvm_arch_commit_memory_region() kvm: nVMX: Add a vmentry check for HOST_SYSENTER_ESP and HOST_SYSENTER_EIP fields KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation) KVM: Reject device ioctls from processes other than the VM's creator KVM: doc: Fix incorrect word ordering regarding supported use of APIs KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size' KVM: nVMX: Do not inherit quadrant and invalid for the root shadow EPT ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: "A pile of x86 updates: - Prevent exceeding he valid physical address space in the /dev/mem limit checks. - Move all header content inside the header guard to prevent compile failures. - Fix the bogus __percpu annotation in this_cpu_has() which makes sparse very noisy. - Disable switch jump tables completely when retpolines are enabled. - Prevent leaking the trampoline address" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/realmode: Make set_real_mode_mem() static inline x86/cpufeature: Fix __percpu annotation in this_cpu_has() x86/mm: Don't exceed the valid physical address space x86/retpolines: Disable switch jump tables when retpolines are enabled x86/realmode: Don't leak the trampoline kernel address x86/boot: Fix incorrect ifdeffery scope x86/resctrl: Remove unused variable
-