-
Ben Hutchings authored
Whenever we add DMA descriptors to a TX ring and update the ring pointer, the TX DMA engine must first read the new DMA descriptors and then start reading packet data. However, all released Solarflare 10G controllers have a 'TX push' feature that allows us to reduce latency by writing the first new DMA descriptor along with the pointer update. This is only useful when the queue is empty. The hardware should ignore the pushed descriptor if the queue is not empty, but this check is buggy, so we must do it in software. In order to tell whether a TX queue is empty, we need to compare the previous transmission count (write_count) and completion count (read_count). However, if we do that every time we update the ring pointer then read_count may ping-pong between the caches of two CPUs running the transmission and completion paths for the queue. Therefore, we split the check for an empty queue between the completion path and the transmission path: - Add an empty_read_count field representing a point at which the completion path saw the TX queue as empty. - Add an old_write_count field for use on the completion path. - On the completion path, whenever read_count reaches or passes old_write_count the TX queue may be empty. We then read write_count, set empty_read_count if read_count == write_count, and update old_write_count. - On the transmission path, we read empty_read_count. If it's set, we compare it with the value of write_count before the current set of descriptors was added. If they match, the queue really is empty and we can use TX push. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
cd38557d