Commit afb5e9e4 authored by Leo Yan's avatar Leo Yan Committed by Arnaldo Carvalho de Melo

perf arm-spe: Bail out if the trace is later than perf event

It's possible that record in Arm SPE trace is later than perf event and
vice versa.  This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.

To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode().  By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample.  To track
the timestamp, everytime it updates timestamp for the latest record.
Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
Reviewed-by: default avatarJames Clark <james.clark@arm.com>
Tested-by: default avatarJames Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 85498f75
......@@ -434,12 +434,36 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
{
struct arm_spe *spe = speq->spe;
struct arm_spe_record *record;
int ret;
if (!spe->kernel_start)
spe->kernel_start = machine__kernel_start(spe->machine);
while (1) {
/*
* The usual logic is firstly to decode the packets, and then
* based the record to synthesize sample; but here the flow is
* reversed: it calls arm_spe_sample() for synthesizing samples
* prior to arm_spe_decode().
*
* Two reasons for this code logic:
* 1. Firstly, when setup queue in arm_spe__setup_queue(), it
* has decoded trace data and generated a record, but the record
* is left to generate sample until run to here, so it's correct
* to synthesize sample for the left record.
* 2. After decoding trace data, it needs to compare the record
* timestamp with the coming perf event, if the record timestamp
* is later than the perf event, it needs bail out and pushs the
* record into auxtrace heap, thus the record can be deferred to
* synthesize sample until run to here at the next time; so this
* can correlate samples between Arm SPE trace data and other
* perf events with correct time ordering.
*/
ret = arm_spe_sample(speq);
if (ret)
return ret;
ret = arm_spe_decode(speq->decoder);
if (!ret) {
pr_debug("No data or all data has been processed.\n");
......@@ -453,10 +477,17 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
if (ret < 0)
continue;
ret = arm_spe_sample(speq);
if (ret)
return ret;
record = &speq->decoder->record;
/* Update timestamp for the last record */
if (record->timestamp > speq->timestamp)
speq->timestamp = record->timestamp;
/*
* If the timestamp of the queue is later than timestamp of the
* coming perf event, bail out so can allow the perf event to
* be processed ahead.
*/
if (!spe->timeless_decoding && speq->timestamp >= *timestamp) {
*timestamp = speq->timestamp;
return 0;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment