feat(trace-utils)!: search all spans to populate tracer payload fields#1954
Conversation
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1954 +/- ##
==========================================
+ Coverage 72.54% 72.63% +0.09%
==========================================
Files 451 451
Lines 74481 74687 +206
==========================================
+ Hits 54030 54249 +219
+ Misses 20451 20438 -13
🚀 New features to boost your workflow:
|
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: fd0ce76 | Docs | Datadog PR Page | Give us feedback! |
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fd5c1b9464
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(v) = root.meta.get(field) { | ||
| return Some(v.clone()); |
There was a problem hiding this comment.
Skip empty trace tag values when searching spans
When a root span carries the requested key with an empty string (for example version: "") and a child span in the same trace has the non-empty value, this returns immediately with the empty value and never checks the remaining spans. The caller then leaves app_version/env/etc. empty for single-trace payloads, which misses the stated goal of finding a non-empty value when the root span has an empty tag.
Useful? React with 👍 / 👎.
|
@codex review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
| assert!(!span.meta.contains_key("aas.site.type")); | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
So I was look at pb:Span, and it looks like there's a bunch of fuzzing used to test that code. Do you think that a bolero check!() type test might be useful here? How well formed are trace payloads, generally?
There was a problem hiding this comment.
collect_pb_trace_chunks, where search_trace_for_field is called, is certainly complex enough to warrant a bolero test in general for everything that it's doing. But for what we want to test as part of this PR in search_trace_for_field - "best effort, first non-empty value wins from any span regardless of structural validity". It just means the bolero test can only assert "if a value is returned, it came from some span in the trace", which is already guaranteed from the other unit tests.
| // from an earlier trace. | ||
| let root = &trace[root_span_index]; | ||
| if tracer_payload_tags.env.is_empty() { | ||
| if let Some(v) = search_trace_for_field(root, trace, "env") { |
There was a problem hiding this comment.
So this trace gets normalized up here right: https://github.com/DataDog/libdatadog/pull/1954/changes#diff-e35e80d6b64d2cc411fc38dad84cf1cf35bd47501f9ef20761c1d20c1b5abd26R631
But it looks like that normalization can have errors, which are logged, but don't break the loop. This is sort of why I'm wondering about fuzzing/trace payloads. Can we get incorrect children passing on incorrect tags?
There was a problem hiding this comment.
Between these normalize functions, only the env field is normalized for the fields relevant to the tracer payload change (env, version, _dd.hostname, runtime-id).
There is a scenario where normalization fails on a span and later spans are not normalized, resulting in potentially non-normalized env values being set on the tracer payload. I think the safest way to handle this scenario would be to apply normalization in collect_pb_trace_chunks rather than modifying how normalize_trace short circuits on an error when iterating over spans.
if let Some(mut v) = search_trace_for_field(root, trace, "env") {
normalize_utils::normalize_tag(&mut v);
tracer_payload_tags.env = v;
}
| "version" => root_span_tags.app_version, | ||
| "_dd.hostname" => root_span_tags.hostname, | ||
| "runtime-id" => root_span_tags.runtime_id, | ||
| if is_agentless { |
There was a problem hiding this comment.
When is_agentless set and what does it mean in this context? If this code is running in the agent, how can it be "agentless"? 😅
There was a problem hiding this comment.
I think this comes from the fact that these trace utils are used in the tracers in addition to the Serverless Compatibility Layer and Bottlecap. From the tracer perspective, if traces are sent directly to the backend, is_agentless: true means "do the normalization and other tasks the Go agent would normally do." Maybe a more appropriate name for this parameter would be something like perform_agent_processing?
There was a problem hiding this comment.
Thanks for clarifying!
Maybe a more appropriate name for this parameter would be something like
perform_agent_processing
Makes sense, but probably out scope of this PR.
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
The expected merge time in
|
What does this PR do?
When extracting span tags to set in the payload, iterate through each trace to find a non-empty value for the tag. Search root spans on each trace first then additional spans if the root span has an empty value for the specified tag.
Motivation
If a trace payload contains multiple trace chunks, and one of those chunks does not contain a trace that has a value for a given tag (like
version), then the top level field set in the tracer payload could be empty even when other traces could have non-empty values for the same tag.This situation arose with the .NET tracer sending a
command_executionspan with no version in the same payload as anazure_functions.invokespan with a version. As a result, the trace metric generated from this payload,trace.azure_functions.invoke.hits, did not have a version tag set.https://datadoghq.atlassian.net/browse/SVLS-9006
Additional Notes
RootSpanTagstoTracerPayloadTagsto better reflect the source of the populated fields.How to test the change?
Unit tests:
envvalue is normalized bycollect_pb_trace_chunksenvvalues after normalization are skipped. incollect_pb_trace_chunks