[Content Understanding] Update toLlmInput page markers and filter LLMStats telemetry by chienyuanchang · Pull Request #49396 · Azure/azure-sdk-for-java

chienyuanchang · 2026-06-05T23:43:26Z

Description

Updates the azure-ai-contentunderstanding LlmInputHelper.toLlmInput() helper to align its rendered output with the upcoming service page-marker format and to remove non-user-facing telemetry from RAI warning output.

Changes made:

Updated SDK-injected document page markers from  to .
Added duplicate-marker defense: if service markdown already contains <!-- InputPageNumber:, toLlmInput() does not inject additional page markers.
Filtered service-emitted internal telemetry warnings whose message starts with LLMStats: from the rendered rai_warnings front matter.
Preserved LLMStats: text when it appears in the document markdown body; only structured warnings are filtered.
Updated unit tests and sample tests for the new marker format and warning-filter behavior.
Updated CHANGELOG.md.

Relevant issues / context:

Design proposal: https://github.com/cognitive-services/ContentUnderstanding-Docs/issues/249
Agent Framework feedback that prompted the LLMStats: filtering: Python: Adopt azure-ai-contentunderstanding to_llm_input in CU context provider microsoft/agent-framework#5796

Companion PRs (sibling SDKs):

This PR is not based on regenerated SDK code from a new swagger / TypeSpec spec.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
- No public API signatures are changed. This only changes rendered text produced by the preview LlmInputHelper.toLlmInput() helper.
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message.

Testing Guidelines

Pull request includes test coverage for the included changes (31 unit tests pass).

Copilot

Pull request overview

This PR updates azure-ai-contentunderstanding’s preview LlmInputHelper.toLlmInput() output to match an upcoming service page-marker format and to suppress service-emitted internal telemetry warnings (LLMStats:) from the LLM-facing rai_warnings front matter.

Changes:

Switched SDK-injected page markers from  to .
Added a duplicate-marker guard: if the markdown already contains <!-- InputPageNumber:, the helper skips injecting markers.
Filtered warnings whose message begins with LLMStats: (after leading whitespace) so they are not rendered into rai_warnings.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
sdk/contentunderstanding/azure-ai-contentunderstanding/src/main/java/com/azure/ai/contentunderstanding/LlmInputHelper.java	Implements the new `InputPageNumber` marker format, skips marker injection when markers already exist, and filters `LLMStats:` warnings from `rai_warnings`.
sdk/contentunderstanding/azure-ai-contentunderstanding/src/test/java/com/azure/ai/contentunderstanding/tests/LlmInputHelperTest.java	Updates existing assertions for the new marker format and adds unit tests covering `LLMStats:` filtering and duplicate-marker avoidance.
sdk/contentunderstanding/azure-ai-contentunderstanding/src/test/java/com/azure/ai/contentunderstanding/tests/samples/Sample_Advanced_ToLlmInputTest.java	Updates sample test assertions and messages to match `<!-- InputPageNumber: N -->`.
sdk/contentunderstanding/azure-ai-contentunderstanding/src/test/java/com/azure/ai/contentunderstanding/tests/samples/Sample_Advanced_ToLlmInputAsyncTest.java	Updates async sample test assertions and messages to match `<!-- InputPageNumber: N -->`.
sdk/contentunderstanding/azure-ai-contentunderstanding/src/samples/java/com/azure/ai/contentunderstanding/samples/Sample_Advanced_ToLlmInput.java	Updates sample comments/documentation to reference the new marker format.
sdk/contentunderstanding/azure-ai-contentunderstanding/README.md	Updates the rendered example marker and aligns the dependency snippet version with the module’s current version.
sdk/contentunderstanding/azure-ai-contentunderstanding/CHANGELOG.md	Documents the marker format update and the `LLMStats:` warning filtering under `1.1.0-beta.2 (Unreleased)`.

Update LlmInputHelper page markers and filter LLMStats telemetry

288798b

github-actions Bot added the Cognitive - Content Understanding label Jun 5, 2026

chienyuanchang mentioned this pull request Jun 5, 2026

[Content Understanding] Update toLlmInput page markers and filter LLMStats telemetry Azure/azure-sdk-for-js#38851

Open

3 tasks

Bump README dependency snippet to 1.1.0-beta.2

d4f5126

chienyuanchang marked this pull request as ready for review June 8, 2026 22:22

Copilot AI review requested due to automatic review settings June 8, 2026 22:22

chienyuanchang requested review from a team, bojunehsu, changjian-wang and yungshinlintw as code owners June 8, 2026 22:22

Merge branch 'main' into cu-sdk/llm-input-helper-update

8bcca2a

Copilot started reviewing on behalf of chienyuanchang June 8, 2026 22:22 View session

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Merge branch 'main' into cu-sdk/llm-input-helper-update

73c78d9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Content Understanding] Update toLlmInput page markers and filter LLMStats telemetry#49396

[Content Understanding] Update toLlmInput page markers and filter LLMStats telemetry#49396
chienyuanchang wants to merge 4 commits into
mainfrom
cu-sdk/llm-input-helper-update

chienyuanchang commented Jun 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chienyuanchang commented Jun 5, 2026

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants