feat(agents): add Task Challenger adversarial questioning agent by rezatnoMsirhC · Pull Request #1315 · microsoft/hve-core

rezatnoMsirhC · 2026-04-07T22:26:35Z

feat(agents): add Task Challenger adversarial questioning agent

Description

Added Task Challenger — an adversarial questioning agent that reads .copilot-tracking/ artifacts cold and interrogates every decision, boundary, and assumption through structured What/Why/How questions. The agent does not validate, suggest, coach, or guide; it asks.

The agent operates in four phases. In Phase 1 (Scope), it discovers what to challenge through a five-level cascade: existing .copilot-tracking/ artifacts, pr-reference.xml, git branch history, domain-based workspace search, and finally direct user input. Terminal access is limited to this phase only. Phase 2 (Read Artifacts) silently reads plans, changes, research, and reviews from .copilot-tracking/. Phase 3 (Identify Challenge Areas) silently selects the 5–7 areas with the highest density of unexamined assumptions — this list is never disclosed to the user. Phase 4 (Challenge) issues exactly one question per response using the structure [What/Why/How] + [noun subject] + [verb] + [open object]?, probes each answer up to twice before marking a point unresolved, and handles skip signals ("Go next", "Skip", etc.) without acknowledgment.

A Challenge Tracking Document is created at Phase 4 entry under .copilot-tracking/challenges/{{YYYY-MM-DD}}/{{topic}}-challenge.md. It captures metadata, confirmed scope, identified challenge areas, a Q&A log with verbatim answers, probe exchanges, and an Unresolved Items table.

Task Researcher was updated to check .copilot-tracking/challenges/ at the start of Phase 1 and, when a challenge document is present, treat its Q&A log and unresolved items as the primary research scope — establishing a formal data contract from challenger outputs to researcher inputs. Task Reviewer received a new 🥊 Challenge handoff that routes to Task Challenger via /task-challenge. The companion task-challenge.prompt.md provides four optional inputs (plan, changes, research, focus) that pre-position scope at invocation; the agent falls back to artifact discovery when none are supplied.

Both hve-core and hve-core-all collections and plugins were updated in lockstep, and .github/copilot-instructions.md was updated to register .copilot-tracking/challenges/ as a known tracking directory.

Related Issue(s)

Related to #1212

Type of Change

Select all that apply:

Code & Documentation:

Bug fix (non-breaking change fixing an issue)
New feature (non-breaking change adding functionality)
Breaking change (fix or feature causing existing functionality to change)
Documentation update

Infrastructure & Configuration:

AI Artifacts:

Reviewed contribution with prompt-builder agent and addressed all feedback
Copilot instructions (.github/instructions/*.instructions.md)
Copilot prompt (.github/prompts/*.prompt.md)
Copilot agent (.github/agents/*.agent.md)
Copilot skill (.github/skills/*/SKILL.md)

Note for AI Artifact Contributors:

Agents: Research, indexing/referencing other project (using standard VS Code GitHub Copilot/MCP tools), planning, and general implementation agents likely already exist. Review .github/agents/ before creating new ones.

Skills: Must include both bash and PowerShell scripts. See Skills.

Model Versions: Only contributions targeting the latest Anthropic and OpenAI models will be accepted. Older model versions (e.g., GPT-3.5, Claude 3) will be rejected.

See Agents Not Accepted and Model Version Requirements.

Other:

Script/automation (.ps1, .sh, .py)
Other (please describe):

Sample Prompts (for AI Artifact Contributions)

User Request:

"Challenge this implementation" with the Task Challenger agent selected, or via /task-challenge from the Task Reviewer's 🥊 Challenge handoff. Optional arguments: /task-challenge plan=.copilot-tracking/plans/my-plan.md focus=authentication.

Execution Flow:

Phase 1 (Scope): Discovers what to challenge through a five-level cascade — .copilot-tracking/ tracking artifacts, pr-reference.xml, git log/diff/status (via git branch --show-current, git log <parent>..HEAD --oneline, git diff --stat), domain-based repo search, or direct user prompt. Presents a factual scope summary and waits for explicit user confirmation before proceeding. Terminal commands are run only during this phase.
Phase 2 (Read Artifacts): Silently reads .copilot-tracking/plans/, changes/, research/, and reviews/.
Phase 3 (Identify Challenge Areas): Silently identifies 5–7 assumption-dense areas. List is never disclosed to the user.
Phase 4 (Challenge): Creates .copilot-tracking/challenges/{{YYYY-MM-DD}}/{{topic}}-challenge.md. Issues one [What/Why/How] + […]? question per response. Probes each answer up to twice; marks unresolved after two probes with no new depth. Advances silently on skip signals. Updates the tracking document throughout the session.
On completion, the Compact handoff summarizes state (including complete Q&A and unresolved items) and defaults the next step to Task Researcher.

Output Artifacts:

<!-- .copilot-tracking/challenges/2026-04-07/task-challenger-agent-challenge.md -->
<!-- markdownlint-disable-file -->
# Challenge Session: task-challenger-agent

**Date**: 2026-04-07
**Scope source**: Level 1 — .copilot-tracking/ artifacts
**Related artifacts**: .copilot-tracking/plans/..., .copilot-tracking/changes/...

## Confirmed Scope
...

## Q&A Log
### Area: scope boundaries
**Q**: What does the five-level cascade exclude?
**A**: [verbatim answer]
  **Probe**: How is an empty artifacts folder distinguished from a missing one?
  **A**: [verbatim answer]

## Unresolved Items

| # | Area             | Last Question Asked                                        | Reason                        |
|---|------------------|------------------------------------------------------------|-------------------------------|
| 1 | scope boundaries | How is an empty folder distinguished from a missing one?   | No new depth after two probes |

Success Indicators:

Agent presents a factual scope summary and waits for explicit confirmation before entering Phase 4.
Each Phase 4 response contains exactly one question with no preamble, no affirmation, and no suggestion.
The challenge tracking document is created at Phase 4 entry and updated after each Q&A exchange.
Switching to Task Researcher after the session shows the challenge document Q&A as the primary research scope.

For detailed contribution requirements, see:

Common Standards: docs/contributing/ai-artifacts-common.md - Shared standards for XML blocks, markdown quality, RFC 2119, validation, and testing
Agents: docs/contributing/custom-agents.md - Agent configurations with tools and behavior patterns
Prompts: docs/contributing/prompts.md - Workflow-specific guidance with template variables
Instructions: docs/contributing/instructions.md - Technology-specific standards with glob patterns
Skills: docs/contributing/skills.md - Task execution utilities with cross-platform scripts

Testing

All automated checks were run during PR generation.

Check	Command	Result
Markdown linting	`npm run lint:md`	✅ Passed — 0 errors across 196 files
Spell checking	`npm run spell-check`	✅ Passed — 0 issues across 298 files
Frontmatter validation	`npm run lint:frontmatter`	✅ Passed — 0 errors, 0 warnings across 490 files
Skill structure validation	`npm run validate:skills`	✅ Passed — 14 skills, 0 errors
Link validation	`npm run lint:md-links`	❌ Pre-existing SECURITY.md failure on `main` — not introduced by this PR
PowerShell analysis	`npm run lint:ps`	✅ Passed — all files clean
Plugin freshness	`npm run plugin:generate`	✅ Passed — two table-formatting fixups applied (see Additional Notes)
Docusaurus tests	`npm run docs:test`	⏭️ Skipped — `jest` not installed; pre-existing environment limitation

Security analysis:

Terminal access is explicitly contained to Phase 1 only; the agent declares execute/runInTerminal and execute/getTerminalOutput in tools but the agent body restricts their use to the Scope phase.
No write or edit tools declared — the agent cannot modify files.
Challenge tracking documents write to .copilot-tracking/challenges/, which is gitignored per existing repo convention.
No new npm, pip, or external package dependencies.
No credentials, tokens, or sensitive data introduced.

Sample Run

Scope Confirmation Phase

Conversational Q&A Phase

Checklist

Required Checks

Documentation is updated (if applicable)
Files follow existing naming conventions
Changes are backwards compatible (if applicable)
Tests added for new functionality (if applicable) (N/A — no test infrastructure for agent/prompt files)

AI Artifact Contributions

Used /prompt-analyze to review contribution
Addressed all feedback from prompt-builder review
Verified contribution follows common standards and type-specific requirements

Required Automated Checks

The following validation commands must pass before merging:

Markdown linting: npm run lint:md
Spell checking: npm run spell-check
Frontmatter validation: npm run lint:frontmatter
Skill structure validation: npm run validate:skills
Link validation: npm run lint:md-links (pre-existing SECURITY.md failure on main)
PowerShell analysis: npm run lint:ps
Plugin freshness: npm run plugin:generate
Docusaurus tests: npm run docs:test (jest not installed; pre-existing environment limitation)

Security Considerations

This PR does not contain any sensitive or NDA information
Any new dependencies have been reviewed for security issues (N/A — no new dependencies)
Security-related scripts follow the principle of least privilege (N/A — no security scripts modified)

Additional Notes

plugin:generate applied two minor fixups during validation: markdown table column padding in task-challenger.agent.md and a missing task-challenge row in plugins/hve-core-all/README.md. Both are present as unstaged local modifications and should be committed before merging.
docs/agents/README.md names five RPI agents (task-researcher, task-planner, task-implementor, task-reviewer, and the RPI orchestrator) but does not yet mention Task Challenger. Consider a follow-up to update that reference.
The lint:md-links failure on SECURITY.md is pre-existing on main and unrelated to this PR.

- add task-challenger.agent.md with What/Why/How interrogation protocol - add task-challenge.prompt.md with optional artifact inputs - add 🥊 Challenge handoff to task-reviewer.agent.md - register agent and prompt in hve-core and hve-core-all collections - regenerate plugin outputs 🥊 - Generated by Copilot

…t/1212-adversarial-task-challenger-agent

… task-challenger - add Phase 1: Scope with artifact discovery, git fallback, and user confirmation - scope Prohibited Behaviors and Response Format to Challenge Phase only - add execute/runInTerminal and execute/getTerminalOutput to tools frontmatter - renumber Read → Phase 2, Identify → Phase 3, Challenge → Phase 4 - add 'Go next' skip signal handling to Phase 4 Protocol ✨ - Generated by Copilot

- add five-level ordered scope fallback with verified git commands - auto-create challenge tracking document at Phase 4 entry - add Challenge Tracking Document Schema section to Phase 4 - weight Compact handoff toward Task Researcher as default - add challenges/ to .copilot-tracking listing in copilot-instructions.md ⚡ - Generated by Copilot

…nger handoffs - add challenges/ artifact check to Task Researcher Phase 1 Step 1 - update Task Challenger handoff prompts to reference challenge document path 🔗 - Generated by Copilot

…t/1212-adversarial-task-challenger-agent

codecov-commenter · 2026-04-07T22:29:30Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.62%. Comparing base (a1928f3) to head (2e5b2bb).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1315      +/-   ##
==========================================
- Coverage   87.63%   87.62%   -0.02%     
==========================================
  Files          61       61              
  Lines        9328     9328              
==========================================
- Hits         8175     8174       -1     
- Misses       1153     1154       +1

Flag	Coverage Δ
pester	`85.18% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

🎨 - Generated by Copilot

github-actions

PR Review: `feat(agents): add Task Challenger adversarial questioning agent`

This is a well-conceived and thoroughly described feature. The adversarial questioning model, four-phase protocol, and data contract from challenger outputs to researcher inputs are all clearly specified. The automated checks pass. Two issues need to be resolved before merging.

Issue Alignment

The PR description uses Related to #1212 rather than Fixes #1212 or Closes #1212. The PR template instructs contributors to use Fixes # or Closes # syntax. If this PR fully delivers the feature described in the linked issue, consider updating to Closes #1212 so the issue is automatically closed on merge. If the issue is intentionally left open (e.g., it tracks ongoing work), a brief note explaining why would clarify intent.

PR Template Compliance

⚠️ AI Artifact Contributions checklist — all three items unchecked

The PR adds AI artifact files (a new agent and a new prompt), which requires completing the AI Artifact Contributions checklist:

* [ ] Used `/prompt-analyze` to review contribution
* [ ] Addressed all feedback from `prompt-builder` review
* [ ] Verified contribution follows common standards and type-specific requirements

All three remain unchecked. These checkboxes represent a required quality gate for AI artifact contributions. Please complete the /prompt-analyze review, address any feedback, and check these items before requesting re-review.

ℹ️ Documentation checkbox

The Documentation is updated (if applicable) checkbox is unchecked. The Additional Notes section acknowledges that docs/agents/README.md currently omits the Task Challenger. While deferring the docs update to a follow-up is a reasonable call, the checkbox should carry an inline (N/A — deferred to follow-up issue) annotation to make that intent explicit, per the checklist conventions used elsewhere in the template.

Coding Standards

❌ disable-model-invocation: true missing from task-challenger.agent.md frontmatter (inline comment on line 4)

The agent declares execute/runInTerminal and execute/getTerminalOutput in tools:, writes files to .copilot-tracking/challenges/, and runs git commands during Phase 1. This makes it a side-effecting agent. task-reviewer.agent.md — a direct peer — sets disable-model-invocation: true for the same reason. The prompt-builder instructions require this field for side-effecting and explicitly-invoked agents. Add disable-model-invocation: true to the frontmatter.

Code Quality

💡 Duplicate "Response Format" section (inline comment on line ~214)

The file defines the Phase 4 response format in two places: a nested #### Response Format inside ### Phase 4: Challenge, and a top-level ## Response Format at the end of the file. Both sections state the same one-question rule with the same examples. Duplicate instructions create a maintenance hazard — a future edit to one copy may silently diverge from the other. Consolidate into the nested section, which is already the more contextually natural location.

Action Items

Required — Check all three AI Artifact Contributions checklist items after completing the /prompt-analyze review.
Required — Add disable-model-invocation: true to the task-challenger.agent.md frontmatter.
Suggested — Remove the duplicate top-level ## Response Format section; retain only the nested #### Response Format under Phase 4.
Minor — Annotate the Documentation checkbox with (N/A — deferred to follow-up) to make the deferral explicit.

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

#1212 issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review for issue #1315

github-actions · 2026-04-07T22:42:35Z

.github/agents/hve-core/task-challenger.agent.md

+---
+name: Task Challenger
+description: 'Adversarial questioning agent that interrogates implementations with What/Why/How questions — no suggestions, no hints, no leading - Brought to you by microsoft/hve-core'
+tools: [read, search, execute/runInTerminal, execute/getTerminalOutput]


Missing disable-model-invocation: true in frontmatter

This agent declares execute/runInTerminal and execute/getTerminalOutput in its tools: list, making it a side-effecting agent. Per the prompt-builder instructions:

Use disable-model-invocation: true for agents that run subagents, agents that cause side effects (git operations, backlog management, deployments), or agents that should only run when explicitly requested.

task-reviewer.agent.md — a peer agent in the same collection — already sets disable-model-invocation: true and has a comparable profile (side effects, subagent orchestration). Task Challenger writes to .copilot-tracking/challenges/ and runs git commands; it should follow the same convention.

Suggested fix:

--- name: Task Challenger description: 'Adversarial questioning agent...' disable-model-invocation: true tools: [read, search, execute/runInTerminal, execute/getTerminalOutput]

github-actions · 2026-04-07T22:42:35Z

.github/agents/hve-core/task-challenger.agent.md

+
+### {{Area Label}}
+
+**Question**: {{question text}}


Duplicate "Response Format" section

The agent defines the Challenge Phase response format in two places:

#### Response Format nested under ### Phase 4: Challenge (earlier in the file)

This top-level ## Response Format section — which adds a clarifying note (> This section applies during the Challenge Phase (Phase 4) only.) but otherwise restates the same requirement verbatim

Having two sections describing identical behavior creates maintenance risk: a future edit to one may miss the other, causing the agent to receive contradictory instructions. The nested #### Response Format already lives where it is most contextually relevant — inside Phase 4.

Suggested resolution: Remove this top-level ## Response Format section and, if the "applies to Phase 4 only" clarification is important, add that note to the nested #### Response Format heading inside Phase 4 instead.

rezatnoMsirhC added 6 commits March 26, 2026 11:47

Merge branch 'main' of https://github.com/microsoft/hve-core into fea…

5241e89

…t/1212-adversarial-task-challenger-agent

feat(agents): add challenges/ discovery to Task Researcher and Challe…

f15e90b

…nger handoffs - add challenges/ artifact check to Task Researcher Phase 1 Step 1 - update Task Challenger handoff prompts to reference challenge document path 🔗 - Generated by Copilot

Merge branch 'main' of https://github.com/microsoft/hve-core into fea…

7e606b6

…t/1212-adversarial-task-challenger-agent

style(agents): fix table column alignment in task-challenger agent

2e5b2bb

🎨 - Generated by Copilot

rezatnoMsirhC marked this pull request as ready for review April 7, 2026 22:33

rezatnoMsirhC requested a review from a team as a code owner April 7, 2026 22:33

github-actions bot added the needs-revision label Apr 7, 2026

github-actions bot requested changes Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): add Task Challenger adversarial questioning agent#1315

feat(agents): add Task Challenger adversarial questioning agent#1315
rezatnoMsirhC wants to merge 7 commits intomainfrom
feat/1212-adversarial-task-challenger-agent

rezatnoMsirhC commented Apr 7, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 7, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot Apr 7, 2026

Uh oh!

github-actions bot Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rezatnoMsirhC commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat(agents): add Task Challenger adversarial questioning agent

Description

Related Issue(s)

Type of Change

Sample Prompts (for AI Artifact Contributions)

Testing

Sample Run

Scope Confirmation Phase

Conversational Q&A Phase

Checklist

Required Checks

AI Artifact Contributions

Required Automated Checks

Security Considerations

Additional Notes

Uh oh!

codecov-commenter commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

PR Review: feat(agents): add Task Challenger adversarial questioning agent

Issue Alignment

PR Template Compliance

Coding Standards

Code Quality

Action Items

Uh oh!

github-actions bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rezatnoMsirhC commented Apr 7, 2026 •

edited

Loading

codecov-commenter commented Apr 7, 2026 •

edited

Loading

PR Review: `feat(agents): add Task Challenger adversarial questioning agent`