Skip to content

Feat/api version default routing#30

Open
pelenz wants to merge 4 commits into
Azure-Samples:mainfrom
pelenz:feat/api-version-default-routing
Open

Feat/api version default routing#30
pelenz wants to merge 4 commits into
Azure-Samples:mainfrom
pelenz:feat/api-version-default-routing

Conversation

@pelenz

@pelenz pelenz commented May 30, 2026

Copy link
Copy Markdown

No description provided.

Petr Lenz and others added 3 commits May 17, 2026 15:45
- Model: gpt-35-turbo/0125 (retired 2025-11-14) → gpt-4.1-mini/2025-04-14
- Region: AOAI eastus2 + APIM koreacentral → both westeurope
- AOAI API version: 2024-02-01 → 2024-10-21
- AOAI deployment capacity: 2 → 50 TPM (was tripping 429s with default)
- README: prepend "Fork notes" section with recon deployment runbook,
  azd env get-value extraction, recon invocation, LAW KQL cross-check,
  teardown. Upstream sample content preserved below the divider.

This fork now serves as the second recon test target — CS integration
pattern (a), AOAI built-in RAI filter only, no APIM llm-content-safety
policy, no middleware. Complements ailz-dev (pattern d) for cross-pattern
diff testing in recon.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The upstream OpenAPI declares every operation with `?api-version={api-version}`,
which makes the query param part of APIM operation matching — requests without
it 404 before any policy runs. To fix:

- Add an all-APIs (service-level) policy that sets `api-version=2024-10-21`
  when the caller doesn't pass one. Runs after route matching, so it only
  serves to defaults the value forwarded to the AOAI backend.
- Override URL templates for the seven `/deployments/{deployment-id}/...`
  passthrough operations (chat, completions, embeddings, image gen, speech,
  transcriptions, translations) to drop the `?api-version` query template,
  so route matching no longer requires it from callers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5000 TPM per subscription was tripping during recon test bursts. Bump
to 20000 so the rate-limit policy doesn't fire during normal probing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pelenz

pelenz commented May 30, 2026

Copy link
Copy Markdown
Author

Update for dev purpose

@pelenz

pelenz commented May 30, 2026

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

…get docs

Infra:
- New `deployTwoBackends` param in `infra/main.bicep` (default true) — gates
  the second AOAI account, its role assignment, and the second backend in
  the APIM load-balancer pool. Lets the target run as a single-backend
  recon test target without editing Bicep.
- New `apim-to-log-analytics` diagnostic setting in
  `infra/core/gateway/apim.bicep`, scoped to the APIM service and routed to
  the existing Log Analytics workspace in resource-specific (`Dedicated`)
  mode. Emits `GatewayLogs`, `WebSocketConnectionLogs`, and `AllMetrics`.
  Complements the existing App Insights logger pipeline; the two land in
  different tables and don't overlap.

Docs:
- `CLAUDE.md` — fork-authoritative guidance: deployment shape (pattern a,
  westeurope, gpt-4.1-mini, StandardV2), commands, soft-delete recovery,
  request flow, key files, gotchas.
- `costs_analysis.md` — Azure Monitor / KQL recipes for traffic & cost
  attribution, gpt-4.1-mini pricing reference.
- `apim_diagnostic.md` — the two APIM telemetry pipelines, what each writes
  to (`AppRequests` vs. `AzureDiagnostics`/`ApiManagementGatewayLogs`),
  Bicep patch description, when pipeline 2 is worth enabling.
- `arm_what_if_limitations.md` — what `azd provision --preview` misses
  (extension resources, APIM sub-resources, policy XML), the persistent
  noise it reports, and a repo-specific "trust this / don't trust that"
  guide with `az` verification commands.
- README — adds matching fork notes pointer.

Hygiene:
- `.gitignore` — add `.DS_Store` and `.claude/`.
- Add sample-app `package.json` + `package-lock.json` (already referenced
  by `src/` scripts; previously untracked).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant