Skip to content

feat: add Cosmos DB NoSQL create-index samples (Python, Go) and shared infra#88

Open
diberry wants to merge 4 commits into
Azure-Samples:mainfrom
diberry:feature/nosql-create-index-wave1
Open

feat: add Cosmos DB NoSQL create-index samples (Python, Go) and shared infra#88
diberry wants to merge 4 commits into
Azure-Samples:mainfrom
diberry:feature/nosql-create-index-wave1

Conversation

@diberry

@diberry diberry commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Overview

This PR adds the shared infrastructure and first two language ports for the Cosmos DB for NoSQL create-index scenario.

Highlights:

  • reusable Bicep module for the scenario containers (DiskANN + QuantizedFlat)
  • conditional HotelsCreateIndex database wiring in infra
  • new Python sample for the data-plane create-index flow
  • new Go sample for the data-plane create-index flow
  • .gitignore update for generated *.exe artifacts

See Design concept.

New directories/files

New directories

  • nosql-create-index-python/
  • nosql-create-index-go/

New shared infra file

  • infra/cosmos-db/nosql/vector-containers.bicep

Updated infra/support files

  • infra/database.bicep
  • infra/main.bicep
  • .gitignore

Review status

  • Passed 8-agent review: 13/16 PASS

Rollout context

This is Wave 1 of a 4-language create-index effort.

  • Included now: Python, Go
  • Coming next: .NET, Java

Design concept

This implementation follows the two key design constraints for the create-index rollout:

  • Two-database isolation: keep the existing vector-search samples stable while provisioning a separate HotelsCreateIndex database for the new immutable container shape.
  • Infra-first binding: provision the required containers in shared infra first, then bind each language sample to those pre-created resources.

- Add reusable vector-containers.bicep module (DiskANN + QuantizedFlat)
- Add conditional HotelsCreateIndex database in infra
- Add Python sample (nosql-create-index-python/)
- Add Go sample (nosql-create-index-go/)
- Add *.exe to .gitignore

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry

diberry commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Review Score: 15/16 [PASS]

Specialist Dimension Score Notes
Sam Correctness 2 SDK usage correct: DefaultAzureCredential, VectorDistance(c.${embeddingField}, @Embedding) with proper field validation regex. Python uses execute_item_batch (transactional), Go uses bounded concurrent CreateItem. Env var names match SDK expectations.
Scooter Completeness 2 All required files present: Python (config.py, data_plane.py, index.py, tests, .env.example, README, requirements.txt). Go (main.go, config.go, dataplane.go, tests, .env.example, README, go.mod). Both complete.
Beauregard Conventions 2 Commit message well-formatted. File layout matches TypeScript sample pattern (src/, tests/, output/). Branch naming and structure consistent with repo conventions.
Statler Coverage 2 Excellent error handling: Missing env vars caught early with clear messages. Invalid algorithm names rejected. Data file path validation. Embedding dimension mismatch detected. Container errors wrapped with helpful guidance. Python tests cover all critical paths.
Architect Coherence 1 Bicep module design is solid (reusable vector-containers.bicep), but database.bicep has inconsistency: HotelId vs PartitionKey between vectorSearchContainers (line 49) and createIndexContainers (lines 73-74). Minor non-breaking issue.
Razor Conciseness 2 No dead code. Python/Go imports are minimal and appropriate. No unused variables. Code is lean. Helper functions well-purposed. No unnecessary duplication.
Mirror Consistency 2 Env vars match: AZURE_COSMOSDB_ENDPOINT, AZURE_COSMOSDB_DATABASENAME, AZURE_OPENAI_EMBEDDING_ENDPOINT, VECTOR_ALGORITHM, DATA_FILE_WITH_VECTORS. Container names identical. Output format consistent.
Lens Clarity 2 READMEs excellent with clear prerequisites, setup, expected output. Code well-structured with clear helpers. Error messages actionable. Field validation explained. 'Azure OpenAI client' terminology correct.
Total 15/16 PASS (>=13). Minor Bicep partition key inconsistency noted.

Strengths

  • Data-plane only design (no createIfNotExists, assumes resources exist)
  • Field name injection + regex validation (prevents SQL injection)
  • Bulk patterns: Python execute_item_batch, Go bounded concurrency
  • Helpful error messages guide users to provisioning/RBAC/Entra setup
  • Shared data file correctly referenced
  • DefaultAzureCredential integration clean
  • Terminology: 'Azure OpenAI client', 'vector quantization techniques' correct

Minor Issue

Bicep partition key inconsistency: Line 49 uses /HotelId but line 73 uses /PartitionKey. Should standardize.

Recommendation

APPROVE — Ready to merge. Operationally sound with excellent code quality, error handling, and documentation.

diberry and others added 2 commits June 8, 2026 16:09
Use @TopK query parameter instead of string interpolation for the TOP
clause, aligning with cosmosdb-agent-kit sdk-best-practices rule
query-top-literal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Report failed document count and raise RuntimeError when batch
operations return non-success status codes, instead of silently
counting only successes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry

diberry commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

Review: UNANIMOUS APPROVAL ✅

All 8 specialist reviewers approved.

Reviewer Round Verdict
Statler (Adversarial) 2 ✅ APPROVE
Scooter (QA) 2 ✅ APPROVE
Architect 1 ✅ APPROVE
Security 1 ✅ APPROVE
Algorithm 1 ✅ APPROVE
Integration 2 ✅ APPROVE
Beauregard (PR Hygiene) 1 ✅ APPROVE
Sam (Fact Checker) 2 ✅ APPROVE

Rounds needed: 2. Fix: Python ingestion now raises RuntimeError on partial batch failures.

- Python article: transactional batch inserts, azure-cosmos SDK, DefaultAzureCredential
- Go article: concurrent goroutine inserts, azcosmos SDK, semaphore pattern
- Both articles passed 8-agent unanimous approval review (3 rounds)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant