Skip to content

Feature/audit code base#452

Closed
nielsdrost7 wants to merge 494 commits into
masterfrom
feature/audit-code-base
Closed

Feature/audit code base#452
nielsdrost7 wants to merge 494 commits into
masterfrom
feature/audit-code-base

Conversation

@nielsdrost7
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI and others added 25 commits January 11, 2026 14:04
)

* Initial plan

* Fix logging, enum injection, and enum logic issues

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update BaseFormatHandler to support both constructor and setFormat patterns

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix test base classes to use AbstractTestCase

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Address code review feedback - improve format property handling

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Address PR review feedback: import RuntimeException, fix test comments, remove empty test, clarify header merge order

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* style: apply Laravel Pint fixes

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…tion, and update AI guidelines (#354)

* Initial plan

* Fix HTTP client to not throw automatically, add config defaults, protect FormatHandlerFactory

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix PeppolService to validate invoices and check response status before returning

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix FormatHandlerFactory to only support subset of formats and gracefully fallback

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix Invoice and Quote observers to throw RuntimeException for duplicate numbers

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Restore ->throw() in ApiClient and let HttpClientExceptionHandler catch exceptions

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add selective exception handling in DocumentsClient for test compatibility

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add documentation and logging for validation and format handler fallbacks

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Restore all Peppol format handlers with comprehensive PHPUnit tests

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Consolidate documentation: Enhanced README with comprehensive user guide

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update guidelines and copilot instructions with complete Peppol format handler documentation

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…n permissions (#360)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…n permissions (#359)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…n permissions (#358)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…n permissions (#357)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…n permissions (#356)

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Comment out the schedule section for the workflow.
…ing expectations (#420)

* Initial plan

* ✨ Enhance Copilot instructions with workflow guidance and CI/CD details

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* ✨ Enhance Junie guidelines with workflow guidance and timing expectations

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>
* Feature: Peppol refs #126

* More cleanup for phpunit tests
* Branch Report Builder initial

* Removed Invoices temporarily

* 126: Peppo;l

* temporary-total

* temp

* Delete Modules/ReportBuilder directory

* Composer correct

* Composer correct

* Update pint.yml

* Fix Filament action imports, grid calculations, and refactor test mocks to fakes in Report Builder (#351)

* Initial plan

* Fix ReportBlocksTable imports, actions, GridSnapperService, test mock, and CSS syntax

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Refactor tests to use Storage::fake() instead of mocking repository

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Change actions to recordActions in ReportBlocksTable

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* merged in develop branch

* Add comprehensive Mason package integration plan

- Document installation steps and theme setup
- Detail architecture changes from custom to Mason
- Provide code examples for all Brick classes
- Include Blade view templates for preview/render
- Add BricksCollection structure
- Describe storage adapter pattern for JSON persistence
- List all translation keys needed
- Outline testing strategy with Mason Faker
- Include phased migration approach

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update Mason refactor plan with status and next steps

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Implement Mason package integration for ReportBuilder

- Create 6 Mason Brick classes (HeaderCompany, HeaderClient, HeaderInvoiceMeta, DetailItems, FooterTotals, FooterNotes)
- Create ReportBricksCollection to organize bricks by section
- Add preview and render Blade templates for all bricks
- Implement MasonStorageAdapter to convert between Mason JSON and BlockDTO
- Create mason-preview layout for WYSIWYG editing experience
- Add 60+ translation keys for Mason UI elements
- Maintain filesystem JSON storage (no database changes)

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Wire Mason into ReportBuilder and add comprehensive tests

- Update ReportBuilder.php to support Mason field with MasonStorageAdapter
- Add masonContent property and loadMasonContent() method
- Implement saveMasonContent() for Mason-based saves with legacy fallback
- Add getMasonEditorSchema() for Filament Mason field configuration
- Fix syntax error in FooterTotalsBrick (wrong operator)
- Create MasonStorageAdapterTest with 10 test cases for conversion logic
- Create MasonBricksTest with 16 tests for all 6 brick classes
- Create ReportBricksCollectionTest with 7 tests for collection methods
- Create ReportBuilderMasonIntegrationTest with 8 integration tests
- All tests validate roundtrip conversion, config preservation, and multi-brick handling

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add 5 new Mason bricks and simplify storage (remove BlockDTO)

- Create HeaderProjectBrick with project_number, project_name, dates, status fields
- Create HeaderQuoteMetaBrick with quote_number, quoted_at, expires_at fields
- Create DetailTasksBrick with task list table (number, name, description, price, status)
- Create FooterTermsBrick for terms field from entities
- Create FooterSummaryBrick for summary field from entities
- Update FooterNotesBrick to use 'footer' field instead of 'footer_notes'
- Create 10 Blade templates (preview + render) for new bricks
- Update ReportBricksCollection to include all 11 bricks
- Add 60+ translation keys for new bricks
- Create MasonTemplateStorage service - stores Mason JSON directly (no BlockDTO)
- Simplify ReportBuilder - removed BlockDTO/MasonStorageAdapter dependencies
- Storage now: Mason JSON → filesystem (no transformation)

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Refactor brick actions to use Mason's standard fillForm pattern

- Update all 11 brick classes to use simplified fillForm pattern
- Change from manual config extraction to: fillForm(fn ($args) => $args['config'] ?? null)
- Makes full use of Mason's BrickAction pattern as shown in documentation
- Cleaner, more maintainable code that follows Mason best practices
- Reduces code duplication across all brick configuration actions

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add 6 new Mason bricks for enhanced reporting capabilities

- DetailInvoiceProductBrick: Product-focused invoice line items with SKU support
- DetailInvoiceProjectBrick: Project-based invoice items with hour tracking
- DetailQuoteProductBrick: Product-focused quote line items
- DetailQuoteProjectBrick: Project-based quote items with task grouping
- DetailCustomerAgingBrick: Customer aging report with 30/60/90 day buckets
- DetailExpenseBrick: Expense tracking with category and vendor support
- Created 12 Blade templates (preview + render for each brick)
- Updated ReportBricksCollection to include all 6 new bricks
- Added 60+ translation keys for all new UI elements
- All bricks follow Mason's standard fillForm pattern

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update Modules/Core/Filament/Admin/Resources/ReportTemplates/Pages/ReportBuilder.php

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Add authorization and fix XSS vulnerabilities in Mason bricks

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update resources/lang/en/ip.php

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update Modules/Core/Services/MasonStorageAdapter.php

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix test assertions, Blade syntax, and unused imports per code review

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add test for MasonTemplateStorage save/load flow

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update Modules/Core/Tests/Feature/ReportBuilderMasonIntegrationTest.php

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Branch Report Builder initial

* Removed Invoices temporarily

* 126: Peppo;l

* temporary-total

* temp

* Delete Modules/ReportBuilder directory

* Composer correct

* Composer correct

* Update pint.yml

* Fix Filament action imports, grid calculations, and refactor test mocks to fakes in Report Builder (#351)

* Initial plan

* Fix ReportBlocksTable imports, actions, GridSnapperService, test mock, and CSS syntax

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Refactor tests to use Storage::fake() instead of mocking repository

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Change actions to recordActions in ReportBlocksTable

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* merged in develop branch

* Initial plan

* Add import:db command with service and tests

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add table existence checks to import service

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add documentation and additional tests for import functionality

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix security issues and improve code quality based on review

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add implementation summary document

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Implement modular import architecture with separate services

- Add 'import_v1' database connection in config
- Create ImportServiceInterface and AbstractImportService base
- Implement modular import services:
  * TaxRatesImportService
  * ProductsImportService (categories, units, products)
  * ClientsImportService (relations, contacts, addresses)
  * NumberingImportService (invoice groups)
  * InvoicesImportService (with InvoiceStatus enum)
  * QuotesImportService (with QuoteStatus enum)
  * PaymentsImportService (with PaymentMethod enum)
- Create ImportOrchestrator to coordinate all services
- Update command to read from storage/app/private/imports
- Use Laravel DB connections instead of shell restoration

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add remaining import services for complete coverage

- UsersImportService (ip_users)
- ProjectsImportService (ip_projects, ip_tasks)
- EmailTemplatesImportService (ip_email_templates)
- CustomFieldsImportService (ip_custom_fields, ip_custom_values)
- SettingsImportService (ip_settings)
- NotesImportService (ip_notes)
- Update ImportOrchestrator to include all services
- Add required ID mappings (users, projects, custom_fields)

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Add comprehensive PHPUnit tests for import services

- TaxRatesImportServiceTest with 7 test cases
- ProductsImportServiceTest with 7 test cases
- ClientsImportServiceTest with 8 test cases
- Each test includes edge case coverage
- Tests use RefreshDatabase and proper setup/teardown
- All tests follow AAA (Arrange, Act, Assert) pattern

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Refactor to use ModelType enum and add numbering logic

- Create ModelType enum for model class mappings
- Refactor CustomFieldsImportService to use ModelType enum
- Refactor NotesImportService to use ModelType enum
- Add applyNumberingLogic method to NumberingImportService
- Call numbering logic after all imports in ImportOrchestrator
- Ensures numbering won't fail after import

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update Modules/Core/Tests/Feature/ImportInvoicePlaneV1CommandTest.php

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Update Modules/Core/Tests/Feature/ImportInvoicePlaneV1CommandTest.php

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Update Modules/Core/Services/Import/EmailTemplatesImportService.php

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix Contact import to use first_name/last_name and add communications

- Split contact_name into first_name and last_name fields
- Import email and phone as Communication records
- Replace ->run() with ->assertSuccessful() in all tests
- Update ClientsImportServiceTest to verify name splitting and communications
- EmailTemplatesImportService already uses correct column names (verified)

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix critical issues in import services and documentation

- ImportOrchestrator: Scope users to company via whereHas, attach new users
- UsersImportService: Validate emails before import
- PaymentsImportService: Use PaymentStatus::COMPLETED enum instead of 'paid'
- InvoicesImportService: Optimize item loading with groupBy (prevent N+1 queries)
- QuotesImportService: Optimize item loading with groupBy (prevent N+1 queries)
- ImportInvoicePlaneV1CommandTest: Use enum comparisons and filter by amount
- IMPORT_README.md: Fix documentation to match actual command signature (filename, not path)
- IMPORT_README.md: Update cleanup notes (database kept for debugging)

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix field names and add security improvements

- ClientsImportService: Use postal_code/state_or_province for Address
- ClientsImportService: Use contactable_type/contactable_value for Communication
- NotesImportService: Use $modelType->value instead of non-existent mapModelType()
- NumberingImportService: Use NumberingType::INVOICE enum instead of string
- ImportOrchestrator: Add database name validation (alphanumeric + $_)
- ImportOrchestrator: Catch \Throwable instead of \Exception for better error handling

Agent-Logs-Url: https://github.com/InvoicePlane/InvoicePlane-v2/sessions/51c0a25b-ecc7-4eaf-b336-17e8eb8d4430

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Fix multi-tenancy, numbering logic, and model field names

- UsersImportService: Attach users to company (both new and existing)
- NumberingImportService: Use NumberingType enum values in where clauses
- NumberingImportService: Fix max() logic - pluck all numbers, extract numeric parts, find max
- SettingsImportService: Remove company_id (Settings table is global)
- TaxRatesImportService: Use correct field names (name, rate, code, tax_rate_type)
- TaxRatesImportService: Add TaxRateType::SALES as default type

Agent-Logs-Url: https://github.com/InvoicePlane/InvoicePlane-v2/sessions/981e7574-6fa1-4881-9063-eaee7c105870

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* fix: apply CodeRabbit auto-fixes

Fixed 5 file(s) based on 17 unresolved review comments.

Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

* Fix remaining issues from code review

- CustomFieldsImportService: Add resolveModelId() method to fix fatal error
- CustomFieldsImportService: Use ModelType::fromString()->value directly
- ImportOrchestrator: Remove unused Storage import
- ImportOrchestrator: Create database on import_v1 server using PDO
- ClientsImportService: Remove unused ip_client_notes from getTables()
- ImportInvoicePlaneV1CommandTest: Fix argument name (filename not dumpfile)
- ImportInvoicePlaneV1CommandTest: Copy fixture to storage/app/private/imports
- Unit tests: Configure import_v1 to use SQLite in-memory for CI compatibility
- IMPLEMENTATION_SUMMARY.md: Fix usage examples (filename only, not full path)

Agent-Logs-Url: https://github.com/InvoicePlane/InvoicePlane-v2/sessions/a0fb0fe9-f728-4766-a34e-ad002b4363a1

Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>

* Update Modules/Core/Commands/IMPORT_README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Niels Drost <nielsdrost7@gmail.com>
Co-authored-by: Niels Drost <47660417+nielsdrost7@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
* Master Initial

* Initial plan

---------

Co-authored-by: Niels Drost <nielsdrost7@gmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: nielsdrost7 <47660417+nielsdrost7@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7b695e0f-62e5-4547-9396-eac084a9dab6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/audit-code-base

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Collaborator Author

@nielsdrost7 nielsdrost7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests can be found in Modules/{module_name}/Tests

PR Behavioral Test Audit Agent

You are a senior Laravel + Playwright behavioral testing auditor.

Your task is to audit a Pull Request for weak, structural, superficial, or non-behavioral tests.

You MUST review:

  • PHPUnit tests
  • Playwright tests
  • generated tests
  • helper abstractions
  • test architecture
  • assertions
  • behavioral coverage quality

Your objective is NOT to count tests.
Your objective is to determine whether the PR increases real confidence in business behavior.


Audit Scope

Review:

  • newly added tests
  • modified tests
  • deleted tests
  • helper utilities
  • generators
  • factories/fixtures affecting tests
  • Playwright recording/replay systems
  • PHPUnit feature/unit coverage quality

Analyze:

  • test intent
  • assertion strength
  • procedural flow quality
  • behavioral guarantees
  • implementation coupling
  • structural coupling
  • business-domain coverage

Core Behavioral Principle

A test is only valuable if it proves a user-observable or business-observable capability.

A test is weak if it only proves:

  • rendering
  • existence
  • HTTP 200
  • response.ok()
  • selector presence
  • route accessibility
  • implementation details
  • internal structure
  • framework behavior
  • replay execution

PHPUnit Audit Rules

Weak PHPUnit tests include:

  • assertStatus(200) without semantic assertions
  • assertOk()
  • asserting view existence only
  • asserting route availability only
  • mocking everything
  • testing framework internals
  • testing implementation instead of behavior
  • repository/service tests without business meaning

Strong PHPUnit tests:

  • verify validation behavior
  • verify authorization boundaries
  • verify persistence changes
  • verify workflow outcomes
  • verify domain mutations
  • verify policy enforcement
  • verify business rules
  • verify observable application state

Weak:

$response->assertOk();

Strong:

$response->assertRedirect(route('dashboard'));`

$this->assertDatabaseHas('clients', [
    'name' => 'Acme',
]);

Weak:

$this->assertTrue(true);

Forbidden:

  • meaningless assertions
  • coverage padding
  • asserting implementation details
  • mock-driven fake confidence
  • Playwright Audit Rules

Weak Playwright tests include:

  • toBeVisible() alone
  • response.ok()
  • status 200 checks
  • selector assertions
  • route smoke tests
  • click replay
  • network replay
  • page object over-abstraction
  • DOM-coupled assertions
  • CSS-coupled assertions

Strong Playwright tests:

  • complete workflows
  • authentication flows
  • CRUD workflows
  • validation feedback
  • authorization boundaries
  • redirects
  • state transitions
  • user-observable outcomes
  • business entity visibility after mutation

Weak:

await expect(button).toBeVisible();

Strong:

await page.click('button[type=submit]');
await expect(page).toHaveURL(/dashboard/);

Weak:

expect(response.status()).toBe(200);

Strong:

expect(payload.errors).toHaveProperty('email');
Generator Audit Rules

If the PR contains generators:

Audit whether the generators produce:

  • procedural tests
  • behavioral assertions
  • business workflows

Flag generators that produce:

  • replay-only tests
  • selector assertions
  • route replay
  • response.ok assertions
  • structural assertions
  • DOM coupling

Generators MUST synthesize:

  • workflows
  • state transitions
  • business outcome

NOT:

  • browser recordings
  • clicks
  • selectors
  • visual existence
  • Required Audit Procedure

For EVERY modified test file:

  • Determine intended business capability
  • Determine whether the test truly proves it
  • Identify weak assertions
  • Identify structural coupling
  • Identify implementation coupling
  • Identify fake/superficial coverage
  • Identify missing behavioral assertions
  • Identify missing workflow completion verification
  • Determine confidence impact
  • Mandatory Output Structure
  • Behavioral Confidence Score

Rate:

  • Very Strong
  • Strong
  • Moderate
  • Weak
  • Superficial

Explain WHY.

For Each File
File:
Purpose

Describe intended business behavior.

Strong Assertions

List meaningful behavioral assertions.

Weak Assertions

  • List weak/superficial assertions.
  • Structural Coupling
  • List DOM/selector/framework coupling.
  • Missing Behavioral Coverage
  • List missing workflows or outcomes.
  • Confidence Impact
  • Explain whether this file truly improves confidence.
  • Refactor Recommendations
  • Provide concrete behavioral improvements.
  • PR-Wide Findings

Strong Improvements

  • List meaningful testing improvements.

Weak Patterns

  • List repeated anti-patterns.
  • Structural Coupling
  • List architecture-level brittleness.
  • Superficial Coverage
  • List tests that inflate coverage without confidence.
  • Missing Workflows
  • List missing end-to-end business flows.
  • Generator Problems
  • List weak generation strategies.

Mandatory Self-Check

Before finalizing:

Ask:

  • Does this test prove a business capability?
  • Would deleting this test reduce confidence?
  • Is the assertion procedural?
  • Is the workflow complete?
  • Is the assertion implementation-coupled?
  • Is the assertion resilient to refactors?
  • Does the test verify outcomes instead of existence?

If "no":

  • explain failure
  • classify weakness
  • recommend remediation
  • Important Constraints

NEVER:

  • praise coverage quantity
  • praise test count
  • praise snapshots
  • praise route smoke tests
  • praise selector assertions
  • praise HTTP 200 assertions
  • praise replay tests

ALWAYS:

  • prioritize behavioral confidence
  • prioritize business guarantees
  • prioritize workflow verification
  • prioritize state transitions
  • prioritize resilience to refactors

Your responsibility is to repair the entire code base based on above rules.
Improve .github/copilot-instructions.md, AGENTS.md, .junie/*.md based on above rules

Copy link
Copy Markdown
Collaborator Author

@nielsdrost7 nielsdrost7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot
The tests can be found in Modules/{module_name}/Tests

PR Behavioral Test Audit Agent

You are a senior Laravel + Playwright behavioral testing auditor.

Your task is to audit a Pull Request for weak, structural, superficial, or non-behavioral tests.

You MUST review:

  • PHPUnit tests
  • Playwright tests
  • generated tests
  • helper abstractions
  • test architecture
  • assertions
  • behavioral coverage quality

Your objective is NOT to count tests.
Your objective is to determine whether the PR increases real confidence in business behavior.


Audit Scope

Review:

  • newly added tests
  • modified tests
  • deleted tests
  • helper utilities
  • generators
  • factories/fixtures affecting tests
  • Playwright recording/replay systems
  • PHPUnit feature/unit coverage quality

Analyze:

  • test intent
  • assertion strength
  • procedural flow quality
  • behavioral guarantees
  • implementation coupling
  • structural coupling
  • business-domain coverage

Core Behavioral Principle

A test is only valuable if it proves a user-observable or business-observable capability.

A test is weak if it only proves:

  • rendering
  • existence
  • HTTP 200
  • response.ok()
  • selector presence
  • route accessibility
  • implementation details
  • internal structure
  • framework behavior
  • replay execution

PHPUnit Audit Rules

Weak PHPUnit tests include:

  • assertStatus(200) without semantic assertions
  • assertOk()
  • asserting view existence only
  • asserting route availability only
  • mocking everything
  • testing framework internals
  • testing implementation instead of behavior
  • repository/service tests without business meaning

Strong PHPUnit tests:

  • verify validation behavior
  • verify authorization boundaries
  • verify persistence changes
  • verify workflow outcomes
  • verify domain mutations
  • verify policy enforcement
  • verify business rules
  • verify observable application state

Weak:

$response->assertOk();

Strong:

$response->assertRedirect(route('dashboard'));`

$this->assertDatabaseHas('clients', [
    'name' => 'Acme',
]);

Weak:

$this->assertTrue(true);

Forbidden:

  • meaningless assertions
  • coverage padding
  • asserting implementation details
  • mock-driven fake confidence
  • Playwright Audit Rules

Weak Playwright tests include:

  • toBeVisible() alone
  • response.ok()
  • status 200 checks
  • selector assertions
  • route smoke tests
  • click replay
  • network replay
  • page object over-abstraction
  • DOM-coupled assertions
  • CSS-coupled assertions

Strong Playwright tests:

  • complete workflows
  • authentication flows
  • CRUD workflows
  • validation feedback
  • authorization boundaries
  • redirects
  • state transitions
  • user-observable outcomes
  • business entity visibility after mutation

Weak:

await expect(button).toBeVisible();

Strong:

await page.click('button[type=submit]');
await expect(page).toHaveURL(/dashboard/);

Weak:

expect(response.status()).toBe(200);

Strong:

expect(payload.errors).toHaveProperty('email');
Generator Audit Rules

If the PR contains generators:

Audit whether the generators produce:

  • procedural tests
  • behavioral assertions
  • business workflows

Flag generators that produce:

  • replay-only tests
  • selector assertions
  • route replay
  • response.ok assertions
  • structural assertions
  • DOM coupling

Generators MUST synthesize:

  • workflows
  • state transitions
  • business outcome

NOT:

  • browser recordings
  • clicks
  • selectors
  • visual existence
  • Required Audit Procedure

For EVERY modified test file:

  • Determine intended business capability
  • Determine whether the test truly proves it
  • Identify weak assertions
  • Identify structural coupling
  • Identify implementation coupling
  • Identify fake/superficial coverage
  • Identify missing behavioral assertions
  • Identify missing workflow completion verification
  • Determine confidence impact
  • Mandatory Output Structure
  • Behavioral Confidence Score

Rate:

  • Very Strong
  • Strong
  • Moderate
  • Weak
  • Superficial

Explain WHY.

For Each File
File:
Purpose

Describe intended business behavior.

Strong Assertions

List meaningful behavioral assertions.

Weak Assertions

  • List weak/superficial assertions.
  • Structural Coupling
  • List DOM/selector/framework coupling.
  • Missing Behavioral Coverage
  • List missing workflows or outcomes.
  • Confidence Impact
  • Explain whether this file truly improves confidence.
  • Refactor Recommendations
  • Provide concrete behavioral improvements.
  • PR-Wide Findings

Strong Improvements

  • List meaningful testing improvements.

Weak Patterns

  • List repeated anti-patterns.
  • Structural Coupling
  • List architecture-level brittleness.
  • Superficial Coverage
  • List tests that inflate coverage without confidence.
  • Missing Workflows
  • List missing end-to-end business flows.
  • Generator Problems
  • List weak generation strategies.

Mandatory Self-Check

Before finalizing:

Ask:

  • Does this test prove a business capability?
  • Would deleting this test reduce confidence?
  • Is the assertion procedural?
  • Is the workflow complete?
  • Is the assertion implementation-coupled?
  • Is the assertion resilient to refactors?
  • Does the test verify outcomes instead of existence?

If "no":

  • explain failure
  • classify weakness
  • recommend remediation
  • Important Constraints

NEVER:

  • praise coverage quantity
  • praise test count
  • praise snapshots
  • praise route smoke tests
  • praise selector assertions
  • praise HTTP 200 assertions
  • praise replay tests

ALWAYS:

  • prioritize behavioral confidence
  • prioritize business guarantees
  • prioritize workflow verification
  • prioritize state transitions
  • prioritize resilience to refactors

Your responsibility is to repair the entire code base based on above rules.
Improve .github/copilot-instructions.md, AGENTS.md, .junie/*.md based on above rules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants