Skip to content

Setup workflow benchmarking to stress-test local workflows #17

@Enniwhere

Description

@Enniwhere

Description:
We've set up local workflows that use the DGX spark as primary inference server and fall back to Scaleway as a backup. We want to be able to stress-test this before we send it out to the users.

Acceptance criteria:

  • Automatic stress-test of different types of load on the endpoints
  • Measure how often we spillover onto Scaleway
  • Telemetry-results for tokens-per-second, e2e latency, time-to-first-token, etc. logged to a safe place.

Technical details:
Optional technical details for context.

Design:
Optional details on design for context.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions