Skip to content

singnet/haas-example-service

Repository files navigation

HaaS Example Service

A minimal working example of a service that can be deployed on the HaaS (Hosting as a Service) platform. It implements a simple arithmetic calculator to demonstrate the required structure, interface contract, and profiling setup.


How HaaS Deploys a Service

When you push a commit to a registered GitHub repository, the HaaS pipeline runs automatically:

GitHub Push
    ↓
Listener   — validates the webhook and records the commit
    ↓
Builder    — clones the repo, builds a Docker image and pushes to registry
    ↓
GPU Orchestrator — creates a serverless endpoint from the image
    ↓
Profiling  — calls the endpoint with profile.json to measure baseline execution time
    ↓
Service Ready — endpoint is cached and available for inference requests

Your repository only needs to satisfy two requirements for this to work:

  1. A Dockerfile that produces an image running runpod_handler.py
  2. A profile.json with a representative input the profiling step can use

Project Structure

example_service/
├── customer_main.py      # Your model logic — the only file you need to change
├── runpod_handler.py     # HaaS runtime wrapper — do not modify
├── profile.json          # Sample input used by the profiling
├── Dockerfile
└── requirements.txt

customer_main.py — your code

The single entry point HaaS expects is a run function:

def run(input_data: dict) -> any:
    ...
Argument Type Description
input_data dict The "input" field from the RunPod request body
return value any JSON-serialisable value Wrapped into {"result": <value>} by the handler

This example implements four arithmetic operations:

# Request
{"input": {"a": 10, "b": 4, "op": "add"}}

# Response
{"result": 14}

Supported operations: add, sub, mul, div.

runpod_handler.py — HaaS runtime wrapper

You do not need to touch this file. It:

  • Bootstraps the RunPod serverless listener
  • Imports and calls customer_main.run(input_data) on every request
  • Captures stdout from your code and forwards it to Sentry as structured log entries
  • Handles exceptions: returns {"error": ..., "error_type": ..., "logs": ...} instead of crashing the worker

profile.json — profiling input

The GPU Orchestrator's profiling service calls the deployed endpoint once with the contents of profile.json immediately after deployment. The measured execution time is stored and used for capacity planning.

The file must contain a valid request body in RunPod format:

{
  "input": { ... }
}

Choose an input that is representative of a typical request — not a trivial no-op, but not the slowest possible case either.


Adapting for Your Own Service

  1. Replace customer_main.py with your model logic. Keep the run(input_data) signature.
  2. Update requirements.txt with your dependencies. Keep runpod==1.7.12 and sentry-sdk==2.46.0 (required for metrics tracking).
  3. Update profile.json with a representative input for your model.
  4. Keep runpod_handler.py and Dockerfile unchanged.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors