GitHub - HiteshMondal/devops

            ██████╗ ███████╗██╗   ██╗ ██████╗ ██████╗ ███████╗
            ██╔══██╗██╔════╝██║   ██║██╔═══██╗██╔══██╗██╔════╝
            ██║  ██║█████╗  ██║   ██║██║   ██║██████╔╝███████╗
            ██║  ██║██╔══╝  ╚██╗ ██╔╝██║   ██║██╔═══╝ ╚════██║
            ██████╔╝███████╗ ╚████╔╝ ╚██████╔╝██║     ███████║
            ╚═════╝ ╚══════╝  ╚═══╝   ╚═════╝ ╚═╝     ╚══════╝

End-to-End DevOps + MLOps Platform

A production-grade DevOps and MLOps project demonstrating the full lifecycle of an AI/ML application:

Overview

This repository provides a single-command interactive deployment runner:

./run.sh

The runner interactively guides you through environment, component, and cloud provider selection — then handles the rest automatically, including runtime detection, dependency resolution, and cluster-aware configuration.

Architecture

┌────────────────────────────────────────────────────────────────────┐
│                           ./run.sh                                 │
│          (Interactive orchestrator — detects everything)           │
└──────────────┬────────────────────────────────┬────────────────────┘
               │                                │
       ┌───────▼────────┐              ┌────────▼───────┐
       │  LOCAL TARGET  │              │  PROD TARGET   │
       │  ──────────────│              │  ────────────  │
       │  Minikube      │              │  AWS EKS       │
       │  Kind          │              │  GKE           │
       │  K3s           │              │  AKS           │
       │  MicroK8s      │              │  OCI OKE       │
       └───────┬────────┘              └────────┬───────┘
               │                                │
       ┌───────▼────────────────────────────────▼────────┐
       │                 DEPLOYMENT PIPELINE             │
       │                                                 │
       │  1. Build & Push Image   (Docker / Podman)      │
       │  2. Provision Infra      (Terraform / OpenTofu) │
       │  3. Deploy App to K8s    (Kustomize overlays)   │
       │  4. Deploy Monitoring    (Prometheus + Grafana) │
       │  5. Deploy Logging       (Loki + Promtail)      │
       │  6. Security Scan        (Trivy)                │
       │  7. MLOps Pipeline       (Train / Drift / Log)  │
       └─────────────────────────────────────────────────┘

Key Features

Category	What's Included
Single Command	Interactive `run.sh` orchestrates the entire pipeline
Container Runtime	Docker and Podman both supported, auto-detected
Application	FastAPI (Python 3.11) served via Uvicorn on port 3000
Kubernetes	Kustomize base + overlays for local and prod environments
Deployment Modes	Direct `kubectl apply` or full GitOps via ArgoCD
CI/CD	GitHub Actions and GitLab CI pipelines, ready to use
Infrastructure	Terraform (AWS EKS), OpenTofu (OCI OKE), Pulumi (AKS)
Observability	Prometheus, Grafana, Loki, Promtail, Node Exporter, kube-state-metrics
Security	Trivy image scanning with Prometheus metrics export
ML Pipelines	Metaflow, Prefect, Kubeflow, DVC — all wired together
Drift Detection	Evidently — HTML + JSON reports auto-generated
Data Profiling	WhyLabs continuous profiling via whylogs
Cluster Detection	Auto-detects Minikube, Kind, K3s, MicroK8s, EKS, GKE, AKS and adapts

Core Stack

Shell Scripts: Automated shell scripts to run — scripts/linux_documentation.md
Application: FastAPI (Python) — app/app_documentation.md
Containerization: Docker / Podman — app/docker/docker_documentation.md
Orchestration: Kubernetes — app/k8s/documentation.md
CI/CD | GitHub Actions · GitLab CI · ArgoCD |
Infrastructure: Terraform / OpenTofu / Pulumi — platform/infra/documentation.md
Monitoring: Prometheus + Grafana + Loki — monitoring/documentation.md
ML Pipelines | Metaflow · Prefect · Lakefs · Kubeflow · DVC |
ML Tracking | Evidently · WhyLabs |

Prerequisites

Ensure the following tools are installed:

Docker or Podman
kubectl
helm
Terraform / OpenTofu (for cloud deployment)
AWS CLI / Azure CLI / OCI CLI (for respective cloud targets)
A running Kubernetes cluster

Docker without sudo:

sudo usermod -aG docker $USER
newgrp docker

Quick Start

1. Clone and configure

git clone https://github.com/HiteshMondal/devops.git
cd devops

cp .env.example .env
nano .env

See .env.example for all available variables

2. Launch

chmod +x run.sh
./run.sh

The runner will prompt you through:

Target environment  →  local | prod
Deployment mode     →  Full Platform | Custom Selection | ...

# If prod:
Cloud provider      →  aws | oci | azure
Infra action        →  plan | apply | destroy

It then auto-detects your container runtime and Kubernetes cluster, resolves dependencies between components, and executes everything in the correct order.

Component Selection

When launching run.sh, you can deploy the full platform or choose individual components:

Option	Components
Full Platform	Everything
Infrastructure Only	Terraform / OpenTofu / Pulumi
Image Only	Build + push container image
Kubernetes Stack	Image + Kubernetes app
Monitoring Stack	Prometheus + Grafana + Loki + Trivy
App + Monitoring	Kubernetes app + full monitoring
MLOps Stack	Image + Kubernetes + ML pipelines
Custom Selection	Pick each component individually

Target Environments

Local Kubernetes

Distribution	Ingress	Service Type	Notes
Minikube	nginx (addon)	NodePort	Configures Docker env automatically
Kind	nginx (installed)	NodePort	Loads image directly into cluster
K3s	Traefik (built-in)	NodePort	Uses built-in ingress
MicroK8s	nginx (addon)	NodePort	Enables addons automatically

Production Cloud

Provider	IaC Tool	Cluster	Database
AWS	Terraform	EKS	RDS PostgreSQL
Oracle Cloud	OpenTofu	OKE	Autonomous DB (Always-Free)
Azure	Pulumi	AKS	PostgreSQL Flexible Server

Application

The app is a FastAPI service (app/src/main.py) running on port 3000.

Endpoint	Description
`GET /`	App info and environment
`GET /health`	Healthcheck (used by K8s probes)
`GET /predict`	Model inference placeholder
`GET /metrics/summary`	Basic request metrics

The image is built with a multi-stage Dockerfile — a builder stage compiles dependencies, a lean runtime stage runs as a non-root user. Compatible with both Docker and Podman.

Kubernetes Resources

Managed via Kustomize — app/k8s/base/ + app/k8s/overlays/.

Base (app/k8s/base/):

Resource	Purpose
`namespace.yaml`	Dedicated namespace isolation
`deployment.yaml`	App deployment with resource limits
`service.yaml`	ClusterIP / NodePort / LoadBalancer (auto-selected)
`ingress.yaml`	Ingress with configurable host and class
`hpa.yaml`	Horizontal Pod Autoscaler (min 2 → max 10 replicas)
`configmap.yaml`	Runtime configuration injection
`secrets.yaml`	DB credentials, JWT secret, API key
`model-pvc.yaml`	PersistentVolumeClaim for ML model artifacts

Prod overlay (app/k8s/overlays/prod/) adds NetworkPolicy and PodDisruptionBudget.

Observability Stack

Prometheus + Grafana

Deployed to the monitoring namespace via monitoring/deploy_monitoring.sh.

kubectl port-forward svc/prometheus-grafana 3000:80 -n monitoring
# Open: http://localhost:3000

Dashboards are auto-provisioned from monitoring/dashboards/ via ConfigMap.

Loki + Promtail

Deployed via monitoring/loki/deploy_loki.sh. Promtail runs as a DaemonSet and ships pod logs to Loki.

Add Loki as a Grafana datasource:

http://loki.loki.svc.cluster.local:3100

Custom Loki 3.0 dashboard included at monitoring/dashboards/devops-loki-dashboard.json.

Drift Detection + Profiling

Tool	Trigger	Output
Evidently	`monitoring/deploy_monitoring.sh` or `mlops.sh drift`	HTML report + `drift_summary.json`
WhyLabs	`monitoring/deploy_monitoring.sh` (if `WHYLABS_ENABLED=true`)	Profile uploaded to WhyLabs dashboard

Both are controlled via .env:

EVIDENTLY_ENABLED=true
WHYLABS_ENABLED=true
WHYLABS_API_KEY=...
WHYLABS_ORG_ID=...
WHYLABS_DATASET_ID=...

CI/CD Pipelines

GitHub Actions

Triggers on push to main. Configure secrets in Settings → Secrets and Variables → Actions:

DOCKERHUB_USERNAME
DOCKERHUB_TOKEN
KUBECONFIG          ← base64-encoded kubeconfig

GitLab CI

Same stages, configured via Settings → CI/CD → Variables.

Cleanup

./scripts/reset.sh

Deletes containers, local Kubernetes cluster state, and networks.

Project Structure

.
├── run.sh                          # Main orchestrator
├── scripts/
|   ├── install.sh                  # Dependency installer
|   ├── reset.sh                    # Cleanup script
├── app/
│   ├── src/                        # FastAPI application
│   ├── k8s/                        # Kubernetes manifests (Kustomize)
│   └── docker/                     # Docker build + compose
├── ml/
│   ├── configs/                    # ML configuration YAMLs
│   ├── data/                       # Raw, processed, features
│   ├── models/artifacts/           # Trained model + metrics
│   ├── pipelines/                  # DVC, Metaflow, Prefect, Kubeflow
│   └── experiments/                # Comet, MLflow
├── monitoring/
│   ├── prometheus_grafana/         # kube-prometheus-stack values
│   ├── loki/                       # Loki Kustomize overlays
│   ├── evidently/                  # Drift detection + reports
│   ├── whylabs/                    # Continuous data profiling
│   ├── trivy/                      # Security scanning
│   └── dashboards/                 # Pre-built Grafana dashboard JSONs
└── platform/
    ├── cicd/                       # GitHub, GitLab, ArgoCD configs
    ├── infra/                      # Terraform / OpenTofu / Pulumi
    ├── lib/                        # Shared shell library (logging, colors)
    └── mlops/                      # MLOps runner + validator

Author

Hitesh Mondal — DevOps · Cloud · MLOps · Cybersecurity

License

Open for learning and demonstration purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 405 Commits
.dvc		.dvc
.github/workflows		.github/workflows
app		app
ml		ml
monitoring		monitoring
notebooks		notebooks
platform		platform
scripts		scripts
.dvcignore		.dvcignore
.env.example		.env.example
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
README.md		README.md
data.dvc		data.dvc
params.yaml		params.yaml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End DevOps + MLOps Platform

Overview

Architecture

Key Features

Core Stack

Prerequisites

Quick Start

1. Clone and configure

2. Launch

Component Selection

Target Environments

Local Kubernetes

Production Cloud

Application

Kubernetes Resources

Observability Stack

Prometheus + Grafana

Loki + Promtail

Drift Detection + Profiling

CI/CD Pipelines

GitHub Actions

GitLab CI

Cleanup

Project Structure

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

End-to-End DevOps + MLOps Platform

Overview

Architecture

Key Features

Core Stack

Prerequisites

Quick Start

1. Clone and configure

2. Launch

Component Selection

Target Environments

Local Kubernetes

Production Cloud

Application

Kubernetes Resources

Observability Stack

Prometheus + Grafana

Loki + Promtail

Drift Detection + Profiling

CI/CD Pipelines

GitHub Actions

GitLab CI

Cleanup

Project Structure

Author

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages