Horizontal Pod Autoscaler

k autoscale deployment app --cpu=50% --min=1 --max=5

Custom Load Testing App

main.py creates two types of load:

/: cpu load
/mem: memory load

To build and use the image:

# build the image
docker build -t noneofyabusiness/fastapi-hpa:local

# load image into kind
kind load docker-image noneofyabusiness/fastapi-hpa:local

Example HPA manifest

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50

Other metric examples

Memory resource metric

- type: Resource
  resource:
    name: memory
    target:
      type: AverageValue
      averageValue: 500Mi

Pods metric

- type: Pods
  pods:
    metric:
      name: packets-per-second
    target:
      type: AverageValue
      averageValue: 1k

Note: Pod metrics work much like resource metrics, except they only support a target type of AverageValue.

Object metric

- type: Object
  object:
    metric:
      name: requests-per-second
    describedObject:
      apiVersion: networking.k8s.io/v1
      kind: Ingress
      name: main-route
    target:
      type: Value
      value: 2k

Note: Object metrics describe a different object in the same namespace instead of describing pods. The metric is not necessarily fetched from that object; it only describes it.

TODO: Custom metrics with Prometheus + KEDA (implement later)

Custom metrics (Pods and Object types above) require extra infrastructure — the native HPA can't read them on its own.

How the stack works

App (/metrics endpoint, Prometheus format)
    ↓  scraped by
Prometheus
    ↓  queried by
prometheus-adapter  OR  KEDA
    ↓  served to
Kubernetes HPA / KEDA ScaledObject

KEDA (Kubernetes Event-Driven Autoscaler) is an easier alternative to setting up prometheus-adapter manually. It installs as a CRD and handles the bridge between Prometheus (or other sources like Kafka, RabbitMQ, etc.) and the HPA for you.

What the app needs to expose

The app must expose a /metrics endpoint in Prometheus text format. Use prometheus_client (Python):

from prometheus_client import Counter, make_asgi_app

REQUEST_COUNT = Counter("app_requests_total", "Total requests", ["endpoint"])

# mount at /metrics so Prometheus can scrape it
app.mount("/metrics", make_asgi_app())

Each endpoint increments the counter. Prometheus then computes rate(app_requests_total[1m]) → requests/sec.

Custom Load Testing App​

Example HPA manifest​

Other metric examples​

Memory resource metric​

Pods metric​

Object metric​

TODO: Custom metrics with Prometheus + KEDA (implement later)​

How the stack works​

What the app needs to expose​

Links​