Vertical Pod Autoscaler (VPA) in IT Common Platform#

The Vertical Pod Autoscaler (VPA) provides recommendations for CPU and memory requests based on observed usage and can automatically apply them.

How VPA Calculates Recommendations#

For each container, VPA calculates:

lowerBound — the safe minimum request.
target — what VPA recommends as the optimal request.
upperBound — the upper safe limit based on observed usage.
uncappedTarget — the optimal resource request based purely on observed usage, ignoring minAllowed and maxAllowed constraints. This value is useful for understanding the ideal resource needs of a container.

These values can be seen in Headlamp under Configuration → Name of VPA → Recommendations.

Update Modes (`spec.updatePolicy.updateMode`)#

These modes control how and when VPA applies recommendations. Note that Auto is deprecated.

Mode	Description
`Off`	Only computes recommendations and stores them; does not modify any existing pods.
`Initial`	Applies recommendations only when pods are first created. Once running, pods are not modified.
`Recreate`	May evict running pods (by deleting them) and spawn new pods with updated resource requests.
`InPlaceOrRecreate`	Attempts non-disruptive in-place resizing first; if not possible, falls back to Recreate. Requires Kubernetes version that supports in-place vertical scaling.
`Auto` (Deprecated)	Previously continuously updated resource requests. Behavior overlaps with `Recreate`. Use is discouraged.

Examples#

Example 1: Recommendation-Only Mode (`Off`)#

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: it-common-platform-headlamp-recommend
  namespace: it-common-platform-headlamp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: headlamp
  updatePolicy:
    updateMode: Off

No changes to running pods; only recommendations are collected.
Use this mode when you want visibility first without risk.

Example 2: InPlaceOrRecreate Mode#

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: it-common-platform-headlamp-inplace
  namespace: it-common-platform-headlamp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: headlamp
  updatePolicy:
    updateMode: InPlaceOrRecreate

Tries to update resource requests in-place without killing pods.
Falls back to recreating pods if in-place update is not possible.
Requires a Kubernetes version that supports in-place vertical scaling.

Viewing Recommendations in Headlamp#

Open Headlamp.
Navigate to Configuration → VPAs.
Select the VPA object for your workload (for example, it-common-platform-headlamp-recommend).
Under Recommendations, see lowerBound, target, upperBound and Uncapped Target for each container.

Best Practices#

Start with Off mode to collect recommendations before applying changes.
Use Recreate for workloads where pod restarts are acceptable and multiple replicas/PodDisruptionBudgets are in place.
Use InPlaceOrRecreate for less disruptive updates if your cluster supports in-place vertical scaling.
Avoid Auto since it is deprecated.
Ensure resource requests and limits are defined initially.
Use target values from recommendations as guides, respecting the lower and upper bounds.
Review memory recommendations carefully; under-provisioning may lead to OOMKills.

VPA Update Mode Comparison#

Mode	Disruption Risk	Kubernetes Version Requirement	Ideal Use Case
`Off`	None	Any	Collect recommendations safely without changing pods.
`Initial`	Low (applies only at pod creation)	Any	Jobs or workloads where pods are short-lived or can be created fresh.
`Recreate`	High (pods may be evicted)	Any	Stable workloads where pod restarts are acceptable; multiple replicas and PDBs recommended.
`InPlaceOrRecreate`	Medium (tries in-place first, fallback to recreate)	Kubernetes >= 1.24 (or version supporting in-place vertical scaling)	Minimize disruption while updating resources; beta feature.
`Auto` (Deprecated)	High	Any (deprecated)	Avoid using; previously continuously updated resources automatically.

API Reference