Back

Kubernetes

Kubernetes PVC Cleanup: Find Persistent Volumes Nobody Uses

Kubernetes PVC cleanup should start with the claim, the mounted workload, and the storage class, not with a list of expensive disks. A persistent volume claim can look abandoned because the pod is gone, but the data may still be needed for rollback, audit, migration, or a batch job that has not run yet.

The useful output is a PVC retirement decision: who owns the data, what mounted it, when it was last written, whether a backup or export exists, what reversible action comes first, and what rule prevents the same orphaned storage from returning. PVC cleanup is cost cleanup, but it is also data-risk cleanup.

Key takeaways

  • Treat every PVC as data until an owner proves it is disposable.
  • Check mounts, StatefulSets, Jobs, storage class behavior, reclaim policy, snapshots, and restore expectations before deleting anything.
  • Prefer detach, snapshot, archive, or expiry review before final removal.
  • Record the evidence in the same place the workload is managed, especially if GitOps or Helm can recreate the claim.
  • Prevent future waste by changing how stateful workloads request storage.

Separate Unmounted From Disposable

An unmounted PVC is only a candidate. It is not proof that the data is safe to remove. A PVC can be temporarily unmounted during a migration, a failed rollout, a restore exercise, or a StatefulSet rename. Cleanup starts by separating “not mounted now” from “no longer needed.”

Use a review table that speaks in storage terms, not generic asset terms.

FieldWhy it matters
ClaimNamespace, PVC name, requested size, access mode, storage class, and bound PV
Mount pathPod, StatefulSet, Deployment, Job, or CronJob that mounted the claim
Data roleCache, queue, upload staging, database files, search index, artifact store, or unknown
Last write evidenceApplication metric, filesystem timestamp, database checkpoint, backup log, or owner confirmation
Reclaim behaviorWhether deleting the claim retains or deletes the underlying volume
First actionKeep, resize, snapshot, detach, archive, mark expiring, or remove after approval

This table keeps the review honest. A cache volume with a rebuild path is different from a database volume with the only copy of production-like test data.

Evidence That A PVC Has No Current Consumer

The strongest PVC evidence connects Kubernetes objects to application reality. Kubernetes can show a bound claim, pods that reference it, events, storage class, and metadata. It does not know whether the data still has legal, debugging, or migration value.

CheckWhat to look forCleanup signal
Pod mountsvolumes.persistentVolumeClaim.claimName, mounted paths, and recent pod restartsNo active or expected pod references the claim
Controller sourceStatefulSet volume claim templates, Helm values, GitOps manifests, and release historyThe claim is not recreated by current deployment config
Bound volumePV reclaim policy, storage class, capacity, zone, and CSI driverFinal action and recovery path are understood
Data protectionSnapshots, backup jobs, restore tests, retention policy, and export locationData is retained elsewhere or explicitly disposable
Workload scheduleCronJobs, paused Jobs, migration plans, and release calendarsQuiet period covers the real use pattern
Owner reviewService owner, data owner, platform owner, or product owner signoffSomebody accountable accepts the risk

Avoid single-metric decisions. Low I/O can mean “unused”, but it can also mean “cold data”, “ready for restore”, or “only written during incidents.” A missing pod can mean “orphaned”, but it can also mean “StatefulSet temporarily removed during a failed deploy.”

When evidence conflicts, choose an intermediate state: snapshot, mark for expiry, attach an owner, or create a ticket with a short review window.

Read-Only PVC Scan

Use kubectl read-only commands to build the candidate list. The current kubectl docs support get across all namespaces, wide output, label and field selectors, JSON output, and describe for detailed inspection.

kubectl get pvc --all-namespaces
kubectl get pv -o wide
kubectl get pods --all-namespaces -o json
kubectl describe pvc $PVC_NAME -n $NAMESPACE
kubectl get events -n $NAMESPACE --field-selector involvedObject.name=$PVC_NAME

The JSON pod output is useful for finding claimName references in workload volumes. It does not prove the underlying filesystem is empty, backed up, or safe to discard. Pair it with application and backup evidence.

Decide Between Resize, Archive, And Removal

PVC cleanup does not always mean deletion. The correct action depends on data role, confidence, and reversibility.

SituationBetter first moveWhy
Oversized active PVCResize when the storage class and filesystem support the path, or create a smaller replacement during migrationThe workload still needs state, just not that much
Unmounted cache volumeMark expiring, confirm rebuild path, then remove after owner approvalCache loss is usually recoverable but still operationally noisy
Old migration volumeSnapshot or archive metadata, then delete after the migration owner signs offMigration data can be useful for rollback or audit
Database-like filesVerify backup, restore test, retention requirement, and application owner approvalData loss risk dominates storage savings
Unknown ownerLabel, ticket, and quarantine with a review dateLack of ownership is not deletion evidence

Track the cleanup candidate with a simple priority score:

ScoreGood signBad sign
ImpactMeaningful spend, risk, toil, noise, or confusion disappearsThe item is cheap and low-risk but politically distracting
ConfidenceOwner, purpose, and dependency path are understoodThe team is guessing from age or name
ReversibilityRestore, recreate, re-enable, or rollback path existsDeletion would be the first real test
PreventionA rule can stop recurrenceThe same pattern will return next month

Start with high-impact, high-confidence, reversible candidates. Defer confusing items only if they get an owner and a date; otherwise “defer” becomes another word for keeping waste permanently.

PVC Cases That Need Patience

Some cleanup candidates are supposed to look quiet. Do not rush these cases:

  • StatefulSet PVCs after a failed rollout, rename, or chart migration.
  • Claims used by CronJobs that run monthly, quarterly, or after delayed upstream delivery.
  • Volumes containing database files, uploaded customer assets, search indexes, or queues.
  • Claims with Retain reclaim behavior where deleting the PVC leaves a PV that still needs a plan.
  • Claims in regulated, audit, incident, or security-analysis namespaces.

For these cases, use a longer observation window, explicit owner approval, and a staged reduction. The point is not to avoid cleanup; it is to avoid making the first proof of dependency an outage.

Run The PVC Review

Run Kubernetes PVC cleanup as a data retirement review, not an open-ended cluster hygiene project.

  1. Export PVCs, PVs, storage classes, and pod volume references for one cluster or namespace group.
  2. Add owner, data role, mount history, reclaim behavior, backup evidence, and risk if wrong.
  3. Remove false positives such as active StatefulSet claims and restore targets.
  4. Ask owners to choose keep, resize, snapshot, archive, expire, remove, or investigate.
  5. Apply the least permanent useful action first and record the watch signal.
  6. Complete final removal only after the review window covers the workload’s real schedule.
  7. Save the evidence with the workload manifest, Helm release, GitOps app, or platform ticket.

For broader cleanup planning, use the cleanup library to pair this guide with related notes. Use the main cloud cost checklist to decide whether the cleanup work has enough upside for a focused sprint. For infrastructure cleanup, the main cloud cost optimization checklist is a useful companion.

Prevent Orphaned PVCs At Creation Time

Prevention should change how stateful workloads request storage. Owner labels help, but PVC waste usually returns when teams can create durable storage without declaring its data role, backup plan, and retirement path.

  • Require PVC labels for owner, data role, environment, retention class, and backup policy.
  • Make temporary environments use storage classes and quotas designed for short-lived data.
  • Add Helm or GitOps review checks for large requested sizes and missing expiry metadata.
  • Document whether each stateful workload can rebuild, restore, or safely discard its volume.
  • Put PVC age, requested size, storage class, and mount status into the platform dashboard.

The recurring review should be short: sort by impact, pick the unclear items, assign owners, and close the loop on anything nobody claims. If the review keeps producing the same class of candidate, fix the creation path instead of celebrating repeated cleanup.

Example Decision Record

Use a compact record so the cleanup can be reviewed later without reconstructing the whole investigation.

FieldExample entry for this cleanup
CandidateUnused persistent volume claims in Kubernetes clusters
Why it looked staleBound claim with no current pod mount, old namespace, oversized request, or completed migration
Evidence checkedPod volume references, StatefulSet templates, PV reclaim policy, storage class, backups, snapshots, and owner signoff
First reversible moveSnapshot, archive metadata, mark expiring, or detach from a retired workload
Watch signalRestore request, failed batch job, missing uploaded file, application error, or owner complaint
Final actionRemove only after backup and retention checks match the data role
Prevention ruleRequire owner, data role, retention class, backup policy, and expiry metadata for new claims

This record is intentionally small. If the decision needs a long narrative, the candidate is probably not ready for removal yet. Keep investigating until the owner, evidence, reversible move, and prevention rule are clear.

FAQ

How often should teams review Kubernetes PVCs?

Use a window long enough to include batch schedules, traffic peaks, and deployment cycles for the first decision, then set a recurring cadence based on change rate. Fast-moving non-production systems may need monthly review; slower systems can be quarterly if every unclear item has an owner and a review date.

What is the safest first action for an unused PVC?

The safest first action is usually ownership repair plus mount evidence. After that, snapshot or mark the claim for expiry before final deletion, especially when the data role is unclear.

What should not be removed quickly?

Do not rush claims with database-like files, customer uploads, queue state, audit data, restore targets, rare batch workloads, or Retain reclaim behavior that leaves another storage object to manage.

How do you make the decision useful later?

Write the decision as a small operational record: candidate, owner, evidence, chosen action, watch signals, rollback path, final date, and prevention rule. That format helps future engineers, search engines, and AI assistants understand the cleanup without guessing.