Hosting Resource Archival Guideline
Department
Summary
How to retire a hosting VM whose contract has expired and is not being renewed: snapshot the data, clean up the deploy controller, destroy the VM, remove DNS, then flip the registry record to Archived. Order matters — cleaning the deploy controller after the VM is destroyed is the most common foot-gun and silently leaves orphan state behind.
Table of Contents
| Section | |
|---|---|
| 1 | Summary |
| 2 | Purpose |
| 3 | Scope |
| 4 | Definitions |
| 5 | Pre-flight Checklist |
| 6 | Step-by-Step Process |
| 7 | Verification |
| 8 | Rollback (Customer Comes Back) |
| 9 | Provider Quick Reference |
Purpose
This guideline standardises the cleanup of expired hosting resources so that snapshots are kept consistently, billing stops promptly, DNS doesn’t dangle, and the deploy controller (Coolify, etc.) doesn’t accumulate orphan entries that a future engineer has to clean by hand.
| Outcome |
|---|
Every expired hosting resource is either fully archived (snapshot kept, VM and dependent state removed, registry flipped to Archived) or explicitly left as Expired pending a renewal decision — never silently abandoned in a half-cleaned state. |
Scope
| # | In scope | Out of scope |
|---|---|---|
| 1 | Resource records of type Hosting with status Expired and confirmed non-renewal | Other Resource types (Domain Name, MA contracts, SSL certificates) |
| 2 | VMs at any cloud provider Kiluth uses (DigitalOcean, etc. — see Provider Quick Reference) | Customer-managed infrastructure outside Kiluth’s accounts |
| 3 | DNS records under kiluth.com and any other zone Kiluth manages | DNS records owned by the customer or third-party registrars |
| 4 | Deploy controller state (Coolify) tied to the destroyed VM | Application data already inside the snapshot |
Definitions
| # | Term | Meaning |
|---|---|---|
| 1 | Expired | Contract end date passed; cleanup hasn’t happened yet. VM usually still running and billing. |
| 2 | Archived | VM destroyed at the provider, snapshot retained. Default end state. |
| 3 | Deleted | VM and snapshot both gone. Rare — only when the customer explicitly asks for full data destruction. |
| 4 | Snapshot | A point-in-time block-level copy of the VM’s disk that can be restored to a fresh VM later. |
| 5 | Deploy controller | The system Kiluth uses to ship apps to the VM (currently Coolify at https://coolify.kiluth.com). May or may not have an entry for any given VM. |
| 6 | Registry | The Resource doctype in portal.kiluth.com, source of truth for which hosting resources exist and their status. |
Pre-flight Checklist
Confirm all of these before touching the provider console.
| # | Check |
|---|---|
| 1 | The PM and customer have explicitly confirmed non-renewal. If renewal is even possible, leave the resource as Expired and revisit later. |
| 2 | The IP / identifier on the Resource record matches the VM you’re about to destroy. The Resource record’s identifier field holds the IP. |
| 3 | You have console access to the provider, the deploy controller (if applicable), the DNS provider, and portal.kiluth.com. |
| 4 | You know which provider quick-reference row to follow (see Provider Quick Reference). |
Step-by-Step Process
Tick each box on the Resource form’s Archival Checklist as you complete the corresponding step.
Step 1. Locate the VM at the Provider
Find the VM by its IP / identifier from the registry. Click into it to confirm:
| # | Cross-check |
|---|---|
| 1 | The VM’s project / tag matches the Resource’s project field. |
| 2 | The VM size (RAM / disk / region) is what you expect, so you can sanity-check the snapshot size later. |
| 3 | The VM is still running. If it’s been off for a while, decide whether the snapshot is still useful or whether you can skip straight to destroy. |
Step 2. Take a Snapshot
Use the provider’s snapshot/image action. Accept the default snapshot name when the provider auto-generates one — see Provider Quick Reference for the naming pattern Kiluth uses.
| # | Snapshot rule |
|---|---|
| 1 | Live snapshots are fine for expired non-production customer environments. |
| 2 | Power-off-then-snapshot is only required when the VM is actively serving writes you can’t afford to half-capture (e.g. a live database under load). |
| 3 | Wait for completion before moving on — the snapshot row should show a real size and a Created … minute ago timestamp, not a “Taking Snapshot” / “In Progress” indicator. |
✅ Tick Snapshot Taken on the Resource form. Paste the snapshot name/ID into Snapshot ID.
Step 3. Verify the Snapshot Independently
Visit the provider’s Snapshots / Images list (not just the VM’s own backups tab) and confirm the row exists with a real, non-zero size. Once the VM is destroyed, a missing snapshot means the data is unrecoverable — verify before going further.
Step 4. Clean the Deploy Controller (only if VM is Coolify-managed)
⚠️ Order matters — do this BEFORE destroying the VM. Coolify’s cascade-delete tries to SSH the host to gracefully shut down containers. With the host alive that takes seconds; with the host already destroyed it silently hangs and nothing gets removed.
Skip this step entirely if the VM was never registered as a Coolify Server. Quick check: search the customer slug at https://coolify.kiluth.com/servers — if nothing matches, skip to Step 5.
If there is a match, you’ll typically have:
| # | Coolify object | Where |
|---|---|---|
| 1 | A Server entry | Servers → your VM’s name |
| 2 | An Environment under the customer’s Project | Projects → PROJ-XXXX - <customer> - <app> → dev / uat / staging / production |
| 3 | The Environment’s child Resources | App + Postgres/MySQL + Redis + any object-store services |
Cleanest path: Server → Danger Zone → Delete with the “Delete all resources (N total)” checkbox checked, then type the server name on the next step → Continue. That cascades through the Application + Databases + Services in one shot.
After the Server is gone, the now-empty Environment can be deleted via Project → Environment → Delete Environment (type the env name to confirm).
🩹 Recovery — if you already destroyed the VM and Coolify entries are stuck: the UI cascade-delete will silently fail. Don’t try clicking each child resource’s “Delete” — same SSH-to-dead-host hang. Use one of the escape hatches below.
| # | Escape hatch | How |
|---|---|---|
| 1 | Coolify API | DELETE /api/v1/servers/{uuid}?delete_associated_volumes=true with the COOLIFY_TOKEN from the frappe_docker GitHub repo secrets (same token the auto-deploy workflow uses). Get the server UUID from the URL on its Danger Zone page. |
| 2 | Direct SQL on Coolify’s Postgres | DELETE the orphan rows from servers, applications, standalone_postgresqls, standalone_redis, services referencing the dead server. |
Step 5. Destroy the VM
Use the provider’s destroy/terminate action. Two universal rules at the destroy modal:
| # | Rule |
|---|---|
| 1 | Snapshot must NOT be deleted in the destroy modal. Most providers let the destroy action sweep snapshots if you opt in — leave that option unchecked. |
| 2 | Type the VM name to confirm. Most providers require this. |
Wait for the success toast. Confirm the VM no longer appears in the provider’s VM list.
✅ Tick VM Destroyed on the Resource form.
Step 6. Clean DNS
Open the DNS console. Search for records pointing to the VM’s IP, typically by the customer-app slug.
| # | What to expect | What to do |
|---|---|---|
| 1 | An apex A record on the convention <customer-app>.<env>.kiluth.com | Delete it. |
| 2 | A wildcard A record (*.<customer-app>.<env>.kiluth.com) | Delete it. |
| 3 | Any related CNAME / TXT records | Delete or repoint, depending on whether they outlive the VM. |
| 4 | Any record’s IP that does not match the destroyed VM | Leave alone — investigate before touching anything you didn’t expect to find. |
Most DNS consoles support bulk-select + delete. Always type-to-confirm if the console offers it.
✅ Tick DNS Records Cleaned on the Resource form.
Step 7. Flip Registry Status to Archived
On the Resource form in portal.kiluth.com, change Status from Expired → Archived and save.
The doctype’s _auto_set_status early-returns on terminal states (Archived, Deleted), so the value sticks even though before_save runs. The daily scheduler is idempotent on terminal states for the same reason.
If you prefer a one-liner from the browser console on the Resource page:
fetch('/api/method/frappe.client.set_value', {
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
'X-Frappe-CSRF-Token': frappe.csrf_token,
},
body: new URLSearchParams({
doctype: 'Resource',
name: 'RESOURCE-XXXXX',
fieldname: 'status',
value: 'Archived',
}),
}).then(r => r.json()).then(console.log);Verification
You should be able to assert all of the following before considering the archival complete.
| # | Check | Where to look |
|---|---|---|
| 1 | VM no longer appears in the provider’s VM list | Provider VM/instance list, all teams/projects |
| 2 | Snapshot row exists with the expected size | Provider Snapshots/Images list |
| 3 | Customer-app slug returns zero rows in DNS | DNS console search |
| 4 | Deploy controller entry is gone (or flagged per Step 4 recovery callout) | Deploy controller’s server/host list |
| 5 | Resource record’s status is Archived | portal.kiluth.com Resource list filtered to Archived |
| 6 | All three Archival Checklist boxes are ticked and Snapshot ID is populated | The Resource form itself |
Rollback (Customer Comes Back)
Snapshots are retained indefinitely at the provider (cost: roughly used-disk-GB × the provider’s snapshot rate per month — small for typical app servers; check the Provider Quick Reference for current rates). To revive a customer:
| # | Step |
|---|---|
| 1 | At the provider: Snapshots → click the snapshot → Create VM (pick the same size and region). |
| 2 | At the DNS provider: re-create the apex + wildcard A records pointing at the new VM’s IP. |
| 3 | At the deploy controller (if applicable): re-add the new VM as a Server, then re-create the Environment + Application referencing it. |
| 4 | In the registry: create a fresh Resource record (don’t reuse the old one — that one is the historical archive). Link it to the same Project and Customer. |
Provider Quick Reference
The flow above is provider-agnostic. The concrete tools per provider:
DigitalOcean
| # | Action | Where |
|---|---|---|
| 1 | Locate VM | https://cloud.digitalocean.com/droplets — switch DigitalOcean teams if not found in the first one |
| 2 | Take snapshot | Droplet → Backups & Snapshots → Take a Snapshot → Take Live Snapshot. Default name format DigitalOcean fills in: <droplet-name>-<unix-ms>. Accept it. |
| 3 | Verify snapshot | https://cloud.digitalocean.com/images/snapshots |
| 4 | Destroy VM | Droplet → Settings → Destroy → leave snapshot unchecked under Associated Resources → type droplet name → Destroy |
| 5 | DNS provider | Cloudflare — open the kiluth.com zone DNS records page from the Cloudflare dashboard |
| 6 | Snapshot pricing | DigitalOcean charges per used-disk-GB per month (check the DigitalOcean snapshot pricing page for the current rate). |
Other providers
Add a new subsection here when Kiluth adopts a new provider. Cover at minimum: where the VM list lives, snapshot action and naming pattern, where to verify snapshots, destroy flow, and snapshot pricing.