You do not get low latency by wishing for it
Teams often talk about start time as if it is a single number produced by clever runtime engineering. In practice, startup latency is a systems question: how much capacity is already warm, how state is resumed, and whether the control plane can route work without rebuilding the world.
The Tennant CLI work makes that concrete. We are already experimenting with host pools, warm allocations, runtime hibernation, cross-host resume paths, and end-to-end latency benchmarks rather than relying on abstract claims.
Hot and warm tiers solve different problems
A hot tier gives you the fastest path to an immediately usable environment. A warm tier controls cost by keeping hosts stopped or hibernated but recoverable. The right platform blends both so operators can trade spend against responsiveness intentionally.
That is the model we want Tennant to expose more clearly: a capacity story, not just a VM story. Users should understand whether they are paying for hot readiness, warm recovery, or fully ephemeral execution.
Benchmarks should map to real operator choices
The important benchmark is not a synthetic boot number in isolation. It is the time from request to useful environment under the exact policy the team chose. That includes whether a sandbox is persistent, hibernated, restored on the same host, or resumed elsewhere.
As Tennant matures, this becomes one of the clearest product differentiators: not just saying starts are fast, but showing which infrastructure choices make them fast and how that affects cost.