0
H100 GPUs up
In-house covers 30 concurrent renders. Spike above that and the scheduler spills the extra to Modal, which provisions real H100 GPUs and scales to zero. Toggle to the real cold-start timing to see the honest cost.
get_current_stats(); the
131 s and the H100 burn are measured/list-price values (see “Proven”).Measured against the live production deployment (the repo's e2e CI test).
| What | Measured |
|---|---|
| Cold provision — wan-s2v renderer (scale-zero → serving) | 131 s < 3 min |
| Warm provision | ~0.4 s < 1 min |
| Memory-snapshot restore (alpha) | ~57 s |
| Production API surface | 74 / 74 endpoints served (curl-verified) |
| Render throughput | ~365 ms / block · ~44 s / clip (offline) |
| Idle cost | $0 (scales to zero) |