DEEP REVIEW GPU CLOUD · 2026 UPDATED NOV 8

Vast.ai is the right GPU cloud if you can pick a good host and your workload survives an interrupt.

Vast.ai is the eBay of GPU clouds. Independent operators list their hardware, you bid, and whoever wins runs your container. The result is the cheapest hourly H100 SXM rate published anywhere ($1.60-1.99/hr interruptible), with a fleet of host quality that ranges from 'university lab on free fiber' to 'two NVIDIA cards taped to a Ryzen in someone's basement'. Read carefully.

Abstract decentralized network diagram, illustrative for a Vast.ai marketplace review.
FIG 1.0 — VAST.AI, CATEGORY ILLUSTRATIVE Image: Conny Schneider · Unsplash
The verdict

The first product we've reviewed in three years that we'd actually buy ourselves.

Vast.ai doesn't just match the spec sheet — it changes the shape of how a team operates. There are real gaps (we'll get to them) but they're operational, not foundational.

83
HARDTECH SCORE · #10 of 10
Across 2,840 verified user reviews
Start free trial

How we tested

Same testing window as Lambda, RunPod, and CoreWeave (Feb 14 to May 1, 2026). We provisioned 12 Vast.ai instances across 8 different host operators, filtering by reliability score ≥9.5 and recent activity. Total spend at Vast.ai: $987 — markedly lower than peers because the rates are markedly lower.

We tested on-demand and interruptible tiers separately. For the interruptible tier we deliberately ran during peak hours to surface preemption frequency. We saw 2 interrupts on 12 jobs (16.7%), all on hosts with reliability score below 9.7.

  • Llama 3.1 8B fine-tune, same dataset, FSDP across 2 GPUs (rare to find 4-GPU Vast hosts).
  • Llama 3.1 70B inference, vLLM 0.7+, FP8, batch 32.
  • Host variance sampling, same workload across 8 different host operators.
  • Bandwidth probe, iperf3 to known endpoints from each host.
  • Preemption frequency, interruptible bids sampled during US business hours.

The raw data shows what Vast.ai actually is: a marketplace where median is great and the tails are wide.

The verdict, in 60 seconds

GAX Score: 83/100. Vast.ai wins the cheapest-hourly-anywhere category outright. Marketplace structure means capacity is always there. Host filtering lets you sort by reliability. $1 sign-up credit makes the first experiment free.

Buy it if you're cost-sensitive, your workload tolerates restart (batch jobs, fine-tuning, exploration), and you're willing to do basic host vetting. Skip it if you need an SLA, you're touching production traffic, your workload requires HIPAA / SOC 2 attestation, or you don't have the time to filter hosts. The variance is real; the savings are also real.

Where the 83 comes from

Vast.ai's score profile is the most polarized of any provider we measured. It's #1 in the rubric on Pricing and Spot Availability. It's bottom-three on Trust and Latency. Buy it for what it's good at, not what it isn't.

Dimension Weight Vast.ai What it measures
Throughput (FP8) 20% 78 Median across reliable hosts; tails wide on cheap-tier hosts
Pricing per GPU-hr 18% 99 $1.60-1.99/hr H100 SXM is the floor of the market
Software stack 14% 75 BYO container, no Lambda Stack equivalent, no templates
Latency 12% 70 Host-dependent, some hosts have visible packet loss
Trust & uptime 10% 64 Marketplace, not a managed cloud; no provider SLA
Support 10% 70 Forum-based, email reachable, no live support
Spot availability 8% 96 Marketplace always has capacity, never 'Coming back soon'
Regions 8% 75 Global by host distribution, not by Vast.ai data centers

The two bottom scores (Trust 64, Latency 70) are structural. Vast.ai is not selling you a managed service, they're selling you a market. If you grade them on what they actually are, the composite is closer to 92.

What it gets right

The hourly rate is genuinely lower

Median Vast.ai H100 SXM listing across our sampling: $1.79/hr interruptible, $2.29/hr on-demand. The cheapest credible listing (reliability ≥9.7, 90+ days history): $1.60/hr interruptible. There is no other public market with H100s priced below $2/hr.

For a researcher running 200 GPU-hours of fine-tuning a month, that's $358 vs $598 on Lambda on-demand. Across a year, real money.

Marketplace means capacity is always there

We never saw 'Coming back soon' on Vast.ai during the test window. Marketplace structure means at any given moment some host somewhere is offering H100 capacity. You may not get the cheapest one or the best-quality one, but you'll get one. That's structurally different from Lambda or CoreWeave, where capacity is the bottleneck.

For burst experiments where you'd rather start a slightly-worse host now than wait for the perfect one, Vast.ai is uniquely valuable.

Host filtering is dense and useful

The Vast.ai console lets you filter by GPU model, GPU memory, host country, datacenter vs not, reliability score, network speed (down/up), Docker support, and 20+ other dimensions. Once you've calibrated which filters matter for your workload (we suggest reliability ≥9.5, datacenter-grade, ≥1 Gbps both directions), you can spin up reproducibly across hosts.

This is a feature most managed clouds don't have because they don't need it. On Vast.ai it's essential and Vast.ai builds it well.

True pay-as-you-go with no minimum

Top up $5, run a $4.40 experiment, log out. No subscriptions. No minimum commits. No 'Contact us'. The $1 sign-up credit is enough to validate the workflow before any real money goes in.

This is the opposite of CoreWeave's contract motion and refreshing if you're sick of sales calls. For students, hobbyists, and one-off experiments, Vast.ai is the only major provider with a friction profile this low.

Where it falls short

Host variance can be punishing

We sampled 8 Vast.ai H100 SXM hosts running an identical Llama 70B inference workload. Best host: 1,879 tok/s. Worst host (still reliability ≥9.5): 1,683 tok/s. That's an 11.7% spread. Two of the eight had visible packet loss to our test endpoint, one had 100 Mbps egress cap that bottlenecked anything serving traffic.

Filter aggressively or expect to do a 'rent for an hour, benchmark, decide' loop before settling on a host. Time spent doing this counts against the price savings.

Interrupt risk on the cheap tier

Two of our 12 interruptible bids got preempted during the test window. Both during US business hours. One was 6 hours into an 8-hour fine-tune. The host re-allocated to a higher bidder; we lost the checkpoint at the in-flight epoch (our fault for not checkpointing aggressively).

You can mitigate with on-demand pricing (host cannot preempt you, costs 1.2-1.5x more) or checkpoint every N steps. But the risk is real and it's part of the deal.

No SLA, no recourse if a host disappears

Hosts on Vast.ai are independent operators. If a host's datacenter loses power, their internet drops, or they just turn off the listing, your container is gone. Vast.ai isn't going to compensate you because it isn't Vast.ai's host.

This happened to us once during testing. The host went offline mid-job, no warning, no recovery. We re-provisioned on a different host within 4 minutes. Not catastrophic for our workload; would be catastrophic for production inference.

Zero compliance posture

No SOC 2 Type II for the platform (Vast.ai the company has minimal certifications; individual hosts vary). No HIPAA, no FedRAMP, no BAA. No DPIA-ready data processing agreements. If a CISO or compliance team is involved in your buying decision, Vast.ai is off the table immediately.

Vast.ai doesn't claim otherwise. They sell a marketplace, not a managed service. That's fine for the buyer they actually serve.

Network bandwidth is host-dependent

Each host sets its own egress and ingress policy. We saw hosts with unlimited 10 Gbps both directions and hosts capped at 100 Mbps egress. For training jobs that stay local to the GPU, this barely matters. For inference serving traffic or dataset transfer from S3, it matters a lot.

The console exposes this in the filter but you have to remember to set it. Default behavior: rent first, discover bandwidth bottleneck later.

Pricing reality

Marketplace pricing fluctuates ±10% week to week based on host supply. The table below is median observed in May 2026, filtered to hosts with reliability ≥9.5.

GPU Interruptible On-demand Lambda on-demand Notes
H100 SXM 80GB $1.79/hr $2.29/hr $2.99/hr Best price on the market
H100 PCIe 80GB $1.59/hr $1.99/hr $2.49/hr More common than SXM on Vast
A100 SXM 80GB $0.79/hr $1.09/hr $1.79/hr Sweet spot for fine-tuning
A100 PCIe 40GB $0.49/hr $0.69/hr n/a Cheapest viable LLM inference GPU
A6000 48GB $0.35/hr $0.49/hr $0.80/hr Best deal for SDXL work
RTX 4090 24GB $0.19/hr $0.29/hr n/a Hobbyist-tier, plentiful

The on-demand prices on Vast.ai are still 20-30% cheaper than the cheapest managed cloud on every SKU. The interruptible prices are 40-50% cheaper. Whether the savings are worth the variance depends entirely on your workload tolerance.

Benchmark matrix

GAX-measured (May 2026). Vast.ai numbers are medians across 8 hosts scored ≥9.5 reliability.

Workload Vast.ai H100 SXM (median) Vast.ai H100 SXM (best host) Lambda H100 SXM Spread
Llama 3.1 70B inference (tok/s) 1,752 1,879 1,892 ±11.7%
Llama 3.1 8B fine-tune (tok/s/GPU) 384 408 412 ±6.3%
SDXL inference (img/s, batch 4) 2.97 3.31 3.41 ±10.2%
NCCL all-reduce P50 (μs) 124 91 78 ±24%
Bandwidth test (Mbps egress) 680 9,200 unmetered range: 100-9200
Provision SSH-ready (s) 176 94 52 ±62%

The 11.7% throughput spread on Llama 70B inference is the most important number on this page. It means the same 'H100 SXM 80GB' listing on Vast.ai gives you anywhere from a Lambda-equivalent host to a meaningfully slower one. Best-case Vast equals or beats Lambda; median Vast is 7% behind. Filter and benchmark.

Cost-to-performance ratio

$/M tokens on Llama 70B inference. Vast.ai's median rate, not the headline rate, is what you should compare.

Provider $/hr tok/s $/M tokens vs Vast median
Vast.ai interruptible (median host) $1.79 1,752 $0.284
Vast.ai interruptible (best host) $1.79 1,879 $0.265 −7%
Vast.ai on-demand (median host) $2.29 1,752 $0.363 +28%
Lambda Reserved 1-yr $1.85 1,892 $0.272 −4%
RunPod Community $2.39 1,791 $0.371 +31%

Vast.ai interruptible on a well-vetted host actually beats Lambda Reserved on $/M tokens — for interruptible workloads. For anything with restart cost (production inference, long training jobs), Lambda Reserved is the better economic deal because Vast on-demand prices are higher and host quality is variable.

Hardware & software stack

Vast.ai's catalog is whatever the host network is offering. As of May 2026 we observed: H100 SXM, H100 PCIe, H200 SXM (rare), A100 SXM, A100 PCIe, A6000, A5000, A4000, RTX 4090, RTX 3090, V100, T4. Multi-GPU configurations are common (2x and 4x are easy to find), 8x H100 SXM is rare and expensive when it appears.

Software: BYO Docker. Vast.ai provides a base PyTorch image and a CUDA image as templates, but expects you to push your own container. The recommended pattern is build a container that runs your workload, push to Docker Hub, point Vast.ai instance at the image.

Storage: Ephemeral on the instance by default. Vast.ai has been rolling out persistent network volumes in beta; pricing is $0.10/GB/month at this time, which is reasonable but the rollout is uneven across hosts.

Networking: Per-host. Filter by minimum bandwidth before renting. Hosts with datacenter-grade fiber (typically 10 Gbps both directions) cost about 15% more than residential-tier hosts.

Scenario simulation: what Vast.ai costs for your work

Three scenarios at realistic volumes, including the cost of host variance.

Scenario A: Indie ML researcher, fine-tuning experiments

Workload: 1x H100 SXM interruptible, 8 hours/week (4 sessions × 2 hrs)

Monthly cost: $1.79 × 32 = $57.28/mo

This is the buyer Vast.ai was built for. A Lambda equivalent month is $95.68 on-demand; RunPod Community is $76.48. The savings compound across a year and the variance doesn't matter for exploratory training. The $1 sign-up credit covers your first hour of experimentation.

Scenario B: Solo founder, burst inference API

Workload: 1x A100 SXM on-demand, autoscaled 6 hrs/day during peak, idle 18 hrs

Monthly cost: $1.09 × 6 × 30 = $196.20/mo

This is borderline territory. Cheaper than RunPod Serverless for steady traffic; more variance and no SLA. If your customers tolerate a 2-3 minute restart once a month, this is the right cloud. If they don't, RunPod Serverless with warm pool is safer at $280/month.

Scenario C: Production inference, multi-region SaaS

Workload: 2x H100 SXM on-demand, 24/7, requirement: 99.5% uptime

Monthly cost: $2.29 × 2 × 24 × 30 = $3,298/mo

Wrong cloud for the job. Cheaper than Lambda on-demand ($4,306/mo), but the lack of SLA, marketplace recourse, and host variance make this fragile. Save money the right way: Lambda Reserved 1-yr at $2,664/mo with real SLA. Vast.ai is not the answer here.

Use-case match matrix

Workload Vast.ai fit Better alternative
Indie ML fine-tuning experiments ✓ Best in class
Hobbyist SDXL LoRA training ✓ Best in class (A6000 cheap)
Production inference with SLA ✗ No SLA available Lambda Reserved or RunPod Secure
Long pretraining run, single-job ✗ Interrupt risk too high CoreWeave or Lambda Reserved
Bursty inference, autoscale to zero ~ OK if restart tolerated Modal or RunPod Serverless
HIPAA / FedRAMP / regulated ✗ Blocked, no compliance AWS / Azure / CoreWeave
Quick batch job, cost-sensitive ✓ Best in class
Multi-node distributed training ✗ Rare to find multi-node hosts CoreWeave or Lambda 1-Click Clusters
Inference where customer tolerates restart ✓ Strong if host vetted
Storing model weights long-term ~ Beta network volumes S3 or GCS, mount to Vast

Stability & uptime history

Vast.ai is a marketplace and doesn't have its own platform-level uptime. We tracked host-level reliability across our test fleet. Marketplace platform itself (the booking and bidding UI) has been highly available.

Period Platform uptime Median host uptime (rel≥9.5) Notes
Nov 2024 – Jan 2025 99.91% 98.4% Platform clean; one host disconnect during testing
Feb 2025 – Apr 2025 99.94% 98.7% Q1 was the most stable host pool we saw
May 2025 – Jul 2025 99.78% 97.9% Summer host attrition (residential operators)
Aug 2025 – Oct 2025 99.92% 98.2% Improvements to reliability scoring rolled out
Nov 2025 – Jan 2026 99.86% 98.0% Q4 demand surge stressed cheap-tier hosts
Feb 2026 – Apr 2026 99.95% 98.5% Best quarter so far

Platform-level uptime: 99.89%. Median host uptime (filtered ≥9.5 reliability): 98.3%. The gap is the whole story: Vast.ai the platform is reliable. Individual hosts vary. If you want better than 98.3% measured, you need to either pick higher-reliability hosts (≥9.8) or use on-demand pricing with interrupt protection.

Longitudinal pricing data

Vast.ai's median listed prices have dropped substantially as H100 supply ramped through 2025.

Date H100 SXM (interruptible) A100 SXM RTX 4090 Notes
May 2024 $3.10/hr $1.69/hr $0.45/hr H100s scarce on marketplace
Nov 2024 $2.45/hr $1.29/hr $0.35/hr Supply growing
Feb 2025 $2.10/hr $1.09/hr $0.29/hr First sustained sub-$2.50 H100 listings
Aug 2025 $1.89/hr $0.89/hr $0.24/hr Marketplace floor
Feb 2026 $1.79/hr $0.79/hr $0.21/hr Stabilized
May 2026 $1.79/hr $0.79/hr $0.19/hr Current

The pattern: H100 supply growth on the marketplace has been steady through 2025-2026, pulling prices down ~42% in two years. RTX 4090 has bottomed out around $0.19/hr; it's hard to see this going much lower without losing host margin. H100 floor may be near.

Community sentiment

Vast.ai generates strong sentiment in both directions. We pulled 6 months of mentions across Reddit (r/LocalLLaMA, r/MachineLearning, r/StableDiffusion), Hacker News, X/Twitter. Sample: 1,624 mentions.

Source Positive Negative Top complaint Top praise
r/LocalLLaMA (n=512) 68% 22% Host variance Cheapest GPU rentals
r/StableDiffusion (n=384) 79% 11% Interrupt risk Cheap RTX 4090s
Hacker News (n=287) 56% 27% No SLA / trust concerns Marketplace innovation
X/Twitter (n=441) 65% 19% Bandwidth caps surprise Pure pricing

Net sentiment: +45 (positive), weakest of the major GPU clouds we tracked, but with the largest bimodal distribution. Either people love Vast.ai (got a great host, saved 50%) or they were burned (bad host, lost a checkpoint). There's not much middle. The 'how to filter hosts' content is where Vast.ai's user education has the most room to grow.

Who should avoid this

Skip this if you fall into any of these buckets. Naming it up-front beats a support ticket later.

  • Healthcare ML touching PHI. No HIPAA, no BAA. Use a managed cloud.
  • Public sector under FedRAMP or any regulated workload. Vast.ai has no compliance posture.
  • Production inference with strict SLA requirements. No SLA exists. Use Lambda Reserved or RunPod Secure.
  • Long-running training jobs you can't interrupt. Use on-demand pricing at minimum, or move to a managed cloud.
  • Buyers who can't tolerate host variance. Even with filtering, ±10% throughput variance is real.
  • Workloads requiring NCCL multi-node tightly-coupled training. Multi-node Vast hosts are rare and not optimized.
  • Buyers without time to filter and benchmark hosts. Vast.ai assumes you'll spend 30 minutes setting up filters and picking carefully.

Testing evidence

FIG 4.0 — Host variance across 8 Vast.ai H100 SXM rentals (May 2026)
host_id   reliability  bandwidth  cuda    Llama70B_tok_s  notes
V-001     9.92         9.4 Gbps   12.4    1,879           best host, datacenter-tier
V-002     9.78         8.1 Gbps   12.4    1,841           clean, slight network
V-003     9.85         3.2 Gbps   12.3    1,798           older CUDA
V-004     9.66         1.0 Gbps   12.4    1,762           low bandwidth tier
V-005     9.81         9.6 Gbps   12.4    1,855           datacenter-tier
V-006     9.55         0.5 Gbps   12.1    1,683           cheap residential
V-007     9.71         2.8 Gbps   12.3    1,754           shared host
V-008     9.83         8.9 Gbps   12.4    1,847           clean

median: 1,752 tok/s | best: 1,879 | worst: 1,683
spread: ±11.7% on identical SKU spec
FIG 4.1 — Interruption sampling, 12 interruptible bids, US business hours
bid_id   reliability  hours_run  interrupted  reason
V-I-01   9.92         8.0         no           completed normally
V-I-02   9.66         5.4         YES          out-bid by higher offer
V-I-03   9.85         8.0         no           completed normally
V-I-04   9.55         3.1         YES          host removed listing
V-I-05   9.81         8.0         no           completed normally
... [7 more, no interrupts]

interrupt rate (all bids): 2/12 = 16.7%
interrupt rate (reliability ≥9.8): 0/8 = 0%
takeaway: filter aggressively if running unattended

ROI calculator

Plug your team's workload to see what Vast.ai costs you. Numbers update live.

H100 SXM interruptible ($1.79/hr) H100 SXM on-demand ($2.29/hr) A100 SXM interruptible ($0.79/hr) A100 SXM on-demand ($1.09/hr) A6000 interruptible ($0.35/hr) RTX 4090 interruptible ($0.19/hr)
ON-DEMAND
$0/mo
VS LAMBDA RESERVED
$0/mo
DELTA
$0/mo

Vast.ai prices fluctuate ±10% week-to-week based on host supply. Median rate shown.

The verdict

Vast.ai is the cheapest GPU compute on the market, and it deserves the score for that alone. The marketplace structure is elegant: capacity is always there, prices float based on real supply, and the host filter lets you find quality if you look. For indie ML researchers, hobbyists training SDXL LoRAs, and solo founders running batch jobs that can absorb a restart, Vast.ai is the rational choice.

The places it loses — compliance, SLA, deterministic performance — aren't bugs, they're the inverse of the price advantage. Don't try to run production on a marketplace. Use Vast.ai where it's good (exploration, fine-tuning, batch work) and route everything else to a managed cloud.

If Vast.ai doesn't fit, consider

For self-serve with an SLA

Lambda Labs

On-demand H100 SXM at $2.99/hr with a real provider SLA. Reserved 1-yr at $1.85/hr beats Vast on-demand.

Read Lambda Labs review →
For cheap serverless

RunPod

Community Cloud at $2.39/hr H100 SXM has SLA-less pricing similar to Vast, with a more curated host pool.

Read RunPod review →
For serverless inference

Modal Labs

If your workload is bursty, Modal's per-second function billing beats any hourly rental once idle ratio gets high.

Read Modal Labs review →
What real users say

From 2,840 verified reviews.

SW
Sam W.
Indie ML researcher

"I trained three Llama LoRAs over a weekend for $18 on Vast. Same compute would have been $80 on Lambda. Two hosts I tried first were unusable. Third one was great. That's the deal."

JH
Jin H.
Solo SaaS founder

"For burst inference where I can absorb a restart, Vast is unbeatable. I save $400/month vs RunPod Community. Production inference, no way — I'd use Lambda."

Frequently asked

Is the cheap H100 rate real or marketing?
Real. Vast.ai marketplace shows H100 SXM hosts in the $1.60-1.99/hr range during most weeks. The catch is interrupt risk on the cheap tier (host can preempt you for a higher bidder), and host quality variance. Pick the on-demand tier and a host with a high reliability score and you'll pay $2.20-2.50/hr, still cheaper than anyone else.
What's a 'host reliability score' and how do I read it?
Vast.ai shows each host's historical uptime, accepted-bid completion rate, and average user rating. Anything above 9.5 / 10 is usually solid. Below 8.5 is risky. We standardized on hosts scored ≥9.7 with 90+ days of history during our test window.
Can I run production inference on Vast.ai?
Technically yes, but most teams shouldn't. There's no SLA, the host could disappear, your container restart might hit a different host with different performance. For experimentation, fine-tuning, and batch workloads, Vast is great. For production that customers depend on, use a managed cloud.
How does interrupt protection work?
You can set your bid to 'on-demand' (host cannot preempt you), 'interruptible' (host can preempt for higher bidder), or 'spot' (cheapest but most likely to be preempted). On-demand costs roughly 1.2-1.5x interruptible. Most workloads worth running on Vast.ai use on-demand.
What about data egress and storage?
Each host sets its own bandwidth policy. Some are unlimited, some cap egress at 100 Mbps, some charge per GB. Filter by 'min download bandwidth' and 'min upload bandwidth' before booking. Storage is ephemeral by default; Vast.ai offers persistent network volumes in beta.
How do I know a host isn't logging my model weights?
You don't, fully. Vast.ai requires host operators to agree to T&Cs prohibiting it, and most reputable hosts run encrypted-disk containers. But unlike a sole-tenant cloud, you're trusting the host operator. For any sensitive model, use Vast for training intermediate checkpoints, not for serving weights you care about.