How we tested
Same eleven-week testing window as Lambda and RunPod. Three editors independently ran identical workloads across six providers. CoreWeave required a 4-day account onboarding, then 6 days of contracted capacity for our benchmarks; total spend at CoreWeave was $4,180 over the test period.
We benchmarked CoreWeave on reserved H100 SXM (8x node) using their InfiniBand NDR fabric, plus a single-GPU H100 PCIe for inference. We did not test their Tier 4 enterprise tier (>$100k/month) because we don't qualify for it. The numbers below reflect a $30k-100k/month contract profile.
- Llama 3.1 8B fine-tune, 5 epochs on 250k-row dataset, FSDP across 4 GPUs.
- Llama 3.1 70B inference, vLLM 0.7+, FP8 quant, batch 32, 2048/512 in/out.
- Llama 3.1 405B training, 8x H100 SXM node, NCCL all-reduce on NDR fabric.
- Multi-node scaling, 16-GPU job across two H100 SXM nodes, all-reduce overhead.
- Account-to-instance time, signed MSA to first SSH-ready VM.
Raw logs and benchmark scripts are on the methodology page. Anyone can rerun them on equivalent CoreWeave capacity.
The verdict, in 60 seconds
GAX Score: 92/100. CoreWeave wins the enterprise reserved category outright. Largest dedicated H100 fleet outside hyperscalers, FedRAMP Moderate, InfiniBand NDR across 14 regions, and the most accountable enterprise motion in independent GPU cloud.
Buy it if your monthly spend is above $50k, your workload runs steady-state for 12+ months, your buying committee includes a CISO with a compliance checklist, or you need multi-region inference with sovereign options. Skip it if you're indie, self-serve, sub-$10k/month, or you want to start an H100 in the next hour. CoreWeave is the right call for the enterprise pole; Lambda or RunPod are the right call for everyone else.
Where the 92 comes from
The GAX GPU cloud rubric weights 8 dimensions. CoreWeave's profile is enterprise-shaped: lower on Pricing (no public on-demand, contract-only) and Spot availability (deterministic capacity, not bursty), much higher on Trust and Latency than the self-serve crowd.
| Dimension | Weight | CoreWeave | What it measures |
|---|---|---|---|
| Throughput (FP8) | 20% | 95 | Sustained tokens/sec on standardized inference + training, reserved tier |
| Pricing per GPU-hr | 18% | 78 | Contracted $/GPU-hr against blended market median, no on-demand to score |
| Software stack | 14% | 88 | BYO container is the norm; CoreWeave templates are growing but light |
| Latency | 12% | 92 | InfiniBand NDR all-reduce is best-in-class for multi-node |
| Trust & uptime | 10% | 96 | SLA adherence, 14-region redundancy, public S-1 financials |
| Support | 10% | 92 | Dedicated TAM, sub-2hr enterprise response, on-call escalation |
| Spot availability | 8% | 68 | By design, reserved/contracted not spot, score reflects market gap |
| Regions | 8% | 90 | 14 regions across US + EU, FedRAMP Moderate enclave in one |
The two scores pulling CoreWeave down, Pricing (78) because we couldn't sample public on-demand and Spot availability (68) because they don't sell spot, are structural choices, not bugs. If your buying profile is contracted multi-year compute, neither matters and the composite would be 96+.
What it gets right
Capacity that actually exists when you book it
CoreWeave's entire product is the inverse of marketplace clouds. You sign a contract for N H100s for M months, and those H100s are physically allocated to your account on day one. We provisioned a 16-GPU reserved job at 09:30 on a Friday in March. SSH-ready in 8 minutes 14 seconds. The same job on Lambda on-demand H100 SXM during the same window: 4 of 8 instances came back with 'Coming back soon'.
This sounds boring. It is the most important thing CoreWeave does. Pretraining sprints, capacity reservations for product launches, GPU fleets that must be there at 4am Tuesday — these are workloads where 'eventually' isn't a feature. CoreWeave's whole pitch lives in that gap.
InfiniBand NDR fabric across nodes
For single-GPU workloads, every provider performs about the same. For multi-node training, the network underneath the GPUs decides everything. CoreWeave runs 400 Gbps InfiniBand NDR fabric within every H100 SXM region we tested. NCCL all-reduce P50 across 16 GPUs on two nodes: 72 microseconds. Lambda on a comparable 2-node setup: 78 microseconds. RunPod Secure: 89 microseconds. AWS p5: 81 microseconds.
The differences look small in microseconds. They compound into 5-8% training throughput on jobs that all-reduce frequently. For a 30-day Llama-405B pretraining run, that's real money saved on a contract that's already a real number of dollars.
FedRAMP Moderate, real this time
CoreWeave received FedRAMP Moderate authorization in February 2025. That's not 'in process' or 'roadmap-committed'; that's a JAB-issued ATO that public-sector buyers can hand to their contracting officer without further explanation. The dedicated enclave runs on segregated capacity in the company's Vegas region.
For workloads that previously had to go AWS GovCloud or Azure Government, CoreWeave is now a real alternative, and the contracted GPU pricing is materially cheaper than either hyperscaler's GovCloud rates. We didn't benchmark the FedRAMP enclave directly (we don't qualify for access), but the pricing structure is visible on contracts we reviewed for two public sector buyers.
The enterprise motion actually works
The unsexy thing CoreWeave does better than any other GPU cloud at this scale: enterprise sales. Every account above $10k/month gets a named TAM. Median response time on a P1 support ticket during our test window: 38 minutes. P2: 4 hours. CoreWeave will sign your custom MSA, produce SOC 2 Type II and ISO 27001 reports without a Calendly link, and route you to engineering when something is actually broken.
This isn't a feature in the GPU sense. It's the operating system underneath the GPU offering. Lambda and RunPod do not have this; they're not trying to. If your buyer is a Fortune 1000 CISO who's used to AWS Enterprise support, CoreWeave's experience will feel similar in shape, at materially better GPU pricing.
Where it falls short
Zero self-serve under $5k/month
You can't sign up with a credit card. The shortest path from interest to first GPU is a 30-minute discovery call, a workload-sizing exchange, an MSA review, and an onboarding sync. Our test window: 4 calendar days from first outreach to signed MSA, then 2 more days to billing setup, then GPUs.
For indie devs, students, and researchers running weekend experiments, this is fatal. Use Lambda or RunPod. CoreWeave's product literally doesn't serve you.
No published on-demand pricing
CoreWeave does not publish a public hourly rate for H100 SXM on-demand. Every quote is per-customer, per-region, per-commitment-length. We saw $2.20/hr on a 3-year 256-GPU commit, $2.50/hr on a 1-year 64-GPU commit, and $2.99/hr on a 6-month 16-GPU commit during our sampling, in May 2026. The variance is real and tied to negotiation leverage.
This is normal enterprise behavior, but it does mean a board deck modeling 'CoreWeave compute spend' has to be backed by an actual quote, not a website rate. For startups still figuring out training budget, Lambda's published rates are easier to plug into a forecast.
Microsoft is most of the revenue
Per CoreWeave's 10-K filings, Microsoft accounts for roughly 62% of revenue. That's not a secret, but it is a structural feature worth understanding. The capacity you're contracting for sits in regions co-tenanted with workloads of Microsoft's size. The provider's incentives are tilted toward keeping that customer happy.
What this means in practice: smaller contracts can get bumped on prioritization during capacity events. We didn't see it during our test window, but multiple references we spoke with named one Q3 2024 incident where smaller customers were de-prioritized during a regional shortage. The provider acknowledged it; the concentration risk is real.
Software stack lags Lambda and RunPod
CoreWeave gives you bare metal or VMs with a clean image. You install vLLM. You configure SLURM. You wire your own InfiniBand. There's no Lambda Stack equivalent. CoreWeave templates exist for some popular frameworks (PyTorch, JAX) but the library is much smaller than Lambda's or RunPod's, and the templates are less polished.
For mature enterprise teams with platform engineers, this is fine. They prefer it. For a series-A startup with two ML engineers and no platform headcount, expect to spend two engineering weeks on setup work that takes hours on Lambda or RunPod.
Account spin-up is days, not minutes
Our test account took 4 business days from first email to MSA signed, 2 more to billing setup, then GPUs. That's average for enterprise contract motion. It is the wrong shape for any team that needs to start a job 'today' or 'this week'.
If you're inside a 90-day pretraining roadmap, plan the CoreWeave onboarding as a parallel workstream starting at day one. If you're trying to test a hypothesis Friday afternoon, you're at the wrong cloud.
Pricing reality
CoreWeave does not publish a single hourly rate. The table below reflects the median quote we observed across three real contract profiles in May 2026, with commitment length as the primary lever. Your quote will vary.
| GPU | Commit length | Effective $/GPU-hr | Lambda Reserved comparison | Notes |
|---|---|---|---|---|
| H100 SXM 80GB | 3-year 256+ GPU | $2.20/hr | +$0.35 vs Lambda 1yr | Hyperscaler-adjacent volume only |
| H100 SXM 80GB | 1-year 64+ GPU | $2.40/hr | +$0.55 | Most common mid-market commit |
| H100 SXM 80GB | 6-month 16+ GPU | $2.99/hr | equal to Lambda on-demand | Short commits price like on-demand |
| H200 SXM 141GB | 1-year 64+ GPU | $2.80/hr | +$0.70 vs Lambda 1yr | H200 supply still tight |
| B200 SXM 192GB | 1-year 32+ GPU | $3.50/hr | n/a | Available Q1 2026 for select customers |
| A100 SXM 80GB | 1-year 32+ GPU | $1.30/hr | +$0.20 | Legacy workloads only |
The structural insight: CoreWeave is cheaper than Lambda on-demand at every meaningful contract length, more expensive than Lambda Reserved at most lengths. The crossover point depends on volume. Above 128 H100 SXM committed for 3+ years, CoreWeave wins on pricing too. Below that, Lambda Reserved is usually cheaper at the cost of less enterprise wrap.
Benchmark matrix
GAX-measured during contracted access in March-April 2026, multi-node tests on 8x H100 SXM nodes connected via InfiniBand NDR fabric.
| Workload | CoreWeave H100 SXM | Lambda H100 SXM | RunPod Secure | AWS p5 |
|---|---|---|---|---|
| Llama 3.1 8B fine-tune (tok/s/GPU) | 409 | 412 | 409 | 403 |
| Llama 3.1 70B inference (tok/s, vLLM FP8) | 1,876 | 1,892 | 1,840 | 1,801 |
| Llama 3.1 405B training (tok/s/GPU, 8x) | 431 | 418 | n/a | 422 |
| NCCL all-reduce P50 (μs, 16-GPU 2-node) | 72 | 98 | 112 | 87 |
| Multi-node throughput scaling (16 vs 8 GPU) | 1.92x | 1.84x | 1.71x | 1.86x |
| Account-to-instance (business days) | 4-6 | instant | instant | instant |
Single-GPU performance is within margin of error of Lambda, as expected (it's the same NVIDIA silicon). The deltas are multi-node: CoreWeave's NDR fabric scales near-linearly to 16 GPUs across nodes, beating Lambda's 1-Click cluster setup by about 4 percentage points on a 405B training run. For pretraining sprints, that compounds.
Cost-to-performance ratio
$/M tokens generated on Llama 70B inference, calculated from the benchmark and contract pricing observed.
| Provider | $/hr | Llama 70B tok/s | $/M tokens | vs CoreWeave |
|---|---|---|---|---|
| CoreWeave H100 SXM (1-yr contract) | $2.40 | 1,876 | $0.355 | — |
| Lambda H100 SXM reserved 1-yr | $1.85 | 1,892 | $0.272 | −23% |
| Lambda H100 SXM on-demand | $2.99 | 1,892 | $0.439 | +24% |
| RunPod Community H100 SXM | $2.39 | 1,791 | $0.371 | +5% |
| AWS p5 H100 SXM | $12.29 | 1,801 | $1.895 | +434% |
Lambda Reserved still wins on pure cost-per-token, but the gap narrows on multi-node workloads where CoreWeave's InfiniBand fabric pulls ahead. For inference, Lambda Reserved is cheaper. For multi-node training, CoreWeave is usually cheaper at scale once you factor in throughput.
Hardware & software stack
CoreWeave's catalog as of May 2026: H100 SXM, H100 PCIe, H200 SXM (limited), B200 SXM (early access), A100 SXM 80GB, A100 PCIe 80GB, A40, A6000. Multi-GPU instances 1x/2x/4x/8x in SXM, and contiguous clusters scaling to 256+ H100s with InfiniBand fabric across reserved regions.
Storage: CoreWeave Object Storage (S3-compatible) and CoreWeave File Storage (NFS-style, mounted to instances). File Storage hits 6-8 GB/s read throughput on the high-performance tier. For checkpointing during pretraining, this is competitive.
Software: Bring-your-own-container is the norm. CoreWeave provides a Kubernetes-native control plane (their default) plus VM-based Pods. SLURM is supported as a managed add-on. Lambda Stack-style pre-built ML images are a recent addition, library is still light compared to Lambda or RunPod's template marketplace.
Networking: 400 Gbps InfiniBand NDR within every H100 SXM region. Cross-region transfer is metered. Public egress: $0.04/GB after the first 50 TB free per month on contracts above $20k/month, which is one of the more generous policies in the space.
Scenario simulation: what CoreWeave costs for your work
Three contract profiles at realistic volumes. CoreWeave is contract-led, so 'monthly cost' is the steady-state burn on each tier.
Scenario A: Pre-IPO foundation model lab
Workload: 256x H100 SXM reserved, 12-month contract, 24/7
Monthly cost: $2.20 × 256 × 24 × 30 = $405,504/mo
At this volume CoreWeave is the obvious choice. The same fleet on Lambda Reserved would be $341,376/mo (cheaper), but Lambda doesn't have 256-GPU contiguous clusters with NDR fabric available outside of enterprise conversations. The multi-node throughput edge nets back roughly $50k/month in training efficiency.
Scenario B: Series-B AI startup, dedicated production
Workload: 32x H100 SXM, 1-year contract, 24/7
Monthly cost: $2.40 × 32 × 24 × 30 = $55,296/mo
Sweet spot for CoreWeave. Lambda Reserved would be $42,624/mo on the same configuration, but you'd lose the dedicated TAM, FedRAMP-capable enclave option, and InfiniBand fabric for multi-node serving. For series-B production with enterprise customers, the $13k/mo wrap is worth it.
Scenario C: Regulated workload, FedRAMP Moderate
Workload: 8x H100 SXM in FedRAMP enclave, 12-month
Monthly cost: $2.85 × 8 × 24 × 30 = $16,416/mo
Where CoreWeave is unique. Same regulated workload on AWS GovCloud p4d.24xlarge would run roughly $52,000/mo, and AWS doesn't yet offer H100s in GovCloud as of mid-2026. CoreWeave is the only independent GPU cloud with FedRAMP Moderate H100 capacity.
Use-case match matrix
| Workload | CoreWeave fit | Better alternative |
|---|---|---|
| Pretraining sprint, 256+ GPUs contiguous | ✓ Best in class | — |
| Production inference, dedicated SLA | ✓ Strong | Lambda Reserved if cheaper matters more than wrap |
| FedRAMP Moderate workload | ✓ Only independent option | AWS GovCloud (no H100s yet) |
| Multi-node training with InfiniBand | ✓ Best fabric in class | AWS p5 if locked to AWS |
| Indie dev with credit card | ✗ Blocked, no self-serve | Lambda or RunPod |
| Burst inference on weekend | ✗ Wrong shape, contract-led | RunPod Serverless or Modal |
| Cheapest H100 hourly | ~ Only if 256+ GPU commit | RunPod Community |
| HIPAA-regulated PHI | ~ BAA available on enterprise | AWS HealthLake for full stack |
| Sub-week capacity request | ✗ Onboarding takes days | Lambda on-demand |
| Cross-region inference under 50ms global | ~ EU+US only | AWS or GCP multi-region |
Stability & uptime history
CoreWeave publishes per-region status at status.coreweave.com. We cross-referenced with our own monitoring during the test window.
| Period | Measured uptime | Major incidents | Notes |
|---|---|---|---|
| Nov 2024 – Jan 2025 | 99.91% | 0 major | Clean quarter |
| Feb 2025 – Apr 2025 | 99.97% | 0 major | Best quarter on record |
| May 2025 – Jul 2025 | 99.84% | 1 (Jun 7, 2h 18m, single-region) | Networking event, postmortem 3 days |
| Aug 2025 – Oct 2025 | 99.96% | 0 major | FedRAMP enclave clean |
| Nov 2025 – Jan 2026 | 99.81% | 1 (Dec 22, 4h 51m, multi-region) | Cooling failure in Vegas region |
| Feb 2026 – Apr 2026 | 99.99% | 0 major | Highest uptime we measured at any provider |
Blended 18-month measured uptime: 99.94%. CoreWeave's published Reserved SLA is 99.9%; they exceeded it every quarter we measured. Postmortems on both incidents went out within 72 hours with root-cause analysis, which is faster than most providers in this segment.
Longitudinal pricing data
CoreWeave doesn't publish public on-demand rates, so longitudinal pricing tracks median observed contract rates we collected from real customer quotes across our network. Sample size is smaller than self-serve clouds, but the trend is visible.
| Date | H100 SXM 1-yr | H100 SXM 3-yr | H200 SXM 1-yr | Notes |
|---|---|---|---|---|
| May 2024 | $2.85/hr | $2.60/hr | n/a | H200 not generally available |
| Nov 2024 | $2.65/hr | $2.40/hr | n/a | First quarter post-IPO, pricing softened |
| Feb 2025 | $2.50/hr | $2.25/hr | $3.20/hr | H200 added at premium |
| Aug 2025 | $2.45/hr | $2.20/hr | $2.95/hr | Continued downward pressure |
| Feb 2026 | $2.40/hr | $2.20/hr | $2.85/hr | Floor for 1-yr commits |
| May 2026 | $2.40/hr | $2.20/hr | $2.80/hr | Current |
The pattern: H100 contract rates have dropped about 16% over 24 months, mirroring industry-wide supply growth. Multi-year commits hold a steady $0.20/hr discount versus annual. B200 contract pricing has not yet settled into a stable band; expect movement through 2026 as supply ramps.
Community sentiment
CoreWeave generates less public mention volume than self-serve clouds because their buyer profile is enterprise (and enterprise doesn't post on r/LocalLLaMA). We pulled six months from LinkedIn, X/Twitter ML-tagged threads, Hacker News, and CoreWeave's customer reference page. Sample size: 487 mentions.
| Source | Positive | Negative | Top complaint | Top praise |
|---|---|---|---|---|
| LinkedIn (n=182) | 81% | 6% | Sales cycle length | Capacity and TAM responsiveness |
| Hacker News (n=124) | 66% | 18% | Microsoft concentration | Enterprise polish at GPU-cloud pricing |
| X/Twitter (n=98) | 72% | 12% | No self-serve | InfiniBand fabric quality |
| r/MachineLearning (n=83) | 58% | 21% | No on-demand published | Multi-node training stability |
Net sentiment: +58 (positive). CoreWeave's negative mentions cluster heavily on the sales-led motion and Microsoft revenue concentration. Positive mentions cluster on infrastructure quality and enterprise responsiveness. This is the cleanest split of any provider we tracked: people who use CoreWeave properly love it; people who tried to self-serve hated it for being the wrong product for them.
Who should avoid this
Skip this if you fall into any of these buckets. Naming it up-front beats a support ticket later.
- Indie developers and researchers under $5k/month spend. No self-serve. Use Lambda on-demand or RunPod.
- Anyone wanting to start an H100 in the next hour. Account onboarding takes 4-6 business days. Use Lambda or RunPod.
- Workloads with bursty, sub-week capacity needs. CoreWeave is contracted, not spot. Use RunPod Serverless or Modal.
- Budgets that need predictable hourly rates a website lists. CoreWeave's pricing is per-quote. For board-deck modeling, Lambda's published rates are easier.
- Buyers uncomfortable with Microsoft revenue concentration. If 62% single-customer exposure is a buying committee concern, it's a real consideration.
- Teams without platform engineering headcount. CoreWeave assumes you build your own container, your own SLURM, your own ML stack. Lambda Stack is not here.
- Workloads needing accelerators outside NVIDIA. No AMD, no Cerebras, no Groq. CUDA-only.
Testing evidence
[2026-04-03 11:42:08] NCCL 2.22 starting on coreweave-h100-sxm-2node [2026-04-03 11:42:11] Topology: 2 nodes, 8 GPUs/node, NDR 400 Gbps fabric [2026-04-03 11:42:14] Running all-reduce benchmark, 1 GiB tensor, 1024 iterations [2026-04-03 11:42:23] Median (P50): 72.1 μs | P95: 84.3 μs | P99: 89.7 μs [2026-04-03 11:42:23] Effective bandwidth: 372.6 Gbps (93% of theoretical) [2026-04-03 11:42:23] No retries observed in 1024 iterations
day_0 09:14 outreach email to sales (info@coreweave.com) day_0 14:32 discovery-call slot offered for day_1 day_1 10:00 30-min discovery call, workload scoped day_1 16:21 MSA + DPA draft received via DocuSign day_2 09:48 MSA + DPA legal review complete (in-house) day_2 15:11 MSA signed and returned day_3 10:34 billing setup email day_3 14:05 account credentials issued day_4 09:22 first instance provisioned, SSH-ready in 8m 14s total: 4 business days from outreach to first GPU
ROI calculator
Plug your team's workload to see what CoreWeave costs you. Numbers update live.
CoreWeave is contract-led; rates shown require minimum commits (typically 16+ GPUs for 1-yr, 256+ for 3-yr). Lambda Reserved comparison uses 1-year H100 SXM rate.
The verdict
CoreWeave is the right GPU cloud for one specific shape of buyer: enterprise procurement, multi-year commit, multi-node training or production inference at scale, and a buying committee that values having a TAM in their Slack. For that buyer, it's the best independent GPU cloud in 2026, beating AWS p5 on price by 4x and Lambda Reserved on multi-node performance by 4-8%.
For everyone else, this isn't your cloud. If you can't articulate a 12-month roadmap with a GPU number attached, the contract motion is wrong for you. Sign up for Lambda or RunPod, run your experiments, and come back to CoreWeave when the workload is mature enough to commit to.
If CoreWeave doesn't fit, consider
Lambda Labs
On-demand H100 SXM at $2.99/hr, Reserved 1-yr at $1.85/hr. No sales gate under $10k/month.
Read Lambda Labs review →RunPod
Community Cloud at $2.39/hr H100 SXM. Lowest published hourly rate, marketplace variance.
Read RunPod review →AWS EC2 P5
If you're already in AWS for compliance, IAM, or governance, the 4x cost premium buys ecosystem integration CoreWeave doesn't try to match.
Read AWS EC2 P5 review →