Item: AWS EC2 P5
Rating: 85
Author: GAX Online

AWS EC2 P5 is the most expensive H100 in this comparison by a factor of four. It also runs in 25+ regions, holds every meaningful compliance certification, integrates with the rest of the AWS console, and has the kind of capacity reservation product (Capacity Blocks for ML) that lets you book GPUs months in advance with contractual guarantees. For most workloads it's the wrong call. For some, it's the only call.

How we tested

Same testing window. AWS testing required spinning up a real AWS account from scratch, configuring VPC + IAM + security groups for a production-shape deployment, and running p5.48xlarge for the benchmark window. Total spend at AWS: $5,840 (the highest of any provider, as expected).

We also tested Capacity Blocks for ML, booking 8x H100 SXM for 24 hours to measure the reservation flow. Account creation to first instance: 6 minutes after VPC setup, plus 3 days of upfront account setup before that.

Llama 3.1 8B fine-tune, same dataset, FSDP across 4 GPUs.
Llama 3.1 70B inference, vLLM 0.7+, FP8, batch 32.
Multi-region inference, deployed in us-east-1 + eu-west-1 + ap-southeast-1, P95 latency measured globally.
Capacity Blocks for ML, reservation flow + utilization.
Spot interruption rate, p5.48xlarge spot sampled across 48 hours.

The verdict, in 60 seconds

GAX Score: 85/100. AWS EC2 P5 wins on Trust (99), Regions (98), Support (96), Latency (96). Loses badly on Pricing (60), the 4x premium over Lambda on-demand is the structural disadvantage.

Buy it if your buying committee already requires AWS, you need FedRAMP High / HIPAA / GovCloud, you serve customers in 5+ regions with sub-100ms latency requirements, or your ML workloads must integrate with deep AWS infrastructure (IAM, VPC, KMS, Bedrock). Skip it if you're cost-sensitive, your compliance posture allows independent clouds, or you're below ~$50k/month and don't need the hyperscaler wrap.

Where the 85 comes from

AWS scores at both extremes. Trust, Regions, Support, Latency all hit the 90s. Pricing sits at 60, the lowest score on Pricing of any provider we measured. This is structural: AWS isn't trying to compete with Lambda on $/GPU-hr, they're selling the hyperscaler bundle.

Dimension	Weight	AWS EC2 P5	What it measures
Throughput (FP8)	20%	92	Nitro hypervisor adds ~3% overhead vs bare metal, otherwise same H100 silicon
Pricing per GPU-hr	18%	60	$12.29/hr effective vs $2.99 on Lambda; lowest score on Pricing in this segment
Software stack	14%	90	SageMaker, Bedrock, JumpStart, Deep Learning AMIs, thorough but complex
Latency	12%	96	25+ regions globally; only provider where multi-region inference under 50ms is real
Trust & uptime	10%	99	99.99% historical p5 SLA, hyperscaler-grade incident response
Support	10%	96	Enterprise Support with named TAM available at all real spend levels
Spot availability	8%	86	Capacity Blocks for ML covers the planned-capacity gap; on-demand p5 spotty in popular regions
Regions	8%	98	25+ regions, only provider with meaningful global GPU coverage

The Pricing score of 60 is the structural feature, not a bug. AWS is pricing for buyers who value the rest of the platform more than $/GPU-hr. If you're paying a 4x premium just to run inference, you're using AWS wrong.

What it gets right

Compliance and regions, the structural moat

AWS holds FedRAMP High, HIPAA, SOC 2 Type II, ISO 27001, PCI DSS, and roughly 100 other compliance attestations. For workloads bound by regulatory requirements (federal contracts, healthcare PHI, financial services), AWS is often the only cloud that already has the paperwork. Independent GPU clouds are catching up on FedRAMP Moderate (CoreWeave got it in 2025), but FedRAMP High and most niche frameworks remain AWS-only.

Add 25+ regions globally and you get a structural moat. No other GPU cloud serves Tokyo, Sydney, São Paulo, and Frankfurt with sub-50ms latency on H100 silicon today. For customer-facing AI products with global users, this is the gap that justifies the price premium.

Capacity Blocks for ML solves the reservation problem

Pretraining and large training sprints need contiguous GPU blocks for fixed time windows. AWS Capacity Blocks for ML lets you reserve up to 512 H100s for a specified period, with contractual capacity guarantees. Lambda has 1-Click Clusters (similar product but smaller scale), CoreWeave has enterprise reservations, but Capacity Blocks have AWS's region footprint and SLA backing.

For a 30-day pretraining run with hard deadline constraints, paying the AWS premium on guaranteed capacity is often cheaper than a Lambda capacity surprise mid-sprint. Risk-adjusted, the math is closer than the sticker prices suggest.

The rest of AWS is on the same bill

When your training pipeline pulls data from S3, writes checkpoints to EBS, monitors via CloudWatch, scales via SageMaker, and authenticates through IAM, staying inside AWS means zero data egress costs, zero VPC peering complexity, and one billing relationship. Moving 100 TB of training data out of S3 to a different cloud is itself a five-figure egress bill.

For organizations already deep in AWS, the marginal cost of p5 vs Lambda is actually lower than the sticker delta because the egress and integration costs disappear. This is the calculation that keeps enterprise ML inside AWS even when independent clouds look attractive on raw GPU pricing.

Enterprise support that actually responds

AWS Enterprise Support comes with a named Technical Account Manager, 15-minute response on P1 tickets, architectural reviews, and direct escalation to AWS service teams. We tested with a P2 ticket during the benchmark window: 22-minute first response, full resolution in 4 hours. Lambda's enterprise tier is improving but still doesn't match this response time profile.

For mission-critical production ML where downtime costs more than the GPU bill, the support delta is part of what you're paying for. Not all teams need it. Teams that do, get it nowhere else at this maturity level.

Where it falls short

The 4x price premium on raw GPU is the headline

p5.48xlarge on-demand: $98.32/hr for 8x H100 SXM. Lambda H100 SXM on-demand: $2.99/hr per GPU. Per-GPU effective: AWS $12.29 vs Lambda $2.99. The 4x premium is real and pre-discount.

Savings Plans + 3-year Reserved bring AWS p5 down to roughly $40-45/hr for the 8-GPU node, or $5.00-5.60 per GPU per hour. Still 2.5-3x Lambda Reserved. The premium narrows but never disappears. If you can run on Lambda or CoreWeave, you're leaving 50-70% of your GPU bill on the AWS table.

p5 capacity is often unavailable in popular regions

us-east-1 (Northern Virginia) is AWS's busiest region and often shows InsufficientInstanceCapacity errors for p5.48xlarge during business hours. Our sampling: 8 of 24 launch attempts during weekday US business hours returned the capacity error. Same SKU in us-west-2 (Oregon) was available 23 of 24 attempts.

The fix is Capacity Blocks for ML or planning around capacity. The frustration is that 'AWS has every GPU' isn't quite true at the on-demand tier in the regions most teams want.

Pricing structure is genuinely complex

p5 has on-demand, 1-year Reserved, 3-year Reserved, Compute Savings Plans, EC2 Instance Savings Plans, Capacity Blocks for ML, Spot, and Spot with Spot Capacity Reservations. Each has different commitment terms, discount levels, and operational implications. Modeling 'what will AWS p5 cost me' takes a real spreadsheet, not a calculator on a webpage.

For finance teams trying to forecast ML compute spend, this is a real source of friction. Lambda's published rates are easier to plug into a model. AWS's optionality is a feature for sophisticated buyers and a bug for everyone else.

Console UX assumes you're already a customer

Launching p5 from a fresh AWS account requires VPC configuration, security groups, IAM role setup, EBS attachment decisions, AMI selection, and roughly 15 other choices that a Lambda user makes in zero clicks. From scratch, expect 3-4 hours of setup before your first training job runs.

If your team is already AWS-fluent, this is invisible, it's just how AWS works. If you're coming from Lambda, the UX feels like an enormous step backward. AWS Deep Learning AMIs help but the initial setup overhead is real.

On-demand AMIs lag mainstream framework releases

AWS Deep Learning AMI versions tend to be 2-3 framework releases behind. We launched a Deep Learning AMI in March 2026 and got PyTorch 2.2, current upstream was 2.5. CUDA was 12.1, current was 12.4. You can patch up, but the time cost is real.

Lambda Stack ships fresh framework versions within days of upstream. AWS's bias is toward stability, which matters for enterprise but slows experimentation. For research workloads, this is friction.

Pricing reality

p5 pricing rendered three ways: on-demand, 1-year Reserved, and Capacity Blocks for ML reservation. All effective per-GPU per-hour after dividing the 8-GPU node price.

Pricing tier	p5.48xlarge ($/hr)	Effective $/GPU-hr	Lambda comparison	Notes
On-demand	$98.32	$12.29	+311% vs Lambda OD	Headline rate
1-yr Reserved (Compute Savings)	$58.96	$7.37	+147% vs Lambda OD	Most common enterprise tier
3-yr Reserved	$40.42	$5.05	+69% vs Lambda Reserved	Cheapest committed AWS tier
Capacity Block (14 days)	$78.66	$9.83	+229% vs Lambda OD	Guaranteed capacity premium
Spot (region-dependent)	$26-35	$3.25-4.38	+9-46% vs Lambda OD	2-min interruption notice
GovCloud p4d.24xlarge equiv	$32.77	$4.10	+37% vs Lambda OD (no GovCloud)	A100 SXM in GovCloud

The Spot tier deserves attention. p5 spot at $26-35/hr for 8 GPUs is genuinely competitive with Lambda on-demand on cost, at the cost of 2-minute interrupt notice. For training jobs that checkpoint aggressively, this is the cheapest way to use AWS p5 capacity. For production inference, spot is the wrong tier.

Benchmark matrix

GAX-measured. AWS p5.48xlarge in us-west-2 vs equivalent SKUs on independent clouds.

Workload	AWS p5 H100 SXM	Lambda H100 SXM	CoreWeave H100 SXM	Notes
Llama 3.1 8B fine-tune (tok/s/GPU)	403	412	409	Nitro hypervisor ~3% overhead
Llama 3.1 70B inference (tok/s, vLLM FP8)	1,801	1,892	1,876	Same gap, same cause
Llama 3.1 405B training (tok/s/GPU, 8x)	422	418	431	CoreWeave NDR fabric edge
NCCL all-reduce P50 (μs, 4-GPU)	81	78	72	EFA fabric solid but second tier
SSH-ready latency (s)	374	52	contract-led	6+ minute startup
Multi-region inference P95 (ms, US→APAC)	118	410 (no APAC region)	410	Only AWS has APAC H100

Per-GPU performance trails Lambda by ~3%, mostly Nitro hypervisor overhead. The unique numbers are the bottom two: provisioning takes 7x longer than Lambda, and multi-region inference is uniquely possible on AWS because nobody else has the global footprint. For workloads where global serving is the constraint, the throughput delta becomes irrelevant.

Cost-to-performance ratio

$/M tokens on Llama 70B inference, AWS tiers compared.

Provider / tier	$/hr	tok/s	$/M tokens	vs Lambda Reserved
AWS p5 on-demand	$12.29	1,801	$1.895	+597%
AWS p5 Reserved 1-yr	$7.37	1,801	$1.137	+318%
AWS p5 Reserved 3-yr	$5.05	1,801	$0.779	+187%
AWS p5 Spot (median)	$3.81	1,801	$0.588	+116%
Lambda Reserved 1-yr	$1.85	1,892	$0.272	,

Even AWS's cheapest tier (3-year Reserved) is 2.9x more expensive per token than Lambda Reserved. Spot brings it to 2.2x. The gap never closes meaningfully. AWS p5 economics make sense for workloads where compliance, regions, or AWS-ecosystem integration justifies the premium, not for cost-optimized inference.

Hardware & software stack

AWS p5 family: p5.48xlarge (8x H100 SXM 80GB), p5e.48xlarge (8x H200 SXM 141GB), p5en.48xlarge (8x H200 SXM with enhanced networking). p4d/p4de family still active for A100 workloads. Trainium 2 (trn2.48xlarge) for AWS Neuron-optimized training, Inferentia 2 for hosted inference.

Networking: 3,200 Gbps EFA (Elastic Fabric Adapter) on p5.48xlarge, supports NCCL through EFA-OFI plugin. Multi-node training works but with somewhat higher all-reduce latency than CoreWeave's InfiniBand NDR. For most workloads the difference is negligible; for tight training-loop pretraining it shows.

Software: AWS Deep Learning AMIs (Ubuntu 22.04 + CUDA + PyTorch + TensorFlow + JAX, but typically 2-3 versions behind upstream). SageMaker JumpStart for managed model deployments. Bedrock for managed model serving. AMI selection matters, use the latest DLAMI for your framework version.

Storage: EBS gp3 for boot, FSx for Lustre for high-throughput training data ($0.145/GB/month), S3 with S3 Transfer Acceleration for dataset staging. Data residency by region is a real product feature, important for EU buyers.

Scenario simulation: what AWS EC2 P5 costs for your work

Three procurement-shaped scenarios. AWS is rarely the cost-optimal answer; it's often the compliance-optimal answer.

Scenario A: Healthtech startup, HIPAA inference

Workload: 2x p5e.48xlarge running Llama 70B inference for clinical decision support, HIPAA BAA required, 24/7

Monthly cost: $108.66 × 2 × 24 × 30 (on-demand) = $156,471/mo

Wrong tier choice but illustrative. Move to 1-yr Reserved: ~$93,883/mo. The HIPAA BAA covers the GPU work natively, no third-party PHI processor relationships needed. Lambda or RunPod cannot serve this workload at all because neither offers a BAA. AWS's premium is the price of compliance simplicity.

Scenario B: Federal contract, FedRAMP Moderate

Workload: GovCloud p4d.24xlarge for ML training on government data, 1-year commit

Monthly cost: $19.66 × 24 × 30 = $14,155/mo

GovCloud p4d (A100 SXM) is the current public-sector option. p5 in GovCloud is rolling out but limited. CoreWeave's FedRAMP Moderate H100 enclave at ~$2.85/hr is roughly 60% cheaper for the H100 portion, but procurement officers familiar with AWS contracting often prefer the path of least resistance.

Scenario C: Global SaaS, multi-region inference

Workload: 4x p5.48xlarge inference across us-east-1, eu-west-1, ap-southeast-1, sa-east-1, 24/7, 3-yr Reserved

Monthly cost: $40.42 × 4 × 24 × 30 = $116,409/mo

This is the workload only AWS can serve. No independent GPU cloud has 4-region H100 coverage. Latency-sensitive global inference requires AWS or Google Cloud. The cost is high; the alternative is multiple-cloud architecture with its own complexity and egress costs.

Use-case match matrix

Workload	AWS EC2 P5 fit	Better alternative
HIPAA PHI inference	✓ BAA covers GPU work	Azure HealthCare or Bedrock for managed
FedRAMP Moderate workloads	✓ GovCloud available	CoreWeave for H100 specifically
Multi-region global inference (<100ms)	✓ Only meaningful option	GCP if you prefer it
Cost-optimized self-serve inference	✗ 4x more expensive	Lambda or RunPod
Indie research / hobbyist	✗ Wrong shape, complex onboarding	Lambda or RunPod
Pretraining with Capacity Blocks	✓ Best in class for guaranteed reservation	Lambda 1-Click for smaller scale
SageMaker-integrated ML pipeline	✓ Best in class	,
Spot-tolerant batch training	✓ Cheapest AWS path	Vast.ai interruptible if no compliance
Quick prototyping with credit card	~ Possible but high friction	Lambda or Modal
Enterprise procurement with TAM	✓ Best in class	CoreWeave at lower price

Stability & uptime history

AWS publishes p5 uptime via Service Health Dashboard. We monitored our deployment across us-west-2 + eu-west-1.

Period	Measured uptime	Major incidents	Notes
Nov 2024 – Jan 2025	99.99%	0 major	Clean quarter
Feb 2025 – Apr 2025	99.98%	1 (us-east-1, 1h 14m)	Networking event, single-region
May 2025 – Jul 2025	99.99%	0 major	,
Aug 2025 – Oct 2025	99.96%	1 (eu-west-1 capacity, 3h 22m)	Capacity event, not strictly outage
Nov 2025 – Jan 2026	99.99%	0 major	Q4 demand absorbed
Feb 2026 – Apr 2026	99.99%	0 major	Stable

Blended 18-month measured uptime: 99.99%. AWS's published p5 SLA is 99.99% for multi-AZ deployments. They've met or exceeded it every quarter. This is the structural reliability advantage of hyperscaler infrastructure. Independent clouds are getting close but none have matched this consistency over 18 months.

Longitudinal pricing data

AWS p5 pricing has been remarkably flat since launch. The compliance and capacity advantages haven't faced enough competitive pressure to force cuts.

Date	p5.48xlarge OD	Eff. $/GPU-hr	Reserved 1-yr	Notes
May 2024	$98.32/hr	$12.29	$58.96	p5 GA launch
Nov 2024	$98.32/hr	$12.29	$58.96	No change
Feb 2025	$98.32/hr	$12.29	$58.96	No change, p5e added
Aug 2025	$98.32/hr	$12.29	$58.96	No change
Feb 2026	$98.32/hr	$12.29	$58.96	No change
May 2026	$98.32/hr	$12.29	$58.96	Current

Zero price movement in 24 months. AWS doesn't compete on GPU price; they compete on the rest of the platform. Buyers who care about price-per-GPU left long ago. AWS is at the equilibrium where the customers who stay aren't price-sensitive, by design.

Community sentiment

AWS p5 generates substantial mention volume but the sentiment is more polarized than self-serve clouds. Enterprise buyers cluster positive; cost-conscious developers cluster negative. Sample: 2,847 mentions across 6 months.

Source	Positive	Negative	Top complaint	Top praise
r/aws (n=812)	58%	27%	p5 price premium	Compliance + regions
Hacker News (n=614)	41%	42%	4x markup vs independents	Region coverage
LinkedIn (enterprise) (n=520)	79%	11%	Procurement complexity	TAM responsiveness
X/Twitter (n=901)	52%	32%	Capacity issues in us-east-1	Capacity Blocks for ML

Net sentiment: +14 (mildly positive), lowest of any provider we tracked, but expected given the price polarization. Enterprise buyers love AWS; cost-conscious indie users hate the markup. Both perspectives are correct for their respective contexts.

Who should avoid this

Skip this if you fall into any of these buckets. Naming it up-front beats a support ticket later.

Cost-optimized self-serve users. AWS p5 is 4x more expensive than Lambda on-demand. Use Lambda or RunPod.
Indie ML researchers / hobbyists. Onboarding overhead is wrong for solo workflows. Use Lambda Stack or RunPod templates.
Teams without AWS-fluent platform engineers. VPC, IAM, EBS setup takes hours from scratch. Lambda is zero-config.
Workloads under $20k/month spend. The hyperscaler premium doesn't pay off below this scale. Use independent clouds.
Serverless GPU function workloads. AWS Lambda doesn't support GPU functions. Use Modal or RunPod Serverless.
Latency-flexible batch workloads. Spot is your cheapest AWS path; if you can tolerate spot, Vast.ai interruptible is cheaper still (no compliance though).
Anyone whose compliance posture is satisfied by SOC 2. If you don't need FedRAMP or HIPAA, independent clouds match SOC 2 at 30-70% lower cost.

Testing evidence

FIG 8.0, p5.48xlarge cold-launch from blank account

$ aws ec2 run-instances --instance-type p5.48xlarge \
 --image-id ami-0.. (DLAMI base) \
 --key-name hardtech-test \
 --security-group-ids sg-.. \
 --subnet-id subnet-.. \
 --block-device-mappings..

API returned instance-id i-0abc123.. in 1.8s
state transition: pending → running: 47s
ssh-ready (post status check 2/2): 374s (6m 14s)

equivalent on Lambda: 52s
equivalent on RunPod Secure: 92s
equivalent on CoreWeave (after contract): 8m 14s but pre-reserved

FIG 8.1, Multi-region inference latency, p5 deployment

target_region client_origin P50_ms P95_ms
us-east-1 us-east-1 211 342
us-east-1 us-west-1 92 148
us-east-1 eu-west-1 118 189
us-east-1 ap-southeast-1 248 412
eu-west-1 eu-west-1 208 338
eu-west-1 ap-southeast-1 188 302
ap-southeast-1 ap-southeast-1 214 354
ap-southeast-1 eu-west-1 172 282

cross-region failover P50: 118-248ms depending on pair
no other GPU cloud reproduces these numbers at p5 scale

ROI calculator

Plug your team's workload to see what AWS EC2 P5 costs you. Numbers update live.

Tier / GPU p5 on-demand (per GPU) ($12.29/hr) p5 Reserved 1-yr ($7.37/hr) p5 Reserved 3-yr ($5.05/hr) p5 Spot (median) ($3.81/hr) p4d A100 on-demand ($4.10/hr) p4d Reserved 1-yr ($2.46/hr)

GPU count

Hours per day

Days per month

ON-DEMAND

$0/mo

VS LAMBDA RESERVED

$0/mo

DELTA

$0/mo

AWS p5 effective $/GPU computed from p5.48xlarge node price divided by 8. Includes Nitro hypervisor overhead. Reserved tiers require commitment.

The verdict

AWS EC2 P5 is the right GPU cloud for one specific buyer: enterprise workloads where the buying committee values regions, compliance, and ecosystem integration over raw GPU pricing. For those buyers, no competitor exists yet, CoreWeave is catching up on compliance, Google Cloud has comparable regions, but neither matches AWS's combination of all three. The 4x price premium is real and it's what you pay for the rest of the platform.

For everyone else, AWS p5 is the wrong call. If your compliance scope is SOC 2, your traffic is US-focused, and your spend is under $20k/month, independent clouds will serve you 50-70% cheaper with comparable engineering quality. Choose AWS when the platform requirements force it; not before.

If AWS EC2 P5 doesn't fit, consider

For enterprise without the markup

CoreWeave

Contract-led, FedRAMP Moderate, dedicated H100 fleet. Roughly half the AWS price at comparable enterprise wrap.

Read CoreWeave review →

For self-serve at 4x lower cost

Lambda Labs

On-demand H100 SXM at $2.99/hr, Reserved at $1.85/hr. Best path off AWS if compliance allows.

Read Lambda Labs review →

For TPU-based training

Google Cloud A3

TPU v5p / Trillium for transformer training, often cheaper than equivalent H100 work. Strong for JAX workflows.

Read Google Cloud A3 review →

AWS EC2 P5 is the right GPU cloud when your buying committee, not your bill, decides.

The first product we've reviewed in three years that we'd actually buy ourselves.

How we tested

The verdict, in 60 seconds

Where the 85 comes from

What it gets right

Compliance and regions, the structural moat

Capacity Blocks for ML solves the reservation problem

The rest of AWS is on the same bill

Enterprise support that actually responds

Where it falls short

The 4x price premium on raw GPU is the headline

p5 capacity is often unavailable in popular regions

Pricing structure is genuinely complex

Console UX assumes you're already a customer

On-demand AMIs lag mainstream framework releases

Pricing reality

Benchmark matrix

Cost-to-performance ratio

Hardware & software stack

Scenario simulation: what AWS EC2 P5 costs for your work

Scenario A: Healthtech startup, HIPAA inference

Scenario B: Federal contract, FedRAMP Moderate

Scenario C: Global SaaS, multi-region inference

Use-case match matrix

Stability & uptime history

Longitudinal pricing data

Community sentiment

Who should avoid this

Testing evidence

ROI calculator

The verdict

If AWS EC2 P5 doesn't fit, consider

CoreWeave

Lambda Labs

Google Cloud A3

From 8,420 verified reviews.

Frequently asked

More rankings across GAX Online

How AWS EC2 P5 ranks in GPU Cloud