No products in the cart.
ScaleOps Secures $130M to Enhance AI Computing Efficiency
The round closed in two weeks, insiders say, a pace that tells you how badly operators want relief from runaway compute bills.
The $130 Million Bet on AI Efficiency
ScaleOps just landed $130 million and an $800 million valuation without promising a new model or a bigger GPU cluster. Instead, the three-year-old Israeli startup sells the picks and shovels of the AI boom—software that squeezes every last cycle out of the iron that’s already humming in the cloud.
Insight Partners led the Series C; Lightspeed, NFX, Glilot, and Picture Capital piled back in. The round closed in two weeks, insiders say, a pace that tells you how badly operators want relief from runaway compute bills.
Chief executive Yodar Shafrir doesn’t pitch magic. He promises 80 % savings by letting workloads borrow, lend, or return cores the way a modern grid shunts electricity. Shafrir sold Run:ai—another GPU efficiency play—to Nvidia last year for a reported $700 million. Same problem, new company, bigger checkbook.
How ScaleOps Plans to Revolutionize Cloud Computing
AI workloads don’t sit still. A training job that needs 512 GPUs at 2 a.m. may idle by breakfast. Legacy autoscalers, built for stateless web servers, can’t track that volatility, so engineers over-provision and hope. The waste shows up in the bill.
How ScaleOps Plans to Revolutionize Cloud Computing AI workloads don’t sit still.
ScaleOps keeps a live map of every container, queue, and GPU. When demand drifts, the platform re-allocates cores before the line goes vertical. No YAML rewrites, no midnight pages. Customers hand over the keys; the system drives from there.
The granularity matters. A single mis-tuned replica can strand a $40 k A100 for hours. Multiply that across regions and teams and you’re into seven-figure burn. ScaleOps claims it hands that money back without touching model accuracy or SLA.
The AI-Driven Demand for Real-Time Resource Management
GPU utilization in the wild hovers around 30 %. The rest is idle rent. Cloud dashboards will happily show you the red bars; they won’t move them. ScaleOps, Cast AI, Kubecost, and Spot all chase the same slack, but most stop at recommendations. Someone still has to click “approve.”
ScaleOps closes the loop. It watches latency, queue depth, and cost, then shifts workloads or parks the silicon. If a job slips, the rollback is automatic. That’s the difference between a report and a remedy.
The urgency keeps rising. Enterprises that once spun up a few dozen GPUs for quarterly retraining now keep thousands warm for fine-tuning, inference, and RAG. The meter spins whether the silicon’s thinking or just radiating heat.
How ScaleOps Stands Out
You may also like
Artificial IntelligenceModi Engages Qualcomm CEO on AI Innovations in India
PM Modi's discussion with Qualcomm's CEO highlights India's AI advancements and their impact on job opportunities in the tech sector.
Read More →The competitive slide decks all converge on the same buzzwords: “autonomous,” “rightsizing,” “FinOps.” The catch is context. Moving a pod to cheaper spot instances looks clever until the checkpoint stalls and the whole training run restarts from epoch zero.
It watches latency, queue depth, and cost, then shifts workloads or parks the silicon.
ScaleOps says it models the blast radius before it acts—traffic, dependencies, even the checkpoint interval. If the risk outweighs the savings, it waits. That caution buys trust, and trust buys time to go deeper into the stack.
End-to-end is the marketing phrase; inside the codebase it means the agent owns the node provisioner, the scheduler hooks, and the cost ledger. Competitors tend to pick one layer. Owning the whole chain lets ScaleOps trade 3 % latency for 40 % cost and still hit the SLA.
With the fresh $130 million, Shafrir will double the 90-person R&D team and open a second U.S. hub in Austin. The roadmap is blunt: support every accelerator vendor, every orchestrator, every cloud. The goal is to make the efficiency layer too cheap and too good to ignore—exactly the playbook that turned Run:ai into an Nvidia tuck-in.









