IaaS Forecasts for 2026: Implications for SMB and Enterprise. Part 1

We see a clear split in how SMB and enterprise approach infrastructure across customer projects. SMB environments are improving for speed-to-market and expected spend, typically combining managed services with a lean IaaS footprint. Enterprise architectures, meanwhile, prioritize governance, segmentation, and scalable platforms spanning multiple providers and regions. However, both segments are converging around the same question: how to adapt to rapidly changing rules and safely scale changes.
Against that backdrop, the coming year promises to be eventful in terms of the evolution of infrastructure solutions. The changes will affect not only small and medium-sized businesses, but also large enterprise customers. IaaS is no longer merely a standalone tool of digital transformation - it has become one of the critical factors enabling organizations to respond quickly to shifting market conditions.
Economic efficiency and regulatory compliance remain top priorities not only for enterprises themselves, but also for infrastructure providers. Customers increasingly demand solutions that do more than simply implement modern technologies - they must also account for global security trends while remaining compliant with applicable legislation.
In this article, we attempt to analyze which trends are likely to become decisive in 2026. Our goal is to show how organizations of all sizes can leverage upcoming changes to their advantage, striking a balance between innovation, data protection, and commercial viability.
IaaS Trends and Innovation
The year 2025 continued the broader trajectory of recent years. Infrastructure as a Service (IaaS) keeps evolving at a rapid pace. Companies are increasingly pushing data processing closer to its source, most notably by extending cloud environments to the “edge” of the network. This enables faster processing of data coming from sensors, cameras, and industrial equipment.
There are exactly two reasons for this shift: economic and physical. If a solution must deliver results to the end user within milliseconds, routing data “across the ocean” simply makes no sense. Video streaming is the most illustrative example. It is far easier, faster, and cheaper to process video as close to the camera as possible, and then send only the processed output to the cloud.
This approach brings additional benefits. Edge environments can (and should) operate autonomously, synchronizing independently from real-time data processing. Even if connectivity to the central cloud is temporarily lost, this does not create significant issues. In a sense, it acts as a form of masking: failures remain confined within the infrastructure and have little to no impact on end users.
Another important advantage is the significant simplification of regulatory compliance. Cross-border data transfers often require adherence to a wide range of conditions, and in some cases are outright prohibited. Processing data within national borders removes an entire layer of legal and technical complexity. It is safe to assume that this trend will continue throughout 2026.
At the same time, demand for bare-metal infrastructure is expected to grow. Over the past year, the number of resource-intensive workloads has increased at an extraordinary pace. This includes high-performance computing (HPC) as well as AI/ML workloads, which require maximum performance not only from accelerators such as GPU or NPU, but from physical servers as a whole.
Several analytical studies, including assessments by Gartner and specialized bare-metal providers, indicated that by 2025 more and more organizations would actively incorporate bare-metal infrastructure into their cloud architectures, particularly for AI, HPC, and latency-sensitive workloads. Bare-metal is no longer a niche solution and is becoming a key architectural component of hybrid IaaS.
Practice has confirmed these forecasts. We expect this trend to continue into the coming year, as IaaS increasingly moves toward diversity and distribution. The need to adapt infrastructure to specific requirements, such as neural network inference, is now a major driver of architectural transformation, and this approach proves economically justified.
Pricing and Payment Models
Since we have touched on the financial aspect, it is worth examining in more detail the changes we are likely to see in the coming year. Most cloud providers offer a pay-as-you-go pricing model, where resources are billed per minute or per hour based on actual usage. This is an extremely flexible approach that works particularly well for variable workloads.
Many companies experience predictable spikes in demand - holiday sales being the most obvious example. Even if baseline infrastructure usage is relatively low, peak periods can trigger real surges that organizations are often unprepared for. This scenario aligns perfectly with pay-as-you-go: costs remain modest under normal conditions, while infrastructure can seamlessly absorb sudden storms of traffic and transactions.
However, this model has one significant drawback. When workloads are consistently high, pay-as-you-go becomes more expensive than virtually any alternative. Also it does not protect customers from critical overselling on shared infrastructure. When many tenants spike simultaneously, contention becomes a systemic risk, exactly when the workload is most sensitive. For mission-critical systems, teams should consider dedicated solutions (or shared plans with a clearly defined guaranteed limit on oversubscription) to keep performance predictable during peak periods.
Since events like large promotional campaigns are regular, many cloud providers allow customers to reserve resources for extended periods. This option benefits both sides. Reserved instances (or similar commitment-based plans) are offered at substantial discounts compared to on-demand pricing. Additionally, they act as a form of insurance against potential shortages of specific resources, such as GPU capacity during AI demand spikes. By committing long-term, customers not only reduce unit costs but also ensure guaranteed access to resources when they are needed most.
For providers, this model improves utilization and makes operations more predictable. Engineering teams can align capacity, maintenance windows, and resiliency initiatives with customer demand patterns over longer time horizons. The result is less firefighting and more planned infrastructure growth. Overall, both customers and vendors are striving for predictability and stability in the cloud. At scale, long term commitments create operational and financial benefits for all parties involved.
New Services and Platforms
Throughout 2025, cloud platforms significantly expanded their service portfolios, moving well beyond traditional virtual machines, databases, and storage infrastructure. Most new services were directly or indirectly related to machine learning and AI. Alongside access to powerful GPU clusters, hyperscalers began offering turnkey services and platforms for deploying large-scale neural network models.
This was a response to exponential demand driven by rapid market growth. Models such as DeepSeek-R1, Qwen 3, and Kimi-K2 caused a major stir, demonstrating that AI integration into cloud platforms can not only automate tasks, but also deliver personalized recommendations while dramatically accelerating and reducing the cost of content generation.
The primary obstacle to adoption has been the substantial system requirements. For example, Kimi-K2 is available only on extremely powerful GPU servers with total VRAM capacities of around 1 TB for 8-bit quantized weights. Where demand exists, supply inevitably follows. Hyperscalers have therefore begun investing massive sums into new data centers designed specifically to host enormous GPU clusters.
This is a long-term strategy, strongly supported by NVIDIA, the undisputed leader in GPU computing. The new Blackwell GPU architecture and the unified Blackwell Ultra platform are explicitly designed to enable next-generation “AI factories.”
Alongside the rapid growth of AI, industry-specific clouds have also gained momentum, targeting sectors such as fintech, healthcare, telecommunications, and others. The key distinction is that these clouds provide service portfolios preconfigured to meet regulatory requirements and align with industry-specific business processes.
As providers seek to address the broadest possible customer base, 2026 is likely to bring highly specialized offerings: from standalone generative services and multi-agent systems to fully managed Kubernetes clusters orchestrated by AI. At the same time this shift introduces a non-trivial operational risk: AI-driven infrastructure often behaves like a black box. If something goes wrong, troubleshooting can be slowed by non-transparent decision logic, rapidly changing state, and configurations that are technically correct but imperfectly explainable.
Network Connectivity Costs
«We’ve covered the structural reasons for rising network costs in more detail in a separate article».
Over the past year, the cost of network connectivity has been one of the most widely discussed topics. Data transfer has historically been a significant component of cloud spending, with egress traffic accounting for the largest share. High tariffs became so prominent that regulators intervened. In the EU, this resulted in the ambitious Data Act, which requires cloud providers to remove unjustified barriers and facilitate cloud-to-cloud migration.
Beyond simplifying switching and limiting switching charges, proposals were also made to require large content providers (including cloud platforms) to contribute financially to the load they generate on telecom networks. Most business associations warned that such costs would ultimately be passed on to customers, particularly in the SMB segment. As a result, the EU abandoned the idea of network fees, and a separate agreement with the United States, where a significant share of global traffic originates, declared a rejection of similar charges for big tech.
While direct price increases were avoided, the overall growth in data transmission continues steadily. More users and devices generate more traffic, forcing providers to invest in backbone capacity expansion. Part of these costs will inevitably be passed on to customers - a topic we will explore in more detail in a future article.
This dynamic will drive increased transparency, along with more proactive customer behavior. Organizations will increasingly rely on peering and CDN solutions to optimize costs. While regulatory action has already helped prevent chaotic price hikes, effective traffic management will become the true magic wand for significantly reducing cloud expenses in the near future.
What's coming in Part 2
- Reliability сrisis & SLA tightening
- Billing becomes transparent
- Market splits: Big Three vs Neoclouds
- Security & Data sovereignty drive costs up
Read Part 2 ⇾