VAST Dataย today unveiled a new AI cloud architecture designed to deliver unprecedented levels of performance, quality of service, zero-trust security, and space/cost/power efficiency for the AI factory. Building on NVIDIA BlueField-3 data processing unit (DPU) technology, VAST Dataโs parallel system architecture makes it possible to disaggregate the entirety of VASTโs operating system natively into AI computing machinery, transforming supercomputers into AI data engines.
The NVIDIA BlueField networking platform combines robust compute power and integrated hardware accelerators to create secure and software-defined accelerated computing infrastructure for AI. By outfitting each graphics processing unit (GPU) server with a dedicated NVIDIA BlueField DPU running a stateless container that powers the VAST parallel services operating system, this new architecture design embeds storage and database processing services directly into AI servers and delivers true linear data services designed to scale to hundreds of thousands of GPUs. Moreover, by removing multiple layers of x86 hardware and networking from VASTโs network-attached Data Platform infrastructure, this new AI factory architecture dramatically reduces the cost, footprint, and power associated with AI data services.
Through its collaboration with NVIDIA and this first-of-its-kind integration, VAST Data is:
- Maximising Data Centre Efficiency: VASTโs Disaggregated, Shared Everything (DASE) architecture leverages the processing power of NVIDIA BlueField-3 to require less independent compute and networking resources, reducing the power usage and data centre footprint for VAST infrastructure by 70 per cent. The combined end-to-end solution results in a net energy consumption savings of over five per cent compared to deploying NVIDIA-powered supercomputers with the previous VAST distributed data services infrastructure.
- Enabling Unprecedented Quality of Service: By providing each GPU server with a dedicated and truly parallel storage and database container, this new AI factory architecture eliminates contention for data services infrastructure. VASTโs DASE architecture features extreme parallelism such that each NVIDIA BlueField-3 can read and write into shared namespaces of the VAST Data Platform without coordinating IO across containers. In essence, this architecture eliminates infrastructure contention at the most fundamental level. This contention-less architecture is essential for multi-tenant service providers who need to meet the contractual Service Level Objectives of their clients while also maximising the utilisation of all GPU computing assets.
- Enhancing Zero-Trust Security: This new AI factory architecture ensures that data and data management remain protected and isolated from host operating systems. Compared to AI computers that use parallel file system clients (which have an intimate understanding of the data services layer), VAST is able to eliminate many attack vectors in a multi-tenant environment by hosting industry-standard network attached services, object services, and database services from NVIDIA BlueField-3 DPUs via standard client protocols that do not expose the underlying Data Platform system topology – such as NFS, SMB, S3 and Apache Arrow.
- Delivering Block Storage Services: VAST systems, powered by the NVIDIA DOCA software framework that enables the rapid development of containerised services, now provides block storage services natively to host operating systems – combining with VASTโs file, object, and database services to provide a comprehensive set of data presentations to high-performance applications.
โWeโre extremely proud to partner with NVIDIA to help industrialise AI computing,โ said Jeff Denworth, co-founder at VAST Data. โThis new architecture is the perfect showcase to express the parallelism of the VAST Data Platform. With NVIDIA BlueField-3 DPUs, we can now realise the full potential of our vision for disaggregated data centres that weโve been working toward since the company was founded.โ
This new VAST architecture โ running VAST software on BlueField DPUs in the AI servers โ is being tested and deployed first at CoreWeave, the leading specialised GPU cloud provider. VAST and CoreWeave began partnering in 2023 to build some of the worldโs most scalable AI machinery and to help many of the worldโs leading LLM builders and blue-chip enterprise customers build their own AI factories.
โWith VASTโs operating system, next-generation accelerated computing solutions are paired with next-generation accelerated network infrastructure, enabling enterprises and service providers to benefit from simpler, more secure experiences with high-performance systems,โ said Rob Davis, Vice President of Storage Technology at NVIDIA.
โVASTโs revolutionary architecture is a game-changer for CoreWeave, enabling us to fully disaggregate our data centres. Weโre seamlessly integrating VASTโs advanced software directly into our GPU clusters,โ said Peter Salanki, vice president of Engineering at CoreWeave. โLeveraging NVIDIA BlueField DPUs, weโve been at the forefront of creating sophisticated, software-defined data centre abstractions. Now, by natively incorporating storage and database services onto BlueField, weโre not just streamlining our infrastructure but we are also elevating the user experience for our customers by removing bottlenecks in the AI data computing pipeline. CoreWeave is not just keeping pace with the future of cloud data management โ we are defining it.โ