Streamline PyTorch Machine Learning Workflows with Cloudian and AWS Hybrid Edge
Cloudian has announced an open-source software contribution that fusesย PyTorch, the widely acclaimed machine learning library, with local Cloudian HyperStore S3-compatible storage solutions. This pioneering development is set to accelerate machine learning workflows by allowing PyTorch users on AWS Hybrid Edge solutions to directly access local data lakes running on Cloudian storage, thus simplifying and cost-reducing […]
Posted: Wednesday, Feb 28
  • KBI.Media
  • $
  • Streamline PyTorch Machine Learning Workflows with Cloudian and AWS Hybrid Edge
Streamline PyTorch Machine Learning Workflows with Cloudian and AWS Hybrid Edge

Cloudian has announced an open-source software contribution that fusesย PyTorch, the widely acclaimed machine learning library, with local Cloudian HyperStore S3-compatible storage solutions.

This pioneering development is set to accelerate machine learning workflows by allowing PyTorch users on AWS Hybrid Edge solutions to directly access local data lakes running on Cloudian storage, thus simplifying and cost-reducing machine learning.

 

Localised Machine Learning with AWS Hybrid Edge

Data scientists and AI developers are well acquainted with the challenges of data staging and the associated costs. This breakthrough simplifies the machine learning (ML) workflow and reduces costs by allowing data scientists and AI developers to run ML on data resident in local Cloudian S3-compatible object storage, without the need to move and stage the data into another system. The ML tasks can also run on local compute resources such as AWS Outposts and Local Zones. Cloudian is a certified Service Ready partner for AWS Outposts and Local Zones and is commercially available through the AWS Marketplace.

 

The Perks of Going Local

By bridging the gap between S3-compatible object storage systems and ML compute platforms, Cloudian is reducing the need for data migration as part of a ML workflows. Data can now be analysed at the source.

This open-source contribution bridges the gap between distributed S3-compatible object storage systems and ML compute platform, eliminating the dependency on a dedicated parallel file system for machine learning workflows. By enabling direct access to a cost-effective, scalable data repository, Cloudian simplifies the ML process, reducing both complexity and costs associated with data analysis.

The key benefits of this development are:

  • Data Sovereignty:ย Proprietary training data can remain on-prem.
  • Simplified Workflow:ย Say goodbye to the cumbersome data staging. This means real-time analysis and model training at reduced costs.
  • Seamless Integration:ย Direct use of PyTorch with Cloudian HyperStore S3-compatible storage ensures immediate data access without the time and expense of data migration.
  • Local Performance:ย Experience machine learning with low latency and high-speed data access, thanks to running models locally with AWS Outposts and Local Zones.

With this contribution, we offer the ML community a tool that integrates two of their most important needs: the computational power of PyTorch and fast, flexible access to training data on S3-compatible storage. By connecting these, we are enabling a more efficient and streamlined approach to machine learning.

 

Open Source and Ready to Use

Cloudian contributed enhancements to AWS Labsโ€™ open-source S3-Connector-for-PyTorch. The enhancements enable PyTorch ML algorithms to access data in Cloudianโ€™s HyperStore object storage system via the AWS S3 API. The enhanced S3 connector is available from the GitHub repositories ofย AWS Labsย andย Cloudian.

With this move, Cloudian reinforces its commitment to supporting the advancement of machine learning technologies on AWS Hybrid Edge platforms.

Share This