AI/Data Science Cloud Platform
InfinitiesSoft CloudFusion enables an integrated AI/Data Science Cloud Platform, featuring machine learning in Kubernetes functionality to make complex deployments in AI, HPC and Big Data easier. The AI/Data Science Cloud platform is based on a private cloud (on premises infrastructure) that can be expanded to a hybrid cloud for allocation of additional resources (compute, storage and even GPU resources) during peak periods (“cloud bursting”) using an advanced cloud management platform for resource allocation.
The AI/Data Science Cloud turnkey package combines the following:
Management Layer – InfinitiesSoft CloudFusion cloud management platform to dynamically allocate virtualized resources and schedule workloads. CloudFusion also can pool on-premises physical resources with those from public cloud services (AWS, Azure, Google Cloud, Ali-Cloud etc.) to create cloud bursting functionality (hybrid cloud).
Virtualization Layer – Docker + Kubernetes for virtualization of GPU resources (containers), OpenStack for virtualization of CPU resources (virtual machines), and BigTera VirtualStor Converger or Scaler for software defined storage.
Hardware Layer – server hardware for the underlying on-premises private cloud infrastructure.
How it Works
The cloud platform enables data science teams, developers, and IT teams to simplify and streamline workloads through a single system.
AI/ML capabilities are already integrated into this cloud so that users can focus on AI/ML workloads and not on system maintenance, adjustment and deployment scheduling. The cloud reduces complexity and the learning curve for users to adopt and master Tensorflow, Caffe, and other deep learning tools.
Containers make it easier, more secure, and faster for developers to develop, scale, and deliver AI applications. They also make it easier for data scientists to work with AI. Both Docker + Kubernetes and Singularity are containers that can be used in this system. Singularity is lightweight and non-IP (HPC) based which is designed for a single user, and ideal for non-interactive batch jobs. By comparison, Kubernetes is heavyweight, IP-based and therefore allows multiple user connections, and ideally suited for interactive jobs.
Kubernetes is fast becoming essential to AI work and is a key feature of this cloud platform. It is the most popular container in machine learning workloads, as most scenarios are set up to run in Kubernetes containers due to its interactive mode capability. Because Kubernetes containers can be scheduled and managed throughout the life cycle, it’s also a favorite among developers and DevOps practitioners working with continuous release or continuous delivery application development processes. Machine learning developers also heavily favor Kubernetes for those same reasons.
Open source tools are increasingly becoming available on the market and further add appeal to using Kubernetes for data science work. For example, the Kubeflow open source tool enables teams to easily attach existing machine learning jobs to a cluster without having to do much in the way of adaptations or integrations.
For Further Information
For further information about how we can help build an AI/Data Science enabled cloud platform for your organization, download our white paper, solution brochure, or contact us.