Technology
Red Hat Launches AI Inference Server and Expanded Portfolio to Accelerate Enterprise AI Adoption Across Cloud Ecosystems
Red Hat, the global leader in open-source solutions, unveiled major enhancements to its AI portfolio today, introducing the Red Hat AI Inference Server, third-party validated AI models, and the integration of Llama Stack and the Model Context Protocol (MCP). These advancements aim to simplify the deployment of generative AI (GenAI) and predictive AI solutions across hybrid cloud environments, empowering organizations to scale their AI capabilities with greater efficiency, consistency, and choice.
With the increasing complexity of the AI landscape, Red Hat is addressing the industry’s demand for flexibility and reliability through infrastructure-agnostic tools. The newly launched Red Hat AI Inference Server enables cost-effective, high-performance AI inference across hybrid environments, integrating seamlessly with Red Hat OpenShift AI and Red Hat Enterprise Linux AI or functioning as a standalone solution.
“Enterprise AI should be about choice and performance, not infrastructure limitations,” said a Red Hat spokesperson. “We are helping organizations overcome deployment bottlenecks and tap into the full potential of their AI investments.”
Validated AI Models and Efficient Deployment
Red Hat also introduced third-party validated models, available via Hugging Face, enabling users to select tested and optimized models for diverse use cases. These models are designed to minimize resource consumption through compression techniques while ensuring high-performance outcomes. Customers are also provided with deployment guidance to enhance reproducibility and reliability.
Llama Stack and MCP: Standardizing AI Agent Development
To facilitate smoother development of AI-powered agents and applications, Red Hat has integrated Meta’s Llama Stack and Anthropic’s Model Context Protocol (MCP) into its platform. These APIs provide a unified interface for inference, retrieval-augmented generation (RAG), model evaluation, and tool integration, supporting a standardized approach to agent workflows.
Upgrades to OpenShift AI and Enterprise Linux AI
The latest Red Hat OpenShift AI (v2.20) release includes:
-
A technology preview of the optimized model catalog, supporting easy deployment of validated models.
-
Distributed training via the Kubeflow Training Operator, optimized for PyTorch and InstructLab workloads.
-
A feature store based on the Kubeflow Feast project, streamlining data management for training and inference.
Red Hat Enterprise Linux AI 1.5 expands its foundation model capabilities, now available on Google Cloud Marketplace in addition to AWS and Azure. It also introduces multi-language support (Spanish, German, French, and Italian) through InstructLab and allows users to bring their teacher models for customization.
Additionally, the Red Hat AI InstructLab on IBM Cloud is now generally available, streamlining the model tuning process with improved scalability and user control.
Red Hat’s Vision: Any Model, Any Accelerator, Any Cloud
Red Hat continues to champion open innovation, aiming to create a universal inference platform free from infrastructure silos. Its evolving AI ecosystem supports the deployment of any model, on any accelerator, across any cloud—enabling enterprises to maximize performance and value in their GenAI deployments.