Enterprise AI

Nutanix Enterprise AI

Based on: NAI 2.3

» Download this section as PDF (opens in a new tab/window)

Key Use Cases of Enterprise AI

Key use cases include:

Challenges with Enterprise AI

Typical challenges implementing Enterprise AI include:

Building an Enterprise AI platform

Building an Enterprise AI platform involves integrating multiple components and technologies to ensure scalability, reliability, and performance.

Enterprise AI Infrastructure Stack

Nutanix Enterprise AI

Nutanix Enterprise AI is a Kubernetes application that enables the AI platform component of the Enterprise AI stack and enables IT organizations to manage and deploy LLMs and inference endpoints. Nutanix Enterprise AI can be deployed on:

How Nutanix Enterprise AI works

How It Works

  1. Deploy and run Nutanix Enterprise AI on Kubernetes.
  2. Login to the interface and deploy your choice of LLM from Hugging Face, NVIDIA, or import your own custom model.
  3. Create a secure endpoint and API key.
  4. Test the model directly from the UI before sending the token to the application developer or data scientist.

Then, monitor and manage the endpoint usage, infrastructure, events, and other metrics for understanding how the organization is using AI and troubleshoot any issues.

Nutanix Enterprise AI Key Features

Example Use Case - Retrieval Augmented Generation

Open source LLMs, such as Meta’s Llama, are pre-trained on vast amounts of data from the internet, but may not know anything about your own organization. For example, if you asked about your next company holiday, it might know about national holidays, but not holidays specific to your organization. That’s where Retrieval Augmented Generation (RAG) comes in.

A crucial part of RAG is the document store and vector database. The typical workflow involves ingesting documents from file or object storage, processing them with a function that splits and embeds the content, and then storing these embeddings in a vector database. Both open source and commercial tools are available to streamline this process.

Once your documents have been embedded, the end-user workflow of a RAG-enabled chatbot looks similar to the below diagram.

RAG

  1. Ask Question
    • User asks a question to the chatbot.
  2. Create Embedding of Query
    • Instead of going directly to the inference API, the application will first create an embedding of the query using an embedding model hosted on Nutanix Enterprise AI.
  3. Search/Retrieval of Similar Content
    • With that embedding, the application will search for similar embeddings in the vector database that has been populated with the embeddings of source documents.
  4. Send Prompt to Inference API
    • The application augments the user’s prompt with the found context and sends this to a text generation model hosted on Nutanix Enterprise AI.
  5. Get Answer
    • The chatbot returns an answer to the user.

For more information for designing and implementing a Retrieval Augmented Generation workflow, check out the Nutanix Validated Design.

Other Resources

To learn more about Nutanix Enterprise AI and to see it in action, check out the following resources:

©2025 Nutanix, Inc. All rights reserved. Nutanix, the Nutanix logo and all Nutanix product and service names mentioned are registered trademarks or trademarks of Nutanix, Inc. in the United States and other countries. All other brand names mentioned are for identification purposes only and may be the trademarks of their respective holder(s).