Skip to content

LLMs and Chat

Pegasus comes with an optional Chat UI for interacting with LLMs. This section covers how it works and the various supported options.

All AI calls in Pegasus go through Pydantic AI. You can configure the model used by setting the DEFAULT_AI_MODEL value in your settings.py or environment variables / .env file. The format is provider:model_name, for example:

Terminal window
DEFAULT_AI_MODEL="openai:gpt-5-mini"

The chat UI and all agents will use whatever is set in DEFAULT_AI_MODEL.

You will also need to set the appropriate API key environment variables for your chosen provider, e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY, etc. See the Pydantic AI model docs for available models and configuration.

To run models like Mixtral or Llama3, you will need to run an Ollama server in a separate process.

  1. Download and run Ollama or use the Docker image
  2. Download the model you want to run:
    Terminal window
    ollama pull llama3
    # or with docker
    docker exec -it ollama ollama pull llama3
    See the Pydantic AI Ollama docs for more details.
  3. Set DEFAULT_AI_MODEL in your .env file to point to the Ollama model. For example:
    Terminal window
    DEFAULT_AI_MODEL="ollama:llama3"
  4. Restart your Django server.

The Chat UI has multiple different implementations, and the one that is used for your project will be determined by your build configuration.

If you build with asynchronous functionality enabled and htmx then it will use a websocket-based Chat UI. This Chat UI supports streaming responses, and is the recommended option. It is also currently the only option that supports the chat widget that can be embedded on any page.

If you build without asynchronous functionality enabled, the chat UI will instead use Celery and polling. The React version of the chat UI also uses Celery and polling. This means that Celery must be running to get responses from the LLM.

The chat widget is currently only available if your project is using HTMX and Async.

The chat widget is a small component that can be embedded on any page of your app. By default, it is included in your chat_home.html file, so you can view a demo of the widget from the “AI Chat” tab in your app.

To add the chat widget to a page, you can follow the example in chat_home.html. There are two steps:

First, include the component at the end of the <body> tag. If you’re extending the app_base.html file this will be at the end of the {% block app %} block.

{% block app %}
...
{% include "chat/components/chat_overlay.html" %}
{% endblock app %}

Then, add the ws_initialize.js in your page JavaScript:

{% block page_js %}
<!-- If using vite -->
{% vite_asset 'assets/javascript/chat/ws_initialize.ts' %}
<!-- If using webpack -->
<script src="{% static 'js/chat-ws-initialize-bundle.js' %}" defer></script>
{% endblock page_js %}

If you want to add the chat widget to all pages in your app, you can add it to the base.html file. For all logged-in pages, you can add it to the app_base.html file.

As of version 2025.9.1, Pegasus includes a set of example agents that you can use as a foundation for building your own agents. These agents are built with Pydantic AI, and include:

  • A weather and location lookup agent, with tools to do geo-lookups and access current weather information.
  • A chatbot to interact with employee application data models, with tools to work with employee data.
  • A chatbot to interact with system database, with MCP tool to access postgres data.
  • A tool to send emails.

For more information on these agents and how they work, you can watch this video:

Agents use the same DEFAULT_AI_MODEL setting as the chat UI. See Configuring your AI model above for details.

Previous versions of Pegasus used LiteLLM and had different configuration options including LLM_MODELS, DEFAULT_LLM_MODEL, and DEFAULT_AGENT_MODEL. As of version 2026.2.1, all AI calls go through Pydantic AI exclusively.

For documentation on the previous setup, see the older version of this page.