Assistant LLM Setup
Configure .envs/local/backend.env to enable LLM-backed summaries on the simulation details page. If LLM support is disabled or misconfigured, the backend falls back to the deterministic metadata summary.
Required Settings
ASSISTANT_LLM_ENABLED=true
ASSISTANT_LLM_PROVIDER=ollama # ollama or livai
Use exactly one provider and configure only that provider's environment variables.
Local Ollama Setup
Recommended local default:
ASSISTANT_LLM_ENABLED=true
ASSISTANT_LLM_PROVIDER=ollama
ASSISTANT_OLLAMA_BASE_URL=http://localhost:11434
ASSISTANT_OLLAMA_MODEL=llama3.1:8b
ASSISTANT_OLLAMA_API_KEY=
ASSISTANT_LLM_TEMPERATURE=0.2
ASSISTANT_LLM_MAX_TOKENS=256
ASSISTANT_LLM_MAX_TOKENS=256 keeps local summaries concise on developer hardware. If unset, the backend runtime default is 2048.
Install and run Ollama:
- On macOS, install Ollama natively: https://docs.ollama.com/quickstart
- Pull a model:
make ollama-pull-fast # llama3.1:8b, faster local default
make ollama-pull-dev # gemma4:e4b, prompt-contract iteration
make ollama-pull-quality # gemma4:26b, quality checks
- Start Ollama in a separate terminal:
make ollama-serve
This runs ollama serve with OLLAMA_KEEP_ALIVE=-1, so models stay loaded while the server is running.
- Restart the backend:
make backend-run
Supported local model choices:
| Model | Use case |
|---|---|
llama3.1:8b |
Faster local summaries on typical developer hardware. |
gemma4:e4b |
Fast prompt-contract iteration. |
gemma4:26b |
Preferred quality checks. |
gemma4:31b |
Only for hardware that can support it. |
ASSISTANT_OLLAMA_BASE_URL=http://localhost:11434 is accepted and normalized internally to Ollama's OpenAI-compatible /v1 endpoint. Values that already include /v1 also work.
On macOS, native Ollama is recommended. Docker Desktop on macOS does not support Ollama GPU acceleration, so Docker-based Ollama is useful only for CPU-only portability testing.
LivAI Setup
ASSISTANT_LLM_ENABLED=true
ASSISTANT_LLM_PROVIDER=livai
ASSISTANT_LIVAI_API_KEY=
ASSISTANT_LIVAI_MODEL=gpt-5.4
ASSISTANT_LIVAI_BASE_URL=https://livai-api.llnl.gov/
ASSISTANT_LLM_TEMPERATURE=0.2
ASSISTANT_LLM_MAX_TOKENS=8192
ASSISTANT_SNAPSHOT_MAX_CHARS=12000
For LivAI, ASSISTANT_LIVAI_API_KEY, ASSISTANT_LIVAI_MODEL, and ASSISTANT_LIVAI_BASE_URL are required.
Model Selection
| Model | Guidance |
|---|---|
gpt-5.4 |
Recommended full model. Reliable structured output completion and handles 8K+ token responses. |
gpt-5.4-mini |
Avoid for this workflow. It may truncate structured responses before completing all required fields such as limitations, citations, and suggested_followups. |
Token Budget Guidance
| Setting | Guidance |
|---|---|
ASSISTANT_LLM_MAX_TOKENS |
Use 4096 to 8192 for gpt-5.4; use 2048 for mini models if used despite limitations. |
ASSISTANT_SNAPSHOT_MAX_CHARS |
Use 12000 to 16000 to balance detail and token budget; reduce to 8000 to 10000 for mini models. |
For current LivAI OpenAI-compatible chat endpoints, SimBoard omits ASSISTANT_LLM_TEMPERATURE for gpt-5* models because the endpoint rejects that parameter. ASSISTANT_LLM_MAX_TOKENS still applies.
Fallback Troubleshooting
After changing .envs/local/backend.env, restart the backend before testing again:
make backend-run
Common fallback reasons:
| Fallback reason | Meaning |
|---|---|
fallback_reason=ollama_misconfigured |
Missing ASSISTANT_OLLAMA_MODEL or ASSISTANT_OLLAMA_BASE_URL. |
fallback_reason=livai_misconfigured |
Missing ASSISTANT_LIVAI_API_KEY, ASSISTANT_LIVAI_MODEL, or ASSISTANT_LIVAI_BASE_URL. |
For Ollama, ASSISTANT_OLLAMA_API_KEY is optional for local runs and can stay blank unless an auth proxy requires it.