- Rust 87.6%
- Shell 12.4%
| .forgejo/workflows | ||
| src | ||
| tools | ||
| .gitignore | ||
| .releaserc.json | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CHANGELOG.md | ||
| new-release.sh | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
llamactl
A command-line interface for managing and interacting with ollama_proxy_3.
Author: Guilhem Lavaux Copyright: CNRS
What is this?
llamactl is the companion CLI for ollama_proxy_3, a managed reverse proxy that sits in front of one or more Ollama compatible instances running on dedicated or SLURM-managed HPC infrastructure.
Instead of talking directly to an Ollama server, users connect through the proxy, which handles authentication, server lifecycle (starting/stopping SLURM jobs), model management, and request routing. llamactl exposes all of that control surface from your terminal.
you ──► llamactl ──► ollama_proxy_3 ──► SLURM ──► Ollama instance(s)
Features
- Server lifecycle — start and stop Ollama backend servers via SLURM
- Job tracking — monitor server startup progress and model pulls via streaming (SSE)
- Model management — list available models, pull new ones onto a specific server
- Local proxy — run a local pass-through server on port 11434 so standard Ollama clients (e.g.
ollama, Open WebUI) work transparently without reconfiguration - User administration — manage users, passwords, admin roles, SLURM access, and API keys
- SLURM visibility — inspect node status and GPU availability
- Metrics — fetch Prometheus-format metrics from the proxy
Installation
Prerequisites: Rust 1.94+
cargo install --path .
# or build manually
cargo build --release
# binary: target/release/llamactl
Pre-built binaries for Linux x86_64 are attached to each release.
Configuration
Set these environment variables to avoid passing flags on every invocation:
| Variable | Flag | Description |
|---|---|---|
OLLAMA_PROX_API_URL |
-u, --url |
Base URL of the ollama_proxy_3 instance |
OLLAMA_PROX_API_TOKEN |
-t, --token |
Bearer token for authentication |
export OLLAMA_PROX_API_URL=https://your-proxy.example.com
export OLLAMA_PROX_API_TOKEN=your-api-token
Usage
llamactl [OPTIONS] <COMMAND>
Server management
# List all servers
llamactl list
# Show detailed info for a server
llamactl show my-server
# Start / stop a server (SLURM job)
llamactl start my-server
llamactl stop my-server
# Check SLURM node status and GPU availability
llamactl status-slurm my-server
llamactl status-slurm my-server --avail
Job tracking
# List pending startup jobs
llamactl progress list
# Stream startup progress for a job
llamactl progress query <JOB_ID>
# Cancel a job
llamactl progress cancel <JOB_ID>
Models
# List models (all servers, or filtered)
llamactl list-models
llamactl list-models my-server
# Pull a model onto a server
llamactl pull start llama3.2 my-server
# Check pull progress
llamactl pull status <JOB_ID>
Local proxy
Starts a local HTTP server on port 11434 that forwards requests to the upstream proxy with your credentials automatically injected. Any standard Ollama client pointed at http://localhost:11434 will work as-is.
# Forward all requests, let the proxy pick the server
llamactl serve
# Pin to a specific backend server
llamactl serve --server my-server
# Use a different port
llamactl serve --port 8080
# Print request/response debug info
llamactl serve --debug
Shell completions
Generate and install a completion script for your shell:
# Bash
llamactl completions bash > ~/.local/share/bash-completion/completions/llamactl
# Zsh (add to a directory on your $fpath)
llamactl completions zsh > ~/.zfunc/_llamactl
# then add `fpath=(~/.zfunc $fpath)` and `autoload -Uz compinit && compinit` to ~/.zshrc
# Fish
llamactl completions fish > ~/.config/fish/completions/llamactl.fish
# PowerShell
llamactl completions powershell >> $PROFILE
# Elvish
llamactl completions elvish
Proxy worker status & metrics
llamactl worker-status
llamactl metrics
User management (admin only)
# List users
llamactl user list
# Add / remove users
llamactl user add alice secret123
llamactl user remove alice
# Grant or revoke admin / SLURM access
llamactl user set-admin alice --is_admin true
llamactl user set-slurm-access alice --can_use_slurm true
# Change your own password
llamactl user password current-pass new-pass
# API key management
llamactl user api-key new "my-script"
llamactl user api-key list
llamactl user api-key remove <KEY_ID>
# Manage another user's keys (admin)
llamactl user api-key --user alice list
Relation to ollama_proxy_3
llamactl is a pure client — it has no business logic of its own beyond formatting requests and displaying responses. All state (servers, users, jobs, models) lives in ollama_proxy_3.
The proxy exposes two API namespaces that llamactl consumes:
/proxy/v1/…— server lifecycle, job tracking, model pulls, SLURM status, metrics/proxy/v2/user/…— user and API key management
The serve subcommand additionally forwards the standard Ollama API (/api/…) so that unmodified Ollama-compatible tools can connect through the proxy without knowing about it.
License
MIT