5 Home
Guilhem Lavaux edited this page 2026-04-05 09:17:24 +00:00

llamactl

A command-line interface for managing and interacting with ollama_proxy_3.

Author: Guilhem Lavaux
Copyright: CNRS

What is this?

llamactl is the companion CLI for ollama_proxy_3, a managed reverse proxy that sits in front of one or more Ollama compatible instances running on dedicated or SLURM-managed HPC infrastructure.

Instead of talking directly to an Ollama server, users connect through the proxy, which handles authentication, server lifecycle (starting/stopping SLURM jobs), model management, and request routing. llamactl exposes all of that control surface from your terminal.

 you ──► llamactl ──► ollama_proxy_3 ──► SLURM ──► Ollama instance(s)

Basic usage

You may change password, create apikeys (labelled as tokens), and setup local proxy server that behaves like Ollama with preauthenticated connection. Such commands are:

  • Change password: llamactl user password <CURRENT_PASSWORD> <NEW_PASSWORD>
  • Create an apikey: llamactl user api-key new [<OPTIONAL LABEL FOR KEY>]
  • Setup a local server: llamactl serve [-p <PORT>] [-s <REMOTE SERVER NAME TO USE>]

Features

  • Server lifecycle — start and stop Ollama backend servers via SLURM
  • Job tracking — monitor server startup progress and model pulls via streaming (SSE)
  • Model management — list available models, pull new ones onto a specific server
  • Local proxy — run a local pass-through server on port 11434 so standard Ollama clients (e.g. ollama, Open WebUI) work transparently without reconfiguration
  • User administration — manage users, passwords, admin roles, SLURM access, and API keys
  • SLURM visibility — inspect node status and GPU availability
  • Metrics — fetch Prometheus-format metrics from the proxy

Installation

Prerequisites: Rust 1.94+

cargo install --path .
# or build manually
cargo build --release
# binary: target/release/llamactl

Pre-built binaries for Linux x86_64 are attached to each release.

Configuration

Set these environment variables to avoid passing flags on every invocation:

Variable Flag Description
OLLAMA_PROX_API_URL -u, --url Base URL of the ollama_proxy_3 instance
OLLAMA_PROX_API_TOKEN -t, --token Bearer token for authentication
export OLLAMA_PROX_API_URL=https://your-proxy.example.com
export OLLAMA_PROX_API_TOKEN=your-api-token

Usage

llamactl [OPTIONS] <COMMAND>

Server management

# List all servers
llamactl list

# Show detailed info for a server
llamactl show my-server

# Start / stop a server (SLURM job)
llamactl start my-server
llamactl stop my-server

# Check SLURM node status and GPU availability
llamactl status-slurm my-server
llamactl status-slurm my-server --avail

Job tracking

# List pending startup jobs
llamactl progress list

# Stream startup progress for a job
llamactl progress query <JOB_ID>

# Cancel a job
llamactl progress cancel <JOB_ID>

Models

# List models (all servers, or filtered)
llamactl list-models
llamactl list-models my-server

# Pull a model onto a server
llamactl pull start llama3.2 my-server

# Check pull progress
llamactl pull status <JOB_ID>

Local proxy

Starts a local HTTP server on port 11434 that forwards requests to the upstream proxy with your credentials automatically injected. Any standard Ollama client pointed at http://localhost:11434 will work as-is.

# Forward all requests, let the proxy pick the server
llamactl serve

# Pin to a specific backend server
llamactl serve --server my-server

# Use a different port
llamactl serve --port 8080

# Print request/response debug info
llamactl serve --debug

Proxy worker status & metrics

llamactl worker-status
llamactl metrics

User management (admin only)

# List users
llamactl user list

# Add / remove users
llamactl user add alice secret123
llamactl user remove alice

# Grant or revoke admin / SLURM access
llamactl user set-admin alice --is_admin true
llamactl user set-slurm-access alice --can_use_slurm true

# Change your own password
llamactl user password current-pass new-pass

# API key management
llamactl user api-key new "my-script"
llamactl user api-key list
llamactl user api-key remove <KEY_ID>

# Manage another user's keys (admin)
llamactl user api-key --user alice list

Relation to ollama_proxy_3

llamactl is a pure client — it has no business logic of its own beyond formatting requests and displaying responses. All state (servers, users, jobs, models) lives in ollama_proxy_3.

The proxy exposes two API namespaces that llamactl consumes:

  • /proxy/v1/… — server lifecycle, job tracking, model pulls, SLURM status, metrics
  • /proxy/v2/user/… — user and API key management

The serve subcommand additionally forwards the standard Ollama API (/api/…) so that unmodified Ollama-compatible tools can connect through the proxy without knowing about it.

License

MIT