No description
- Python 98%
- Shell 1.4%
- Dockerfile 0.6%
| example | ||
| ollama_prox | ||
| tests | ||
| .gitignore | ||
| build.sh | ||
| CHANGELOG.md | ||
| Dockerfile | ||
| new-release.sh | ||
| poetry.lock | ||
| pyproject.toml | ||
| pytest.ini | ||
| README.md | ||
Ollama Proxy Server
A secure, load-balanced proxy server for Ollama, designed to route requests to multiple Ollama instances while enforcing authentication, rate limiting, and monitoring.
Features
- ✅ Load Balancing – Distributes requests across multiple Ollama servers.
- ✅ Authentication – Secure API access with bcrypt-hashed keys.
- ✅ Queue Management – Limits concurrent requests to prevent overload.
- ✅ Prometheus Metrics – Monitor request counts, queue sizes, and active connections.
- ✅ Logging – Detailed request logging for auditing.
- ✅ Allowed Paths – Restricts access to specific Ollama endpoints.
Requirements
- Python 3.11+
- FastAPI
- Uvicorn
- Prometheus Client
bcrypt(for key hashing)
Installation
1. Clone the Repository
git clone https://git.aquila-consortium.org/guilhem_lavaux/ollama_proxy_2
cd ollama_proxy_2
poetry shell # need the poetry shell plugin
poetry install
ollama_proxy
The API server is now running on the advertised port.