Note: This is a quick reference of common endpoints. For the complete API reference, see docs/API.md in the repository.
| Method | Endpoint | Description |
|---|---|---|
| GET | /v1/models | List available models |
| POST | /v1/chat/completions | Chat completion (streaming) |
| POST | /v1/messages | Messages-style completion |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/tags | List all models |
| POST | /api/chat | Chat completion |
| POST | /api/generate | Text generation |
| POST | /api/show | Model info |
| GET | /api/ps | Loaded models |
| Method | Endpoint | Description |
|---|---|---|
| GET | /health | Mesh status |
| GET | /nodes | All nodes with models |
| GET | /models | Model list |
| POST | /chat | Mesh routing |
| POST | /pull | Download model |
OpenAI-compatible request:
curl -X POST http://localhost/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "llama3.1:8b",
"messages": [{"role": "user", "content": "hello"}],
"stream": true
}'
Use these names to route to the best available model:
parley:best — Most capable modelparley:code — Best code modelparley:fast — Lowest latencyparley:reason — Best reasoning