API Reference

Note: This is a quick reference of common endpoints. For the complete API reference, see docs/API.md in the repository.

OpenAI Compatible

Method	Endpoint	Description
GET	`/v1/models`	List available models
POST	`/v1/chat/completions`	Chat completion (streaming)
POST	`/v1/messages`	Messages-style completion

Ollama Compatible

Method	Endpoint	Description
GET	`/api/tags`	List all models
POST	`/api/chat`	Chat completion
POST	`/api/generate`	Text generation
POST	`/api/show`	Model info
GET	`/api/ps`	Loaded models

Mesh Native

Method	Endpoint	Description
GET	`/health`	Mesh status
GET	`/nodes`	All nodes with models
GET	`/models`	Model list
POST	`/chat`	Mesh routing
POST	`/pull`	Download model

Chat Completion Example

OpenAI-compatible request:

curl -X POST http://localhost/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "llama3.1:8b",
    "messages": [{"role": "user", "content": "hello"}],
    "stream": true
  }'

Model Aliases

Use these names to route to the best available model:

parley:best — Most capable model
parley:code — Best code model
parley:fast — Lowest latency
parley:reason — Best reasoning