Running The Wooly Server
Prerequisites​
- A host machine with a compatible GPU (NVIDIA or AMD currently)
- A license.json file. You can get it from https://woolyai.com/signup/
- Docker installed on the GPU host machine.
- Choose the proper docker image for your GPU vendor:
- NVIDIA:
woolyai/server:nvidia-latest(x86_64 & aarch64) - AMD:
woolyai/server:amd-latest(x86_64)
- NVIDIA:
Setup​
- Create a directory for the server VRAM cache
mkdir woolyai-server-vram-cache
- Create the server config file:
woolyai-server-config.toml:
[SERVER]
LISTEN_ADDR = tcp::443
# Optional SSL endpoint. Uncomment after placing certfile.pem in working dir.
# LISTEN_ADDR = ssl::443
# SSL_CERT_FILE = certfile.pem
# SSL_KEY_FILE = certfile.pem
########################
# Controller integration (leave blank if not using a controller).
########################
## Note: You can comma separate multiple controller URLs
# CONTROLLER_NATS_URL = nats://localhost:4222
# NODE_NAME must be unique across all nodes in the cluster
# NODE_NAME = my-node
# NODE_ID will be auto-generated from NODE_NAME if not set (must be a valid UUID)
# NODE_ID = 159e6f46-9398-11f0-bca3-6b6ea1493108
# NODE_ADDRESS is the address of the node the client will connect to
# NODE_ADDRESS = 127.0.0.1
# Global cache behaviour: OFF, RECORD, or REPLAY (default).
GLOBAL_CACHE_MODE = OFF
-
Make sure you have the
woolyai-server-license.jsonfile in the current directory. You can get it from WoolyAI support. -
Run the Container
NVIDIA​
docker run -d --name woolyai-server \
--gpus all \
--network=host \
--pull always \
--entrypoint /usr/local/bin/server-entrypoint.bash \
-v "./woolyai-server-vram-cache:/home/automation/.wooly/shared_mem:rw" \
-v "./woolyai-server-config.toml:/home/automation/.wooly/config:ro" \
-v "./woolyai-server-license.json:/home/automation/.wooly/license.json:ro" \
woolyai/server:nvidia-latest
To delete, docker rm -f woolyai-server
Supported Architectures​
- RTX PRO 6000 Blackwell Server Edition
- GB300, B300
- GB200, B200
- GH200, H200, H100
- L4, L40, L40S
- A100, A30
- A40, A10, A16, A2
AMD​
warning
AMD support is Beta. We are currently working on optimizing the performance of the Unified WoolyAI Client Container to be as close to native on AMD. Please be sure to watch out for more updates.
Supported Architectures​
- MI300
- MI325X
- MI350
docker run -d --name woolyai-server \
--device /dev/kfd --device /dev/dri --security-opt seccomp=unconfined \
--network=host \
--pull always \
--entrypoint /usr/local/bin/server-entrypoint.bash \
-v "./woolyai-server-vram-cache:/home/automation/.wooly/shared_mem:rw" \
-v "./woolyai-server-config.toml:/home/automation/.wooly/config:ro" \
-v "./woolyai-server-license.json:/home/automation/.wooly/license.json:ro" \
woolyai/server:amd-latest
To delete, docker rm -f woolyai-server
- Check the logs with
docker logs woolyai-serverto make sure it started properly. You should see"server listening on"if it worked.
info
The wooly-server-vram-cache(Optional) folder is where you can cache models in VRAM with the VRAM Model Cache Tool. This is done with the woolyai-vram-model-cache --root ./wooly-server-vram-cache . . . command.
FAQ​
- There is no need to go into the container.
- You can see logs with:
docker logs -f woolyai-server