Skip to main content

Set Up The Server

Prerequisites​

  • A host machine with a compatible GPU (NVIDIA currently)
  • A license.json file. You can get it from https://woolyai.com/signup/
  • Docker installed on the GPU host machine.
  • Choose the proper docker image from the WoolyAI Server Docker Hub. We provide images for NVIDIA at specific driver versions. Generally, you use as close as possible to your

Setup​

  1. Create a directory for the server VRAM cache
mkdir woolyai-server-vram-cache
  1. Create the server config file: woolyai-server-config.toml:
[SERVER]

LISTEN_ADDR = tcp::443
# Optional SSL endpoint. Uncomment after placing certfile.pem in working dir.
# LISTEN_ADDR = ssl::443
# SSL_CERT_FILE = certfile.pem
# SSL_KEY_FILE = certfile.pem

########################
# Controller integration (leave blank if not using a controller).
########################
## Note: You can comma separate multiple controller URLs
# CONTROLLER_NATS_URL = nats://localhost:4222
# NODE_NAME must be unique across all nodes in the cluster
# NODE_NAME = my-node
# NODE_ID will be auto-generated from NODE_NAME if not set (must be a valid UUID)
# NODE_ID = 159e6f46-9398-11f0-bca3-6b6ea1493108
# NODE_ADDRESS is the address of the node the client will connect to
# NODE_ADDRESS = 127.0.0.1

# Global cache behaviour: OFF, RECORD, or REPLAY (default).
GLOBAL_CACHE_MODE = OFF
  1. Make sure you have the woolyai-server-license.json file in the current directory. You can get it from WoolyAI support.

  2. Run the Container

NVIDIA​

docker run -d --name woolyai-server \
--gpus all \
--network=host \
--pull always \
--entrypoint /usr/local/bin/server-entrypoint.bash \
-v "./woolyai-server-vram-cache:/home/automation/.wooly/shared_mem:rw" \
-v "./woolyai-server-config.toml:/home/automation/.wooly/config:ro" \
-v "./woolyai-server-license.json:/home/automation/.wooly/license.json:ro" \
woolyai/server:cuda12.9.1-latest
  1. Check the logs with docker logs woolyai-server to make sure it started properly. You should see "server listening on" if it worked.
info

The wooly-server-vram-cache(Optional) folder is where you can cache models in VRAM with the VRAM Model Cache Tool. This is done with the woolyai-vram-model-cache --root ./wooly-server-vram-cache . . . command.

FAQ​

  • There is no need to go into the container.
  • You can see logs with: docker logs -f woolyai-server