I want to experiment more with local models to understand their limits, so I want them to be easy to install and run. That suggests using ollama. I don’t have a beefy MacBook Pro, so I’d like to run them on my local Linux server.
These are some instructions for setting up ollama to run on a local Debian server, so that you can access it from your laptop on the same local subnet (which will almost always be the case if they are connected via the same home router).
Why not just use ollama’s default linux install script? Two reasons:
-
Hosting on the subnet. By default that install scripts sets up ollama so if you run it on linux you can access it only from Linux, not from other hosts on your subnet.
-
Compatibility/Uninstallability. By default, it installs ollama’s different components into different subdirectories (
bin/,lib/,share/), all placed off of either/usr/local,/usr, or even/, depending on what it finds in$PATH.
If the install script chooses to install under /usr or /. I have no
idea how this fits with Ubuntu etc., but it certainly violates Debian’s
guidelines for where to install third-party software! If the installer
chooses to install under /usr/local, this is more compatible but it
still makes it relatively hard to uninstall or upgrade, since all of the
ollama files are not under a single subdirectory which can be simply
deleted. The best way, I think, is to follow the Filesystem Hierarchy
Standard
(FHS),
which indicates that an “add-on application software package” should go
in /opt/ollama.
Why be so fussy? Stability. (That’s why I run Debian: it’s so boringly stable.) The point is to maximize the odds of not destabilizing Debian’s own package management system.
That’s the motivation. Here are the steps.
Installing a local Ollama server on Debian
- First, create the installation directory:
sudo mkdir -p /opt/ollama
This intentionally deviates from the default install script, in order to follow Debian’s convention of using the FHS.
- Create system user/group:
sudo useradd -r -s /bin/false -U -m -d /opt/ollama/data ollama
sudo usermod -a -G ollama $USER
This follows script’s security model but changes home directory to /opt/ollama/data for better organization.
- Download and extract:
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o /tmp/ollama-linux-amd64.tgz
sudo tar -C /opt/ollama -xzf /tmp/ollama-linux-amd64.tgz
(You will need to update the download link and file appropriately if your architecture is not amd64.)
- Create symlink:
sudo ln -sf /opt/ollama/bin/ollama /usr/local/bin/ollama
We symlink to the installed binary, from /usr/local/bin, which I
assume is in your PATH already.
- Create systemd service file at
/etc/systemd/system/ollama.service:
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/opt/ollama/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"
[Install]
WantedBy=default.target
This systemd service configuration file deviates from the default
install script in two respects. It updates the path to the binary to
match our FHS-based location. Second, it sets the environment variable
OLLAMA_HOST to the “unspecified” address of 0.0.0.0, which
configures the server to accept connections from other computers.
(Technically, it configures the server to listen not just on the
loopback network interface, which never receives traffic from external
hosts, but on the “unspecified” interface, which means to listen on all
interfaces, including the ethernet, wifi, etc. interfaces which other
machines may send to.)
- Enable and start service:
sudo systemctl daemon-reload
sudo systemctl enable ollama
sudo systemctl start ollama
- To verify reachability
On your server, use ollama to pull a model and run it:
$ ollama pull deepseek-r1
$ ollama run deepseek-r1
On your mac, use curl to verify you can reach it:
$ LINUX_HOST=box.local.
$ curl -X POST http://${LINUX_HOST}:11434/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "deepseek-r1", "messages": [{"role": "user", "content": "Hello, world!"}]}'
You will need to update LINUX_HOST to the local IP address of your
Linux server. (My server’s address above is a multicast DNS address,
which works thanks to avahi-daemon.)
If everything is working you will see something like the following:
$ curl -X POST http://${LINUX_HOST}:11434/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "deepseek-r1", "messages": [{"role": "user", "content": "Hello, world!"}]}'
{"id":"chatcmpl-390","object":"chat.completion","created":1741032140,"model":"deepseek-r1","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"\u003cthink\u003e
\u003c/think\u003e
Hello, world! How can I assist you today? 😊"},"finish_reason":"stop"}],"usage":{"prompt_tokens":7,"completion_tokens":18,"total_tokens":25}}
- Eventually, for uninstallation:
sudo systemctl stop ollama
sudo systemctl disable ollama
sudo rm /etc/systemd/system/ollama.service
sudo rm /usr/local/bin/ollama
sudo rm -rf /opt/ollama
sudo userdel -r ollama
Security
To emphasize what might be obvious, this setup configures your Debian machine to automatically run an unsecured server, accessible from any other host on your subnet. So you might not want this if your home subnet is an unsecured environment, or if, like some crazy man, your server is directly exposing all its ports on a public IP address, without your router serving as a de facto firewall by limiting which ports are exposed.
Questions, Someday/maybe next steps
I’m new to ollama so I haven’t figured out these points yet, which are probably obvious:
- How do I start a model only to serve it via API, without starting a text chat session?
- How do I configure it so that that models run indefinitely?
- How do I easily determine ahead of time which models I can run on my particular system?
- How hard would it be for me to build a setup like this, which did not require sudo access, and which presented a secured connection, so I could run these models with a userspace-only account on a cloud server?