You know, for a while there, it felt like the really cutting-edge AI stuff, like ChatGPT, was always just out of reach, tethered to the internet. You could run other AI models offline on your Mac, sure, but the big one? Not so much. Well, that's all changed, and it's pretty exciting.
There's this open version of ChatGPT, called gpt-oss, that's been released. The really neat part is that its 'weights' – think of them as the brain's learned knowledge – are publicly available. You can just download them. OpenAI actually put out two versions: a super-powerful one, gpt-oss-120b, which needs some serious graphics card muscle, and a lighter, more manageable one, gpt-oss-20b. This smaller model is the one that's perfect for most of us with Apple Silicon Macs (M1, M2, M3 chips). It runs smoothly even with 16GB of RAM, making it accessible to a lot more people.
So, how do you actually get this running on your own machine, offline? It's surprisingly straightforward, especially if you're comfortable with the Terminal app on your Mac. You'll need a few things:
- An Apple Silicon Mac (M1, M2, or M3 is ideal, but even 8GB of RAM will get you started, though 16GB or more is definitely better).
- About 4-10GB of free space on your hard drive, depending on which model you choose.
- An internet connection, but only for the initial setup.
Getting Started: The Essential Tools
First up, you'll likely need Homebrew. If you've ever installed software on your Mac using the Terminal, you might already have it. It's basically a package manager that makes installing other software a breeze. If you don't have it, opening Terminal and pasting a simple command will get it set up. You can check if it's ready by typing brew doctor.
Next, and this is the key player, you need Ollama. Think of Ollama as the engine that lets you run these AI models locally. Installing it is as simple as typing brew install ollama in your Terminal. Once it's installed, you'll want to start it up with ollama serve. It's best to keep this running in the background.
Downloading and Chatting with gpt-oss-20b
With Ollama ready to go, it's time to bring the gpt-oss-20b model onto your Mac. In your Terminal, you'll type ollama pull gpt-oss:20b. This command downloads the model, and it might take a little while depending on your internet speed. Once that's done, you can start the model by typing ollama run gpt-oss:20b.
And just like that, you'll see a prompt appear right in your Terminal. You can start typing your questions or prompts, hit Enter, and get responses directly back. It's pretty wild to have this powerful AI working right there on your machine, no internet needed after the initial download. To exit the chat, you can press Control + D. To jump back in later, just use the ollama run gpt-oss:20b command again.
For a smoother, more visual experience, Ollama also offers a clean chat interface. If you open the Ollama app itself, you can find an option to 'Open Ollama'. This launches a browser-based chat window where you can select the gpt-oss 20b model from a dropdown and chat away. It feels a bit more like using a dedicated app.
Keeping Things Tidy: Managing Your Models
As you start downloading different AI models, you might want to keep track of what you have installed. The command ollama list will show you everything. If you decide you don't need a particular model anymore, you can easily remove it with ollama rm gpt-oss:20b.
A Little Extra: The Web Interface Option
For those who prefer a web-based interface and perhaps have Ollama installed via Docker, there's an optional step to set up a web UI. This involves installing Docker, pulling the web UI image, and then running it. It's a bit more involved but offers a polished browser experience for chatting.
Other Paths to Local AI
While Ollama is arguably the most straightforward way, it's not the only path. If you're a developer comfortable with Python, you can use libraries like Transformers to run models. This gives you a lot more control for custom workflows or research. For those needing an OpenAI-compatible API locally, vLLM is another option to explore.
It's genuinely amazing how accessible these powerful AI tools are becoming, allowing us to experiment and integrate them into our own workflows without constant reliance on cloud services. The ability to run ChatGPT-like models locally opens up a whole new world of possibilities for privacy, offline use, and custom development.
