For most of the last two years, using good AI meant renting it. You sent your data to a large cloud service, OpenAI, Google or Anthropic, and paid by the query while your information travelled to a server you do not own. That is starting to change, and the shift matters for any small business that has held back on AI because of cost or because of where its data ends up. On 3 June, Google released Gemma 4 12B, an open model it describes as small enough to run locally on a consumer laptop with 16GB of memory.
The headline is not that another model exists. It is that a genuinely capable one now runs on the kind of hardware a small business already has on its desks. Gemma 4 handles text, images and even audio, is released under an open licence that lets a business deploy it however it likes, and is built to drive the sort of multi-step, agentic work that until recently needed a data centre. The model does the work on the machine in front of you, not on a server in another country.
And it is not just the vendors saying so. This week an essay titled "Running local models is good now" shot to the top of Hacker News, the forum where software builders gather, with a clear message from people who do this for a living: the open models you can run yourself have quietly crossed the line from interesting toy to useful tool. When the builders shift their view at the same time as the hardware requirements fall, that is usually the signal a capability has arrived.
What actually changed
Two things moved at once. The models got smaller without getting dumber, and the everyday machines got fast enough to run them. Google says Gemma 4 12B reaches close to the quality of a model more than twice its size, while fitting in the memory of a normal laptop. It is the first model of its size the company has given native audio input, so it can listen as well as read and look. Practically, that means a single model on local hardware can read a document, look at a photo and take a spoken instruction, the building blocks of real work, without anything being uploaded.
Open licensing is the other half of the story. Because Gemma 4 is released openly, a business is not locked into one provider's pricing, terms or uptime. The model can be run privately, shaped to a specific job, and kept running even when the internet is not. That is a different relationship with the technology than paying a monthly bill for access to something you can never see inside.
Why this matters for an Australian small business
The two biggest reasons owners give for not using AI more are privacy and cost, and local models answer both. On privacy, the cleanest way to keep customer records, quotes, contracts and health or financial details out of a third party's hands is to never send them there. When the model runs on your own machine, sensitive information stays inside the business, which makes a lot of the worry, and a fair bit of the compliance question under the Privacy Act, far simpler to manage. On cost, a model you run yourself does not bill you per query, so the expense does not climb every time the team leans on it harder.
Andrew Ng, who is about as practical a voice on AI for business as you will find, has long made the case that open models give companies options that a closed service cannot: control over your own data, freedom from a single vendor, and costs that do not scale with every use. Ng frames it as a question of who holds the leverage. Renting intelligence is fine until the price, the rules or the model behind the curtain change without your say.
- Sensitive customer and financial data can be worked on by AI without ever leaving the business, which makes privacy obligations easier to meet.
- The cost is the hardware you already own rather than a bill that grows every month as usage climbs.
- The tools keep working when the connection drops, in the van, on site, in a back office with patchy internet.
- You are not tied to one provider's pricing or terms, so a change at their end does not upend how you operate.
- The same private setup can read documents, handle images and take spoken instructions, the groundwork for automating real tasks in-house.
Capable is not the same as configured
Here is the honest part. The fact that a strong model can run on a laptop does not mean it is plug and play, or that dropping one onto a staff machine will quietly transform the business. Choosing the right model for the job, getting it running reliably on the hardware you have, connecting it to your actual work so it is useful rather than a novelty, and keeping it secure and current, that is where local AI either earns its keep or becomes an expensive distraction. The capability is now within reach for a small business. Turning it into something that genuinely saves time and protects data is a craft, and it is easy to get wrong in ways that quietly cost you.
The most capable AI a small business can use no longer has to live on someone else's server. The question is no longer whether you can run it in-house, but whether it is set up to actually help.
So the move is not to rush out and install something. It is to look honestly at the work where keeping data private or cutting a growing AI bill would matter most, and decide where running AI on your own terms is worth doing. That is the same instinct behind putting AI to work on the repetitive parts of customer service: the prize is real, and the value is in doing it properly.
This is exactly the kind of work we do at NextAura. We help Australian small businesses work out where AI belongs, then build and run the automations that take real work off the team, whether that lives in the cloud or privately on your own hardware. If keeping your data in-house and your costs predictable sounds like the version of AI you have been waiting for, get in touch, and we will work out what fits and stand it up so it holds.