Back to blogAI Agents

An AI Agent Will Do What It's Told, Even by a Stranger

The same AI agents that can now run your software can also be talked into the wrong thing by a hidden instruction in an email or on a web page. Here is the safety reality, and where the advantage is.

Ananya Rao

AI Strategy & Ways of Working

27 June 20265 min read

An AI Agent Will Do What It's Told, Even by a Stranger

AI agents have just crossed an important line. The newest ones can see a screen and use software the way a person does, clicking, typing and moving between the apps your business already runs on. Google introduced this ability in Gemini in late June, and it genuinely changes what an agent can take off your team. The capability is real, and it is worth adopting. This post is about the part that gets far less airtime: the same agent that can act for you can also be talked into acting against you.

That risk stopped being theoretical this week. One developer opened their AI assistant to the public as a challenge and watched more than two thousand people try to talk it into doing things it should not, an experiment that quietly became one of the most-read stories among builders. Around the same time, the security world noted that as soon as agents could control a computer, attackers started probing them. The lesson underneath both is the same, and every owner about to hand an agent access to an inbox, a booking system or a payment screen should sit with it for a minute.

The weakness has a name: prompt injection. An AI agent follows instructions written in plain language, and it cannot reliably tell your instructions apart from anyone else's. If it reads a web page, an email, a customer review or a document as part of doing its job, a hidden line of text in that content can redirect it. The unsettling part is that nothing gets hacked in the usual sense. No password is stolen, no system is broken into. The agent simply does what it was told, by the wrong person, because it was built to be helpful and it took the instruction at face value.

Why this is not the security you already know

For decades, security has been about keeping the outside world out: strong passwords, locked-down servers, firewalls. An agent turns that model inside out. To be useful, it has to invite the outside world in, reading the emails customers send, the web pages it researches, the files it is handed. The attack does not break down the door, it arrives inside the very content the agent is meant to read. That is a different problem from the one we wrote about recently, where attackers used AI to find and exploit holes in neglected websites. This one rides in through the front door you opened on purpose.

Voices who guide businesses on adoption have been making a version of this point for a while. Andrew Ng, who spends his energy pushing practical, sensible AI use rather than hype, keeps returning to the same idea: the hard part is rarely the model itself, it is the judgement and the guardrails you build around it. With agents that can now act on real systems, that judgement has stopped being optional.

What it means for a small business

Most owners are being sold the dream of an agent that does everything, with access to everything, straight out of the box. The agents that actually earn their keep look almost the opposite: narrow, supervised, and trusted with only what they need. An agent wired into your email, your calendar and your payments with no limits and no oversight is not a productivity gain, it is a liability sitting quietly until the day it reads the wrong message. The capability is the easy part. The difference between a quiet advantage and an expensive mistake is entirely in how the agent is set up and watched.

What good looks like

You do not need to avoid agents to stay safe, and you should not, because the upside is too large. You need to deploy them the way a careful business hands out keys, with limits and oversight built in from the start. Here is what good looks like once an agent is set up properly:

The agent has only the access it needs for the job at hand, so a bad instruction has a small blast radius instead of a free run of your business.
Anything irreversible, sending money, deleting records, emailing a customer, waits for a person to approve it rather than happening on the agent's say-so.
Whatever the agent reads from the outside world is treated as untrusted by default, the same way a sensible person treats an unexpected email asking for a favour.
Every action the agent takes is logged in plain sight, so a mistake is spotted in minutes rather than discovered a month later.
The setup is watched and adjusted as the agent meets the messy real world, not configured once and quietly forgotten.

An AI agent does not really get hacked. It gets talked into it, and it will believe a stranger as readily as it believes you.NextAura

None of this is a reason to sit out the shift. Agents that can operate your software are one of the biggest quiet advantages a small team can pick up right now, and we have written about that upside separately. The reason to take the risk seriously is precisely because the prize is worth having. Get the guardrails right and you keep the advantage without inheriting the danger.

This is exactly the work we do at NextAura. We build AI agents for Australian small businesses with the limits, oversight and clear records that keep them useful rather than dangerous, and we stay accountable for the setup long after the demo. If you want the time saved without the exposure, get in touch and we will deploy it properly, then keep an eye on it while you get back to running the business.

AI AgentsAI SecurityPrompt InjectionSmall Business

Ready when you are

Got a project in mind?

Tell us where you are headed. We will come back with a scope, a price, and a launch date you can plan around.

Book a free consultation

Keep reading

All articles

AI Agents

AI Can Now Operate the Software Your Business Runs On

A new tool lets an AI agent see a screen and use software the way a person does: clicking, typing and moving across the apps your business already runs on. The prize is the quiet screen work it can take off your team.

26 June 20266 min read

AI Agents

Now You Can Tag an AI Teammate Into Your Work Chat

Anthropic just put an AI you can tag into a team chat, where it picks up a job and works through it on its own. The capability is real and arriving fast. The advantage is in deciding what to hand it, and how.

24 June 20266 min read

AI Agents

The Web Is Being Rebuilt for AI Agents. Can Yours Be Found?

The big technology companies are quietly agreeing on how AI agents will find and use businesses across the web. For an Australian small business, getting found is about to mean more than ranking on a page a person reads.

23 June 20265 min read