In 2024, Jake Moffatt asked Air Canada's website chatbot about bereavement fares after a family member's death. The chatbot told him he could book at full price and apply for the reduced fare retroactively within 90 days. That was wrong; Air Canada's actual policy required the discount to be applied at the time of booking.
When Moffatt tried to claim the refund, Air Canada denied it. In a remarkable legal argument, the airline told a Canadian tribunal that its chatbot was "a separate legal entity" responsible for its own statements, and that the company could not be held liable for what it said. The tribunal rejected this and ordered Air Canada to honor the refund. Legal scholars have since cited the case as an early marker of how AI liability is evolving in commercial settings.
The airline lost a case that never should have happened. A single human review step before the chatbot committed to a refund policy would have caught the error. Instead, an autonomous system made a binding promise, and no one was watching.
This is the predictable outcome of removing humans from decision-making loops.
The Automation Spectrum Nobody Talks About
The conversation around AI automation usually presents two options: manual work or full automation. You either do everything yourself, or you "set it and forget it." Most vendors position their tools on the fully autonomous end because it sounds impressive.
But there is a spectrum between those extremes, and the right answer for most businesses is somewhere in the middle.
Level 1: Fully Manual You do everything. Every email, every decision, every action. This is where most solopreneurs start, and where many stay because they do not trust the alternatives.
Level 2: AI-Assisted AI helps you do the work faster. It drafts responses, summarizes documents, suggests categorizations. But you execute every action yourself.
Level 3: Supervised Autonomy AI executes actions independently for low-risk decisions. High-risk decisions pause for your review. You only review exceptions, not every action.
Level 4: Fully Autonomous AI handles everything without human intervention. No approval queues, no review steps, no oversight.
Most AI automation tools push you toward Level 4. That is a mistake for anyone who values their reputation, client relationships, or sleep quality. If you want to see the real-world consequences, our analysis of why AI agents go rogue documents incidents where autonomous agents deleted data, ran up $47,000 bills, and took down production systems. Even mature platforms like Zapier and Make (covered in our tool comparison) execute workflows without any approval layer.
Real Costs of Going Fully Autonomous
The marketing cost of fully autonomous AI is obvious: things go wrong publicly. But the hidden costs are subtler and often more damaging.
Cost 1: Trust Erosion
When an AI sends a wrong email, processes an incorrect refund, or misclassifies a support ticket, it not only costs you the immediate correction but it also erodes the trust your customers have in your business.
For solopreneurs and small teams, trust is everything. You do not have a brand marketing budget to rebuild reputation. Every interaction matters disproportionately. This is exactly why human review is the missing piece in AI automation -- the approval step catches errors before they reach your customers.
Cost 2: Silent Failures
The scariest automation failures are the ones you never notice. A lead that gets categorized as "cold" when it was actually your biggest potential client. A support ticket that gets an automated response when it needed a personal call. An invoice follow-up that sounds aggressive to a client going through a tough month.
These failures compound silently. By the time you notice the pattern, the damage is done.
Cost 3: Compliance and Liability
Depending on your industry, automated actions can create real legal exposure. The question of whether your AI agent should be sending that email without explicit approval is one worth asking before granting broad permissions. Automated financial communications, healthcare-adjacent decisions, or legal document handling without human review can violate regulations you did not even know applied to you. The EU AI Act now classifies certain automated decision-making systems by risk level, and the obligations increase significantly when there is no human oversight mechanism. In the US, the FTC's 2024 enforcement sweep against deceptive AI claims and schemes is a reminder that regulators still treat exaggerated or misleading AI marketing (and the harms that follow) as the company's problem, not the model's.
Cost 4: The Anxiety Tax
Here is the paradox: fully autonomous AI is supposed to reduce your workload, but it often increases your anxiety. When you know an AI is making decisions without your oversight, you spend mental energy worrying about what it might do wrong. You second-guess whether the automation is working correctly because you spend mental energy worrying about what it might do wrong.
You traded manual work for anxiety. That is not progress. We explored this dynamic in more detail in what "set it and forget it" automation gets wrong.
The Trust Gradient: A Better Approach
Instead of choosing between full control and full automation, adopt a trust gradient. This means different actions get different levels of oversight based on their risk and the AI's demonstrated accuracy.
How it works in practice:
- Start with everything requiring approval. Every action the AI proposes goes through your review queue. This is temporary, but it is how you calibrate
- Observe patterns. After a week, you will notice that certain decisions are always correct. Email categorization might be 95% accurate. Draft responses for scheduling requests might be 98% accurate
- Selectively reduce oversight. Let the consistently accurate actions run autonomously. Keep high-risk actions in your approval queue
- Monitor with confidence scoring. Each action gets a confidence score based on how certain the AI is about its decision. Low-confidence actions pause for review. High-confidence actions proceed automatically
- Maintain safety nets. Even for autonomous actions, sampling a percentage for random review catches drift before it becomes a pattern
This approach gives you the time savings of automation without the anxiety of losing control.
The goal is not zero human involvement. It is zero unnecessary human involvement.
What Confidence Scoring Actually Means
When we talk about confidence scoring at Rills, we mean something specific: every time a workflow runs, each step produces a confidence score for that particular execution. It is a dynamic assessment based on the specific input data to determine the quality of the automation.
For example, an email categorization step might score:
- 97% for "I want to purchase your enterprise plan" (clear buying signal)
- 72% for "Can you tell me more about how this works?" (ambiguous intent)
- 45% for "My nephew said you might be able to help with something" (vague, contextual)
The first one auto-executes. The second and third pause for your review. Same workflow, different outcomes based on the actual data.
Over time, the system learns from your approvals and rejections. When you correct a misclassification, Rills surfaces an optimization suggestion to the underlying prompts and models. Future similar inputs get more accurate scores. The result is that you approve fewer and fewer actions over time because the AI agent is getting better at the task.
When Full Automation Actually Makes Sense
With action credit pricing, adding approval steps to your workflows costs nothing -- so being cautious never increases your bill.
To be fair, there are scenarios where full automation is appropriate:
- Logging and record-keeping. Writing to a database or log file has no customer-facing impact
- Internal notifications. Sending yourself a Slack message about a new lead does not need approval
- Data formatting. Converting a CSV to a specific format is deterministic and verifiable
- Scheduled maintenance. Archiving old records or running backups
The common thread: these actions are low-risk, reversible, and do not involve external communication. For everything else, some level of oversight is worth the minimal time investment.
Moving from Anxiety to Confidence
If you are currently doing everything manually because you do not trust automation, supervised autonomy is your on-ramp. You do not have to leap to full automation. You can walk there, one review at a time.
Our step-by-step guide to building your first workflow takes less than ten minutes and walks through exactly this kind of cautious starting point.
Start with a single workflow. Review every action for a week. Watch the AI learn. Lower your confidence thresholds as your trust grows. Within a month, you will have an automation that saves you hours per week while keeping your quality standards intact.
The hidden cost of fully autonomous AI is not just the mistakes it makes. It is the trust it never earns. Supervised autonomy earns that trust through demonstrated performance, one decision at a time.
Want to see how it works? Explore Rills and set up your first supervised workflow. Human approvals are always free. You only pay for the actions that create real value.
Ready to automate your workflows?
Eliminate monitoring anxiety with AI agents that propose actions while you stay in control. Start your free trial today.
Start Free TrialNo credit card required to sign up