When the Intern Gets the Keys to the Building

The one where we talk about what happens when the helpful assistant stops assisting and starts deciding.

Apr 21, 2026

[The Series: Part 1, Part 2, Part 3, Bonus Sana Post]

tl;dr:

Moving from "AI that suggests" to "AI that acts" isn't an upgrade. It's a fundamentally different risk profile, and most HR tech teams have zero infrastructure for it.

Back in December, I published a Safety Checklist. Five questions to ask your vendors before buying. This is the harder list: three questions you have to answer yourself before deploying, plus the foundational data problem underneath all of it.

Your current change management process was designed for humans. Agentic AI is going to look at it, politely nod, and then set it on fire.

The Promotion Nobody Voted On

In Part 2, I introduced you to my digital intern. Permanently caffeinated. Never asks for PTO. Occasionally confidently wrong about whether the Egyptians invented pizza.

You liked the intern. I liked the intern. The intern was great.

Here's the problem: someone in your leadership team just watched a vendor demo where the intern wasn't fetching coffee anymore. The intern was approving purchase orders. The intern was adjusting compensation ranges. The intern was making decisions about people's careers while a sales engineer narrated the whole thing like a nature documentary.

And now that leader is in your office asking: "Why aren't we doing this?"

The intern didn't get smarter. The intern got promoted. And nobody in your organization voted on it, built a job description for it, or thought about what happens when the newly promoted intern does something catastrophically wrong at 2am on a Saturday with no human in the loop.

That's what this post is about.

The Jump

The distance between "AI that recommends" and "AI that acts" is not a step. It's a canyon.

On one side, you have a tool that says, "Hey, based on the data, you might want to consider adjusting this comp range." A human reads it, thinks about it, and decides. That's a suggestion. That's a really smart sticky note.

On the other side, you have an agent that adjusts the comp range. Autonomously. Based on logic that someone approved at some point, theoretically, probably during an implementation meeting where half the room was checking email and the other half was trying to figure out the conference room AV system.

Most HR tech teams have spent decades building change management processes, approval chains, and governance structures for humans making decisions. Humans who hesitate. Humans who call their colleague and say, "Does this look right to you?" Humans who have a gut feeling that something is off, even if they can't articulate why.

Agents don't hesitate. Agents don't have gut feelings. Agents have logic, and they execute that logic with the enthusiasm of a golden retriever who just discovered the gate is open.

Nobody has built a governance model for that. And the vendors selling you agentic capabilities are not going to build it for you. That's not a knock on them; it's not their job. It's yours.

So let's build it.

The Checklist Was the Entrance Exam. This Is the Course.

Back in December, I published a five-point AI Safety Checklist. These were the questions you ask your vendor before you buy anything. Data retention. Bias audits. Explainability. Security inheritance. Kill switches.

That was the entrance exam. Those questions protect you from buying the wrong thing.

This is different. These are the questions you have to answer about yourself before you deploy the right thing. Because you can pass every vendor checkpoint with flying colors and still crash the car if you haven't built the road.

Three questions. They sound simple. They are not.

1. "Who Gave It Permission? Does That Permission Still Mean What They Think It Means?" (Authority)

Every agent acts on authority. Someone, somewhere, at some point, said "yes, you can do this."

The question is: who? And when? And did they understand what they were saying yes to?

Because here's what happens in practice. During implementation, someone approves a capability called "Talent Optimization." It sounds great. Everybody nods. It goes into the statement of work as an enabled feature. Eighteen months later, the vendor has expanded what "Talent Optimization" means. Now the agent is auto-adjusting job posting language based on candidate conversion data. It's rewriting your job descriptions. In real time. Based on what gets clicks.

And nobody remembers approving that. Because technically, they didn't. They approved a category. The vendor filled in the details later. And the details now include an AI rewriting your EVP messaging to optimize for engagement metrics that may or may not align with what your talent acquisition strategy actually says.

This happens all the time with regular software. Feature creep is not new. But feature creep with a deterministic system means someone eventually notices and opens a ticket. Feature creep with an autonomous agent means the agent has been making decisions for weeks before anyone realizes the scope drifted.

The fix: A living authority register. Not a one-time implementation sign-off. A document (a real one, not a slide deck someone presented once at a steering committee meeting and then buried in a SharePoint folder called "Archive - DO NOT DELETE - Final v3") that says exactly what each agent is authorized to do, who approved it, when it was last reviewed, and what the boundaries are. Reviewed quarterly at minimum.

I know. Quarterly reviews. I can feel you closing the browser tab. Stay with me.

2. "Who's Watching What It Did? Can They Explain Why It Did It?" (Oversight + Explainability)

In the Safety Checklist, I told you to demand explainability from your vendors. A "Glass Box," not a black one. Every output should come with a reasoning chain.

That was the vendor's job. This is yours: who on your team is actually reading those reasoning chains?

Agents act fast. Humans review slow. If your agent runs daily and your review cadence is monthly, you have 30 days of unaudited decisions compounding on top of each other. That's not a gap. That's a canyon with a gift shop at the bottom selling "I Survived Our AI Governance Program" t-shirts.

The scenario: An AI agent is recommending compensation adjustments across open requisitions based on market data, pipeline conversion rates, and internal equity benchmarks. It runs every day. It touches 200 reqs over three weeks.

Recruiters see the recommendations. They look reasonable. They have a little confidence score next to them, which (as we discussed in Part 3) makes humans go absolutely limp with compliance. "The system says 87%? Must be right." They adjust the comp ranges. Hiring managers approve. Offers go out.

Three weeks later, someone in Total Rewards finally reviews the output. They discover the market data feed was pulling from a dataset that included contract roles, which inflated the benchmarks by 12%. Every comp adjustment for the last three weeks was built on a bad foundation.

You are no longer auditing a recommendation. You are unwinding a decision chain. Offers have been extended. Some have been accepted. And here's the part that should make your legal team's ears perk up: if those inflated benchmarks skewed differently across demographics (and they almost certainly did, because market data for contract roles doesn't distribute evenly across job families), you may have just created an adverse impact pattern. Not because anyone intended to discriminate. Because nobody was watching closely enough to catch the drift.

That's not an AI bias problem in the way the Checklist discussed it. You already asked the vendor for the bias audit. They passed. This is a monitoring problem. The model was fine. The data feeding it shifted. And nobody caught it because nobody was assigned to catch it at the speed the agent was moving.

The fix: The review cadence has to match the action cadence. If the agent acts daily, someone (a human, with judgment, and the authority to say "stop") reviews daily. And that person needs to be able to read the reasoning chain — not just see the output, but understand why the agent made the call it made. If your reviewer can't explain the agent's logic to a skeptical VP in plain English, the review is theater.

3. "What's the Damage Protocol?" (Rollback)

The Safety Checklist gave you the kill switch — the vendor-side toggle that lets you shut it off. Good. You need that.

But turning it off is step one. What about the damage that's already done?

Your agent is going to make a mistake. It is a statistical certainty. And I need you to internalize something: the mistake will not announce itself. There will be no pop-up that says "ERROR: I JUST DID SOMETHING DUMB." The agent will execute the wrong action with the exact same confidence it executes the right one. That's the whole thing about machines. They don't sweat. They don't pause. They don't get a weird feeling in their stomach when something doesn't add up.

You know who does get a weird feeling in their stomach? Brenda in Payroll. And Brenda is going to catch this. At 4:47pm on a Friday. And Brenda is going to send you an email with the subject line "Question???" and three question marks is never good. Three question marks means Brenda found a body.

The scenario: An agent flags a batch of employees for a compliance-related action based on certification expiration dates. The field hasn't been audited in eight months. Some of the dates are stale. Some are flat-out wrong because someone fat-fingered an entry during a mass upload and nobody caught it because (and this is the part where I stare directly into the camera) nobody audits EIB data after the upload.

The agent doesn't know the data is stale. The agent sees an expired certification, matches it to a business rule, and fires the action. Forty employees get compliance notices. Twenty of them have valid, current certifications. They are, understandably, upset. Some of them call HR. Some of them call their manager. One of them calls a lawyer. (There's always one. If you've worked in HR Tech long enough, you know that one person who has a lawyer on speed dial like it's a pizza place.)

And here's where the regulatory landscape makes this worse: depending on the nature of the compliance action and your jurisdiction, you may have just triggered a notification obligation. NYC Local Law 144 requires notice and disclosure when automated tools are used in employment decisions. The EU AI Act classifies HR systems as "high-risk." You didn't just send a wrong notice; you potentially sent a wrong notice using a process that regulators are actively watching.

The fix: A rollback playbook for every agentic capability you enable. Not "open a ticket." A documented protocol that answers: what gets reversed, who authorizes the reversal, how do we communicate to affected employees, who owns the post-mortem, and what's our regulatory notification obligation? That protocol needs to exist before the agent goes live. Treat it like a disaster recovery plan. Because that's what it is.

The Floor Under the Agent (Data Foundation)

I've saved the most boring and most important part for last.

Your agent inherits every sin in your data.

Every stale supervisory org. Every orphaned position. Every job profile that hasn't been audited since the original implementation when someone said, "We'll clean that up in Phase 2," and Phase 2 never came because Phase 2 is a myth. Phase 2 is the HR Tech equivalent of "we should get together sometime." It's never happening.

The agent doesn't know your data is bad. The agent treats your data as ground truth. It walks across your data foundation like it's solid concrete. If there are holes, the agent falls through them at machine speed. And it doesn't even know it's falling. It just keeps executing, confidently, all the way down.

Bad data in a traditional system creates bad reports. Bad data under an autonomous agent creates bad actions. Reports you can correct. Actions you have to undo. And some actions (an offer extended, a notice sent, a compliance flag triggered) are really, really hard to undo.

The minimum before you go agentic:

Supervisory org audit. When was the last time you verified that your org structure in the system matches reality? If the answer involves the phrase "during implementation," that's not an answer. That's a confession.
Job profile hygiene. Are your job profiles current, consistently structured, and actually reflective of the roles people are doing? Or are they a museum exhibit of what someone thought the role was in 2019?
Data field freshness. Every field the agent will read needs a last-verified date. If you can't tell me when a data point was last validated, the agent shouldn't be acting on it.
Integration integrity. If the agent is pulling data from multiple systems, are those systems in sync? Or are you running on "Schrödinger's Data" (a concept I covered earlier in this series) where the same employee exists in three systems with three different job titles and nobody knows which one is authoritative?

You don't need perfect data. (Perfect data is another myth, right up there with Phase 2 and "we'll migrate the historical data later.") You need audited data. You need to know where the holes are before the agent starts walking.

Your Change Management Process Is Kindling

I promised in Part 3 that we'd talk about why your current change management process is going to look at agentic AI and burst into flames.

Here's why.

Your change management process was designed for a world where a human is the decision-maker at every step. A change gets proposed. A stakeholder reviews it. An approval chain fires. Someone communicates it to the affected population. Training happens. Feedback gets collected. The cycle repeats.

That entire model assumes one thing: time. Time for review. Time for feedback. Time for someone to say, "Wait, I don't think this is right." Time for Brenda to get that feeling in her stomach.

Agentic AI compresses time to near-zero. The agent proposes, evaluates, and acts in the space between your first sip of coffee and your second. There is no "review window." There is no "let's run this by the team." The agent ran it by itself, and it approved unanimously (because it was the only one in the room, and it doesn't understand the concept of a dissenting opinion).

You don't need a new version of your change management process. You need to acknowledge that the old one was designed for a species that moves at human speed, and then build something new for a world where decisions happen at machine speed.

What does that look like? It looks like pre-approved decision boundaries instead of case-by-case approvals. It looks like automated exception detection instead of manual review queues. It looks like real-time monitoring dashboards instead of monthly steering committee decks. It looks like treating your governance model as a living system, not a binder on a shelf.

It looks like work. A lot of it. I'm sorry. I wish I could tell you there's a shortcut. The shortcut is called "let the agent do whatever it wants and hope for the best," and the technical term for that strategy is "career-limiting."

The AI hasn't changed. The models are getting better, sure, but the fundamental dynamic is the same as it was in Part 2 when I was using it to draft user instructions and untangle calc fields. It's a tool. A very fast, very confident, occasionally wrong tool.

What changed is the authority. You moved the intern from the desk next to yours (where you could see the screen, check the work, and catch the errors) to a corner office with signing authority and a door that closes.

The governance model is how you make sure that promotion is earned. Incrementally. With guardrails. With review cadences that match the speed of the decisions being made. With rollback plans that exist before you need them. With data foundations that have been audited this calendar year.

Not all at once on a Tuesday afternoon because someone in the C-suite saw a demo and sent you an email with "THOUGHTS??" in the subject line.

Build the governance. Audit the data. Match the review speed to the action speed. Plan the rollback before you need it.

And for the love of everything, answer the questions before you flip the switch.

That's the series. Four (okay, it ended up being five) parts. The vocabulary, the plumbing, the literacy gap, and the governance. If you've made it through all of them, you're better prepared than most of the HR tech teams I talk to. And if you read the Safety Checklist back in December, you've now got both sides: the questions you ask them and the questions you answer yourself.

If this series helped you think differently about AI in your environment, share it. Forward it. Post it. Do the LinkedIn screenshot thing. Build the governance your organization needs before someone else builds urgency you can't control.

And if you're sitting there thinking, "This is great, Mike, but how do I actually sell this to my leadership?" Yeah. I hear you. I might have another post in me on that. No promises. (Okay, soft promise.)

— Mike

Director HR Tech | Intern Supervisor

The Department of First Things First. For the people who do the work.

P.S. Justin asked me what "governance" means. I told him it's like the rules in his video game that keep other players from cheating. He said, "So... it doesn't work?" Kid might have a future in consulting.

Discussion about this post

Ready for more?