Yannick onsite at VD Mayr


All Blogs
Filter by

Why We Built a Workflow-Governed Agent System for Recruiting Automation
There is now a fairly standard implementation pattern behind many so-called “AI agents”: the model receives context, decides whether to invoke a tool, observes the tool result, and repeats until it decides the task is complete. OpenAI’s Agents SDK describes this explicitly as an agent loop, and Anthropic distinguishes these model-directed systems from workflows, where execution paths are predefined in code.
That distinction is important for production systems, because “agentic” and “autonomous” are not interchangeable. A system can use LLM-based routing, reasoning, decomposition, and synthesis while still keeping control flow, state transitions, and side effects outside model-directed execution.
That is the architecture we chose for recruiting automation.
Instead of pursuing a fully autonomous agent, we chose to build a workflow-governed agent system: a multi-stage architecture in which specialized LLM calls perform bounded cognitive tasks, while orchestration, sequencing, and mutation of system state remain code-controlled. This design is closer to what Berkeley AI Research describes as a compound AI system than to a single autonomous agent.
Architecture overview
Our runtime is organized into three stages:
- parallel classification
- domain execution
- response synthesis
Each stage uses LLMs differently, and each stage has a narrow contract.
1) Parallel classification layer
The first layer performs concurrent inference across several specialized classifiers: safety, jailbreak detection, language identification, retrieval eligibility, intent routing, and workflow classification.
This layer is best understood as a combination of routing and parallelization in Anthropic’s workflow taxonomy. The model helps classify the request, but it does not dynamically construct arbitrary downstream plans. The available execution paths are predefined, typed, and enforced by code.
Two engineering properties matter here:
Latency isolation. Since these classifiers run concurrently, end-to-end latency is closer to the slowest classifier than to the sum of all classifier runtimes.
Error isolation. Each classifier has a narrow prompt surface and a single responsibility, which makes failures easier to detect, evaluate, and retrain than a monolithic orchestration prompt.
2) Domain execution layer
After classification, a code-level dispatcher selects the relevant domain handlers.
This layer reflects our clearest design choice: keeping execution bounded rather than model-directed. The model is not deciding, step by step, which tool to call next in an open loop. Instead, execution follows a constrained graph:
- read-only operations may fan out in parallel
- state-mutating operations execute sequentially
- side effects are performed through code-governed handlers
- outputs are written into typed data structures rather than appended to an unconstrained reasoning transcript
This makes execution semantics more explicit and easier to inspect. The system does not rely on the model to remember hidden state, infer permissible transitions, or decide whether certain checks are required.
For recruiting workflows, that matters because the runtime is not just answering questions. It is also participating in a stateful business process: collecting required information, validating completion criteria, applying stage logic, and triggering downstream actions.
3) Response synthesis layer
The final stage takes structured outputs from prior stages and renders a user-facing response.
This stage is intentionally separated from execution. Its role is linguistic, not operational: translate workflow state into a clear conversational response; adapt tone and phrasing; preserve multilingual quality; and explain next steps.
It does not:
- choose new execution paths
- mutate workflow state
- reinterpret transition rules
- bypass required process steps
That separation of concerns is one of the main advantages of the architecture. It allows the system to benefit from LLM fluency without giving the response model authority over control flow.
Why we did not use a fully autonomous agent loop
The main reason is that for the kinds of recruiting workflows we care about, unconstrained model-directed execution introduces the wrong tradeoffs.
1) Process fidelity is more important than conversational initiative
In a general assistant, initiative can be a feature. In recruiting, it can become a defect.
A screening flow often contains required questions, specific evaluation steps, mandatory disclosures, and deterministic transition conditions. A fully autonomous agent may infer that a question is redundant because the candidate already mentioned something adjacent to it. That may be conversationally efficient, but it can violate standardization requirements or downstream scoring assumptions.
A workflow-governed system is designed to reduce that class of error: the model may adapt how a step is communicated, but not whether the step exists.
2) Stateful execution is easier to reason about in code than in prompt state
State-heavy processes degrade quickly when too much execution logic is delegated to conversational context. In a long-running agent loop, the system must continually preserve and reinterpret latent state across turns and tool results.
By contrast, a typed workflow architecture externalizes state:
- progress is explicit
- transition conditions are explicit
- side effects are explicit
- failure recovery is explicit
That makes the system easier to test, easier to audit, and easier to modify.
3) Reliability still drops sharply in realistic enterprise tasks
Recent benchmark evidence suggests that realistic multi-step enterprise execution remains difficult for frontier systems. In EnterpriseOps-Gym, a March 2026 benchmark covering 1,150 expert-curated tasks across eight enterprise domains, the best reported model reached 37.4% success. That result highlights the gap between impressive local agent behaviors and dependable end-to-end task completion in production-style environments.
The lesson is not that agentic techniques are useless. It is that long-horizon model-directed execution introduces many failure surfaces at once: decomposition, action choice, parameter selection, result interpretation, policy adherence, and state consistency.
Our architecture narrows that error surface by assigning different responsibilities to different components and keeping the most sensitive parts of execution outside autonomous control.
Why this matters specifically in recruiting
Recruiting automation differs from general-purpose assistance in one important way: the conversational participant is not the only stakeholder.
The recruiter, employer, or process owner defines:
- required steps
- evaluation criteria
- transition rules
- compliance boundaries
- acceptable automation behavior
The candidate interacts with the system, but does not define its operating semantics.
That makes recruiting an example of process-governed automation, not pure user-driven assistance.
In that setting, giving the model broad autonomy can create a mismatch between conversational optimization and process correctness. A model may optimize for brevity, empathy, or local coherence while violating requirements that matter to the actual system owner.
A workflow-governed agent system resolves that by separating:
- what must happen — encoded in workflow logic
- how it is communicated — handled by LLMs
Safety and compliance are easier to enforce structurally
Another reason we preferred workflow control is that safety checks can be mandatory pipeline steps rather than optional model decisions.
Moderation, jailbreak detection, policy classification, and scope validation all run because the runtime executes them — not because the model chooses to. That can make prompt injection less effective, since the model does not own the decision of whether protective checks run.
This is particularly relevant in hiring. Under the EU AI Act, AI systems used for recruitment or selection of natural persons are explicitly classified as high-risk in Annex III, and the main obligations for those systems apply from 2 August 2026. In a high-risk context, system properties like traceability, human oversight, and technical documentation become architectural concerns, not just policy aspirations.
A workflow-based architecture helps because it produces explicit intermediate artifacts:
- classifier outputs
- selected execution path
- collected structured data
- triggered rules
- resulting state transition
That is closer to an auditable system record than a free-form interleaving of reasoning and tool traces.
Tradeoffs
This architecture is intentionally less flexible than a fully autonomous agent.
It cannot improvise arbitrary new capabilities outside the defined execution graph. If a capability is not represented in the workflow, the system will not invent it. That is a limitation, but in a regulated, stateful, business-process domain, it is often the right limitation.
The upside is stronger guarantees around:
- process fidelity
- state consistency
- safety enforcement
- auditability
- operational predictability
For recruiting automation, we have found those properties more valuable than unconstrained model autonomy.
Conclusion
We did not reject agentic techniques. We use them extensively — for routing, classification, synthesis, and bounded decision-making. What we rejected was fully autonomous control over execution.
The result is an architecture where LLMs contribute intelligence inside narrow interfaces, while code retains authority over workflow structure, state mutation, and side effects. For recruiting automation, that has been a better engineering fit for us than a single autonomous agent loop.

What commerce innovations mean for applicant management
Yesterday, OpenAI introduced a new feature for ChatGPT: Instant Checkout. Users in the USA can now buy products from Etsy or over a million Shopify retailers directly in chat — without detours via shopping carts or external sites. With Stripe as a technical basis, this creates an agentical protocol for commerce that seamlessly integrates checkout, payment and fulfillment into a conversation.
That sounds like e-commerce at first. But anyone who looks deeper realizes that this is exactly the paradigm shift that we are also experiencing in recruiting.
From click to conversation
In classic online retail, customers click through catalogs, shopping carts and checkout funnels. Now everything happens in a conversation with an AI that presents products, answers questions and finalizes the purchase.
Recruiting was similar for a long time:
- Search job ad
- Open form
- Enter data
- dispatching
- waiting
Today, the process is also shifting here: Agentic AI makes it possible for applicants to apply directly via chat. Whether via WhatsApp, SMS or other messengers — the application becomes a fluid dialogue.
Recruiting = Commerce
The parallel is striking:
- Product data in retail corresponds to job data in recruiting.
- Checkout in e-commerce is the same as an application.
- Fulfillment (shipping, delivery) corresponds to the onboarding phase.
Just as ChatGPT only conveys information between buyer and retailer during checkout, recruiting AI conveys the relevant data between applicant and employer.
The result: less friction, more speed and the candidate experience that applicants expect today.
Agentic AI in applicant management
Instant Checkout creates an open standard for commerce that allows secure, powerful and easy-to-integrate payments in chat. We transfer exactly this idea to applicant management:
- Dialogue instead of a funnel: Applicants upload documents, answer questions, prove qualifications — all in chat.
- Agentic processes: AI understands the context, checks requirements in real time (e.g. language level, certificate of conduct), coordinates appointments and ensures that all stakeholders remain informed.
- Security and data protection: Just like with a payment token, the applicant only shares the data that is necessary — and remains in control at all times.
Agent readiness for recruiting
Instant Checkout creates a new field in e-commerce: agent readiness. Retailers must structure product data so that AI agents can understand and use it correctly.
The same applies to recruiting:
- Clean job data: clear role profiles, requirements, start dates, working hours.
- Machine-readable documents: contracts, proofs, certificates.
- Transparent processes: interview slots, feedback logic, feedback rules.
Only companies that prepare their recruiting data in such a way that agentic AI can work with it will benefit from this development in the future.
Conclusion: Instant recruiting is the next stage of evolution
What we are observing in retail is the blueprint for recruiting: away from click paths and towards intent-driven conversations with AI agents.
Just as ChatGPT completes the buying process in chat, applicant management will soon work like this:
- Find jobs
- Apply directly in chat
- Screening, interview, contract — all in one stream
The result: Recruiting in 60 minutes instead of 60 days.
Companies that start making their processes agent-enabled today secure the decisive advantage: find and hire the right employees faster — before the competition does.

Why agentic AI is reinventing recruiting
In June 2025, McKinsey released the new CEO Playbook Seizing the Agentic AI Advantage. The report confirms what many of us have long been observing in practice: Generative AI is widely used, but so far it has barely had any noticeable impact on company performance.
McKinsey calls that Gen-AI paradox: Almost 80% of companies are already using AI, but a similar number report that they see no measurable effect on their results.
The reason is simple. To date, most companies have relied on horizontal use cases — chatbots, copilots, text summaries. These are easy to introduce, but only provide diffuse, superficial benefits. Real impact comes from vertical use cases — i.e. AI that is deeply embedded in a company's core processes. That's where agents come in.
The change: From assistants to agents
As McKinsey describes, real added value is only created when processes are automated end-to-end. Agents differ from simple co-pilots because they:
- Understanding goals and breaking them down into subtasks
- Interact with people and systems
- Execute workflows independently
- Adapt in real time as conditions change
This is not just an increase in efficiency — it is a transformation of the way we work.
Recruiting as a prime example
At Paul's Job, we experience this every day. Recruiting is one of the most complex communicative processes in any organization. A single setting includes:
- Candidates who submit documents, answer questions, coordinate appointments
- Recruiters who pre-check, check documents, clarify availabilities
- Hiring managers who make decisions
- Assistants or coordinators who manage calendars and processes
If this process is run manually, it consumes enormous time and management capacity. In industries such as nursing, facility management or retail — where service performance depends directly on sufficient personnel — this has serious consequences:
- Vacancies remain vacant.
- Companies can serve fewer customers or patients.
- Competitors win candidates simply because they are faster.
What happens when you automate recruiting
Now let's imagine an end-to-end recruiting agent. Candidates apply directly via WhatsApp or SMS. Paul, the AI agent, guides them through the process:
- Collects documents (driver's license, language certificates, work permits)
- Checks requirements (language level via voice message, availability, certificates)
- Automatically coordinates interviews with hiring managers
- Prepare contracts as soon as all criteria are met
The result: From initial contact to a signed contract in less than 60 minutes.
This is not a chatbot that is docked to old processes. This is a process that has been rethought from the ground up around agents — exactly the vertical deployment that McKinsey identifies as the missing piece of the puzzle of today's AI strategies.
Why this is particularly important for service industries
In labor-intensive companies, faster recruiting is not just an HR indicator — it directly determines performance:
- Care services with more nurses can serve more patients.
- Retailers and logisticians can staff branches and warehouses more reliably.
- Security and facility firms can fulfill contracts on time, without expensive delays.
Recruiting speed is a strategic competitive advantage, especially in times of skill and labor shortages.
The bigger lesson for CEOs
The McKinsey playbook concludes with a clear message: The experimental phase is over. CEOs must have the courage to fundamentally rethink processes with agents. That means:
- Embed agents into core workflows, not just as an add-on
- Redesign processes to utilize the strengths of agents — parallel operations, real-time customization, scalability
- Create governance to ensure trust, security, and acceptance
Recruiting shows what that can look like: From weeks of communication to hiring in less than an hour. It's not a gimmick, it's process innovation.
conclusion
The Gen AI paradox cannot be solved with additional co-pilots or chatbots. The solution is when companies select a few, highly relevant processes, redesign them end-to-end — and let agents take over.
Recruiting is an excellent example. Anyone who automates here not only gains efficiency, but also real competitiveness. That is the real strength of Agentic AI.

Will AI replace the recruiter job — yes or no?
Be honest: We hear this question all the time. Will AI replace recruiters?
Our answer is clear: No.
But — the role is fundamentally changing.
With Paul, we can already see how much work is being done by recruiters. Communicating with applicants, collecting documents, coordinating interviews, checking basic requirements — an AI agent can do all of this faster, smarter and more consistently.
What does that mean for recruiters? It creates space for the tasks that really matter:
- Build the right teams.
- Strategically support specialist areas.
- Build stronger applicant engagement (almost like HR marketing).
- And very important: managing AI.
Because AI doesn't run on autopilot. With Paul, there is always a human-in-the-loop. Paul suggests decisions — such as whether a candidate fits the profile — but the last word lies with the person. Recruiters thus become AI managers: They review, control and refine what the system does.
Yes, the job profile is changing. But that is exactly where the opportunity lies: better decisions, higher quality and more time for what recruiting really means — the human side.
So no — AI doesn't replace recruiters. But it makes them better, faster and more effective than ever before.

Restart successful
After a tough summer, things turned around in July. Back then, I met our current lead investor: Jesse Jeng from Scalehouse Capital. Jesse was instantly convinced — not only of our vision, but also of our story. My exit with Softgarden, the fact that Yannick is on board, and our clear idea of how recruiting with agentic AI will work in the future — it was all right.
In August and September, we did it: We set up our pre-seed financing for Paul's job. A total of 1.4 million euros — led by Scalehouse Capital and supported by a great group of Super Angels: Alexander Brühl, Rainer Hofmann, Andreas Junck, Carsten Reetz and Michael May.
At the same time, we were also able to take the work we had already put into the product in 2024 and 2025 into the new company. Honestly, it wasn't cheap either 😉 — but it was worth it because it gave us a strong foundation on which we could immediately continue building.
And we didn't start alone: With customers such as Teamwork, Alloheim and VD Mayr, with whom we worked before, we were able to go through this turbulent period. We are incredibly grateful that they have remained loyal to us. On this basis, we know that we can count on our customers.
Today, as three founders — Yannick, Putu and I — we stand on really clean foundations. Without legacy issues, with a clear setup and a mission that drives us all: to completely rethink recruiting with agentic AI.
This restart feels like a fresh start — only better because we now have all the learnings in our luggage. We're stronger, more focused and have the right basis to build something really big.

A tough summer
The summer of 2025 was not an easy time for me. I'd like to share openly what happened here.
I joined a team in 2024 that had already worked on HR products. I met Putu there. Together, we have started working on a new vision: a recruiting system that uses agentic AI to completely rethink the application process.
However, the team I joined had a very complex past — with topics that actually had nothing to do with this new product. It is precisely these inherited issues that have made it extremely difficult to advance our vision in this structure.
During this time, I also brought Yannick into the team. We had worked closely together at Softgarden before, and I knew that with his experience in sales, he was just the right person to bring that vision forward.
But at some point, we were faced with a harsh reality: The team's complicated past made it impossible to get new funding. We would have liked to continue — but we simply had no chance anymore. We were forced to stop even though we still had so much to do.
That was a real low point. But at the same time, it's also a moment of clarity. It became clear to us that if we want to achieve this vision, we must restart it and start it unencumbered.
In the next article, you will read exactly how we succeeded — and why we are stronger and more free today than ever before.