The race to deploy multi-agent AI systems inside enterprise operations is accelerating. Customer support pipelines, document processing workflows, decision-support tools — organizations are assembling networks of specialized AI agents faster than they are thinking through the infrastructure that holds those networks together. According to McKinsey's State of AI 2025, a survey of nearly 2,000 executives across 105 countries, 62% of organizations are already experimenting with or actively scaling agentic AI, yet only 39% report measurable business impact at the enterprise level. The gap is not a model problem but an architecture problem. The dominant deployment pattern is direct coupling: agents wired to APIs, databases, retrieval pipelines, and prompt templates, with business logic, tool execution, and context management all fused inside a single component, a structure that holds together in a prototype and frays predictably under the pressures of production scale.
Igor Zuykov, Chief Software Engineer at G-71 Inc., has spent more than 18 years building large-scale distributed systems for major financial institutions, including treasury management platforms and an Asset and Liability Management system at T-Pro, the IT consulting firm he founded. At G-71, he serves as the architect behind LeaksID, a document security platform that can identify the source of a leak from a single photograph, a unique method protected by a US patent. His newly published peer-reviewed study in The American Journal of Engineering and Technology, “Architectural Principles for Multi-Agent Systems Based on the Model Context Protocol,” argues that the industry is investing in model capability while leaving the underlying architecture unexamined. A senior member of IEEE, a fellow of Hackathon Raptors, and twice appointed as a judge at the AITEX AI technology competition, Zuykov has spent his career asking the same question he now brings to agentic AI: not whether a system works in demonstration, but whether it holds up when scale, security, and real operational complexity enter the picture.
In this article, Mr. Zuykov explains the four structural principles that separate enterprise-grade multi-agent architecture from its fragile alternatives, and why governance built into the system from the start is the only approach that survives operational reality.
The failure mode Zuykov describes in his research is not specific to AI. It is the same structural problem that surfaces whenever teams treat integration as a local concern rather than an architectural one. Direct coupling works in the early phases of a project, when the scope is narrow, and the team is small enough to hold the whole system in their heads.
What his research documents is what happens when that scope expands. Every new data source increases coupling across the entire system, adding maintenance overhead that is invisible at first and expensive later. Testing narrows because agent logic and external dependencies cannot be examined independently. Security governance turns reactive, since access paths multiply faster than they can be audited. Zuykov characterizes the cumulative effect precisely: direct integration of agents with external tools and data pipelines leads to “fragmented capabilities and uneven governance mechanisms”, a pattern that only compounds when single-agent systems grow into multi-agent architectures.
That dynamic was visible not only in his own projects but across the field. At AITEX, evaluating solutions across AI infrastructure and data security, Zuykov encountered technically sophisticated submissions whose architecture hadn't been designed to separate what an agent knows from what it can reach, and that couldn't demonstrate stable behavior when dependencies changed. “You see teams that spent months on model quality and days on infrastructure,” Zuykov says. “Then a data source changes format or a third-party API updates its authentication, and the whole agent pipeline stops. Not because the model failed, but because nobody designed the boundary between the agent and the thing it was reaching into.” The failure mode he documents in his research and the one he observed in submitted solutions are the same one, appearing at different scales.
What distinguishes Zuykov's treatment of MCP from most coverage of the protocol is the architectural frame he applies rather than the mechanics he describes. “MCP should be treated not merely as an auxiliary integration protocol,” he writes in his study published in The American Journal of Engineering and Technology, “but as an autonomous architectural layer that establishes a formalized topology of interaction among hosts, clients, and servers.” The distinction matters because a connectivity tool gets evaluated on the range of systems it can reach. In contrast, an architectural layer gets evaluated on the structural properties it enforces across everything built on top of it.
The topology works through three components. The host is the runtime environment that coordinates tasks and manages execution context. The client is the protocol implementation inside the host, maintaining a stateful connection to a specific MCP server. The server exposes capabilities through a standardized interface: tools, resources, and prompt templates. The application communicates through clients that handle discovery and routing rather than reaching directly into external systems.
During his work as an integrator on behalf of IT services and consulting firms, Zuykov spent years building connections between enterprise systems across major financial institutions, including Barclays, UniCredit, and Raiffeisen. What he observed repeatedly was what accumulates without formal mediation: cross-system dependencies that look clean in architecture diagrams and create incidents in production, where a change to one component propagates through integrations that were never designed with explicit boundaries. The capability layer MCP establishes addresses exactly this for agentic AI, where specialized agents share infrastructure without each one absorbing the full complexity of the environment they operate in, and the capability layer can be audited as a coherent whole rather than reconstructed from scratch inside every agent that needs external access.
The four architectural principles Zuykov derives from MCP's design are mutually reinforcing rather than independently applicable. Each addresses a specific failure mode that emerges when the capability layer is left unstructured, and together they define what a governed multi-agent system actually requires.
The core problem Zuykov identifies in direct-coupled architectures is what he calls "the fusion of planning, reasoning, data retrieval, and action execution within a single agent boundary": when the same component interprets goals, calls tools, manages failures, and synthesizes outputs, a change to any one function potentially affects all the others, and in multi-agent systems this entanglement multiplies with every specialist agent added. MCP resolves this by assigning cognitive and infrastructural work to distinct parts of the topology, where the agent handles reasoning and synthesis while servers handle execution and data access. Zuykov encountered this requirement long before working on agentic systems, in traditional microservices environments. As a team leader and software architect at T-Pro, he built a Privileged Access Management system from scratch, along with an Asset and Liability Management system and a digital loan risk monitoring platform, all deployed on Kubernetes, following SOLID principles and standard microservices patterns. The common thread across them was a strict separation of concerns: infrastructure-level policy enforcement, including security, session auditing, and service discovery, lived apart from application business logic. Services communicated through well-defined APIs, and changes to one service did not force changes to others. MCP brings the same discipline to agentic AI.
Agents are exposed to prompt injection, data leakage, and privilege escalation by the nature of what they do, and when multiple agents coexist, layering controls onto an architecture not designed for them produces enforcement gaps that widen as the system grows. MCP provides least privilege as a structural default instead: agents carry no inherent access to external systems and can only interact with capabilities a connected server explicitly exposes, while authentication, authorization, and input sanitization are enforced at the server layer. Host-mediated isolation ensures servers cannot observe the full conversation history and cannot interact directly with one another. Long before agentic AI, this principle was already standard in microservices environments: policy enforcement lived at the infrastructure layer through API gateways and service meshes, kept entirely separate from business logic. As he notes in his research, MCP's design "relocates governance to a declarative, auditable layer", the same architectural separation that made those systems auditable, now applied to agents.
Without a shared protocol layer, agent ecosystems fragment: tools built for one framework stay bound to its abstractions, the same capability gets reimplemented for each new agent stack, and the overhead of maintaining parallel integrations consumes the efficiencies that specialization was supposed to create. MCP breaks this pattern by treating servers as framework-agnostic providers. Any conforming agent can use a compatible server regardless of the framework it runs on, meaning a database, search, or file server can serve multiple agents and applications from a single implementation. Over time, this creates what Zuykov describes as “a reusable infrastructure library that can be audited, improved, and shared across teams” — not a set of bespoke integrations rebuilt from scratch for every new use case.
In typical implementations, prompt strings live inside application code, retrieval heuristics drift as data changes, and different agents working on related tasks diverge silently in how they interpret shared context. Each of these is a local decision that accumulates into system-wide liability. MCP replaces this with a declarative model in which agents request context through structured identifiers without needing to know how retrieval or template management is implemented behind the interface. Context becomes, in the researcher's formulation, “a first-class asset that is discoverable, versioned, and administratively controlled”, updated centrally and consumed consistently across every agent that references it, without redeployment.
Zuykov demonstrates how these principles work together through a reference architecture built around customer support, a domain demanding enough in its access control, personalization, and compliance requirements to make the architectural choices visible.
The system contains three specialist agents operating within a shared host environment. The Classifier Agent analyzes incoming customer messages to determine intent and extract key entities, identifying, for instance, that a complaint about a double charge indicates a billing-related intent tied to a specific customer identifier. The Retriever Agent takes that output and assembles the context needed to address it: customer profile, subscription history, recent transactions, and relevant knowledge base articles. The Reply Generator Agent receives this assembled context and produces a personalized draft response, drawing on a versioned template appropriate to the identified intent.
Five MCP servers handle the external interactions: CRM, Ticketing, Billing, Knowledge Base, and Response Template. Each exposes only what its role requires. None has visibility into the others' data, and none has access to the conversation history held by the host. When the Retriever Agent calls the CRM Server, the server manages authorization and returns only what the requesting agent is permitted to see. A compromised Knowledge Base Server cannot reach customer financial data because the protocol boundaries between servers are not traversable. When the classification logic needs updating, the prompt is revised in the Response Template Server and immediately reflected across every agent that references it — no redeployment required.
The workflow also includes a human review step before any draft response is sent, for cases involving billing disputes, policy interpretation, or reputational sensitivity. The host coordinates the handoff, and the agents themselves are unaware of it. This is what governance built into the architecture from the start actually produces: not a constraint imposed on a system that wasn't designed for it, but a structural property the system supports without modification.
The architectural logic is sound, but the ecosystem around it is not yet mature. Centralized security oversight remains limited, multi-tenant environments are vulnerable to configuration drift, and authentication frameworks across clients and servers still need consolidation. Server distribution carries its own risks: spoofing, unofficial packages, and outdated versions can each compromise the capability layer, and because that layer is often a shared dependency, a single affected server cascades across every agent that relies on it. Signing, provenance, trusted catalogs, and continuous validation remain open problems, not settled practice.
“The real test for any architecture is not whether it works when everything goes right,” Zuykov says. “It is whether it gives you a clear picture of what happened when something goes wrong — and whether you can fix that one thing without rebuilding everything around it. That is what separating the capability layer from the reasoning layer actually buys you.”
For enterprise teams deciding where to invest as agentic AI moves from experimentation to production, the distinction is practical: MCP today is not a finished governance model. The ecosystem is still building the tooling to support one. What it offers is an architecture that enforces separation, bounded access, and auditability now, before the standards catch up. Zuykov's full argument, along with his earlier study on making RAG-based document analysis robust enough for enterprise use, is available in The American Journal of Engineering and Technology.