A hardened, multi-agent shield for AI-to-AGI systems. The premise is simple and old: never trust a single guard. Three guardians, three coding styles, one central hub. Bypass any one and three more spawn — randomized, on the spot, until the attacker's surface area exceeds their patience.
Each guardian is implemented in a different language family with a different parser, sandbox, and ruleset. A jailbreak crafted for one is unlikely to land on the others.
Any successful bypass spawns three new randomized guardians. Stages: 3 → 9 → 27. At 27 the system enters protective shutdown rather than allow further compromise.
The central hub does not trust any guardian's verdict alone — it requires k-of-n agreement on policy-relevant decisions. Disagreement is itself a signal.
Prompt injection · jailbreak chains · multi-turn social engineering · automated bot floods · poisoned tool outputs.
Cerberus emits its verdicts into the OCTOREFLEX loop and the Constitutional Code Store. Every block becomes a sealed, auditable record.
Default-deny under disagreement, latency excess, or rate budget breach. Safety first. Performance second.
It rides the Waterfall firewall, reads from the Open Constraint Enforcement Engine, and reports into the Triumvirate's mutual-check loop.