⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Latent Space: The AI Engineer Podcast

2 DAYS AGO

⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Latent Space: The AI Engineer Podcast

2 DAYS AGO

Overview Shownote Highlights Transcript Chapters Pins

In a world where AI safety is often reduced to brittle guardrails and closed-door evaluations, two figures stand out for challenging the status quo: Pliny the Liberator and John V. They represent a growing movement that prioritizes radical transparency, open-source collaboration, and deep technical intuition in the pursuit of meaningful AI security.

Pliny and John advocate for universal jailbreaks as tools to expose latent model behaviors, rejecting the notion that guardrails equate to real safety. They distinguish between hard, single-input jailbreaks and soft, multi-turn attacks—methods long known in hacker communities but only recently acknowledged by academia. Through projects like the open-source Libertas repository, they employ techniques such as predictive reasoning cascades and 'steered chaos' to push models beyond training distributions. Their collective, BT6, vets members on both skill and integrity, insisting on full transparency and turning down closed bounties like Anthropic’s Constitutional AI challenge due to lack of data openness. They warn that segmented sub-agents can weaponize models like Claude for orchestrated real-world attacks—a threat Pliny anticipated months before official disclosures. Ultimately, they argue that true AI safety lies not in restricting models, but in full-stack, system-level defenses and open research grounded in meatspace realities.