A Test Program Designed to Lose Money Ran in Production for 45 Minutes. Knight Capital Didn’t Survive It.

Dead code behind a repurposed flag, waiting for the input nobody tested. Knight Capital had one. Most codebases in 2025 carry hundreds like it.

Apr 22, 2026

A Test Program Designed to Lose Money Ran in Production for 45 Minutes. Knight Capital Didn’t Survive It.

Dead code behind a repurposed flag, nine years deep in the codebase. Knight Capital had one. Most codebases in 2025 carry hundreds.

Before 2003, Knight Capital ran a test program called Power Peg. It was an internal tool for market simulation: a deliberately destructive trading algorithm that bought high and sold low to move stock prices up and down, so Knight’s other algorithms could be validated against a controlled, mobile target. Power Peg was never meant to see a live market. In 2003, Knight stopped using it. The code stayed in the codebase. Nobody removed it.

In 2005, during an unrelated refactor, an engineer moved a tracking function to an earlier point in the system’s execution sequence, disconnecting it from Power Peg. That function had one job: count the shares each order had already filled, and stop sending new orders once the total matched the parent order’s target. Without it, Power Peg would send orders forever. Nobody tested whether Power Peg still worked after the move. Why would they? Power Peg was retired.

Seven years of commits piled on top. The code stayed in production, broken in a way that was invisible because nobody had any reason to run it.

In July 2012, NYSE announced a new Retail Liquidity Program, giving market makers roughly a month to prepare. An engineer writing Knight’s RLP integration reused the flag that had once controlled Power Peg to activate the new functionality. When the flag was set to yes, RLP would now activate, replacing Power Peg. The intent was that the old Power Peg code would be removed at the same time. It wasn’t.

On July 27, 2012, a Knight Capital technician began deploying the new RLP code to eight production servers running SMARS (Smart Market Access Routing System), the algorithmic order router that handled roughly 1% of all U.S. equity trading volume. The deployment was manual, done a few servers at a time over several days. The technician copied the code to seven servers. The eighth was missed. No second technician reviewed the deployment. Knight had no written procedures that required such a review.

Starting at 8:01 AM Eastern, the morning of August 1, Knight’s internal system generated 97 automated email alerts referencing SMARS and the error “Power Peg disabled.” Nobody was watching that inbox as an alert channel. The emails sat unread as the market opened.

At 9:30 AM Eastern, the New York Stock Exchange opened for trading and Knight’s engineers activated the flag. Seven SMARS servers executed RLP as intended. The eighth ran Power Peg — a test program designed to lose money on purpose, now operating in a live market, against real counterparties, with no brake, no monitor, and no idea it was supposed to stop.

Over the next forty-five minutes, Knight Capital’s eighth server processed 212 parent orders and routed millions of child orders into the market, resulting in over four million trades across 154 stocks and more than 397 million shares. Knight’s response made it worse: believing the new RLP code was the problem, the team uninstalled it from the seven correctly-deployed servers, which caused those servers to also run Power Peg. All eight were now executing the dead code against a live market.

The firm took a loss of more than $460 million. Their stock dropped over 70% in two business days. On August 5, Knight raised $400 million in rescue financing led by Jefferies. Four months later, they agreed to be acquired by GETCO; the deal closed in 2013 and the Knight Capital name ceased to exist. They had gone, in forty-five minutes, from the largest equity trader on the NYSE to a footnote in compliance textbooks.

The SEC’s cease-and-desist proceedings in October 2013 laid out the technical chain in detail. Power Peg code preserved in production after its 2003 retirement. Its safety mechanism moved to a different part of the codebase in 2005, with no test on the dead code left behind. A flag repurposed in 2012 without auditing or removing the code it used to gate. A manual deployment with no written procedure requiring peer review. An alerting system that fired 97 times in the 90 minutes before market open and reached nobody with authority to stop the launch. Each a distinct hole. The holes aligned.

Pete Hodgson’s 2017 essay on feature toggles was explicit about this class of flag. Release toggles should be short-lived, removed as soon as the feature fully ships, and never repurposed. The flag and the code it gates should be deleted together. Hodgson also named the underlying math: N flags create 2^N possible system states, while test coverage grows linearly at best. The gap compounds.

The math is elementary. Here is what it looks like.

The industry read the part about shipping velocity and skipped the part about discipline. In 2025, a flag management vendor estimated that 20 trillion feature flag evaluations happen daily across the industry. The accumulation is exponential. The removal rate is not. Modern codebases routinely carry hundreds of flags, many without owners, many without expiration, many whose original purpose is known only to engineers who left years ago. Every one of them is a small Power Peg, waiting for an input that looks just different enough from what anyone tested.

The practices that would have saved Knight are available today. Assign an owner and an expiration date to every flag at creation; Power Peg’s flag had neither, so it outlived the team that wrote it. Never repurpose a flag. When code is retired, remove the flag and the gated code together in the same commit; Knight left the code and reused the flag. Require peer review for flag-state transitions in production; Knight’s deployment procedure required no second pair of eyes. When a flag is finally removed, the removal goes through CI with a test that asserts the old code path is unreachable; if Knight had run that test in 2012, the eighth server’s missing deployment would have been caught before market open.

Look at your own flag dashboard. How many flags were added last quarter? How many were removed? How many have no owner? Knight had one flag, one piece of forgotten code, nine years of accumulating risk. Most codebases in 2025 carry hundreds of flags under similar discipline.

Somewhere in your codebase, there is a Power Peg. A flag that got repurposed, code that got left behind, years of commits piling on top. Knight didn’t call theirs a powder keg. Nobody ever does.

Thanks for reading Designed to Fail! This post is public so feel free to share it.

Discussion about this post

Ready for more?