Articles have been written on write about private fill information for quite some time.
ESMA even asked trading firms and exchanges about their thoughts on private fill information, and summarized its findings starting on page 106 of this document. It makes for a fun read.
Still, the full importance of this mechanism is vastly underappreciated and understated even by many industry professionals, which makes this a good starting point for this little space.
What is a private fill
These points will be clear to most, but they bear repeating.
Every exchange has two categories of real-time channels:
"Public" channels: subscribers (typically for a hefty fee) receive data about the visible order book, trades etc.
"Private" channels: users send commands and the exchange sends replies to those commands and updates to any ensuing orders.
Every update to the displayed order book results in messages being sent down both channels:
On one hand, the public needs to know how the order book has changed.
The participants involved in the change of the book need to know what happened to their orders.
And here is the crucial bit: there is nothing, at least in principle, that requires that updates corresponding to the same event to be disseminated across the two types of channel at the same time.
"Private fill information", or private fills in short, are:
Messages transmitted on a private channel
Indicating that one’s order has been filled, in full or in part
Whose arrival to colocated customers precedes the arrival of the public messages conveying the same information.
Private fills and the race to negative
For a long time, people have talked about the "race to zero" in HFT when referring about latency reacting to events. For many years, however, one exchange operator has systematically and deliberately taken that race to negative. The exchange in question is no less than the CME Group.
For the casual reader, the CME Group operates venues where many of the most liquid and consequential futures (and options, and more) trade massive volumes every day. That on this venue private fills are not a matter of chance or casual approach to technology, but rather a deliberate effort that CME monitors and refers to publicly under the umbrella term "market dynamics", suggests that they and some of their customers consider the issue to be quite important. It is worth spending some thought on why, which can be done through some links and some contrasts.
A short aside on the toxicity of flow
The much-debated maker-taker fee structure has arisen in many markets worldwide as a consequence, primarily, of the fact that liquidity providers face too much adverse selection on their resting orders. The model is simple. When a displayed order trades:
The participant who placed the order on the book receives a rebate.
The other participant, which is called the taker (for taking liquidity from the book), pays a fee.
This adverse selection can come from a lot of sources, but a critical one for liquidity providers goes like this:
LP rests orders on instrument X
Something happens on instrument Y, which is strongly correlated to X
In the extreme case, the instrument can even be the same, just across different venues.
Given this information, LP would like to cancel or reprice its order (its because every high-volume LP is an algorithm, these days), but another participant is faster and trades against LP.
The above is a good chunk of why low latency matters in trading. It is also why rebates are quite large: the gap between cheapest taker fee and largest rebate - this is the relevant gap as the concern is with high-volume participants - has to be high enough that opportunities for takers are not excessively frequent, and small dislocations happen a lot.
CME's solution
For exchange operators, the maker-taker model has two big drawbacks:
Margins get smaller: for each trade, you pay a good chunk of what you collect from one side to participants on the other side.
Trading is reduced: unwanted as it would be by the liquidity providers, all the flow coming from low-latency reactions is fee-paying flow all the same.
Many exchange operators consider this a fact of life and deal with it, but CME (with significant nudging from a major customer) came up with something different, which amounts to a whole new economic model. They structured their systems to practically guarantee that, for each trade:
The taker receives his fill information first
Next, a certain number of private fills are disseminated, in order of their book priority prior to the trade.
There is a small, curious exception to this discussed in the first link below.
While the model led to some abuse and some unwanted behavior, changes were put in place to allay those concerns, and the result is really quite special. It allows CME to:
Charge both sides of each transaction with a symmetric fee - no maker rebates, at least not on the most liquid and important contracts!
Not reduce trading volume dramatically: many participants will use private fill information only to cancel orders, but some go further and initiate trades on CME on the back of that information.
There are even frequent instances of cascades of trades: particpant gets a fill, decides it does not like it, trades against someone further back in queue who ends up doing the same until the whole price level is gone.
Reward liquidity providers: given an early-enough place in queue that an order will get a private fill, the value of that order starts including the value of information it might earn, andorders further back in the queue have some probability of making it further ahead, etc.
At this point, this model might look like a marvellous innovation. It is good to abandon that perspective quickly. To that end, consider the following, in ascending order of importance:
Transparency: CME do not document these "market dynamics". Lack of documentation protects incumbents. The high volumes, low fees and large footprint required to acquire a good amount of private fills that can be monetized has the same effect.
Fairness: priority at the top of the book is often determined by races in the nanoseconds. The advantage of one queue spot in private fill dynamics, however, is often very large. Besides, if the reader is not keen on markets full of participants trading on private information, it should be quite disappointing to learn that one of the main pillars of the U.S. markets sells private information to trade on, and does so for cheaper to its largest customers.
Markets do not trade in isolation, least of all highly-liquid futures market. Private information obtained in one market can be used to the detriment of participants in other markets.
This is problematic, as the model incentivizes participants to post the smallest quantity possible to receive private fill information to go and extract value elsewhere.
The last point is salient, and one would expect that it would prompt vigorous lobbying with regulators against this model by other exchange operators. Ultimately, CME externalise the cost of their flow toxicity, and other exchange operators largely have to make up for that. Still, industry experience and lore suggest that CME have all the reasons to protect this model: several firms spend 8 digits a year in fees and realized losses from orders placed with the sole purpose of gathering private fills. This is an easily justifiable sum given how useful that information can be, and the larger a firm's footprint, the more the opportunities to monetize a private fill, therefore the higher the value of each such fill.
Some upcoming posts will illustrate through data some of the consequences of this model and its specific implementation on some of CME's own instruments. Some topics on the horizon:
Concrete illustrations of types of events where reactions to private fills are obvious and very profitable.
A blueprint for how to extract information from CME private fills.
Spoiler alert: just looking at the difference between receipt time and TransactTime is not enough at all.
Ample empirical evidence that participants reacting to private fills often take a lot of liquidity for a consistent profit. While only CME and the CFTC might be able to know for a fact, the author's information indicates that most large proprietary trading firms are consistently and conspicuously net liquidity takers.
The role of private fill information for queue formation.
The interested reader is invited to watch this space.
A few notes
Some exchanges have adopted architectures where private fills are rare if at all present. Eurex is a good example of that. Participants can compete on latency and transparency is brought to that competition via a multitude of high-accuracy timestamps. As the liquidity on Eurex is deep, this is a good counterexample for those that might claim that the CME model is needed to maintain deep markets.
CME's design makes it so that inferring fills from market data is also very useful: participants should not wait until they have received private information about a fill before reacting. Instead, they should arbitrate between private channels, where the information is explicit and will lead or lag the public channel depending on order priority, and the public channel, where the information is still very much present, although encoded differently.