As we have exhaustively covered, getting accurate estimates of the actual economic throughput of public blockchains is not a trivial task. Due to the existence of mixers, self-churn, privacy enhancements, spam, and change outputs (in UTXO chains), raw estimates of transactional value are often misstated by a factor of 10 or more.

Today, we’re pleased to make our adjusted transaction value (USD) estimates available to you on our Charts and Data pages. Initially we are rolling this out for UTXO chains (Bitcoin, Bitcoin Cash, Litecoin, Zcash, Dash, Decred, Verge, and so on) and Ethereum. From the perspective of the blockchain, all transactions are equally valid – but from the perspective of an analyst or economist, it’s useful to isolate only the meaningful economic transactions, so that’s what we’ve done here. We have different methods for the UTXO chains and for Ethereum.

Our UTXO chain methodology

For UTXO chains, we used two heuristics to cull non-meaningful transactions:

  • the early spent output heuristic from Narayanan et al (2017); so outputs spent within 4 blocks of their first expenditure are subtracted to avoid double-counting. This subtracts out the self-churn typical of exchanges, mixers, and stress tests
  • additionally, we already use the “obvious change” heuristic; outputs that are cycled directly back into the originating address are considered change and culled

Note that for the early spent output heuristic we index it to Bitcoin which it was originally designed for – so we cull outputs spent again within 4 blocks for Bitcoin, which is equivalent to 16 blocks for Litecoin and 40 blocks for Dogecoin.

We chose not to use the “non-obvious change” heuristic (Meiklejohn et al 2013) – in which outputs are culled if they are cycled back into originating addresses after several hops. The reason for this is that it required the manual removal of false positives.

Picking appropriate heuristics to subtract “non-economic volume” is a delicate balance between accuracy and false-positives. We want to isolate traffic which is clearly meaningful without being too aggressive in removing valid transactions. Unfortunately some of the more interesting heuristics ran the risk of identifying many false positives which would have had to be added back in.

We want scalable processes, not highly idiosyncratic ones. If we can identify methods which suit our desired tradeoff subsequently, we will add them in.

We’re very pleased with our initial efforts here and our output culling has a meaningful effect on volume. We will continue to refine our methods and are considering adding in additional heuristics.

Our Ethereum methodology

We have done two things with our Ethereum estimates, one which increases the economic volume and one which decreases it. The first thing we’ve done is add indirect transfers of ETH to the ETH transaction volume figures. That will increase both the ‘raw’ and ‘adjusted’ figures (slightly). The second is tackling the issue of the Ethereum mixer.

It is well established that Ethereum has hosted a colossal mixer for large portions of the last 18 months. This was first established by cyber.fund and subsequently by bloxy.info. We had run into long transaction chains on etherscan and decided to repeat the analysis and determine the existence of the mixer.

Since Ethereum is an account-based system, not a UTXO system, the UTXO heuristic’s don’t apply. However, Ethereum is also a ripe candidate for the culling of non-economic volume due to the prevalence of this mixer and its influence on the chain.
The mixer detection methodology was fairly well documented previously but we will briefly detail our efforts here. First, we parsed the entire Ethereum chain and SQL-ized it (that’s the first step in virtually any blockchain analysis). Then we identified all “one-time addresses” – addresses for which the first ever and last ever transaction was less than one day. Then we built a graph where one-time addresses are vertices, and two vertices have an edge if there is a transaction between two addresses. We found that the largest connected component of the graph had 96.7% of all one-time addresses and 97% of all the volume flowing through one-time addresses.

We then assume that the connected component is the mixer and arrive at figures for the mixer which are virtually identical to the figures found by bloxy and cyber fund.

Thus, our adjusted volume (USD) figures for Ethereum are those in which the mixer is absent. They are dramatically different from the unadjusted figures.

We will follow up with a longer post addressing our mixer methodology in depth but we are comfortable saying the following:

  • we have sanity-checked the mixer transactions and do not think we are mistaking normal transactional activity for mixer volume; these transactions are highly idiosyncratic
  • we do not think this is an unintentional artifact of Ethereum; this appears deliberate

Regarding the use of the ‘mixer’ nomenclature – we are not alleging that this is linked to any illegal activity or money-laundering; but it does look like a deliberate effort to mix coins and obfuscate their origin and destination. We are using the term neutrally – you could call it a stirrer or blender if you want.

To conclude, our raw estimates include the impact of the mixer, and our adjusted estimates remove the mixer volume. We believe the adjusted estimates are more meaningful from an economic perspective.

We’re pleased to offer better estimates for economic volume for both UTXO chains and Ethereum, and we will keep refining our adjustments and broadening their scope in the coming weeks and months.