Sietch - technology to reduce metadata leakage of Zcash Protocol transactions https://hush.is
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

227 lines
14 KiB

5 years ago
# Sietch
5 years ago
## What is Sietch?
5 years ago
Sietch is the codename for a new set of RPC functionality and consensus rules related to
5 years ago
using shielded funds. Shielded transaction creation is modified above
the level of the Zcash Protocol itself, no changes to that
layer are made. Additionally, there are no changes at how the RPC layer
is used, i.e. Sietch is implemented in hushd internals and existing software
which uses `hush-cli` and/or the RPC interface are not required to change
in any way. This preserves compatibility with all existing Zcash protocol
ecosystem software at a very low level.
4 years ago
Sietch defines certain wallet and transaction conditions that must be
5 years ago
upheld for transactions to be accepted by the network as well as making all clients have
new default behavior that is more privacy-preserving. This involves both
4 years ago
new wallet behavior that is non-consensus (currently implemented) and also future consensus rules
to enforce this new default behavior.
5 years ago
Sietch does not modify anything about Zcash internals such as zk-SNARKs,
cryptographic primitives or changing any of the deep zero knowledge math involved, such
4 years ago
as Elliptic Curves parameters, datatypes or other cryptographic machinery.
Any Zcash Protocol coin could adopt this "upgrade" to Sietch-enabled Hush Protocol,
5 years ago
though it should be noted that Hush was the first pure Sapling Zcash Protocol
coin and so our implementation ignores Sprout zaddrs. This is because we thankfully
have no Sprout zaddrs or transactions on our v3 mainnet.
Sietch is compatible with older sprout zaddrs and supporting
it is left as an exercise to the interested privacy coin.
5 years ago
5 years ago
## How is Sietch implemented?
Sietch is written in C++ and inside the hushd full node code, which means
all existing code that interacts with hush can benefit *with no changes*.
Some optional/advanced features in the future
may be implemented at the GUI wallet level. The goal is that the user does not need
to think or do anything different, to have more privacy when
making transactions.
5 years ago
Since Sietch works at the hushd level, all existing code, such as GUI wallets,
mining pools and exchanges, require no code changes. They will make a shielded
transaction as they always have, via the same exact RPC methods and parameters.
From the perspective of the average GUI wallet user, It Just Works.
There are no additional mandatory steps for the user when sending
5 years ago
each transaction. No mandatory wallet maintenance will be needed
to keep their privacy. No confusing options to "get right" on each transaction.
The same goes for exchanges, mining pools and all Hush users.
5 years ago
No changes at all are required by end users to enable Sietch, other than upgrading
to compatible Hush software.
5 years ago
There will be some options for advanced users, to
5 years ago
tweak the settings, but using them will not be required for increased privacy.
Hush is dedicated to all future privacy features being defaulted to ON instead
of asking our users to opt-in to privacy, which we know is a fools errand.
5 years ago
Users will not have to opt-in to Sietch and there will be no way to turn off
4 years ago
the functionality. Additionally, Sietch will be enabled on mainnet as an opt-in feature
at first, as a first wave of adoption and for testing. When all Hush users have updated
to software with Sietch-enabled Hush protocol, a new consensus rule will go into effect
which will prevent older non-Sietch clients. Additionally, Hush will increase it's
PROTOCOL\_VERSION with the Sietch feature, which will also allow Sietch-enabled Hush nodes
to effeciently block potentially malicious non-Sietch hushd clients at the p2p level.
5 years ago
## What are the goals of Sietch?
* Make linkability analysis of fully shielded (z2z) transactions drastically more expensive
* Allow people to make zaddr xtns safely/privately without thinking about
metadata leakage or advanced techniques
5 years ago
* Prevent average users from making some blockchain operations which give out too much metadata
* Break assumptions in blockchain analyst software
* Require blockchain analysts to write new software
* Increase computational cost of current blockchain analysis
* Increase the privacy of all funds in the Hush shielded pool
5 years ago
* Introduce non-determinism to counter ITM/Metaverse style metadata attacks
5 years ago
5 years ago
## What are the limitations/downsides of Sietch?
5 years ago
5 years ago
The goal of Sietch is increased privacy at the cost of increased blockspace usage per transaction,
5 years ago
increased CPU time and RAM to validate transcations and as a secondary effect, increased sync times.
5 years ago
Sietch potentially adds inputs and outputs to transactions to increase their privacy,
and so the maximum number of actual recipients is lower when Sietch is enabled. In practice,
transactions can still send to over 1000 recipients even with Sietch protections, so it
doesn't effect normal usage.
5 years ago
5 years ago
How much longer will average zxtn take? Research: How many zouts while staying under <10s?
5 years ago
5 years ago
Potentially more txfees if the transaction becomes so large to require more than the default fee.
No increased RAM usage, as creating ShieldedSpends is a serial process, but shielded transactions
5 years ago
will take longer in CPU seconds, since there will be more recipients by default.
5 years ago
## Implementation Details
5 years ago
These RPCs are modified directly:
5 years ago
```
z_sendmany
z_shieldcoinbase
z_mergetoaddress
5 years ago
```
All Hush RPCs that make shielded transactions have Sietch enabled.
5 years ago
These RPCs interact with zaddr xtns and may report different or additional info after this change:
```
z_viewtransaction
z_listtransactions
listunspent
```
4 years ago
These new RPCs were added in Sietch:
```
z_listnullifiers
```
5 years ago
### z\_sendmany Rule of Seven
Our goal is to be non-deterministic while also inserting enough zouts such that we have at least N=7 zouts.
Since the normal case is to have 2 zouts (one recipient for a z2z and one change zout back to sending address),
the normal case will be to add 5 zouts to a normal z2z xtn.
4 years ago
* For a z=>t, we must add 7 zouts
* For a t=>z we must add 6 zouts
* For a t=>(z,z) we must add 5 zouts
* For a z=>z we must add 6 zouts
5 years ago
The reason N=7 is chosen is because of the simple fact that `6!=720` while `7!=5040`. This parameter is chosen
in response to the ITM attack, which relies on a small number of zouts and doing combinatorial algorithms on all
possibilities. These combinatorial algorithms increase in state space for each link in a long chain of transactions.
Traditionally each xtn only has a few zouts and so because `2!=2` and `3!=6`, the combinatorial explosion does not
have a chance to slow down the ITM attack.
Now, consider 5040 choices at each link in a chain, and say we are studying a chain of length L=10.
Compared to pre-Sietch, we would have `2^10=1024` possibilities versus
```
5040^10 = 10575608481180064985917685760000000000
````
possibilities. Even for short chains of xtns, the combinatorial explosion of possibility renders the ITM attack
extremely expensive. It limits it to searching for linkability metadata in only short chains with lots of
additional supporting metadata, effectively raising the bar for attack and removing many potential attackers
who do not have sufficient resources. In pre-Sietch code, the ITM attack can potentially research very long chains,
dozens, hundreds and perhaps thousands of transactions in length, with commodity hardware.
Sietch exponenentially increases the cost of doing this, in RAM, CPU and running-time. Either an attacker would
need botnet or supercomputer-level resources to attack the same length chains as pre-Sietch code, or more likely,
de-anonymizing attackers will focus on studying short linkability chains with a lot of additional metadata from
timing analysis, amount analysis and potentially passive or active dust attacks.
5 years ago
## Non-determinism
5 years ago
Combinatorial explosion can only protect us so much. It is only one layer of defense and surely will not save us
from the inevitable quantum computers which are being optimized every day.
5 years ago
When adding zutxos, to break the ITM/Metaverse metadata attacks at a deep level, we must break a deep assumption
that is baked deep into Bitcoin wallet behavior: determinism.
5 years ago
Given the exact same wallet.dat, Bitcoin, Zcash and Hush will act in *exactly the same way* when sending a transaction, every
single time. Given all input data, one can predict the behavior of what will happen. Obviously, this is a very good idea in Bitcoin,
for total supply and issuance accountability. But this also contributes to Bitcoin being a "surveillance coin", that perfectly
preserves metadata until the end of time.
5 years ago
Zcash had no reason to change this and they did not, and so this behavior is baked in deep to all Bitcoin and Zcash forks.
In light of the ITM/Metaverse attack, this determinism is considered dangerous by the author. The reason is that the
ITM/Metaverse attack utilizes the fact that wallet operations are predictable to extract more metadata than previously thought possible from z2z and t=>z xtns. The only way to prevent that is to break the assumption of predictable wallet behavior.
4 years ago
## Downsides/Issues/Attacks
Code which uses purely raw transactions and broadcasts them to the network needs to do extra work to support Sietch.
This is why SDL has it's own implementation of Sietch, and any kind of hardware wallet support that uses raw transactions
in the future, would need to learn about Sietch.
### Metadata Attacks Against Sietch
Yes, Sietch can be metadata-attacked itself!
TLDR: Worst case is that an attacker steals wallet.dat files and does an immense amount of work to reduce privacy back to pre-Sietch levels, using the ITM attack.
This research became [Attacking Zcash Protocol For Fun And Profit](https://attackingzcash.com)
4 years ago
At first there was a single Sietch implementation of 200 zaddrs that were fixed. If the wallet.dat owning those zaddrs were stolen, that could be used to delete all the Sietch "privacy dust" from the transaction graph, making the job of blockchain analysts and/or ITM attackers much easier. This theoretical attack is what prompted dynamic Sietch addresses and also a better way of generating zaddrs inside SDL: using [HIP39](https://git.hush.is/hush/hips/src/branch/master/hip-0039.md) seed-phrases to generate a single zaddr and then delete the seed phrase. This method leaves no wallet.dat on disk to steal and the private key material for the zaddr only existed in memory a short time.
4 years ago
Additionally, each zaddr is not linkable to each other since they were derived from different seedphrases. SDL uses this method in production currently.
Originally, `hushd` used 200 static zaddrs and SDL 10,000 static (HIP39-derived) zaddrs in `SDL`. hushd generates random public keys to derive each zaddr,
achieving unlinkability while avoiding the added cost of generating seedphrases.
There is no wallet.dat to steal to recover data for most Sietch zoutputs. This leaves a dead-end in blockchain analysis software, that prevents algorithms from being effective.
Dynamic Sietch zaddrs make the entire process much more secure by preventing analysts/attackers from even knowing the zaddrs that could potentially be a Sietch output. These dynamic Sietch zaddrs are generated at run-time and private keys never even written to disk, nor part of the `hdseed` of any wallet.dat in the case of `SDL`.
4 years ago
4 years ago
## Conclusions
Sietch-enabled Hush Protocol can be thought of as using the ideas of combinatorial explosion and non-determinism to thwart brand-new
blockchain analysis techniques. Non-determinism is the stronger weapon, but it does not add enough privacy unless we add in the appropriate amount of combinatorial explosion to linkability analysis. Together they are a potent weapon which also give us a knob
to turn to increase future security, i.e. the minimum number of zaddr outputs allowed in a transaction.
This is why both techniques complement each other and have a greater privacy improvement when used together.
5 years ago
4 years ago
## Current Implementations
Dynamic Sietch addresses are in production in both hushd and SDL currently. More blockchain history uses the
new dynamic addresses, so any attacks against the first implementations have been mitigated.
Originally there were 4(!) implementations of Sietch in Hush world, 2 inside of `hushd` internals and 2 for `SilentDragonLite` which uses raw transactions and not the RPC interface of `z_sendmany`. Each of the 2 implementations has a static (drawing from a fixed pool of Sietch zaddrs) and a dynamic version (dynamically generating Sietch zaddrs at run-time). The static implementations went into production as of `Hush 3.3.0` and `SilentDragonLite 1.1.3`
## Frequently Asked Questions
### Why do some Sietch tx's have only 3 shielded outputs?
z_shieldcoinbase, which shields coinbase (mining) funds, uses 3 outputs instead of the normal sietch ~8 zouts because it's trivial to see that it's not a normal z2z, because it has taddr inputs. Trying to blend in with other z2z tx's just won't work. So instead, metadata leakage is reduced by creating 3 shielded outputs (zouts) which makes it harder for attackers to figure things out. 8 outputs could have been used for these transactions as well, but that doesn't hide the fact that they are shielding mining funds, takes up more bytes on the blockchain and makes the transactions slower to create. The reason ~8 was a good choice for z2z tx's is because sometimes you are sending to a few addresses, and ~8 zouts hides that metadata leakage. But when you shield coinbase funds, you are always sending to a single address, so 3 zouts is sufficient.
## Does Sietch protect transactions with more than 8 outputs?
Currently it does not. Originally this question was asked roughly as "If a shielded tx has M outputs where M>8, will Sietch create a transaction with M+8 outputs?" The answer is "No" because that wouldn't actually reduce metadata leakage. For example, if you make a ztx with 10 outputs, and Sietch always turned that into 18 outputs, then an attacker would know there was originally 10 outputs and 8 were added. Always adding a fixed number of outputs does not reduce metadata leakage.
## Is it possible for Sietch to reduce metadata leakage on transactions with more than 8 outputs?
Yes. This will be implemented as a future improvement to how Sietch works.