Difference between revisions of "Thin Client Security"

From Bitcoin Wiki
Jump to: navigation, search
(See talk page)
(tweak category sort key)
 
(12 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
Recently there have been a number of proposals for bitcoin clients which do not store a complete copy of every block in the entire block chain.  This page will refer to all such clients as "thin clients".  This page is meant to be a place to try to make sense of the security and trust implications of the various schemes.
 
Recently there have been a number of proposals for bitcoin clients which do not store a complete copy of every block in the entire block chain.  This page will refer to all such clients as "thin clients".  This page is meant to be a place to try to make sense of the security and trust implications of the various schemes.
  
== Block Height vs. Depth ==
+
== Full Node vs. Thin Clients ==
  
 
It is important to distinguish between block height verification and block depth verification.
 
It is important to distinguish between block height verification and block depth verification.
  
A client verifies the height H of a block by checking that there are H block '''before''' it, all of which are well-formed and obey the maximum-difficulty-adjustment-rate rule.  Currently only the Satoshi client,  libbitcoin, and btcd do block height verification.  Block height is the fundamental anchor of trustless security in the Bitcoin system.
+
A full node client verifies that all preceding blocks are valid in order to guarantee that a transaction is valid.  Currently only the Satoshi client,  libbitcoin, and btcd do full node verification.  Full nodes are the fundamental anchor of trustless security in the Bitcoin system.
  
A client verifies the depth D of a block by checking that there are D blocks '''after''' it (also called "confirmations"), all of which are well-formed.  SPV clients substitute block depth for block height as a transaction validity check.  All clients use block depth as a measure of the liklihood of a [[Chain_Reorganization|block chain reorganization]] producing a new longer fork which excludes the transaction.
+
A client verifies the depth D of a block by checking that there are D blocks '''after''' it (also called "confirmations"), all of which are well-formed.  Thin clients don't verify the preceding blocks, they use the number of confirmations (whether they are valid or not) as a measure of the likelihood of a [[Chain_Reorganization|block chain reorganization]] producing a new longer fork which excludes the transaction.
  
See also [https://bitcointalk.org/index.php?topic=88208.msg987429#msg987429 some comments on probabilistic verification of block height].
+
== Full Node Clients ==
 
 
== Full-Node Clients ==
 
  
 
The "thick" bitcoin client downloads a copy of the entire chain, including all transactions (not just headers).  It will be used as the reference point for security comparisons below.
 
The "thick" bitcoin client downloads a copy of the entire chain, including all transactions (not just headers).  It will be used as the reference point for security comparisons below.
  
A full-node client uses the [[Protocol_rules#Blocks well-formed|difficultywise-longest]] valid block chain it can find. A transaction's ''depth'' (the number of blocks ''after'' it) is used to determine the likelihood of the transaction being double-spent due to the emergence of a longer fork.
+
A full-node client uses the [[Protocol_rules#Blocks well-formed|difficultywise-longest]] valid block chain it can find. A transaction's ''depth'' (the number of blocks or confirmations ''after'' it) is used to determine the likelihood of the transaction being double-spent due to the emergence of a longer fork.
  
 
==== [[bitcoind|bitcoin-qt]] ====
 
==== [[bitcoind|bitcoin-qt]] ====
Line 21: Line 19:
 
==== [https://github.com/conformal/btcd btcd] ====
 
==== [https://github.com/conformal/btcd btcd] ====
  
== Header-Only Clients ==
+
==== [[Libbitcoin|libbitcoin-server]] ====
 +
 
 +
=== Block Retention ===
  
These client downloads a complete copy of the headers for all blocks in the entire block chain.  This means that the download and storage requirements scale linearly with the amount of time since bitcoin was invented; it would be preferable to have the scaling be logarithmic or even constant.
+
Once a full-chain client has downloaded the entire chain, it typically retains it (as the Satoshi client did/does).
  
=== Simplified Payment Verification (SPV) ===
+
Satoshi's original paper mentions the possibility of pruning individual transactions, which allows for full nodes which verify the entire transaction history but do not retain it. Because users are required to download and verify the block chain from some other node initially, this change isn't costless.
  
This scheme is described in section 8 of the [http://bitcoin.org/bitcoin.pdf original bitcoin whitepaper].
+
== Thin Clients ==
  
==== Block '''Depth''' as a Transaction Validity Check ====
+
This client downloads a complete copy of the headers for all blocks in the entire block chain.  This means that the download and storage requirements scale linearly with the amount of time since Bitcoin was invented.
  
As Satoshi writes, "[the thin client] can't check the transaction for himself, but by linking it to a place in the chain, he can see that a network node has accepted it, and blocks added after it further confirm the network has accepted it."  If we take "X" to be the "number of blocks added after it", then SPV essentially trusts that a transaction X blocks deep in the chain does not have inputs which were already spent further back in the chain. Therefore, the validity of a transaction is determined by its depth -- i.e. how many blocks come ''after'' it.  Other thin client protocols also include this assumption.
+
This scheme is described in section 8 of the [http://bitcoin.org/bitcoin.pdf original bitcoin whitepaper].
  
This is very different from the trust model in the "thick" client: the thick client verifies that a transaction's inputs are unspent by actually checking the whole chain up to that point -- there is no "X blocks deep" involved here.  The thick client uses "X blocks deep" (aka "confirmations") only once it has already decided that a transaction is valid (i.e. no [[Double-spending|double-spends]]).  At that point it uses "X blocks deep" to decide how likely it is that a longer fork in the chain will emerge which excludes that transaction.
+
==== Block Depth Check ====
  
It is very important to understand how the same property ("X blocks deep") is used to verify two different properties in the thick client and SPV cases.  '''The thick client never uses block depth as a measure of transaction validity; the SPV client does'''.
+
As Satoshi writes, "[the thin client] can't check the transaction for himself, but by linking it to a place in the chain, he can see that a network node has accepted it, and blocks added after it further confirm the network has accepted it."  If we take "X" to be the "number of blocks added after it", then a thin client essentially trusts that a transaction X blocks deep will be costly to forge.
  
This is a concern in a situation where an SPV client is subjected to a double-spend attack by somebody who controls its network connection.  For example, suppose you are at a wi-fi cafe and are paying for something using your smartphone -- the cafe owner controls your network connection. Satoshi acknowledges this implicitly when he writes that "the verification is reliable as long as honest nodes control the network" -- to be completely pedantic, this means that the verification is reliable as long as honest nodes control '''the part of the network that the SPV client is able to communicate with'''.  In an attack-by-ISP scenario this may not be a sufficiently strong security property.  The attacker would not need to overpower "the rest of the network" because the client is unable to communicate with it.
+
This is very different from the trust model in the "thick" client: the thick client verifies that a transaction's inputs are unspent by actually checking the whole chain up to that point -- there is no "X blocks deep" involved here. At that point it uses "X blocks deep" to decide how likely it is that a longer fork in the chain will emerge which excludes that transaction.
  
  
 
==== [[BitCoinJ|bitcoinj]] ====
 
==== [[BitCoinJ|bitcoinj]] ====
  
Simplified Payment Verification is the verification mechanism used in [[BitCoinJ|bitcoinj]].
+
A security analysis of some of the issues in bitcoinj can be found [https://bitcoinj.github.io/security-model here]; however:
 
 
A security analysis of some of the issues in bitcoinj can be found [http://code.google.com/p/bitcoinj/wiki/SecurityModel here]; however:
 
  
 
* The claim that "picking 10 nodes and requiring all of them to be consistent needs much less trust" overlooks the problem of [https://en.bitcoin.it/wiki/Weaknesses#Cancer_nodes "cancer nodes"] and [http://en.wikipedia.org/wiki/Sybil_attack Sybil attacks].
 
* The claim that "picking 10 nodes and requiring all of them to be consistent needs much less trust" overlooks the problem of [https://en.bitcoin.it/wiki/Weaknesses#Cancer_nodes "cancer nodes"] and [http://en.wikipedia.org/wiki/Sybil_attack Sybil attacks].
Line 50: Line 48:
  
 
==== [https://bitcointalk.org/index.php?topic=128055.0 picocoin] ====
 
==== [https://bitcointalk.org/index.php?topic=128055.0 picocoin] ====
 
Simplified Payment Verification is the verification mechanism used in picocoin.
 
  
 
The library (libccoin) that picocoin is based on includes code for validating scripts and blocks; this could potentially be used to implement a full-chain client.
 
The library (libccoin) that picocoin is based on includes code for validating scripts and blocks; this could potentially be used to implement a full-chain client.
Line 68: Line 64:
 
If such UOT hashes were included in the block chain, a client which shipped with a [https://en.bitcoin.it/wiki/Vocabulary checkpoint] block that had a UOT would only need to download blocks after the checkpoint.  Moreover, once the client had downloaded those blocks and confirmed their UOTs, it could discard all but the most recent block containing a UOT.
 
If such UOT hashes were included in the block chain, a client which shipped with a [https://en.bitcoin.it/wiki/Vocabulary checkpoint] block that had a UOT would only need to download blocks after the checkpoint.  Moreover, once the client had downloaded those blocks and confirmed their UOTs, it could discard all but the most recent block containing a UOT.
  
This would also let a thin client reduce the question of "is this output unspent" to the question of "is this block super-well-formed" where "well-formed" means "well-formed according to the normal block chain rules and additionally has an Unused Output Tree which is accurate and truthful".  This is still a long way from the low level of trust involved in the thick client, but it is a major improvement over all existing proposals.
+
Hostile miners may insert blocks into the chain which have what claims to be a UOT, but which is actually invalid.  It is unlikely that such blocks could be kept out of the chain because, again, this would require adding a new block validity criterion, and miners implementing this new criterion would risk "mining on the wrong side" of a fork, which could cost them a lot of money.  Therefore, any UOT strategy would need to cope with the fact that not every block containing a UOT entry can be trusted.
 
 
It is unlikely that bitcoin would ever arrive at a state where every single block had a UOT, since this would require upgrading 100% of the miners on the network, or else convincing enough miners to reject blocks which do not contain a UOT.  The latter strategy risks creating block chain forks, which can be expensive (in reward terms) to miners.  Therefore, any UOT strategy would need to cope with the fact that not every block contains a UOT.
 
 
 
Hostile miners may insert blocks into the chain which have what claims to be a UOT, but which is actually invalid.  It is unlikely that such blocks could be kept out of the chain because, again, this would require adding a new block well-formedness criterion, and miners implementing this new criterion would risk "mining on the wrong side" of a fork, which could cost them a lot of money.  Therefore, any UOT strategy would need to cope with the fact that not every block containing a UOT entry can be trusted.
 
  
 
Note that at the present moment no standard format for such Unused Output Tree hashes has been agreed upon, nor do any of the blocks in the chain contain them.  The [https://bitcointalk.org/index.php?topic=91954 ultraprune] feature added to bitcoind-0.8 maintains a similar data structure on the client's disk.  It does not put this data structure or its hash anywhere in the block chain.
 
Note that at the present moment no standard format for such Unused Output Tree hashes has been agreed upon, nor do any of the blocks in the chain contain them.  The [https://bitcointalk.org/index.php?topic=91954 ultraprune] feature added to bitcoind-0.8 maintains a similar data structure on the client's disk.  It does not put this data structure or its hash anywhere in the block chain.
Line 78: Line 70:
 
== Server-Trusting Clients ==
 
== Server-Trusting Clients ==
  
These clients involve some (usually low) level of trust in the server they rely upon.  Mechanisms for authenticating the server, and for confirming that the server has not been compromised, are usually not explained.
+
These clients involve a high level of trust in the server they rely upon.  Mechanisms for authenticating the server, and for confirming that the server has not been compromised, are usually not explained.
  
 
All thin clients listed below currently connect to a single server, and are vulnerable to an attack similar to a double-spend. The attack can be run by that single server - the server can just lie to them that they received a Bitcoin transaction, and they, assuming the server does not lie, perform some service, transfer funds or send goods without actually receiving any Bitcoin in exchange. Therefore, they are implicitly trusting it.
 
All thin clients listed below currently connect to a single server, and are vulnerable to an attack similar to a double-spend. The attack can be run by that single server - the server can just lie to them that they received a Bitcoin transaction, and they, assuming the server does not lie, perform some service, transfer funds or send goods without actually receiving any Bitcoin in exchange. Therefore, they are implicitly trusting it.
Line 94: Line 86:
 
* The [[Weaknesses#Sybil_attack|sybil attack (also known as "cancer nodes")]] paragraph explains some of the issues with thin clients that base security on trusting whatever "a majority of the IP addresses I can see" say.
 
* The [[Weaknesses#Sybil_attack|sybil attack (also known as "cancer nodes")]] paragraph explains some of the issues with thin clients that base security on trusting whatever "a majority of the IP addresses I can see" say.
 
* [http://bitcoin.stackexchange.com/questions/2613/how-secure-are-various-models-of-bitcoin-clients related discussion on Stack Exchange]
 
* [http://bitcoin.stackexchange.com/questions/2613/how-secure-are-various-models-of-bitcoin-clients related discussion on Stack Exchange]
* A hypothesized [https://bitcointalk.org/index.php?topic=134318.msg1441171#msg1441171 intermediate security class] between SPV and full-chain validation.
+
* A hypothesized [https://bitcointalk.org/index.php?topic=134318.msg1441171#msg1441171 intermediate security class] between thin clients and full-chain validation.
  
 
<references>
 
<references>
  
 
[[Category:Technical]]
 
[[Category:Technical]]
[[category:Clients]]
+
[[category:Clients| ]]
 
[[Category:Security]]
 
[[Category:Security]]

Latest revision as of 08:20, 22 May 2018

Recently there have been a number of proposals for bitcoin clients which do not store a complete copy of every block in the entire block chain. This page will refer to all such clients as "thin clients". This page is meant to be a place to try to make sense of the security and trust implications of the various schemes.

Full Node vs. Thin Clients

It is important to distinguish between block height verification and block depth verification.

A full node client verifies that all preceding blocks are valid in order to guarantee that a transaction is valid. Currently only the Satoshi client, libbitcoin, and btcd do full node verification. Full nodes are the fundamental anchor of trustless security in the Bitcoin system.

A client verifies the depth D of a block by checking that there are D blocks after it (also called "confirmations"), all of which are well-formed. Thin clients don't verify the preceding blocks, they use the number of confirmations (whether they are valid or not) as a measure of the likelihood of a block chain reorganization producing a new longer fork which excludes the transaction.

Full Node Clients

The "thick" bitcoin client downloads a copy of the entire chain, including all transactions (not just headers). It will be used as the reference point for security comparisons below.

A full-node client uses the difficultywise-longest valid block chain it can find. A transaction's depth (the number of blocks or confirmations after it) is used to determine the likelihood of the transaction being double-spent due to the emergence of a longer fork.

bitcoin-qt

btcd

libbitcoin-server

Block Retention

Once a full-chain client has downloaded the entire chain, it typically retains it (as the Satoshi client did/does).

Satoshi's original paper mentions the possibility of pruning individual transactions, which allows for full nodes which verify the entire transaction history but do not retain it. Because users are required to download and verify the block chain from some other node initially, this change isn't costless.

Thin Clients

This client downloads a complete copy of the headers for all blocks in the entire block chain. This means that the download and storage requirements scale linearly with the amount of time since Bitcoin was invented.

This scheme is described in section 8 of the original bitcoin whitepaper.

Block Depth Check

As Satoshi writes, "[the thin client] can't check the transaction for himself, but by linking it to a place in the chain, he can see that a network node has accepted it, and blocks added after it further confirm the network has accepted it." If we take "X" to be the "number of blocks added after it", then a thin client essentially trusts that a transaction X blocks deep will be costly to forge.

This is very different from the trust model in the "thick" client: the thick client verifies that a transaction's inputs are unspent by actually checking the whole chain up to that point -- there is no "X blocks deep" involved here. At that point it uses "X blocks deep" to decide how likely it is that a longer fork in the chain will emerge which excludes that transaction.


bitcoinj

A security analysis of some of the issues in bitcoinj can be found here; however:

  • The claim that "picking 10 nodes and requiring all of them to be consistent needs much less trust" overlooks the problem of "cancer nodes" and Sybil attacks.
  • Many of the security claims are qualified by some form of "if you don't think an attacker controls your internet connection"; see the previous section for a discussion of why this is problematic.

picocoin

The library (libccoin) that picocoin is based on includes code for validating scripts and blocks; this could potentially be used to implement a full-chain client.

Electrum

Electrum fetches blockchain information from Electrum servers, bitcoin nodes that index the blockchain by address. Electrum performs Simple Payment Verification to check the transactions returned by servers. For this, it fetches blokchain headers from about 10 random servers. In addition, Electrum servers are authenticated by SSL, in order to protect users from MITM attacks.

Unused Output Tree in the Block chain (UOT)

There have been several proposals (the first appears to be this one by gmaxwell, who called it an "open transaction tree", although the term "open" is now taken to mean "not yet mined into the block chain" rather than "unspent") to form a tree of unused transaction outputs at each block in the chain, hash it as a Merkle tree, and encode the root hash in the block chain (probably as part of the coinbase input). This will be called an Unused Output Tree (UOT). The first detailed proposal so far appears to be Alberto Torres' proposal; etotheipi's ultimate block chain compression is a variant of this.

If such UOT hashes were included in the block chain, a client which shipped with a checkpoint block that had a UOT would only need to download blocks after the checkpoint. Moreover, once the client had downloaded those blocks and confirmed their UOTs, it could discard all but the most recent block containing a UOT.

Hostile miners may insert blocks into the chain which have what claims to be a UOT, but which is actually invalid. It is unlikely that such blocks could be kept out of the chain because, again, this would require adding a new block validity criterion, and miners implementing this new criterion would risk "mining on the wrong side" of a fork, which could cost them a lot of money. Therefore, any UOT strategy would need to cope with the fact that not every block containing a UOT entry can be trusted.

Note that at the present moment no standard format for such Unused Output Tree hashes has been agreed upon, nor do any of the blocks in the chain contain them. The ultraprune feature added to bitcoind-0.8 maintains a similar data structure on the client's disk. It does not put this data structure or its hash anywhere in the block chain.

Server-Trusting Clients

These clients involve a high level of trust in the server they rely upon. Mechanisms for authenticating the server, and for confirming that the server has not been compromised, are usually not explained.

All thin clients listed below currently connect to a single server, and are vulnerable to an attack similar to a double-spend. The attack can be run by that single server - the server can just lie to them that they received a Bitcoin transaction, and they, assuming the server does not lie, perform some service, transfer funds or send goods without actually receiving any Bitcoin in exchange. Therefore, they are implicitly trusting it.

Future enhancements have been suggested that will have the client talk to multiple servers and broadcast transactions and query all of them. Unfortunately it is well known to security researchers that this does not actually increase security; it simply makes the exploits more complicated and difficult to find. Security researchers have a name for this phenomenon: it is called a "Sybil attack"[1]. This post on bitcointalk explains how some governments (notably Iran and China) already perform these sorts of attacks on their own citizens, with the coerced assistance of SSL certificate authorities.

Clients with a checkpoint (even a very old one) that download and validate the headers for the whole block chain are not vulnerable to Sybil attacks in the following sense: they can always ensure that an attack would cost more than the amount being stolen.

BCCAPI

Other

<references>