During the project Creating and signing a standard raw Bitcoin transaction, I spent some time working with Pay-To-Script-Hash (P2SH) addresses. In this article, I have selected relevant excerpts that deal with P2SH addresses and their security, and made some additional notes where necessary.
The rest of this article contains these excerpts and notes.
My then-current LocalBitcoins receiving address was a Pay-To-Script-Hash (P2SH) address. I constructed and signed a second transaction (tx2) that moved the available bitcoin from the target address to my LocalBitcoins account. I studied P2SH addresses in order to accomplish this. The transaction was broadcast and added to the Bitcoin blockchain.
Standard Pay-To-Public-Key-Hash (P2PKH) addresses start with a '1' character.
P2SH addresses start with a '3' character.
- Studied P2SH addresses in order to discover how to send a transaction output to a P2SH address (in this case, my then-current LocalBitcoins receiving address). Learned that P2SH were a soft fork that specified a new convention for how to use the components of a P2PKH address. Learned that the most common use for P2SH is to implement multi-signature addresses. Decided that P2SH addresses replaced the cryptographic guarantee of P2PKH addresses with a promise made by the Bitcoin miners to observe a different cryptographic guarantee. Concluded that breaking the guarantee of P2PKH addresses requires a hard fork, whereas breaking the guarantee of P2SH address only requires another soft fork, and that therefore P2SH addresses are not as secure as P2PKH addresses.
- A soft fork is a narrowing of the protocol. It is a convention for the use of an item, but does not actually change the overall behaviour of the item.
- A hard fork changes the overall behaviour of an item in some way.
- The Notes / Discoveries section contains the following parts, which cover the items in this summary in more detail:
[...]
-- P2SH multi-signature addresses - security
-- P2SH multi-signature addresses - results from this project
P2SH multi-signature addresses - security
Note: The following section is conjecture based on hearsay. It has not been tested.
Summary: Soft forks can be unmade as easily as they were made. They are necessarily easier to make or unmake than a hard fork. P2SH addresses, including P2SH multi-signature addresses, are only protected by a soft fork, not by the original protocol, and are therefore a less secure store of value than are P2PKH addresses.
[Hearsay] Pay-to-script-hash (P2SH) transactions were created in 2012 to let a spender create a pubkey script containing a hash of a second script, the redeem script.
[Hearsay] P2SH multi-signature is a subtype of P2SH.
[Hearsay] There was an earlier multi-signature transaction type, in which the entire redemption script was stored in the address. [0]
P2SH multi-signature has replaced it in actual use - in this case only the hash of the redemption script is stored in the address.
[Hearsay] P2SH multi-signature script format:
outputScript: OP_HASH160 {Hash160(redeemScript)} OP_EQUAL
scriptSig: {sig} [signature] [signature] ... {redeemScript}
Note: In code, outputScript might be called "scriptPubKey", as it occupies the same position and role as the scriptPubKey for P2PKH outputs.
Let "nodes using the original ruleset" be "nodes".
Let "nodes using a soft-fork ruleset" be "soft-fork nodes".
P2SH is a soft fork, i.e. a narrowing of the original protocol, with something extra added on (the promise to only mine a transaction if the signatures in the scriptSig meet the conditions of the redeemScript). P2SH transactions are a convention that the miners promise to enforce.
The soft fork is a convention on how to use the items scriptPubKey and scriptSig from the original ruleset / protocol.
When a new transaction appears that spends from a P2SH multi-signature address, nodes will verify that the redeemScript in the scriptSig hashes to the hash value in the outputScript. Soft-fork nodes will additionally verify that the conditions in the redeemScript (i.e. M of N signatures by specified public keys) are satisfied by the [signature]s.
To nodes, the scriptSig looks like an invalid signature. They expect that the scriptSig contains a signature and a public key, that the public key verifies the signature, and that the signature was made by the private key corresponding to the public key. The redeemScript fulfills none of these conditions. Nodes won't accept the transaction and will not mine it or re-broadcast it. However, they won't reject blocks that contain it.
The soft fork (and, in fact, any soft fork) is only enforced because 51% of the mining hash power agrees to enforce it. If the majority of the hash power were to stop enforcing the extra conditions in the scriptSig redeemScript, anyone could create a new transaction that transferred bitcoin from a P2SH address to an original P2PKH address that they controlled.
Note: This transaction would not be valid under the original ruleset, because the hypothetical private key corresponding to the redeemScript hash would be unknown, so this transaction could not be validly signed. This seizure would therefore have to be done with the cooperation of 51% of the mining power, who would have to accept (or at least not reject) this new transaction. This would essentially be another, different, soft fork.
The people with the incentive to eventually try to do this "soft fork rollback" would be some emergent miner conglomerate / alignment, whenever it became worthwhile to do so without too much risk. This rollback would not affect those who use P2PKH addresses - it would therefore be a limited seizure of funds that would probably not greatly affect the market value of the Bitcoin ecosystem in the long term. The P2PKH addresses would continue to be a reasonably safe store of value, regardless of whether the soft-fork P2SH addresses are enforced or not.
Note: Any change of the ruleset that allowed a miner conglomerate to seize funds stored in P2PKH addresses would be a hard fork. In the event of a hard fork, it is likely that the original ruleset would be preserved by a subsection of the mining hash power, and that there would be a fork in the chain. The hard-fork chain would likely see a steady decrease in its market price (due to being an unsafe store of value). The original-ruleset chain would continue as normal.
Conclusion: Soft forks can be unmade as easily as they were made. They are necessarily easier to make or unmake than a hard fork. P2SH addresses, including P2SH multi-signature addresses, are only protected by a soft fork, not by the original protocol, and are therefore a less secure store of value than are P2PKH addresses.
Notes on hard forks:
- Miners can't easily know which fork would win, and mining on the eventual loser chain means that all mining rewards from that chain would be lost, so they are highly incentivised to never change the new block acceptance ruleset, as they can't be sure that the majority of other miners would also make the exact same change. It's a prisoner's dilemma situation.
-- Note: If a miner cartel exists that controls 51% of the hash power and can effectively coordinate its internal activity, then it could perhaps enforce a change in the block acceptance ruleset.
- Anyone who maintains a Bitcoin node is also obliged to stick to the existing ruleset for the acceptance of new blocks, for the same game-theoretic reason.
P2SH multi-signature addresses - results from this project
To send to a P2SH multi-signature address, I don't need to worry about how to spend from it. I just need to construct an outputScript from the P2SH address and use it in the transaction output.
The P2SH outputScript is:
OP_HASH160 PUSHDATA(20) [20-byte op_hash160 hash of redeemScript] OP_EQUAL
- 20 in hex is 0x14.
The address format of P2SH addresses is:
base58-encode( [one-byte version] + [20-byte hash of redeemScript] + [4-byte checksum] )
On the main network, the version byte is 0x05.
Note: Because the '05' byte is always added at the front of the script hash, a P2SH address will never start with a leading zero byte, so the count-leading-zeros section of the base58check algorithm will never have an effect on the address during conversion.
Version byte + 20-byte hash with the minimum possible integer value:
"05"+"00"*20 =
050000000000000000000000000000000000000000
- in base58check_encoding (in which the 4-byte checksum is added), this is:
31h1vYVSYuKP6AhS86fbRdMw9XHieotbST
- minimum value as a decimal integer:
7307508186654514591018424163581415098279662714880
Version byte + 20-byte hash with the maximum possible integer value:
"05"+"FF"*20 =
05FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
- in base58check_encoding (in which the 4-byte checksum is added), this is:
3R2cuenjG5nFubqX9Wzuukdin2YfBbQ6Kw
- maximum value as a decimal integer:
8769009823985417509222108996297698117935595257855
So: A string of 21 bytes that starts with the '05' byte, once converted into base58check encoding (with a 4-byte checksum added by the encoding function), always starts with the base58 symbol '3'.
I decoded my then-current LocalBitcoins receiving address
36CQfj2Yt54sZttJYTb5ywuS7YGEQLfzCE
and found that the redeemScript hash inside it was:
316f8d5c41d88d3a0df179fc5ad765d57f7f4667
I then constructed an outputScript from the redeemScript hash.
outputScript:
- OP_HASH160: a9
- PUSHDATA: 14
- [derived property] PUSHDATA decimal value: 20
- public_key_hash: 31 6f 8d 5c 41 d8 8d 3a 0d f1 79 fc 5a d7 65 d5 7f 7f 46 67
- OP_EQUAL: 87
I was able to construct and sign a valid transaction that transferred bitcoin from the P2PKH target address to the P2SH multi-signature LocalBitcoins receiving address. This was transaction 2.
Excerpts from:
bitcoin.org/en/developer-guide
This page has a pop-up:
"BETA: This documentation has not been extensively reviewed by Bitcoin experts and so likely contains numerous errors."
P2SH Scripts
[...]
pay-to-script-hash (P2SH) transactions were created in 2012 to let a spender create a pubkey script containing a hash of a second script, the redeem script.
The basic P2SH workflow [...] looks almost identical to the P2PKH workflow. Bob creates a redeem script with whatever script he wants, hashes the redeem script, and provides the redeem script hash to Alice. Alice creates a P2SH-style output containing Bob's redeem script hash.
[...]
When Bob wants to spend the output, he provides his signature along with the full (serialized) redeem script in the signature script. The peer-to-peer network ensures the full redeem script hashes to the same value as the script hash Alice put in her output; it then processes the redeem script exactly as it would if it were the primary pubkey script, letting Bob spend the output if the redeem script does not return false.
[...]
The hash of the redeem script has the same properties as a pubkey hash - so it can be transformed into the standard Bitcoin address format with only one small change to differentiate it from a standard address. This makes collecting a P2SH-style address as simple as collecting a P2PKH-style address. The hash also obfuscates any public keys in the redeem script, so P2SH scripts are as secure as P2PKH pubkey hashes.
Standard Transactions
After the discovery of several dangerous bugs in early versions of Bitcoin, a test was added which only accepted transactions from the network if their pubkey scripts and signature scripts matched a small set of believed-to-be-safe templates, and if the rest of the transaction didn't violate another small set of rules enforcing good network behavior. This is the IsStandard() test, and transactions which pass it are called standard transactions.
Non-standard transactions - those that fail the test - may be accepted by nodes not using the default Bitcoin Core settings. If they are included in blocks, they will also avoid the IsStandard test and be processed.
[...]
The most common use of P2SH is the standard multisig pubkey script.
[...]
Pubkey script: OP_HASH160 <Hash160(redeemScript)> OP_EQUAL
Signature script: <sig> [sig] [sig...] <redeemScript>
This script combination looks perfectly fine to old nodes as long as the script hash matches the redeem script. However, after the soft fork is activated, new nodes will perform a further verification for the redeem script. They will extract the redeem script from the signature script, decode it, and execute it with the remaining stack items(<sig> [sig] [sig..]part). Therefore, to redeem a P2SH transaction, the spender must provide the valid signature or answer in addition to the correct redeem script.
Key points:
- To send to a P2SH multi-signature address, I don't need to worry about how to spend from it. I just need to construct a script from the P2SH address and use it in the transaction output.
-- The script is: OP_HASH160 [Hash160(redeemScript)] OP_EQUAL
- P2SH multi-signature works in this way: A multi-signature redeemScript will be included in the scriptSig of a new transaction that spends from a P2SH multi-signature address. The miners will: a) confirm that the redeemScript hashes to the value in the P2SH multi-signature address, and b) execute the script and confirm that the requisite multiple signatures are included in the new scriptSig and are valid.
- Old nodes (running the original ruleset) will only confirm that the redeemScript hashes to the value in the P2SH multi-signature address. I think that this happens because the redeemScript is occupying the position of a public key in a standard P2PKH scriptSig, so they treat it as a public key. From the perspective of old nodes, the redeemScript will simply be a invalid public key for the supplied signature data in the rest of the scriptSig. They won't relay a transaction containing a payment to a P2SH multi-signature address, or mine it, but I think that they will accept blocks containing it. Notably, old nodes would also accept blocks containing later transactions that (now that the redeemScript is publicly known) spend from the P2SH multi-signature address but do not include valid signatures (because they wouldn't actually run and validate the redeemScript according to the new soft-fork ruleset).
- P2SH multi-signature is a soft fork, i.e. a narrowing of the original protocol, with something extra added on (the promise to only mine a transaction if the signatures in the scriptSig meet the conditions of the redeemScript). Essentially, P2SH multi-signature transactions are a convention that the miners promise to enforce.
-- Note: This is true of original P2PKH (Pay-to-Public-Key-Hash) signatures as well. They only have validity because miners promise to only mine transactions that have valid signatures.
--- Bitcoin values stored on the blockchain have a market value because this promise is kept.
--- Hm. A reversal of the soft fork could eventually occur, in which a majority of the miners stop enforcing the redeemScript conditions but continue to enforce the original P2PKH conditions. They could then mine transactions that transfer any bitcoin stored in P2SH multi-signature addresses to their own P2PKH addresses.
---- Why would they do this? Well, a rising Bitcoin price might eventually make it worthwhile to do so. This would be a reversal of a soft fork (a convention), not a hard fork (a fundamental change). A hard fork is dangerous for miners because it might make long-term holders, who are interested primarily in Bitcoin as a store of value, decide to leave the chain, collapsing the price and severely damaging the miners' investment in equipment etc. A soft fork is not as dangerous for miners. Notably, a reversal of a soft fork is also not as dangerous for miners - they simply stop enforcing some additional rules. The people who use only P2PKH addresses would not be threatened and would probably stay on the Bitcoin blockchain, rather than selling or switching to holding value on another blockchain. The market value of Bitcoin would probably remain high.
Interesting additional point from the earlier excerpt:
- If non-standard transactions are included in blocks, they will also avoid the IsStandard() test.
Thought: The new ruleset for P2SH multi-signature may not be enforced during validation of transactions in new blocks received from the network. Miners might only be refusing to mine non-valid P2SH multi-signature transactions, but they might accept a block from another miner that had already mined some (i.e. they may not have bothered to implement code that would reject this block). Profitably reversing the soft-fork might not even require 51% consensus among miners.
Some reading suggests that there was an earlier multi-signature transaction type, in which the entire redemption script was stored in the address. P2SH multi-signature has replaced it in actual use - in this case only the hash of the redemption script is stored in the address.
Excerpts from:
github.com/bitcoin/bips/blob/master/bip-0016.mediawiki
BIP: 16
Layer: Consensus (soft fork)
Title: Pay to Script Hash
Author: Gavin Andresen <gavinandresen@gmail.com>
Comments-Summary: No comments yet.
Comments-URI: http://github.com/bitcoin/bips/wiki/Comments:BIP-0016
Status: Final
Type: Standards Track
Created: 2012-01-03
[...]
Abstract
This BIP describes a new "standard" transaction type for the Bitcoin scripting system, and defines additional validation rules that apply only to the new transactions.
Motivation
The purpose of pay-to-script-hash is to move the responsibility for supplying the conditions to redeem a transaction from the sender of the funds to the redeemer.
The benefit is allowing a sender to fund any arbitrary transaction, no matter how complicated, using a fixed-length 20-byte hash that is short enough to scan from a QR code or easily copied and pasted.
Specification
A new standard transaction type that is relayed and included in mined blocks is defined:
OP_HASH160 [20-byte-hash-value] OP_EQUAL
[20-byte-hash-value] shall be the push-20-bytes-onto-the-stack opcode (0x14) followed by exactly 20 bytes.
[...]
Backwards Compatibility
These transactions are non-standard to old implementations, which will (typically) not relay them or include them in blocks.
Old implementations will validate that the {serialize script}'s hash value matches when they validate blocks created by software that fully support this BIP, but will do no other validation.
Some notes:
- Status == Final. I assume that this BIP was fully implemented.
- "The purpose of pay-to-script-hash is to move the responsibility for supplying the conditions to redeem a transaction from the sender of the funds to the redeemer." I hadn't properly appreciated this motive before.
- Confirmation that old nodes won't relay these transactions or mine them.
- Confirmation that old nodes will verify the hash in the transaction if it is in a new block that they receive, but won't verify the signature.
- The script that I need to construct (after decoding the P2SH multi-signature address) is: OP_HASH160 PUSHDATA(20) [20-byte-hash-value] OP_EQUAL
20 in hex is 0x14.
Excerpts from:
github.com/bitcoin/bips/blob/master/bip-0013.mediawiki
BIP: 13
Layer: Applications
Title: Address Format for pay-to-script-hash
Author: Gavin Andresen <gavinandresen@gmail.com>
Comments-Summary: No comments yet.
Comments-URI: http://github.com/bitcoin/bips/wiki/Comments:BIP-0013
Status: Final
Type: Standards Track
Created: 2011-10-18
[...]
Abstract
This BIP describes a new type of Bitcoin address to support arbitrarily complex transactions. Complexity in this context is defined as what information is needed by the recipient to respend the received coins, in contrast to needing a single ECDSA private key as in current implementations of Bitcoin.
In essence, an address encoded under this proposal represents the encoded hash of a script, rather than the encoded hash of an ECDSA public key.
[...]
Specification
The new bitcoin address type is constructed in the same manner as existing bitcoin addresses (see Base58Check encoding):
base58-encode: [one-byte version][20-byte hash][4-byte checksum]
Version byte is 5 for a main-network address, 196 for a testnet address. The 20-byte hash is the hash of the script that will be used to redeem the coins. And the 4-byte checksum is the first four bytes of the double SHA256 hash of the version and hash.
[...]
The leading version bytes are chosen so that, after base58 encoding, the leading character is consistent: for the main network, byte 5 becomes the character '3'. For the testnet, byte 196 is encoded into '2'.
[...]
Backwards Compatibility
This proposal is not backwards compatible, but it fails gracefully-- if an older implementation is given one of these new bitcoin addresses, it will report the address as invalid and will refuse to create a transaction.
[...]
See Also
BIP 12: OP_EVAL, the original P2SH design
BIP 16: Pay to Script Hash (aka "/P2SH/")
BIP 17: OP_CHECKHASHVERIFY, another P2SH design
Some notes:
- Status == Final.
- "if an older implementation is given one of these new bitcoin addresses, it will report the address as invalid and will refuse to create a transaction." I didn't realise that old nodes wouldn't even create transaction that spend to P2SH multi-signature addresses, although, thinking about it, it makes sense. Old nodes will see the new address format as invalid.
- The first byte will always be 5.
-- Hm. In Satoshi's Base58 encoding, 0x00 bytes are '1', but 0x05 bytes are '6', not '3'.
--- Ah. The first byte is not exactly byte 0x05. It's really 0x05 with 24 following bytes (the 20-byte script hash and the 4-byte checksum), i.e. it has an positional value in base58 due to its position in the byte string.
Minimum value:
"05"+"00"*20 =
050000000000000000000000000000000000000000
- in base58check_encoding, this is:
31h1vYVSYuKP6AhS86fbRdMw9XHieotbST
- minimum value as a decimal integer:
7307508186654514591018424163581415098279662714880
Maximum value:
"05"+"FF"*20 =
05FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
- in base58check_encoding, this is:
3R2cuenjG5nFubqX9Wzuukdin2YfBbQ6Kw
- maximum value as a decimal integer:
8769009823985417509222108996297698117935595257855
Hm. So: A string of 21 bytes that starts with the '05' byte, once converted into base58check encoding (with a 4-byte checksum added by the encoding function), always starts with the base58 symbol '3'.
Note: Because the '05' byte is always added at the front of the script hash, a P2SH address will never start with a leading zero byte, so the count-leading-zeros section of the base58check algorithm will never have an effect on the address during conversion.
To get the above results, I wrote a short test script in Python that used parts of generate_bitcoin_address.py.
Here is the test script and an example of its use:
test.py
#!/opt/local/bin/python from binascii import hexlify, unhexlify import ecdsa from pypy_sha256 import sha256 from bjorn_edstrom_ripemd160 import RIPEMD160 def base58check(input_hex): input_bytes = unhexlify(input_hex) # calculate checksum digest = sha256(input_bytes).digest() digest2 = sha256(digest).digest() digest_hex = hexlify(digest2) checksum_hex = digest_hex[:8] # first 4 bytes item_hex = input_hex + checksum_hex item_base58_str = convert_hex_to_base58(item_hex) n = count_leading_zero_bytes(input_hex) item_base58_str = n * '1' + item_base58_str return item_base58_str def count_leading_zero_bytes(input_hex): count = 0 for i in range(0, len(input_hex), 2): byte = input_hex[i:i+2] if byte == '00': count += 1 else: break return count def convert_hex_to_base58(input_hex): input_int = int(input_hex, 16) # the string base58_symbols can accessed as a 0-indexed list of characters. base58_symbols = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz" output_str = '' item_int = input_int while item_int > 0: item_int, remainder = divmod(item_int, 58) # use remainder as index for accessing the corresponding base58 symbol. output_str += base58_symbols[remainder] output_str = ''.join(reversed(output_str)) return output_str input_hex = "050000000000000000000000000000000000000000" input_hex = "05FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF" output = base58check(input_hex) print "" print "input_hex: %s" % input_hex print "len(input_hex)/2 = %d" % (len(input_hex)/2.0) print "int(input_hex, 16) = %d" % (int(input_hex, 16)) print "result: %s" % output print ""
aineko:work stjohnpiano$ chmod 700 test.py
aineko:work stjohnpiano$ ./test.py
input_hex: 05FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
len(input_hex)/2 = 21
int(input_hex, 16) = 8769009823985417509222108996297698117935595257855
result: 3R2cuenjG5nFubqX9Wzuukdin2YfBbQ6Kw