EIP-2124: Fork identifier for chain compatibility checks #2125

New issue

Closed

Listed in

#5111

Closed

EIP-2124: Fork identifier for chain compatibility checks#2125

Listed in

Labels

opened

· edited by karalabe

Member

Discussion thread for #2124.

Abstract

There are many public and private Ethereum networks, but the discovery protocol doesn't differentiate between them. The only way to check if a peer is good or bad (same chain or not), is to establish a TCP/IP connection, wrap it with RLPx cryptography, then execute an eth handshake. This is an extreme cost to bear if it turns out that the remote peer is on a different network and it's not even precise enough to differentiate Ethereum and Ethereum Classic. This cost is magnified for small networks, where a lot more trial and errors are needed to find good nodes.

Even if the peer is on the same chain, during non-controversial consensus upgrades, not everybody updates their nodes in time (developer nodes, leftovers, etc). These stale nodes put a meaningless burden on the peer-to-peer network, since they just latch on to good nodes, but don't accept upgraded blocks. This causes valuable peer slots and bandwidth to be lost until the stale nodes finally update. This is a serious issue for test networks, where leftovers can linger for months.

This EIP proposes a new identity scheme to both precisely and concisely summarize the chain's current status (genesis and all applied forks). The conciseness is particularly important to make the identity useful across datagram protocols too. The EIP solves a number of issues:

If two nodes are on different networks, they should never even consider connecting.
If a hard fork passes, upgraded nodes should reject non-upgraded ones, but NOT before.
If two chains share the same genesis, but not forks (ETH / ETC), they should reject each other.

This EIP does not attempt to solve the clean separation of 3-way-forks! If at the same future block number, the network splits into three (non-fork, fork-A and fork-B), separating the forkers from each another will need case-by-case special handling. Not handling this keeps the proposal pragmatic, simple and also avoids making it too easy to fork off mainnet.

To keep the scope limited, this EIP only defines the identity scheme and validation rules. The same scheme and algorithm can be embedded into various networking protocols, allowing both the eth/6x handshake to be more precise (Ethereum vs. Ethereum Classic); as well as the discovery to be more useful (reject surely peers without ever connecting).

karalabe

changed the title ~~[-]EIP-2124: Zero RTT netsplit on chain mismatch[/-]~~ EIP-2124: Zero RTT chain compatibility check

on Jun 20, 2019

karalabe

changed the title ~~[-]EIP-2124: Zero RTT chain compatibility check[/-]~~ EIP-2124: Fork identifier for chain compatibility checks

on Jul 3, 2019

ritzdorf

Hi @karalabe ,

very interesting proposal. I had a look after it was mentioned in the allcoredevs call.

I have a question:
Assuming that a fork is happening at block 100, when is this fork considered for the FORK_HASH?

After the local block number is >= 100
or
After the local block number is >= 99?

I assume that you intended the first case. Then, it seems to me that the edge case of Istanbul-Ropsten case is not properly covered. At block B-1 the situation would be:

Updated nodes would have:
Past Forks: A
Future Forks: B

Stale Nodes would have:
FORK_HASH: A
FORK_NEXT:

The mining power of the updated nodes is very weak, hence they stall at B-1.
The stale nodes proceed and hence reach B, B+1, B+2...
All the updated nodes will still connect to the stale nodes.

I see two possible solutions X and Y:

X: Considering the fork for the FORK_HASH on the block before.
In the Istanbul-Ropsten case this would mean at block B-1:
Updated nodes would have:
Past Forks: A,B
Future Forks:
and hence would no longer connect to stale nodes.

Y: Instead of sending FORK_NEXT, the current block number (BLOCK_NUM) is sent. As FORK_NEXT is used to tell "whether a remote node is out of sync or whether its software is stale.", this seems easier to do with a block number. New decision tree:

FORK_HASH matches local past forks:

No local future fork
- connect
Local future fork > remote BLOCK_NUM
- connect
Local future fork <= remote BLOCK_NUM
- don't connect (this is the improvement over your proposal, it detects remote stale nodes on a fork with mining power)
  FORK_HASH is subset of local post forks:
remote BLOCK_NUM >= last local past fork
don't connect
remote BLOCK_NUM < last local past fork
connect (here it potentially doesn't detect stale nodes on a fork with 0 mining power)
FORK_HASH is a superset of local past forks:
(local past forks + subset S of local future forks) matches FORK_HASH:
- Local future fork not in S > remote BLOCK_NUM:
  - don't connect (Example: a freshly started ropsten node connecting to a stale ropsten node)
- else
  - connect
(local past forks + subset of local future forks) never matches FORK_HASH:
- don't connect
  Else: (e.g. different genesis)
don't connect

I can also transform it into a table if you prefer that. There are a few cases that are different.

See you in Osaka

ritzdorf

Hi @karalabe,

as discussed in Osaka BLOCK_NUM might not be ideal. So disregard my proposal from above.

Quick Question

Hence, I still have the following question:
Assuming that a fork is happening at block 100, when is this fork considered for the FORK_HASH?

After the local block number is >= 100
or
After the local block number is >= 99?

I assume that you intended the first case. Then, it seems to me that the edge case of Istanbul-Ropsten case is not properly covered. At block B-1 the situation would be:

Updated nodes would have:
Past Forks: A
Future Forks: B

Stale Nodes would have:
FORK_HASH: A
FORK_NEXT:

The mining power of the updated nodes is very weak, hence they stall at B-1.
The stale nodes proceed and hence reach B, B+1, B+2...
All the updated nodes will still connect to the stale nodes.

Of course, as soon as the updated nodes find a single block (and manage to distribute this block), the issue is resolved. Is this sufficient from your point of view?

Comments on the Table

I found the following entries missing inside the table, even though they are probably clear.

Past forks	Future forks	Remote FORK_HASH	Remote FORK_NEXT	Connect	Reason
A	B,C1	A+B	C2	Yes	Local out of sync, differing future forks, but those are uncertain.
A	B,C	A	B	Yes	Possibly out of sync. Possibly stale.
A	B	A+B	C	Yes	Local out of sync.
A	B1	A+B2		No	Someone needs software update.
A		B		No	Different genesis blocks

Here are the cases that are not clear to me, they are taken from your table:

Past forks	Future forks	Remote FORK_HASH	Remote FORK_NEXT	Connect	Reason
A		A	B	Yes?	What if B < local block number. Then, we know that one client is stale.
A	B1	A	B2	Yes?	What if B2 < local block number. Then, we know that one client is stale.

karalabe

MemberAuthor

@ritzdorf This is a very nice corner case indeed, thank you!

In your expanded example with Ropsten Istanbul, getting stuck on fork-1 block indeed is nasty and does have the side effect of not detecting the split until the forked chain passes that barrier too. Your solution of marking fork-1 as "already forked" and including it in FORK_HASH would potentially solve it, but I find it a bit odd. It also doesn't address your second 2 valid questions.

Wrt the second two cases you brought up, I think both are valid corner cases. If we would have the first special case of your table handled correctly, that would also solve the Ropsten Istanbul splitting correctly, because non-updaters would have split off from not-yet-forked Istanbul nodes, even though the Istanbul ones would not detect the issue. This case however cannot be solved by the more complex FORK_HASH modification.

Perhaps a better alternative solution would be to extend the validation code a bit. The fork ID fields would remain the same as is now, but the validator would also take into consideration the local chain head. Then:

Your extended issue would become equivalent to the first special case from the table. The non-forking nodes would detect that they are past the Istanbul FORK_NEXT, and thus reject Istanbul peers. It's a one way street (i.e. Istanbul peers wouldn't reject non-Istanbul ones), but this special case should exist for only a single-block window, so I think it's not thatbad. At least as long as the non-forking chain also progresses (interesting issue for Clique, ha, need to consider this too).
The second special case is also covered by the local head comparison the same way.

Does this make sense to you? (still need to figure out Clique, might need that initial solution too after all)

ritzdorf

Thanks for your feedback @karalabe.

Your solution of marking fork-1 as "already forked" and including it in FORK_HASH would potentially solve it, but I find it a bit odd.

I agree that it is odd. It was more like a brainstormed suggestion.

Does this make sense to you? (still need to figure out Clique, might need that initial solution too after all)

Yes. Thank you. That is roughly what I had in mind. Sorry for not writing it more clearly.

(still need to figure out Clique)

With Clique do you mean a set of non-forking nodes that are isolated from the non-forking miners?

Wouldn't they eventually find the non-forking miners and thereby resolve this?
As soon as the forking nodes pass the fork block they anyway wouldn't connect to the Clique any longer, so it is a case that only exists for a single-block window, right?

karalabe

mentioned this in 2 pull requests

on Oct 30, 2019

djrtwo

Contributor

taking the first 4 bytes of a Keccak256 hash (seems odd)

Can you explain a bit more why this "seems odd"?

Taking a slice of bytes from a cryptographic hash function is a safe and reasonable way to get well-distributed bytes on some domain.

Agreed CRC32 is fine for the task, but avoiding additional dependencies is nice.

karalabe

MemberAuthor

This ship sailed long ago :)

That said, I wrote that it seems odd because we're discarding part of the checksum vs. crc32 would keep all the entropy.

djrtwo

Contributor

This ship sailed long ago :)

No worries! Just chiming in while going over the specs

ethereum

deleted a comment from mikyR

on Feb 7, 2021

poojaranjan

Contributor

Watch an overview of the 'fork identifier', its purpose, and usage for the Ethereum chain in syncing nodes by @fjl
https://www.youtube.com/watch?v=2Yg-MX0ubJQ

fulldecent

Contributor

Can this please be implemented in two or more competing implementations before this is published as final status?

fulldecent

mentioned this

on May 14, 2021

Revert EIP-191 to review #3572

karalabe

MemberAuthor

It's implemented by all clients.

…

On Fri, May 14, 2021, 22:53 William Entriken ***@***.***> wrote: - Can this please be implemented in two or more competing implementations before this is published as final status? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2125 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA7UGMMKJGCWPCMZMFEB6TTNV5STANCNFSM4HZI3O2Q> .

fulldecent

Contributor

Thank you. Can specific clients (at least some) please be noted in the EIP text?

MicahZoltu

Contributor

EIPs should not be linking out to client implementations. See the template at https://raw.githubusercontent.com/ethereum/EIPs/master/eip-template.md and note that there is no longer an Implementations section, there is now only a reference implementation section which is intended to contain an inline implementation and it is optional.

fulldecent

Contributor

It is okay that EIP reviews (which are unpaid, I'm assuming) do not accept a responsibility of verifying technical aspects of EIPs.

But if EIPs are to be anything other than a pastebin, then any interop specifications (i.e. all EIPs) should include multiple reference implementation. And they should be verified by the community (e.g. the Last Call process).

karalabe

MemberAuthor

Specifications almost never say how to implement something, only what to implement. Reference code is a courtesy in the spec world, not a requirement. Would you expect the yellow paper to include multiple implementations of the entire Ethereum EVM?

…

On Sun, May 16, 2021, 05:57 William Entriken ***@***.***> wrote: It is okay that EIP reviews (which are unpaid, I'm assuming) do not accept a responsibility of verifying technical aspects of EIPs. But if EIPs are to be anything other than a pastebin, then any interop specifications (i.e. all EIPs) should include multiple reference implementation. And they should be verified by the community (e.g. the Last Call process). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2125 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA7UGJI4EWSYMKBMQZMI5TTN4YCJANCNFSM4HZI3O2Q> .