ChainSync - synchronizing the Blockchain
ChainSync - synchronizing the Blockchain
What is blockchain synchronization?
Blockchain synchronization ("sync") is a key part of a blockchain system. It handles retrieval and propagation of blocks and transactions (messages), and thus is in charge of distributed state replication. This process is security critical -- problems here can be catastrophic to the operation of a blockchain.
What is ChainSync?
ChainSync
is the protocol Filecoin uses to synchronize its blockchain. It is specific to Filecoin's choices in state representation and consensus rules, but is general enough that it can serve other blockchains. ChainSync
is a group of smaller protocols, which handle different parts of the sync process.
Terms and Concepts
LastCheckpoint
the last hard social-consensus oriented checkpoint thatChainSync
is aware of.This consensus checkpoint defines the minimum finality, and a minimum of history to build on.
ChainSync
takesLastCheckpoint
on faith, and builds on it, never switching away from its history.TargetHeads
a list ofBlockCIDs
that represent blocks at the fringe of block production.These are the newest and best blocks
ChainSync
knows about. They are "target" heads becauseChainSync
will try to sync to them. This list is sorted by "likelihood of being the best chain" (eg for now, simplyChainWeight
)BestTargetHead
the single best chain headBlockCID
to try to sync to.This is the first element of
TargetHeads
ChainSync Summary
At a high level, ChainSync
does the following:
Part 1: Verify internal state (
INIT
state below)SHOULD verify data structures and validate local chain
Resource expensive verification MAY be skipped at nodes' own risk
Part 2: Bootstrap to the network (
BOOTSTRAP
)Step 1. Bootstrap to the network, and acquire a "secure enough" set of peers (more details below)
Step 2. Bootstrap to the
BlockPubsub
channelsStep 3. Listen and serve on Graphsync
Part 3: Synchronize trusted checkpoint state (
SYNC_CHECKPOINT
)Step 1. Start with a
TrustedCheckpoint
(defaults toGenesisCheckpoint
).Step 2. Get the block it points to, and that block's parents
Step 3. Graphsync the
StateTree
Part 4: Catch up to the chain (
CHAIN_CATCHUP
)Step 1. Maintain a set of
TargetHeads
(BlockCIDs
), and select theBestTargetHead
from itStep 2. Synchronize to the latest heads observed, validating blocks towards them (requesting intermediate points)
Step 3. As validation progresses,
TargetHeads
andBestTargetHead
will likely change, as new blocks at the production fringe will arrive,and some target heads or paths to them may fail to validate.
Step 4. Finish when node has "caught up" with
BestTargetHead
(retrieved all the state, linked to local chain, validated all the blocks, etc).
Part 5: Stay in sync, and participate in block propagation (
CHAIN_FOLLOW
)Step 1. If security conditions change, go back to Part 4 (
CHAIN_CATCHUP
)Step 2. Receive, validate, and propagate received
Blocks
Step 3. Now with greater certainty of having the best chain, finalize Tipsets, and advance chain state.
libp2p Network Protocols
As a networking-heavy protocol, ChainSync
makes heavy use of libp2p
. In particular, we use three sets of protocols:
libp2p.PubSub
a family of publish/subscribe protocols to propagate recentBlocks
.The concrete protocol choice impacts
ChainSync
's effectiveness, efficiency, and security dramatically.For Filecoin v1.0 we will use
libp2p.Gossipsub
, a recentlibp2p
protocol that combines features and learningsfrom many excellent PubSub systems. In the future, Filecoin may use other
PubSub
protocols. Important Note: is entirelypossible for Filecoin Nodes to run multiple versions simultaneously. That said, this specification requires that filecoin
nodes
MUST
connect and participate in the main channel, usinglibp2p.Gossipsub
.libp2p.PeerDiscovery
a family of discovery protocols, to learn about peers in the network.This is especially important for security because network "Bootstrap" is a difficult problem in peer-to-peer networks.
The set of peers we initially connect to may completely dominate our awareness of other peers, and therefore all state.
We use a union of
PeerDiscovery
protocols as each by itself is not secure or appropriate for users' threat models.The union of these provides a pragmatic and effective solution.
Discovery protocols marked as required MUST be included in implementations and will be provided by implementation teams. Protocols marked as optional MAY be provided by implementation teams but can be built independently by third parties to augment bootstrap security.
libp2p.DataTransfer
a family of protocols for transfering dataFilecoin Nodes must run
libp2p.Graphsync
.
More concretely, we use these protocols:
libp2p.PeerDiscovery
(required)
libp2p.BootstrapList
a protocol that uses a persistent and user-configurable list of semi-trustedbootstrap peers. The default list includes a set of peers semi-trusted by the Filecoin Community.
(optional)
libp2p.KademliaDHT
a DHT protocol that enables random queries across the entire network(required)
libp2p.Gossipsub
a pub/sub protocol that includes "prune peer exchange" by default, disseminating peer info as part of operation(optional)
libp2p.PersistentPeerstore
a connectivity component that keeps persistent information about peersobserved in the network throughout the lifetime of the node. This is useful because we resume and continually
improve Bootstrap security.
(optional)
libp2p.DNSDiscovery
to learn about peers via DNS lookups to semi-trusted peer aggregators(optional)
libp2p.HTTPDiscovery
to learn about peers via HTTP lookups to semi-trusted peer aggregators(optional)
libp2p.PEX
a general use peer exchange protocol distinct from pubsub peer exchange for 1:1 adhoc peer exchange
libp2p.PubSub
(required)
libp2p.Gossipsub
the concretelibp2p.PubSub
protocolChainSync
uses
libp2p.DataTransfer
(required)
libp2p.Graphsync
the data transfer protocol nodes must support for providing blockchain and user data(optional)
BlockSync
a blockchain data transfer protocol that can be used by some nodes
Subcomponents
Aside from libp2p
, ChainSync
uses or relies on the following components:
Libraries:
ipld
data structures, selectors, and protocolsipld.GraphStore
local persistent storage forchain
datastructuresipld.Selector
a way to express requests for chain data structuresipfs.GraphSync
a general-purposeipld
datastructure syncing protocol
Data Structures:
Data structures in the
chain
package:Block, Tipset, Chain, Checkpoint ...
chainsync.BlockCache
a temporary cache of blocks, to constrain resource expendedchainsync.AncestryGraph
a datastructure to efficiently linkBlocks
,Tipsets
, andPartialChains
chainsync.ValidationGraph
a datastructure for efficient and secure validation ofBlocks
andTipsets
Graphsync in ChainSync
ChainSync
is written in terms of Graphsync
. ChainSync
adds blockchain and filecoin-specific synchronization functionality that is critical for Filecoin security.
Rate Limiting Graphsync responses (SHOULD)
When running Graphsync, Filecoin nodes must respond to graphsync queries. Filecoin requires nodes to provide critical data structures to others, otherwise the network will not function. During ChainSync, it is in operators' interests to provide data structures critical to validating, following, and participating in the blockchain they are on. However, this has limitations, and some level of rate limiting is critical for maintaining security in the presence of attackers who might issue large Graphsync requests to cause DOS.
We recommend the following:
Set and enforce batch size rate limits.
Force selectors to be shaped like:
LimitedBlockIpldSelector(blockCID, BatchSize)
for a singleconstant
BatchSize = 1000
.Nodes may push for this equilibrium by only providing
BatchSize
objects in responses,even for pulls much larger than
BatchSize
. This forces subsequent pulls to be run, re-rootedappropriately, and hints at other parties that they should be requesting with that
BatchSize
.Force all Graphsync queries for blocks to be aligned along cacheable bounderies.
In conjunction with a
BatchSize
, implementations should aim to cache the results of Graphsyncqueries, so that they may propagate them to others very efficiently. Aligning on certain boundaries
(eg specific
ChainEpoch
limits) increases the likelihood many parties in the network will requestthe same batches of content.
Another good cacheable boundary is the entire contents of a
Block
(BlockHeader
,Messages
,Signatures
, etc).Maintain per-peer rate-limits.
Use bandwidth usage to decide whether to respond and how much on a per-peer basis. Libp2p already
tracks bandwidth usage in each connection. This information can be used to impose rate limits in
Graphsync and other Filecoin protocols.
Detect and react to DOS: restrict operation.
The safest implementations will likely detect and react to DOS attacks. Reactions could include:
Smaller
Graphsync.BatchSize
limitsFewer connections to other peers
Rate limit total Graphsync bandwidth
Assign Graphsync bandwidth based on a peer priority queue
Disconnect from and do not accept connections from unknown peers
Introspect Graphsync requests and filter/deny/rate limit suspicious ones
Previous BlockSync protocol
Prior versions of this spec recommended a BlockSync
protocol. This protocol definition is available here. Filecoin nodes are libp2p nodes, and therefore may run a variety of other protocols, including this BlockSync
protocol. As with anything else in Filecoin, nodes MAY opt to use additional protocols to achieve the results. That said, Nodes MUST implement the version of ChainSync
as described in this spec in order to be considered implementations of Filecoin. Test suites will assume this protocol.
ChainSync State Machine
ChainSync
uses the following conceptual state machine. Since this is a conceptual state machine, implementations MAY deviate from implementing precisely these states, or dividing them strictly. Implementations MAY blur the lines between the states. If so, implementations MUST ensure security of the altered protocol.
ChainSync FSM: INIT
INIT
beginning state. no network connections, not synchronizing.
local state is loaded: internal data structures (eg chain, cache) are loaded
LastTrustedCheckpoint
is set the latest network-wide acceptedTrustedCheckpoint
FinalityTipset
is set to finality achieved in a prior protocol run.Default: If no later
FinalityTipset
has been achieved, setFinalityTipset
toLastTrustedCheckpoint
Chain State and Finality:
In this state, the chain MUST NOT advance beyond whatever the node already has.
No new blocks are reported to consumers.
The chain state provided is whatever was loaded from prior executions (worst case is
LastTrustedCheckpoint
)
security conditions to transition out:
local state and data structures SHOULD be verified to be correct
this means validating any parts of the chain or
StateTree
the node has, fromLastTrustedCheckpoint
on.
LastTrustedCheckpoint
is well-known across the Filecoin Network to be a trueTrustedCheckpoint
this SHOULD NOT be verified in software, it SHOULD be verified by operators
Note: we ALWAYS have at least one
TrustedCheckpoint
, theGenesisCheckpoint
.
transitions out:
once done verifying things: move to
BOOTSTRAP
ChainSync FSM: BOOTSTRAP
BOOTSTRAP
network.Bootstrap()
: establish connections to peers until we satisfy security requirementfor better security, use many different
libp2p.PeerDiscovery
protocols
BlockPubsub.Bootstrap()
: establish connections toBlockPubsub
peersThe subscription is for both peer discovery and to start selecting best heads.
Listing on pubsub from the start keeps the node informed about potential head changes.
Graphsync.Serve()
: set up a Graphsync service, that responds to others' queriesChain State and Finality:
In this state, the chain MUST NOT advance beyond whatever the node already has.
No new blocks are reported to consumers.
The chain state provided is whatever was loaded from prior executions (worst case is
LastTrustedCheckpoint
).
security conditions to transition out:
Network
connectivity MUST have reached the security level acceptable forChainSync
BlockPubsub
connectivity MUST have reached the security level acceptable forChainSync
"on time" blocks MUST be arriving through
BlockPubsub
transitions out:
once bootstrap is deemed secure enough:
if node does not have the
Blocks
orStateTree
corresponding toLastTrustedCheckpoint
: move toSYNC_CHECKPOINT
otherwise: move to
CHAIN_CATCHUP
ChainSync FSM: SYNC_CHECKPOINT
SYNC_CHECKPOINT
While in this state:
ChainSync
is well-bootstrapped, but does not yet have theBlocks
orStateTree
forLastTrustedCheckpoint
ChainSync
issuesGraphsync
requests to its peers randomly for theBlocks
andStateTree
forLastTrustedCheckpoint
:ChainSync
's counterparts in other peers MUST provide the state tree.It is only semi-rational to do so, so
ChainSync
may have to try many peers.Some of these requests MAY fail.
Chain State and Finality:
In this state, the chain MUST NOT advance beyond whatever the node already has.
No new blocks are reported to consumers.
The chain state provided is the available
Blocks
andStateTree
forLastTrustedCheckpoint
.
Important Notes:
ChainSync
needs to fetch several blocks: theBlock
pointed at byLastTrustedCheckpoint
, and its directBlock.Parents
.Nodes only need hashing to validate these
Blocks
andStateTrees
-- no block validation or state machine computation is needed.The initial value of
LastTrustedCheckpoint
isGenesisCheckpoint
, but it MAY be a value later in Chain history.LastTrustedCheckpoint
enables efficient syncing by making the implicit economic consensus of chain history explicit.By allowing fetching of the
StateTree
ofLastTrustedCheckpoint
viaGraphsync
,ChainSync
can yield much moreefficient syncing than comparable blockchain synchronization protocols, as syncing and validation can start there.
Nodes DO NOT need to validate the chain from
GenesisCheckpoint
.LastTrustedCheckpoint
MAY be a value later in Chain history.Nodes DO NOT need to but MAY sync earlier
StateTrees
thanLastTrustedCheckpoint
as well.
Pseudocode 1: a basic version of
SYNC_CHECKPOINT
:security conditions to transition out:
StateTree
forLastTrustedCheckpoint
MUST be stored locally and verified (hashing is enough)
transitions out:
once node receives and verifies complete
StateTree
forLastTrustedCheckpoint
: move toCHAIN_CATCHUP
ChainSync FSM: CHAIN_CATCHUP
CHAIN_CATCHUP
While in this state:
ChainSync
is well-bootstrapped, and has an initial trustedStateTree
to start from.ChainSync
is receiving latestBlocks
fromBlockPubsub
ChainSync
starts fetching and validating blocksChainSync
has unvalidated blocks betweenChainSync.FinalityTipset
andChainSync.TargetHeads
Chain State and Finality:
In this state, the chain MUST NOT advance beyond whatever the node already has:
FinalityTipset
does not change.No new blocks are reported to consumers/users of
ChainSync
yet.The chain state provided is the available
Blocks
andStateTree
for all available epochs,specially the
FinalityTipset
.Finality must not move forward here because there are serious attack vectors where a node can be forced to end up on the wrong fork if finality advances before validation is complete up to the block production fringe.
Validation must advance, all the way to the block production fringe:
Validate the whole chain, from
FinalityTipset
toBestTargetHead
The node can reach
BestTargetHead
only to find out it was invalid, then has to updateBestTargetHead
with next best one, and sync to it(without having advanced
FinalityTipset
yet, as otherwise we may end up on the wrong fork)
security conditions to transition out:
Gaps between
ChainSync.FinalityTipset ... ChainSync.BestTargetHead
have been closed:All
Blocks
and their content MUST be fetched, stored, linked, and validated locally.This includes
BlockHeaders
,Messages
, etc.Bad heads have been expunged from
ChainSync.TargetHeads
. Bad heads include heads that initiallyseemed good but turned out invalid, or heads that
ChainSync
has failed to connect (ie. cannotfetch ancestors connecting back to
ChainSync.FinalityTipset
within a reasonable amount of time).All blocks between
ChainSync.FinalityTipset ... ChainSync.TargetHeads
have been validatedThis means all blocks before the best heads.
Not under a temporary network partition
transitions out:
once gaps between
ChainSync.FinalityTipset ... ChainSync.TargetHeads
are closed: move toCHAIN_FOLLOW
(Perhaps moving to
CHAIN_FOLLOW
when 1-2 blocks back in validation may be ok.we dont know we have the right head until we validate it, so if other heads of similar height are right/better, we won't know until then.)
ChainSync FSM: CHAIN_FOLLOW
CHAIN_FOLLOW
While in this state:
ChainSync
is well-bootstrapped, and has an initial trustedStateTree
to start from.ChainSync
fetches and validates blocks.ChainSync
is receiving and validating latestBlocks
fromBlockPubsub
ChainSync
DOES NOT have unvalidated blocks betweenChainSync.FinalityTipset
andChainSync.TargetHeads
ChainSync
MUST drop back to another state if security conditions change.Keep a set of gap measures:
BlockGap
is the number of remaining blocks to validate between the Validated blocks andBestTargetHead
.(ie how many epochs do we need to validate to have validated
BestTargetHead
, does not include null blocks)
EpochGap
is the number of epochs between the latest validated block, andBestTargetHead
(includes null blocks).MaxBlockGap = 2
, which means how many blocks mayChainSync
fall behind on before switching back toCHAIN_CATCHUP
(does not include null blocks)MaxEpochGap = 10
, which means how many epochs mayChainSync
fall behind on before switching back toCHAIN_CATCHUP
(includes null blocks)
Chain State and Finality:
In this state, the chain MUST advance as all the blocks up to
BestTargetHead
are validated.New blocks are finalized as they cross the finality threshold (
ValidG.Heads[0].ChainEpoch - FinalityLookback
)New finalized blocks are reported to consumers.
The chain state provided includes the
Blocks
andStateTree
for theFinality
epoch, as well ascandidate
Blocks
andStateTrees
for unfinalized epochs.
security conditions to transition out:
Temporary network partitions (see Detecting Network Partitions).
Encounter gaps of
>MaxBlockGap
or>MaxEpochGap
between Validated set and a newChainSync.BestTargetHead
transitions out:
if a temporary network partition is detected: move to
CHAIN_CATCHUP
if
BlockGap > MaxBlockGap
: move toCHAIN_CATCHUP
if
EpochGap > MaxEpochGap
: move toCHAIN_CATCHUP
if node is shut down: move to
INIT
Block Fetching, Validation, and Propagation
Notes on changing TargetHeads
while syncing
TargetHeads
while syncingTargetHeads
is changing, asChainSync
must be aware of the best heads at any time. reorgs happen, and our first set of peers could've been bad, we keep discovering others.Hello protocol is good, but it's polling. Unless node is constantly polllng, won't see all the heads.
BlockPubsub
gives us the realtime view into what's actually going on.Weight can also be close between 2+ possible chains (long-forked), and
ChainSync
must select the right one (which, we may not be able to distinguish until validating all the way)
fetching + validation are strictly faster per round on average than blocks produced/block time (if they're not, will always fall behind), so we definitely catch up eventually (and even quickly). The last couple rounds can be close ("almost got it, almost got it, there").
General notes on fetching Blocks
ChainSync
selects and maintains a set of the most likely heads to be correct from among those receivedvia
BlockPubsub
. As more blocks are received, the set ofTargetHeads
is reevaluated.ChainSync
fetchesBlocks
,Messages
, andStateTree
through theGraphsync
protocol.ChainSync
maintains sets ofBlocks/Tipsets
inGraphs
(seeChainSync.id
)ChainSync
gathers a list ofTargetHeads
fromBlockPubsub
, sorted by likelihood of being the best chain (see below).ChainSync
makes requests for chains ofBlockHeaders
to close gaps betweenTargetHeads
ChainSync
forms partial unvalidated chains ofBlockHeaders
, from those received viaBlockPubsub
, and those requested viaGraphsync
.ChainSync
attempts to form fully connected chains ofBlockHeaders
, parting fromStateTree
, toward observedHeads
ChainSync
minimizes resource expenditures to fetch and validate blocks, to protect against DOS attack vectors.ChainSync
employs Progressive Block Validation, validating different facets at different stages of syncing.ChainSync
delays syncingMessages
until they are needed. Much of the structure of the partial chains canbe checked and used to make syncing decisions without fetching the
Messages
.
Progressive Block Validation
Blocks may be validated in progressive stages, in order to minimize resource expenditure.
Validation computation is considerable, and a serious DOS attack vector.
Secure implementations must carefully schedule validation and minimize the work done by pruning blocks without validating them fully.
ChainSync
SHOULD keep a cache of unvalidated blocks (ideally sorted by likelihood of belonging to the chain), and delete unvalidated blocks when they are passed byFinalityTipset
, or whenChainSync
is under significant resource load.These stages can be used partially across many blocks in a candidate chain, in order to prune out clearly bad blocks long before actually doing the expensive validation work.
Progressive Stages of Block Validation
BV0 - Syntax: Serialization, typing, value ranges.
BV1 - Plausible Consensus: Plausible miner, weight, and epoch values (e.g from chain state at
b.ChainEpoch - consensus.LookbackParameter
).BV2 - Block Signature
BV3 - Beacon entries: Valid random beacon entries have been inserted in the block (see beacon entry validation).
BV4 - ElectionProof: A valid election proof was generated.
BV5 - WinningPoSt: Correct PoSt generated.
BV6 - Chain ancestry and finality: Verify block links back to trusted chain, not prior to finality.
BV7 - Message Signatures:
BV8 - State tree: Parent tipset message execution produces the claimed state tree root and receipts.
Notes:
in
CHAIN_CATCHUP
, if a node is receiving/fetching hundreds/thousands ofBlockHeaders
, validating signatures can be very expensive, and can be deferred in favor of other validation. (ie lots of BlockHeaders coming in through network pipe, don't want to bound on sig verification, other checks can help dump blocks on the floor faster (BV0, BV2)in
CHAIN_FOLLOW
, we're not receiving thousands, we're receiving maybe a dozen or 2 dozen packets in a few seconds. We receive cid w/ Sig and addr first (ideally fits in 1 packet), and can afford to (a) check if we already have the cid (if so done, cheap), or (b) if not, check if sig is correct before fetching header (expensive computation, but checking 1 sig is way faster than checking a ton). In practice likely that which one to do is dependent on miner tradeoffs. we'll recommend something but let miners decide, because one strat or the other may be much more effective depending on their hardware, on their bandwidth limitations, or their propensity to getting DOSed
Progressive Block Propagation (or BlockSend)
In order to make Block propagation more efficient, we trade off network round trips for bandwidth usage.
Motivating observations:
Block propagation is one of the most security critical points of the whole protocol.
Bandwidth usage during Block propagation is the biggest rate limiter for network scalability.
The time it takes for a Block to propagate to the whole network is a critical factor in determining a secure
BlockTime
Blocks propagating through the network should take as few sequential roundtrips as possible, as these roundtrips impose serious block time delays. However, interleaved roundtrips may be fine. Meaning that
block.CIDs
may be propagated on their own, without the header, then the header without the messages, then the messages.Blocks
will propagate over alibp2p.PubSub
.libp2p.PubSub.Messages
will most likely arrive multiple times at a node. Therefore, using only theblock.CID
here could make this very cheap in bandwidth (more expensive in round trips)Blocks
in a single epoch may include the sameMessages
, and duplicate transfers can be avoidedMessages
propagate through their ownMessagePubsub
, and nodes have a significant probability of already having a large fraction of the messages in a block. Since messages are the bulk of the size of aBlock
, this can present great bandwidth savings.
Progressive Steps of Block Propagation
IMPORTANT NOTES:
These can be effectively pipelined. The
receiver
is in control of what to pull, and when. It is up them to decide when to trade-off RTTs for Bandwidth.If the
sender
is propagating the block at all toreceiver
, it is in their interest to provide the full content toreceiver
when asked. Otherwise the block may not get included at all.Lots of security assumptions here -- this needs to be hyper verified, in both spec and code.
sender
is a filecoin node runningChainSync
, propagating a block via Gossipsub(as the originator, as another peer in the network, or just a Gossipsub router).
receiver
is the local filecoin node runningChainSync
, trying to get the blocks.For
receiver
toPull
things fromsender
,receiver
must conntect tosender
. Usuallysender
is sending toreceiver
because of the Gossipsub propagation rules.receiver
could choose toPull
from any other node they are connected to, but it is most likelysender
will have the needed information. They usually will be more well-connected in the network.
Step 1. (sender)
Push BlockHeader
:sender
sendsblock.BlockHeader
toreceiver
via Gossipsub:bh := Gossipsub.Send(h block.BlockHeader)
This is a light-ish object (<4KB).
receiver
receivesbh
.This has many fields that can be validated before pulling the messages. (See Progressive Block Validation).
BV0, BV1, BV2, and BV3 validation takes place before propagating
bh
to other nodes.receiver
MAY receive many advertisements for each winning block in an epoch in quick succession. This is because (a) many want propagation as fast as possible, (b) many want to make those network advertisements as light as reasonable, (c) we want to enablereceiver
to choose who to ask it from (usually the first party to advertise it, and that's what spec will recommend), and (d) want to be able to fall back to asking others if that fails (fail = dont get it in 1s or so)
Step 2. (receiver)
Pull MessageCids
:upon receiving
bh
,receiver
checks whether it already has the full block forbh.BlockCID
. if not:receiver
requestsbh.MessageCids
fromsender
:bm := Graphsync.Pull(sender, SelectAMTCIDs(b.Messages))
Step 3. (receiver)
Pull Messages
:If
receiver
DOES NOT already have the all messages forb.BlockCID
, then:If
receiver
has some of the messages:receiver
requests missingMessages
fromsender
:Graphsync.Pull(sender, SelectAll(bm[3], bm[10], bm[50], ...))
or
If
receiver
does not have any of the messages (default safe but expensive thing to do):receiver
requests allMessages
fromsender
:Graphsync.Pull(sender, SelectAll(bh.Messages))
(This is the largest amount of stuff)
Step 4. (receiver)
Validate Block
:The only remaining thing to do is to complete Block Validation.
Calculations
Security Parameters
Peers
>= 32 -- direct connectionsideally
Peers
>= {64, 128}-
Pubsub Bandwidth
These bandwidth calculations are used to motivate choices in ChainSync
.
If you imagine that you will receive the header once per gossipsub peer (or if lucky, half of them), and that there is EC.E_LEADERS=10 blocks per round, then we're talking the difference between:
2MB in <5s may not be worth saving-- and maybe gossipsub can be much better about supressing dups.
Notes (TODO: move elsewhere)
Checkpoints
A checkpoint is the CID of a block (not a tipset list of CIDs, or StateTree)
The reason a block is OK is that it uniquely identifies a tipset.
using tipsets directly would make Checkpoints harder to communicate. We want to make checkpoints a single hash, as short as we can have it. They will be shared in tweets, URLs, emails, printed into newspapers, etc. Compactness, ease of copy-paste, etc matters.
We'll make human readable lists of checkpoints, and making "lists of lists" is more annoying.
When we have
EC.E_PARENTS > 5
or= 10
, tipsets will get annoyingly large.The big quirk/weirdness with blocks it that it also must be in the chain. (If you relaxed that constraint you could end up in a weird case where a checkpoint isn't in the chain and that's weird/violates assumptions.)
Bootstrap chain stub
The mainnet filecoin chain will need to start with a small chain stub of blocks.
We must include some data in different blocks.
We do need a genesis block -- we derive randomness from the ticket there. Rather than special casing, it is easier/less complex to ensure a well-formed chain always, including at the beginning
A lot of code expects lookbacks, especially actor code. Rather than introducing a bunch of special case logic for what happens ostensibly once in network history (special case logic which adds complexity and likelihood of problems), it is easiest to assume the chain is always at least X blocks long, and the system lookback parameters are all fine and dont need to be scaled in the beginning of network's history.
PartialGraph
The PartialGraph
of blocks.
Is a graph necessarily connected, or is this just a bag of blocks, with each disconnected subgraph being reported in heads/tails?
The latter. The partial graph is a DAG fragment-- including disconnected components. here's a visual example, 4 example PartialGraphs, with Heads and Tails. (note they aren't tipsets)
Last updated