mirror of
https://github.com/nspcc-dev/neofs-node.git
synced 2026-03-01 04:29:10 +00:00
Revise blockchain height check on startup #1058
Labels
No labels
I1
I2
I3
I4
S0
S1
S2
S3
S4
U0
U1
U2
U3
U4
blocked
bug
config
dependencies
discussion
documentation
enhancement
enhancement
epic
feature
go
good first issue
help wanted
neofs-adm
neofs-cli
neofs-cli
neofs-cli
neofs-ir
neofs-lens
neofs-storage
neofs-storage
performance
question
security
task
test
windows
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
nspcc-dev/neofs-node#1058
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cthulhu-rider on GitHub (Jul 6, 2023).
Inner Ring and Storage nodes check that height of the underlying blockchain height is greater or equal than the latest encountered one optionally persisted in the local storage (config and config respectively).
App requests current height by RPC, compares results with peristed one and fails if the local value is greater.
Which chain is stuck?
according to @aprasolova experience, we an encounter next error in log:
It is not visible from the message which chain - main or side - is stuck. It's proposed to reflect blockchain kind in this log message.
Await or not await
it's possible that chain node currently synchronizes its state, and it hasn't reached up-to-date state yet. In this case NeoFS node will immediately fail. In fact, it could wait within some context (global or with some sane deadline) and free admin to periodically restart the app.
btw in code check function is called awaitHeight which syntactically implies a background wait, but in fact does not wait.
maybe there are other signs that will allow NeoFS to understand what exactly is happening at the moment and distinguish between freeze and synchronization, for example If so, then we could improve behavior and admin UX. @AnnaShaleva @roman-khimov
Blockchain reset
if chain was reset, and admin restarts the node - it will fail until fresh chain will reach the height not less than persisted one. In this case it's not obvious for admin that state should be reset too. As possible solution, we could also take into accout blockchain network magic, but it may be also left untouched.
@carpawell commented on GitHub (Jul 6, 2023):
There is some detail about it. It did wait in #798, but also stopped waiting in the same PR. So mb @532910 has some info about it (and the issue in general).
@cthulhu-rider commented on GitHub (Jul 6, 2023):
i also started to think about connection switch in multi-RPC setting. @carpawell ur an expert of this currently, pls explain how this reconn could affect our state sync
@roman-khimov commented on GitHub (Jul 19, 2023):
This block counter can't be perfect since local state can be dropped at any time. But it helps in some ways, so:
No 100% reliable way to do that. But
StartWhenSynchronizedRPC option helps somewhat, at least the node is supposed to be up to date when it starts serving RPC (so this problem shouldn't happen at all).Just forget this for now.