Adopt possibly incomplete NeoFS SEARCH results in NeoFSBlockFetcher and upload-bin CLI command #1404

Open
opened 2025-12-28 17:16:19 +00:00 by sami · 8 comments
Owner

Originally created by @AnnaShaleva on GitHub (Oct 24, 2024).

Current Behavior

Some blocks are uploaded to NeoFS, then restart of the script happens. The script starts to upload from 0 block, not from the latest incomplete batch:

2024-10-23 14:46:48.603	Chain block height: 6231784
2024-10-23 14:47:00.828	Processing batch from 0 to 9999
2024-10-23 14:47:00.828	First block of latest incomplete batch uploaded to NeoFS container: 0
2024-10-23 14:50:56.510	Processing batch from 10000 to 19999
2024-10-23 14:50:56.510	Successfully uploaded batch of blocks: from 0 to 9999
2024-10-23 14:55:19.178	Processing batch from 20000 to 29999
2024-10-23 14:55:19.178	Successfully uploaded batch of blocks: from 10000 to 19999
2024-10-23 15:00:07.874	Processing batch from 30000 to 39999
2024-10-23 15:00:07.874	Successfully uploaded batch of blocks: from 20000 to 29999
2024-10-23 15:03:35.567	Processing batch from 40000 to 49999
2024-10-23 15:03:35.567	Successfully uploaded batch of blocks: from 30000 to 39999
2024-10-23 15:04:49.432	Chain block height: 6231850
2024-10-23 15:07:34.927	Processing batch from 0 to 9999
2024-10-23 15:07:34.927	First block of latest incomplete batch uploaded to NeoFS container: 0
2024-10-23 15:12:00.468	Processing batch from 10000 to 19999
2024-10-23 15:12:00.468	Successfully uploaded batch of blocks: from 0 to 9999
2024-10-23 15:16:50.336	Processing batch from 20000 to 29999
2024-10-23 15:16:50.336	Successfully uploaded batch of blocks: from 10000 to 19999
2024-10-23 15:21:10.256	Processing batch from 30000 to 39999
2024-10-23 15:21:10.256	Successfully uploaded batch of blocks: from 20000 to 29999
2024-10-23 15:25:27.258	Processing batch from 40000 to 49999
2024-10-23 15:25:27.258	Successfully uploaded batch of blocks: from 30000 to 39999
2024-10-23 15:30:14.827	Processing batch from 50000 to 59999

The pattern repeats.

Expected Behavior

Reupload must happen starting from latest incomplete batch.

Possible Solution

Find the problem in fetchLatestMissingBlockIndex, fix it.

Originally created by @AnnaShaleva on GitHub (Oct 24, 2024). ## Current Behavior Some blocks are uploaded to NeoFS, then restart of the script happens. The script starts to upload from 0 block, not from the latest incomplete batch: ``` 2024-10-23 14:46:48.603 Chain block height: 6231784 2024-10-23 14:47:00.828 Processing batch from 0 to 9999 2024-10-23 14:47:00.828 First block of latest incomplete batch uploaded to NeoFS container: 0 2024-10-23 14:50:56.510 Processing batch from 10000 to 19999 2024-10-23 14:50:56.510 Successfully uploaded batch of blocks: from 0 to 9999 2024-10-23 14:55:19.178 Processing batch from 20000 to 29999 2024-10-23 14:55:19.178 Successfully uploaded batch of blocks: from 10000 to 19999 2024-10-23 15:00:07.874 Processing batch from 30000 to 39999 2024-10-23 15:00:07.874 Successfully uploaded batch of blocks: from 20000 to 29999 2024-10-23 15:03:35.567 Processing batch from 40000 to 49999 2024-10-23 15:03:35.567 Successfully uploaded batch of blocks: from 30000 to 39999 2024-10-23 15:04:49.432 Chain block height: 6231850 2024-10-23 15:07:34.927 Processing batch from 0 to 9999 2024-10-23 15:07:34.927 First block of latest incomplete batch uploaded to NeoFS container: 0 2024-10-23 15:12:00.468 Processing batch from 10000 to 19999 2024-10-23 15:12:00.468 Successfully uploaded batch of blocks: from 0 to 9999 2024-10-23 15:16:50.336 Processing batch from 20000 to 29999 2024-10-23 15:16:50.336 Successfully uploaded batch of blocks: from 10000 to 19999 2024-10-23 15:21:10.256 Processing batch from 30000 to 39999 2024-10-23 15:21:10.256 Successfully uploaded batch of blocks: from 20000 to 29999 2024-10-23 15:25:27.258 Processing batch from 40000 to 49999 2024-10-23 15:25:27.258 Successfully uploaded batch of blocks: from 30000 to 39999 2024-10-23 15:30:14.827 Processing batch from 50000 to 59999 ``` The pattern repeats. ## Expected Behavior Reupload must happen starting from latest incomplete batch. ## Possible Solution Find the problem in `fetchLatestMissingBlockIndex`, fix it.
Author
Owner

@AnnaShaleva commented on GitHub (Oct 24, 2024):

One more example:

2024-10-24 06:10:41.740	
Successfully uploaded batch of blocks: from 1960000 to 1969999
2024-10-24 06:14:17.712	Processing batch from 1980000 to 1989999	
2024-10-24 06:14:17.712	Successfully uploaded batch of blocks: from 1970000 to 1979999	
2024-10-24 06:18:11.192	Processing batch from 1990000 to 1999999	
2024-10-24 06:18:11.192	Successfully uploaded batch of blocks: from 1980000 to 1989999	
2024-10-24 06:21:47.636	Processing batch from 2000000 to 2009999	
2024-10-24 06:21:47.636	Successfully uploaded batch of blocks: from 1990000 to 1999999	
2024-10-24 06:25:07.098	Processing batch from 2010000 to 2019999
2024-10-24 06:25:07.098	Successfully uploaded batch of blocks: from 2000000 to 2009999	
2024-10-24 06:29:13.781	Processing batch from 2020000 to 2029999	
2024-10-24 06:29:13.781	Successfully uploaded batch of blocks: from 2010000 to 2019999	
2024-10-24 06:33:18.434	Processing batch from 2030000 to 2039999	
2024-10-24 06:33:18.434	Successfully uploaded batch of blocks: from 2020000 to 2029999	
2024-10-24 06:36:05.190	upload error: failed to initiate object upload: connection: no healthy client
2024-10-24 06:37:08.145	Chain block height: 6235291	
2024-10-24 06:54:58.627	Processing batch from 0 to 9999	
2024-10-24 06:54:58.627	First block of latest incomplete batch uploaded to NeoFS container: 0	
2024-10-24 07:00:32.594	Processing batch from 10000 to 19999	
2024-10-24 07:00:32.594	Successfully uploaded batch of blocks: from 0 to 9999	
2024-10-24 07:05:19.672	Processing batch from 20000 to 29999	
2024-10-24 07:05:19.672	Successfully uploaded batch of blocks: from 10000 to 19999	
2024-10-24 07:08:32.479	Processing batch from 30000 to 39999	
2024-10-24 07:08:32.479	Successfully uploaded batch of blocks: from 20000 to 29999	
2024-10-24 07:10:25.644	Processing batch from 40000 to 49999	
2024-10-24 07:10:25.644	Successfully uploaded batch of blocks: from 30000 to 39999	
2024-10-24 07:10:48.870	upload error: failed to initiate object upload: connection: no healthy client	
2024-10-24 07:11:50.608	Chain block height: 6235419
@AnnaShaleva commented on GitHub (Oct 24, 2024): One more example: ``` 2024-10-24 06:10:41.740 Successfully uploaded batch of blocks: from 1960000 to 1969999 2024-10-24 06:14:17.712 Processing batch from 1980000 to 1989999 2024-10-24 06:14:17.712 Successfully uploaded batch of blocks: from 1970000 to 1979999 2024-10-24 06:18:11.192 Processing batch from 1990000 to 1999999 2024-10-24 06:18:11.192 Successfully uploaded batch of blocks: from 1980000 to 1989999 2024-10-24 06:21:47.636 Processing batch from 2000000 to 2009999 2024-10-24 06:21:47.636 Successfully uploaded batch of blocks: from 1990000 to 1999999 2024-10-24 06:25:07.098 Processing batch from 2010000 to 2019999 2024-10-24 06:25:07.098 Successfully uploaded batch of blocks: from 2000000 to 2009999 2024-10-24 06:29:13.781 Processing batch from 2020000 to 2029999 2024-10-24 06:29:13.781 Successfully uploaded batch of blocks: from 2010000 to 2019999 2024-10-24 06:33:18.434 Processing batch from 2030000 to 2039999 2024-10-24 06:33:18.434 Successfully uploaded batch of blocks: from 2020000 to 2029999 2024-10-24 06:36:05.190 upload error: failed to initiate object upload: connection: no healthy client 2024-10-24 06:37:08.145 Chain block height: 6235291 2024-10-24 06:54:58.627 Processing batch from 0 to 9999 2024-10-24 06:54:58.627 First block of latest incomplete batch uploaded to NeoFS container: 0 2024-10-24 07:00:32.594 Processing batch from 10000 to 19999 2024-10-24 07:00:32.594 Successfully uploaded batch of blocks: from 0 to 9999 2024-10-24 07:05:19.672 Processing batch from 20000 to 29999 2024-10-24 07:05:19.672 Successfully uploaded batch of blocks: from 10000 to 19999 2024-10-24 07:08:32.479 Processing batch from 30000 to 39999 2024-10-24 07:08:32.479 Successfully uploaded batch of blocks: from 20000 to 29999 2024-10-24 07:10:25.644 Processing batch from 40000 to 49999 2024-10-24 07:10:25.644 Successfully uploaded batch of blocks: from 30000 to 39999 2024-10-24 07:10:48.870 upload error: failed to initiate object upload: connection: no healthy client 2024-10-24 07:11:50.608 Chain block height: 6235419 ```
Author
Owner

@AnnaShaleva commented on GitHub (Oct 24, 2024):

But sometimes it works differently (logs are from the same mainnet service):

2024-10-24 06:37:08.145	Chain block height: 6235291	
2024-10-24 06:54:58.627	Processing batch from 0 to 9999	
2024-10-24 06:54:58.627	First block of latest incomplete batch uploaded to NeoFS container: 0
2024-10-24 07:00:32.594	Processing batch from 10000 to 19999	
2024-10-24 07:00:32.594	Successfully uploaded batch of blocks: from 0 to 9999	
2024-10-24 07:05:19.672	Processing batch from 20000 to 29999	
2024-10-24 07:05:19.672	Successfully uploaded batch of blocks: from 10000 to 19999	
2024-10-24 07:08:32.479	Processing batch from 30000 to 39999	
2024-10-24 07:08:32.479	Successfully uploaded batch of blocks: from 20000 to 29999
2024-10-24 07:10:25.644	Processing batch from 40000 to 49999	
2024-10-24 07:10:25.644	Successfully uploaded batch of blocks: from 30000 to 39999	
2024-10-24 07:10:48.870	upload error: failed to initiate object upload: connection: no healthy client	
2024-10-24 07:11:50.608	Chain block height: 6235419	
2024-10-24 07:18:58.201	failed to fetch the latest missing block index from container: search of index files failed for batch with indexes from 2480000 to 2489999: failed to initiate object search: session: init session: status: code = 1024 message = connection to the RPC node has been lost	
2024-10-24 07:20:01.123	Chain block height: 6235449	
2024-10-24 07:29:07.555	Processing batch from 40000 to 49999	
2024-10-24 07:29:07.555	First block of latest incomplete batch uploaded to NeoFS container: 40000	
2024-10-24 07:29:45.742	Processing batch from 50000 to 59999	
2024-10-24 07:29:45.742	Successfully uploaded batch of blocks: from 40000 to 49999	
2024-10-24 07:30:14.433	Processing batch from 60000 to 69999	
2024-10-24 07:30:14.433	Successfully uploaded batch of blocks: from 50000 to 59999	
2024-10-24 07:31:04.556	Processing batch from 70000 to 79999	
2024-10-24 07:31:04.556	Successfully uploaded batch of blocks: from 60000 to 69999	
2024-10-24 07:31:42.554	upload error: failed to initiate object upload: connection: no healthy client
2024-10-24 07:32:44.920	Chain block height: 6235496
@AnnaShaleva commented on GitHub (Oct 24, 2024): But sometimes it works differently (logs are from the same mainnet service): ``` 2024-10-24 06:37:08.145 Chain block height: 6235291 2024-10-24 06:54:58.627 Processing batch from 0 to 9999 2024-10-24 06:54:58.627 First block of latest incomplete batch uploaded to NeoFS container: 0 2024-10-24 07:00:32.594 Processing batch from 10000 to 19999 2024-10-24 07:00:32.594 Successfully uploaded batch of blocks: from 0 to 9999 2024-10-24 07:05:19.672 Processing batch from 20000 to 29999 2024-10-24 07:05:19.672 Successfully uploaded batch of blocks: from 10000 to 19999 2024-10-24 07:08:32.479 Processing batch from 30000 to 39999 2024-10-24 07:08:32.479 Successfully uploaded batch of blocks: from 20000 to 29999 2024-10-24 07:10:25.644 Processing batch from 40000 to 49999 2024-10-24 07:10:25.644 Successfully uploaded batch of blocks: from 30000 to 39999 2024-10-24 07:10:48.870 upload error: failed to initiate object upload: connection: no healthy client 2024-10-24 07:11:50.608 Chain block height: 6235419 2024-10-24 07:18:58.201 failed to fetch the latest missing block index from container: search of index files failed for batch with indexes from 2480000 to 2489999: failed to initiate object search: session: init session: status: code = 1024 message = connection to the RPC node has been lost 2024-10-24 07:20:01.123 Chain block height: 6235449 2024-10-24 07:29:07.555 Processing batch from 40000 to 49999 2024-10-24 07:29:07.555 First block of latest incomplete batch uploaded to NeoFS container: 40000 2024-10-24 07:29:45.742 Processing batch from 50000 to 59999 2024-10-24 07:29:45.742 Successfully uploaded batch of blocks: from 40000 to 49999 2024-10-24 07:30:14.433 Processing batch from 60000 to 69999 2024-10-24 07:30:14.433 Successfully uploaded batch of blocks: from 50000 to 59999 2024-10-24 07:31:04.556 Processing batch from 70000 to 79999 2024-10-24 07:31:04.556 Successfully uploaded batch of blocks: from 60000 to 69999 2024-10-24 07:31:42.554 upload error: failed to initiate object upload: connection: no healthy client 2024-10-24 07:32:44.920 Chain block height: 6235496 ```
Author
Owner

@AliceInHunterland commented on GitHub (Oct 24, 2024):

can be connected #3615

@AliceInHunterland commented on GitHub (Oct 24, 2024): can be connected #3615
Author
Owner

@AnnaShaleva commented on GitHub (Oct 24, 2024):

Well, of course it's a bug in fetchLatestMissingBlockIndex only if our batches are full and don't have gaps.

can be connected https://github.com/nspcc-dev/neo-go/issues/3615

Check currently uploaded data for N3 mainet. See if there are gaps in batches.

@AnnaShaleva commented on GitHub (Oct 24, 2024): Well, of course it's a bug in *fetchLatestMissingBlockIndex* only if our batches are full and don't have gaps. > can be connected https://github.com/nspcc-dev/neo-go/issues/3615 Check currently uploaded data for N3 mainet. See if there are gaps in batches.
Author
Owner

@AnnaShaleva commented on GitHub (Oct 24, 2024):

Checked, depends on #3615 resolution.

@AnnaShaleva commented on GitHub (Oct 24, 2024): Checked, depends on #3615 resolution.
Author
Owner

@AnnaShaleva commented on GitHub (Oct 24, 2024):

To resolve this issue, we need to adopt the SEARCH completeness marker once https://github.com/nspcc-dev/neofs-node/issues/2721 implemented.

@AnnaShaleva commented on GitHub (Oct 24, 2024): To resolve this issue, we need to adopt the SEARCH completeness marker once https://github.com/nspcc-dev/neofs-node/issues/2721 implemented.
Author
Owner

@AnnaShaleva commented on GitHub (Oct 24, 2024):

See also https://github.com/nspcc-dev/neofs-node/issues/2721#issuecomment-2435121374: for situations where all SNs from REP policy are dead, we need to shut down BlockFetcher, because it won't be able to receive even block OIDs and proceed with addition of blocks to the chain.

@AnnaShaleva commented on GitHub (Oct 24, 2024): See also https://github.com/nspcc-dev/neofs-node/issues/2721#issuecomment-2435121374: for situations where all SNs from REP policy are dead, we need to shut down BlockFetcher, because it won't be able to receive even block OIDs and proceed with addition of blocks to the chain.
Author
Owner

@AnnaShaleva commented on GitHub (Nov 11, 2024):

Within the scope of this issue we need to revert changes made by https://github.com/nspcc-dev/neo-go/issues/3670 and fall back from per-object SEARCH to SEARCH for the range of objects.

@AnnaShaleva commented on GitHub (Nov 11, 2024): Within the scope of this issue we need to revert changes made by https://github.com/nspcc-dev/neo-go/issues/3670 and fall back from per-object SEARCH to SEARCH for the range of objects.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nspcc-dev/neo-go#1404
No description provided.