Seed node address can be blacklisted #1521

Closed
opened 2025-12-28 17:16:44 +00:00 by sami · 8 comments
Owner

Originally created by @AnnaShaleva on GitHub (May 19, 2025).

Originally assigned to: @AnnaShaleva on GitHub.

Current Behavior

Node height stales due to inability to fetch new blocks. getpeers shows that all known peers are blacklisted:

anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "getpeers", "params": ["0x398dbf659b3ea356845aafdc54227833550855e8956f5a9eae338c025dfa8a77"] }' http://192.168.43.94:7112 | json_pp 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   447  100   314  100   133   4535   1920 --:--:-- --:--:-- --:--:--  6478
{
   "id" : 1,
   "jsonrpc" : "2.0",
   "result" : {
      "bad" : [
         {
            "address" : "127.0.0.1",
            "port" : 20333
         },
         {
            "address" : "192.168.43.94",
            "port" : 20333
         },
         {
            "address" : "192.168.43.92",
            "port" : 20333
         },
         {
            "address" : "192.168.43.95",
            "port" : 20333
         },
         {
            "address" : "192.168.43.93",
            "port" : 20333
         },
         {
            "address" : "127.0.0.1",
            "port" : 57392
         }
      ],
      "connected" : [],
      "unconnected" : []
   }
}

It's known that some of thees peers are seed nodes.

Expected Behavior

At least seed nodes must not be blacklisted.

Possible Solution

Find the bug in this code, fix it:
nspcc-dev/neo-go@66449003b3/pkg/network/discovery.go (L174-L181)

Originally created by @AnnaShaleva on GitHub (May 19, 2025). Originally assigned to: @AnnaShaleva on GitHub. ## Current Behavior Node height stales due to inability to fetch new blocks. `getpeers` shows that all known peers are blacklisted: ``` anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "getpeers", "params": ["0x398dbf659b3ea356845aafdc54227833550855e8956f5a9eae338c025dfa8a77"] }' http://192.168.43.94:7112 | json_pp % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 447 100 314 100 133 4535 1920 --:--:-- --:--:-- --:--:-- 6478 { "id" : 1, "jsonrpc" : "2.0", "result" : { "bad" : [ { "address" : "127.0.0.1", "port" : 20333 }, { "address" : "192.168.43.94", "port" : 20333 }, { "address" : "192.168.43.92", "port" : 20333 }, { "address" : "192.168.43.95", "port" : 20333 }, { "address" : "192.168.43.93", "port" : 20333 }, { "address" : "127.0.0.1", "port" : 57392 } ], "connected" : [], "unconnected" : [] } } ``` It's known that some of thees peers are seed nodes. ## Expected Behavior At least seed nodes must not be blacklisted. ## Possible Solution Find the bug in this code, fix it: https://github.com/nspcc-dev/neo-go/blob/66449003b3cd3e69c18fc4b03d8185407f25744d/pkg/network/discovery.go#L174-L181
sami 2025-12-28 17:16:44 +00:00
  • closed this issue
  • added the
    bug
    S4
    I4
    U0
    labels
Author
Owner

@AnnaShaleva commented on GitHub (May 19, 2025):

IR node config:

root@m181lt-ir-2:~# cat /mnt/metadata/local_state/config/config-ir.yaml

# Logger section
logger:
  level: warn # Logger level: one of "debug", "info" (default), "warn", "error", "dpanic", "panic", "fatal"

# Wallet settings
wallet:
  path: /morph/local_state/config/wallet-ir.json  # Path to NEP-6 NEO wallet file
  address: NYsRrQeVmqREiT59i9QmRX3M1nwU3vC46Y  # Account address in the wallet; ignore to use default address
  password: one  # Account password in the wallet

# Profiler section
pprof:
  enabled: false
  address: :6060  # Endpoint for application pprof profiling; disabled by default
  shutdown_timeout: 30s  # Timeout for profiling HTTP server graceful shutdown

# Application metrics section
prometheus:
  enabled: true
  address: localhost:9090  # Endpoint for application prometheus metrics; disabled by default
  shutdown_timeout: 30s  # Timeout for metrics HTTP server graceful shutdown

# Toggling the sidechain-only mode
without_mainnet: true

# Neo main chain RPC settings
mainnet:
  endpoints: # List of websocket RPC endpoints in mainchain; ignore if mainchain is disabled
    - ws://main-chain:30333/ws

# Neo side chain RPC settings
fschain:
  dial_timeout: 5s # Timeout for RPC client connection to sidechain
  reconnections_number: 5  # number of reconnection attempts
  reconnections_delay: 5s  # time delay b/w reconnection attempts
  validators: # List of hex-encoded 33-byte public keys of sidechain validators to vote for at application startup
    - 0244e23d9ac8f91fedcaf487c8249bff2eda17c898298d22794c0b8929a047a24d
    - 02d014f67915bacc5bf9890d786f29a27c721cd3d6de008d19e7ddc1bf08bcf86a
    - 03b2194f7e0cf23c65509ae2ee5caf3b1d27c449cd36ea545d93a0aa3e871ff4ef
    - 03dcc97069aa4bdecb05a47c68fe35a221ab2673b0be38469a3ad5ac9328339329
  consensus:
    magic: 15405
    committee:
      - 0244e23d9ac8f91fedcaf487c8249bff2eda17c898298d22794c0b8929a047a24d
      - 02d014f67915bacc5bf9890d786f29a27c721cd3d6de008d19e7ddc1bf08bcf86a
      - 03b2194f7e0cf23c65509ae2ee5caf3b1d27c449cd36ea545d93a0aa3e871ff4ef
      - 03dcc97069aa4bdecb05a47c68fe35a221ab2673b0be38469a3ad5ac9328339329
    storage:
      type: boltdb
      path: /morph/local_state/fschain.bolt
    time_per_block: 1s
    max_traceable_blocks: 200000
    seed_nodes:
      - :20333
      - 192.168.43.95:7111
      - 192.168.43.93:7111
      - 192.168.43.92:7111
    rpc:
      max_gas_invoke: 200
      max_websocket_clients: 512
      session_pool_size: 512
      listen:
        - ":30333"
    p2p_notary_request_payload_pool_size: 5000
    p2p:
      dial_timeout: 3s
      proto_tick_interval: 2s
      listen:
        - ":20333"
      peers:
        min: 2
        max: 9
        attempts: 20
      ping:
        interval: 3s
        timeout: 9s
    set_roles_in_genesis: true # Optional flag for designating P2PNotary and NeoFSAlphabet roles to all
      # genesis block validators. The validators depend on 'committee' and, if set, 'validators_history'.
      # Must be 'true' or 'false'.

fschain_autodeploy: true # Optional flag to run auto-deployment procedure of the FS chain. By default,
  # the chain is expected to be deployed/updated in the background (e.g. via NeoFS ADM tool).
  # If set, must be 'true' or 'false'.

# Internal worker pool configurations
workers:
  netmap: 100 # Default is 10 which is not enough for large clusters.

nns:
  system_email: ops@nspcc.io

# Network time settings
timers:
  stop_estimation:
    mul: 1  # Multiplier in x/y relation of when to stop basic income estimation within the epoch
    div: 4  # Divider in x/y relation of when to stop basic income estimation within the epoch
  collect_basic_income:
    mul: 1  # Multiplier in x/y relation of when to start basic income asset collection within the epoch
    div: 2  # Divider in x/y relation of when to start basic income asset collecting within the epoch
  distribute_basic_income:
    mul: 3  # Multiplier in x/y relation of when to start basic income asset distribution within the epoch
    div: 4  # Divider in x/y relation of when to start basic income asset distribution within the epoch

# Storage node GAS emission settings
emit:
  storage:
    amount: 1000000000  # Fixed8 value of sidechain GAS emitted to all storage nodes once per GAS emission cycle; disabled by default

# Audit settings
audit:
  pdp:
    max_sleep_interval: 100ms  # Maximum timeout between object.RangeHash requests to the storage node

# Settlement settings
settlement:
  basic_income_rate: 100000000  # Optional: override basic income rate value from network config; applied only in debug mode
  audit_fee: 100000  # Optional: override audit fee value from network config; applied only in debug mode

control:
  grpc:
    endpoint: ":16512"

experimental:
  chain_meta_data: true

node:
  persistent_state:
    path: /morph/local_state/neofs-ir-state # Path to application state file
@AnnaShaleva commented on GitHub (May 19, 2025): IR node config: ``` root@m181lt-ir-2:~# cat /mnt/metadata/local_state/config/config-ir.yaml # Logger section logger: level: warn # Logger level: one of "debug", "info" (default), "warn", "error", "dpanic", "panic", "fatal" # Wallet settings wallet: path: /morph/local_state/config/wallet-ir.json # Path to NEP-6 NEO wallet file address: NYsRrQeVmqREiT59i9QmRX3M1nwU3vC46Y # Account address in the wallet; ignore to use default address password: one # Account password in the wallet # Profiler section pprof: enabled: false address: :6060 # Endpoint for application pprof profiling; disabled by default shutdown_timeout: 30s # Timeout for profiling HTTP server graceful shutdown # Application metrics section prometheus: enabled: true address: localhost:9090 # Endpoint for application prometheus metrics; disabled by default shutdown_timeout: 30s # Timeout for metrics HTTP server graceful shutdown # Toggling the sidechain-only mode without_mainnet: true # Neo main chain RPC settings mainnet: endpoints: # List of websocket RPC endpoints in mainchain; ignore if mainchain is disabled - ws://main-chain:30333/ws # Neo side chain RPC settings fschain: dial_timeout: 5s # Timeout for RPC client connection to sidechain reconnections_number: 5 # number of reconnection attempts reconnections_delay: 5s # time delay b/w reconnection attempts validators: # List of hex-encoded 33-byte public keys of sidechain validators to vote for at application startup - 0244e23d9ac8f91fedcaf487c8249bff2eda17c898298d22794c0b8929a047a24d - 02d014f67915bacc5bf9890d786f29a27c721cd3d6de008d19e7ddc1bf08bcf86a - 03b2194f7e0cf23c65509ae2ee5caf3b1d27c449cd36ea545d93a0aa3e871ff4ef - 03dcc97069aa4bdecb05a47c68fe35a221ab2673b0be38469a3ad5ac9328339329 consensus: magic: 15405 committee: - 0244e23d9ac8f91fedcaf487c8249bff2eda17c898298d22794c0b8929a047a24d - 02d014f67915bacc5bf9890d786f29a27c721cd3d6de008d19e7ddc1bf08bcf86a - 03b2194f7e0cf23c65509ae2ee5caf3b1d27c449cd36ea545d93a0aa3e871ff4ef - 03dcc97069aa4bdecb05a47c68fe35a221ab2673b0be38469a3ad5ac9328339329 storage: type: boltdb path: /morph/local_state/fschain.bolt time_per_block: 1s max_traceable_blocks: 200000 seed_nodes: - :20333 - 192.168.43.95:7111 - 192.168.43.93:7111 - 192.168.43.92:7111 rpc: max_gas_invoke: 200 max_websocket_clients: 512 session_pool_size: 512 listen: - ":30333" p2p_notary_request_payload_pool_size: 5000 p2p: dial_timeout: 3s proto_tick_interval: 2s listen: - ":20333" peers: min: 2 max: 9 attempts: 20 ping: interval: 3s timeout: 9s set_roles_in_genesis: true # Optional flag for designating P2PNotary and NeoFSAlphabet roles to all # genesis block validators. The validators depend on 'committee' and, if set, 'validators_history'. # Must be 'true' or 'false'. fschain_autodeploy: true # Optional flag to run auto-deployment procedure of the FS chain. By default, # the chain is expected to be deployed/updated in the background (e.g. via NeoFS ADM tool). # If set, must be 'true' or 'false'. # Internal worker pool configurations workers: netmap: 100 # Default is 10 which is not enough for large clusters. nns: system_email: ops@nspcc.io # Network time settings timers: stop_estimation: mul: 1 # Multiplier in x/y relation of when to stop basic income estimation within the epoch div: 4 # Divider in x/y relation of when to stop basic income estimation within the epoch collect_basic_income: mul: 1 # Multiplier in x/y relation of when to start basic income asset collection within the epoch div: 2 # Divider in x/y relation of when to start basic income asset collecting within the epoch distribute_basic_income: mul: 3 # Multiplier in x/y relation of when to start basic income asset distribution within the epoch div: 4 # Divider in x/y relation of when to start basic income asset distribution within the epoch # Storage node GAS emission settings emit: storage: amount: 1000000000 # Fixed8 value of sidechain GAS emitted to all storage nodes once per GAS emission cycle; disabled by default # Audit settings audit: pdp: max_sleep_interval: 100ms # Maximum timeout between object.RangeHash requests to the storage node # Settlement settings settlement: basic_income_rate: 100000000 # Optional: override basic income rate value from network config; applied only in debug mode audit_fee: 100000 # Optional: override audit fee value from network config; applied only in debug mode control: grpc: endpoint: ":16512" experimental: chain_meta_data: true node: persistent_state: path: /morph/local_state/neofs-ir-state # Path to application state file ```
Author
Owner

@AnnaShaleva commented on GitHub (May 19, 2025):

So one side of the problem is that seed list contains: 192.168.43.95:7111, 192.168.43.93:7111, 192.168.43.92:7111 whereas in fact peers marked as "bad" have the same addresses but different ports, e.g. 192.168.43.93:20333 etc. due to the fact that nodes are running inside Docker container, and here are port mappings:

d4a4fb518cbe   docker.morphbits.io/morph/console:0.18.1   "/morph/bin/init-mor…"   38 hours ago   Up 38 hours   0.0.0.0:7140->5100/tcp, [::]:7140->5100/tcp, 0.0.0.0:7120->8080/tcp, [::]:7120->8080/tcp, 0.0.0.0:7130->9080/tcp, [::]:7130->9080/tcp, 0.0.0.0:80->9100/tcp, [::]:80->9100/tcp, 0.0.0.0:7111->20333/tcp, [::]:7111->20333/tcp, 0.0.0.0:7112->30333/tcp, [::]:7112->30333/tcp   morph

So technically the node doesn't treat 192.168.43.93:20333 as seed node 192.168.43.93:7111 because we use raw strings comparison here:
nspcc-dev/neo-go@66449003b3/pkg/network/discovery.go (L174)

@AnnaShaleva commented on GitHub (May 19, 2025): So one side of the problem is that seed list contains: `192.168.43.95:7111`, `192.168.43.93:7111`, `192.168.43.92:7111` whereas in fact peers marked as "bad" have the same addresses but different ports, e.g. `192.168.43.93:20333` etc. due to the fact that nodes are running inside Docker container, and here are port mappings: ``` d4a4fb518cbe docker.morphbits.io/morph/console:0.18.1 "/morph/bin/init-mor…" 38 hours ago Up 38 hours 0.0.0.0:7140->5100/tcp, [::]:7140->5100/tcp, 0.0.0.0:7120->8080/tcp, [::]:7120->8080/tcp, 0.0.0.0:7130->9080/tcp, [::]:7130->9080/tcp, 0.0.0.0:80->9100/tcp, [::]:80->9100/tcp, 0.0.0.0:7111->20333/tcp, [::]:7111->20333/tcp, 0.0.0.0:7112->30333/tcp, [::]:7112->30333/tcp morph ``` So technically the node doesn't treat `192.168.43.93:20333` as seed node `192.168.43.93:7111` because we use raw strings comparison here: https://github.com/nspcc-dev/neo-go/blob/66449003b3cd3e69c18fc4b03d8185407f25744d/pkg/network/discovery.go#L174
Author
Owner

@roman-khimov commented on GitHub (May 19, 2025):

announcedPort in https://github.com/nspcc-dev/neo-go/blob/master/docs/node-configuration.md#p2p-configuration?

@roman-khimov commented on GitHub (May 19, 2025): `announcedPort` in https://github.com/nspcc-dev/neo-go/blob/master/docs/node-configuration.md#p2p-configuration?
Author
Owner

@roman-khimov commented on GitHub (May 19, 2025):

But why does it prevent connection to seeds?

@roman-khimov commented on GitHub (May 19, 2025): But why does it prevent connection to seeds?
Author
Owner

@AnnaShaleva commented on GitHub (May 19, 2025):

Probably it's not the real bug reason, I'm looking at the code and it looks like 192.168.43.93:20333 should never be included in the list of bad peers, but somehow it's included.

@AnnaShaleva commented on GitHub (May 19, 2025): Probably it's not the real bug reason, I'm looking at the code and it looks like `192.168.43.93:20333` should never be included in the list of bad peers, but somehow it's included.
Author
Owner

@AnnaShaleva commented on GitHub (May 19, 2025):

A side fact, just for the record. The same situation is reproduced on other CNs in this network, although not all CNs are marked as "bad", e.g. here's the response from CN1:

anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "getpeers", "params": ["0x398dbf659b3ea356845aafdc54227833550855e8956f5a9eae338c025dfa8a77"] }' http://192.168.43.92:7112 | json_pp 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   582  100   449  100   133   6537   1936 --:--:-- --:--:-- --:--:--  8558
{
   "id" : 1,
   "jsonrpc" : "2.0",
   "result" : {
      "bad" : [
         {
            "address" : "127.0.0.1",
            "port" : 55108
         },
         {
            "address" : "127.0.0.1",
            "port" : 20333
         },
         {
            "address" : "192.168.43.92",
            "port" : 20333
         },
         {
            "address" : "192.168.43.94",
            "port" : 20333
         }
      ],
      "connected" : [
         {
            "address" : "192.168.43.95",
            "lastknownheight" : 66067,
            "port" : 20333,
            "useragent" : "/NEO-GO:/"
         },
         {
            "address" : "192.168.43.93",
            "lastknownheight" : 66067,
            "port" : 20333,
            "useragent" : "/NEO-GO:/"
         }
      ],
      "unconnected" : [
         {
            "address" : "192.168.43.95",
            "port" : 20333
         }
      ]
   }
}

Also note, there's another divergence from expected behaviour: peer 192.168.43.95:20333 is marked as both "connected" and "unconnected.

@AnnaShaleva commented on GitHub (May 19, 2025): A side fact, just for the record. The same situation is reproduced on other CNs in this network, although not all CNs are marked as "bad", e.g. here's the response from CN1: ``` anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ curl -d '{ "jsonrpc": "2.0", "id": 1, "method": "getpeers", "params": ["0x398dbf659b3ea356845aafdc54227833550855e8956f5a9eae338c025dfa8a77"] }' http://192.168.43.92:7112 | json_pp % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 582 100 449 100 133 6537 1936 --:--:-- --:--:-- --:--:-- 8558 { "id" : 1, "jsonrpc" : "2.0", "result" : { "bad" : [ { "address" : "127.0.0.1", "port" : 55108 }, { "address" : "127.0.0.1", "port" : 20333 }, { "address" : "192.168.43.92", "port" : 20333 }, { "address" : "192.168.43.94", "port" : 20333 } ], "connected" : [ { "address" : "192.168.43.95", "lastknownheight" : 66067, "port" : 20333, "useragent" : "/NEO-GO:/" }, { "address" : "192.168.43.93", "lastknownheight" : 66067, "port" : 20333, "useragent" : "/NEO-GO:/" } ], "unconnected" : [ { "address" : "192.168.43.95", "port" : 20333 } ] } } ``` Also note, there's another divergence from expected behaviour: peer `192.168.43.95:20333` is marked as both "connected" and "unconnected.
Author
Owner

@AnnaShaleva commented on GitHub (May 19, 2025):

it looks like 192.168.43.93:20333 should never be included in the list of bad peers, but somehow it's included.

OK, this is explainable and reproducible on privnet: when CN disconnects from the network, its peers mark its connection address (192.168.43.93:20333) as "unconnected". Then peer iterates over unconnectedAddrs map and eventually tries to connect to this address. It fails for connRetries times (since CN is offline) and then marks this peer as "bad".

But it doesn't explain the fact why does the node prevents attempts to connect to seed nodes.

@AnnaShaleva commented on GitHub (May 19, 2025): > it looks like 192.168.43.93:20333 should never be included in the list of bad peers, but somehow it's included. OK, this is explainable and reproducible on privnet: when CN disconnects from the network, its peers mark its connection address (`192.168.43.93:20333`) as "unconnected". Then peer iterates over `unconnectedAddrs` map and eventually tries to connect to this address. It fails for `connRetries` times (since CN is offline) and then marks this peer as "bad". But it doesn't explain the fact why does the node prevents attempts to connect to seed nodes.
Author
Owner

@AnnaShaleva commented on GitHub (May 19, 2025):

Goroutines dump of aborted neofs-ir process:
neofs-ir_ABRT_dump.md

@AnnaShaleva commented on GitHub (May 19, 2025): Goroutines dump of aborted `neofs-ir` process: [neofs-ir_ABRT_dump.md](https://github.com/user-attachments/files/20299243/neofs-ir_ABRT_dump.md)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nspcc-dev/neo-go#1521
No description provided.