Cannot PUT object to node which have no space left even policy allow it #738

Closed
opened 2025-12-28 17:20:32 +00:00 by sami · 13 comments
Owner

Originally created by @anikeev-yadro on GitHub (Sep 9, 2022).

Expected Behavior

When we PUT object to node which have no space left, the object uploaded to other node trough internal interface

Current Behavior

When we PUT object to node which have no space left, we got the error "no space left"

Steps to Reproduce (for bugs)

  1. Make no space left on device with shadrs on node3.
admin@vedi:~$ df -h
/dev/vdb            30G          28G     0          100% /srv
  1. Create container with policy to allow store objects only on nodes 1,2,4 (according network map below)
anikeev@NB-1670:~/neofs/neofs-testcases$ sudo ./neofs-cli --rpc-endpoint 172.26.160.204:8080 --wallet ../wallet.json container create --name f_test_rep2_RU_FI --policy "REP 2 in NODES SELECT 2 FROM C AS NODES FILTER Country eq “Russia” O
R Country eq “Finland” as C" --await
[sudo] password for anikeev:
Enter password >
container ID: raW3aUxPDw56aQ8xbryUuihKjj41KyPCJRcJymEfu1Q
awaiting...
container has been persisted on sidechain
  1. Try to PUT object to node3. Got the error:
admin@perf-load-01:~/xk6$ neofs-cli --rpc-endpoint 172.26.160.204:8080 -w ../wallet.json object put --file /tmp/object50M-SW_FI.sample --cid raW3aUxPDw56aQ8xbryUuihKjj41KyPCJRcJymEfu1Q
Enter password >
create session: can't open session: status: code = 1024 message = no space left on device
  1. Try to PUT object to node4. Object uploaded successfully:
admin@perf-load-01:~/xk6$ neofs-cli --rpc-endpoint 172.26.163.168:8080 -w ../wallet.json object put --file /tmp/object50M-SW_FI.sample --cid raW3aUxPDw56aQ8xbryUuihKjj41KyPCJRcJymEfu1Q
Enter password >
 121679872 / 524288000 [======================================>---------------------------------------------------------------------------------------------------------------------------------]  23.21% 00m57s

Network map

Node 1: qWygg5EQNf247B7KAL7a8XowpRsm7FKchgJ6KaFxiGYK ONLINE [/dns4/node3.neofs/tcp/8080]
        Continent: Europe
        Country: Sweden
        CountryCode: SE
        Deployed: YACZROKH
        Location: Stockholm
        Price: 10
        SubDiv: Stockholms l�n
        SubDivCode: AB
        UN-LOCODE: SE STO
Node 2: rCNa2Wis4PT8V3ADVgjuzdjeK5UEoAEiua4RSUAbfyz8 ONLINE [/dns4/node2.neofs/tcp/8080]
        Continent: Europe
        Country: Russia
        CountryCode: RU
        Deployed: YACZROKH
        Location: Saint Petersburg (ex Leningrad)
        Price: 10
        SubDiv: Sankt-Peterburg
        SubDivCode: SPE
        UN-LOCODE: RU LED
Node 3: 21EoJZ5ifb6yzYHbrPYiMCaz7de37BbfhjhgCnUYG1fDb ONLINE [/dns4/node1.neofs/tcp/8080]
        Continent: Europe
        Country: Russia
        CountryCode: RU
        Deployed: YACZROKH
        Location: Moskva
        Price: 10
        SubDiv: Moskva
        SubDivCode: MOW
        UN-LOCODE: RU MOW
Node 4: 21W4qGtdumLmswdYmYNfmqmAAUULnMysVGiU7dffjvW3B ONLINE [/dns4/node4.neofs/tcp/8080]
        Continent: Europe
        Country: Finland
        CountryCode: FI
        Deployed: YACZROKH
        Location: Helsinki (Helsingfors)
        Price: 10
        SubDiv: Uusimaa
        SubDivCode: 18
        UN-LOCODE: FI HEL

Your Environment

Versions:

tatlin-object-v0.1.1-nb-20220901.2

NeoFS Storage node
Version: v0.31.0-66-gd8a00c36
GoVersion: go1.18.4

NeoFS Adm
Version: v0.31.0-3-ge07921fc
GoVersion: go1.18.4

Server setup and configuration:
cloud, 4 VMs, 4 SN, 4 http qw, 4 s3 gw

Operating System and version (uname -a):
linux vedi 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 GNU/Linux

Originally created by @anikeev-yadro on GitHub (Sep 9, 2022). ## Expected Behavior When we PUT object to node which have no space left, the object uploaded to other node trough internal interface ## Current Behavior When we PUT object to node which have no space left, we got the error "no space left" ## Steps to Reproduce (for bugs) 1. Make no space left on device with shadrs on node3. ``` admin@vedi:~$ df -h /dev/vdb 30G 28G 0 100% /srv ``` 2. Create container with policy to allow store objects only on nodes 1,2,4 (according network map below) ``` anikeev@NB-1670:~/neofs/neofs-testcases$ sudo ./neofs-cli --rpc-endpoint 172.26.160.204:8080 --wallet ../wallet.json container create --name f_test_rep2_RU_FI --policy "REP 2 in NODES SELECT 2 FROM C AS NODES FILTER Country eq “Russia” O R Country eq “Finland” as C" --await [sudo] password for anikeev: Enter password > container ID: raW3aUxPDw56aQ8xbryUuihKjj41KyPCJRcJymEfu1Q awaiting... container has been persisted on sidechain ``` 3. Try to PUT object to node3. Got the error: ``` admin@perf-load-01:~/xk6$ neofs-cli --rpc-endpoint 172.26.160.204:8080 -w ../wallet.json object put --file /tmp/object50M-SW_FI.sample --cid raW3aUxPDw56aQ8xbryUuihKjj41KyPCJRcJymEfu1Q Enter password > create session: can't open session: status: code = 1024 message = no space left on device ``` 4. Try to PUT object to node4. Object uploaded successfully: ``` admin@perf-load-01:~/xk6$ neofs-cli --rpc-endpoint 172.26.163.168:8080 -w ../wallet.json object put --file /tmp/object50M-SW_FI.sample --cid raW3aUxPDw56aQ8xbryUuihKjj41KyPCJRcJymEfu1Q Enter password > 121679872 / 524288000 [======================================>---------------------------------------------------------------------------------------------------------------------------------] 23.21% 00m57s ``` Network map ``` Node 1: qWygg5EQNf247B7KAL7a8XowpRsm7FKchgJ6KaFxiGYK ONLINE [/dns4/node3.neofs/tcp/8080] Continent: Europe Country: Sweden CountryCode: SE Deployed: YACZROKH Location: Stockholm Price: 10 SubDiv: Stockholms l�n SubDivCode: AB UN-LOCODE: SE STO Node 2: rCNa2Wis4PT8V3ADVgjuzdjeK5UEoAEiua4RSUAbfyz8 ONLINE [/dns4/node2.neofs/tcp/8080] Continent: Europe Country: Russia CountryCode: RU Deployed: YACZROKH Location: Saint Petersburg (ex Leningrad) Price: 10 SubDiv: Sankt-Peterburg SubDivCode: SPE UN-LOCODE: RU LED Node 3: 21EoJZ5ifb6yzYHbrPYiMCaz7de37BbfhjhgCnUYG1fDb ONLINE [/dns4/node1.neofs/tcp/8080] Continent: Europe Country: Russia CountryCode: RU Deployed: YACZROKH Location: Moskva Price: 10 SubDiv: Moskva SubDivCode: MOW UN-LOCODE: RU MOW Node 4: 21W4qGtdumLmswdYmYNfmqmAAUULnMysVGiU7dffjvW3B ONLINE [/dns4/node4.neofs/tcp/8080] Continent: Europe Country: Finland CountryCode: FI Deployed: YACZROKH Location: Helsinki (Helsingfors) Price: 10 SubDiv: Uusimaa SubDivCode: 18 UN-LOCODE: FI HEL ``` ## Your Environment **Versions:** ``` tatlin-object-v0.1.1-nb-20220901.2 NeoFS Storage node Version: v0.31.0-66-gd8a00c36 GoVersion: go1.18.4 NeoFS Adm Version: v0.31.0-3-ge07921fc GoVersion: go1.18.4 ``` **Server setup and configuration:** cloud, 4 VMs, 4 SN, 4 http qw, 4 s3 gw **Operating System and version (uname -a):** linux vedi 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 GNU/Linux
sami 2025-12-28 17:20:32 +00:00
  • closed this issue
  • added the
    bug
    S4
    U3
    I4
    labels
Author
Owner

@anikeev-yadro commented on GitHub (Sep 9, 2022):

bug_1773.zip

@anikeev-yadro commented on GitHub (Sep 9, 2022): [bug_1773.zip](https://github.com/nspcc-dev/neofs-node/files/9534626/bug_1773.zip)
Author
Owner

@anikeev-yadro commented on GitHub (Sep 9, 2022):

Because we found the mistake in policy string I have re-run test with following policy:
REP 2 IN NODES SELECT 2 FROM C AS NODES FILTER 'Country' EQ 'Russia' OR 'Country' EQ 'Finland' AS C

Steps to reproduce:
1.Create container with policy to allow to store objects on nodes 1,2,4

anikeev@NB-1670:~/neofs/neofs-testcases$ sudo ./neofs-cli --rpc-endpoint 172.26.160.204:8080 --wallet ../wallet.json container create --name f_test_rep2_RU_FI_new --policy
"REP 2 IN NODES SELECT 2 FROM C AS NODES FILTER 'Country' EQ 'Russia' OR 'Country' EQ 'Finland' AS C" --await
Enter password >
container ID: EBjYGc9wjJceesvoo1aF2eL7gNWsytFsPthdtWgSV1hC
awaiting...
container has been persisted on sidechain

2.Try to put object to node3 which have no space left on data device lead the error:

admin@perf-load-01:~$ date;neofs-cli --rpc-endpoint 172.26.160.204:8080 -w wallet.json object put --file /tmp/object75M-RU_FI.sample --cid EBjYGc9wjJceesvoo1aF2eL7gNWsytFsPthdtWgSV1hC
Пт 09 сен 2022 13:01:49 UTC
Enter password >
create session: can't open session: status: code = 1024 message = no space left on device

Logs:
no_space_left_bug.zip

@anikeev-yadro commented on GitHub (Sep 9, 2022): Because we found the mistake in policy string I have re-run test with following policy: ```REP 2 IN NODES SELECT 2 FROM C AS NODES FILTER 'Country' EQ 'Russia' OR 'Country' EQ 'Finland' AS C``` **Steps to reproduce:** 1.Create container with policy to allow to store objects on nodes 1,2,4 ``` anikeev@NB-1670:~/neofs/neofs-testcases$ sudo ./neofs-cli --rpc-endpoint 172.26.160.204:8080 --wallet ../wallet.json container create --name f_test_rep2_RU_FI_new --policy "REP 2 IN NODES SELECT 2 FROM C AS NODES FILTER 'Country' EQ 'Russia' OR 'Country' EQ 'Finland' AS C" --await Enter password > container ID: EBjYGc9wjJceesvoo1aF2eL7gNWsytFsPthdtWgSV1hC awaiting... container has been persisted on sidechain ``` 2.Try to put object to node3 which have no space left on data device lead the error: ``` admin@perf-load-01:~$ date;neofs-cli --rpc-endpoint 172.26.160.204:8080 -w wallet.json object put --file /tmp/object75M-RU_FI.sample --cid EBjYGc9wjJceesvoo1aF2eL7gNWsytFsPthdtWgSV1hC Пт 09 сен 2022 13:01:49 UTC Enter password > create session: can't open session: status: code = 1024 message = no space left on device ``` **Logs:** [no_space_left_bug.zip](https://github.com/nspcc-dev/neofs-node/files/9536276/no_space_left_bug.zip)
Author
Owner

@cthulhu-rider commented on GitHub (Sep 12, 2022):

The problem is described by

create session: can't open session: status: code = 1024 message = no space left on device

In order to exec PUT operation NeoFS CLI should be able to open NeoFS session with the storage node. Main session artifact - generated private key - requires some storage space. Currently there is no way to create in-memory session, so described behavior is expected.

I guess we should be able to provide support of in-memory sessions for such cases.

@cthulhu-rider commented on GitHub (Sep 12, 2022): The problem is described by ``` create session: can't open session: status: code = 1024 message = no space left on device ``` In order to exec PUT operation NeoFS CLI should be able to open NeoFS session with the storage node. Main session artifact - generated private key - requires some storage space. Currently there is no way to create in-memory session, so described behavior is expected. I guess we should be able to provide support of in-memory sessions for such cases.
Author
Owner

@cthulhu-rider commented on GitHub (Sep 13, 2022):

Currently (neofs-node@v0.31.0) storage nodes can be configured to use in-memory sessions only.

I suggest to consider supporting in-memory sessions at the request level.
@fyrchik @carpawell @realloc

@cthulhu-rider commented on GitHub (Sep 13, 2022): Currently (`neofs-node@v0.31.0`) storage nodes can be configured to use in-memory sessions only. * see https://github.com/nspcc-dev/neofs-node/pull/1781 I suggest to consider supporting in-memory sessions at the request level. @fyrchik @carpawell @realloc
Author
Owner

@realloc commented on GitHub (Sep 13, 2022):

I'd suggest falling back to in-memory when the node is unable to persist token to the on-disk session store.

@realloc commented on GitHub (Sep 13, 2022): I'd suggest falling back to in-memory when the node is unable to persist token to the on-disk session store.
Author
Owner

@carpawell commented on GitHub (Sep 13, 2022):

I suggest to consider supporting in-memory sessions at the request level.

@cthulhu-rider, do you mean adding some flag to the request that forces a node to use in-mem sessions? and CLI should try a regular request and then try a request with that new flag in case of "no space left on device"?

I'd suggest falling back to in-memory when the node is unable to persist token to the on-disk session store.

Won't it lead to unexpected impossible-to-track results? Such as losing some sessions on restart. Seems like some migration process is required then. Moreover, do we need to be able to create sessions with a node that is not possible to put any objects?

@carpawell commented on GitHub (Sep 13, 2022): > I suggest to consider supporting in-memory sessions at the request level. @cthulhu-rider, do you mean adding some flag to the request that forces a node to use in-mem sessions? and CLI should try a regular request and then try a request with that new flag in case of "no space left on device"? > I'd suggest falling back to in-memory when the node is unable to persist token to the on-disk session store. Won't it lead to unexpected impossible-to-track results? Such as losing *some* sessions on restart. Seems like some migration process is required then. Moreover, do we need to be able to create sessions with a node that is not possible to put any objects?
Author
Owner

@fyrchik commented on GitHub (Sep 13, 2022):

Falling back to in-memory sessions is bad from the user POV, I'd like the software to behave exactly as the configuration prescribes. May be we could add an option like: on-overflow: refuse, on-overflow: use memory, on-overflow: drop oldest with refuse being the default.

@fyrchik commented on GitHub (Sep 13, 2022): Falling back to in-memory sessions is bad from the user POV, I'd like the software to behave exactly as the configuration prescribes. May be we could add an option like: `on-overflow: refuse`, `on-overflow: use memory`, `on-overflow: drop oldest` with `refuse` being the default.
Author
Owner

@realloc commented on GitHub (Sep 13, 2022):

Having an option to allow fallback seems good. It should not become a surprise if it's seen from logs.

@realloc commented on GitHub (Sep 13, 2022): Having an option to allow fallback seems good. It should not become a surprise if it's seen from logs.
Author
Owner

@cthulhu-rider commented on GitHub (Sep 13, 2022):

do you mean adding some flag to the request that forces a node to use in-mem sessions?

Not exactly. IMO, from the user side, server must not respond with status OK on temporary stored session unless the user has explicitly specified it.

According to this, node behavior can be implemented and configured in any of the proposed ways - API client should not depend on it.

@cthulhu-rider commented on GitHub (Sep 13, 2022): > do you mean adding some flag to the request that forces a node to use in-mem sessions? Not exactly. IMO, from the user side, server must not respond with status `OK` on temporary stored session unless the user has explicitly specified it. According to this, node behavior can be implemented and configured in any of the proposed ways - API client should not depend on it.
Author
Owner

@anikeev-yadro commented on GitHub (Sep 20, 2022):

In the original config all nodes had one disk with data, cache and metadata.

I had to reproduce with other config close to real life.
Node3 has 2 disks:
-first physical system disk with cache and metadata
-second physical disk with data

      blobstor:
      - path: /data/neofs/data0/blobovnicza <---/dev/vdb
        type: blobovnicza
      - path: /data/neofs/data0
        type: fstree
      metabase:
        path: /srv/neofs/meta0/metabase0.db <---dev/vda
      pilorama:
        path: /srv/neofs/meta0/pilorama0.db
      writecache:
        path: /srv/neofs/meta0/write_cache0

Data disk /dev/vdb has no space left.
Now I can PUT object to node if replication policy allow to create replicas on the other nodes.
But I cannot PUT object if replication policy disallow create other replicas. This is strange, because node has enough space in write cache.

сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.208Z        warn        engine/put.go:112        could not put object in shard        {"shard": "VQrS8znyyq4goHwm8vXFwv", "error": "could not put object to BLOB storage: mkdir /data/neofs/data2/5/s: no space left on device"}
сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.229Z        warn        engine/put.go:112        could not put object in shard        {"shard": "KE1bG4rzWQeUg2T3GpXARc", "error": "could not put object to BLOB storage: mkdir /data/neofs/data0/5/s: no space left on device"}
сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.376Z        warn        engine/put.go:112        could not put object in shard        {"shard": "BZJwTshaGfZk57eY7dW3tA", "error": "could not put object to BLOB storage: mkdir /data/neofs/data1/5/s/L: no space left on device"}
сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.382Z        warn        engine/put.go:112        could not put object in shard        {"shard": "ELGmwQ37PNw9ey9p9BzF5v", "error": "could not put object to BLOB storage: mkdir /data/neofs/data3/5/s: no space left on device"}
admin@glagoli:~$ df -h
Файловая система Размер Использовано  Дост Использовано% Cмонтировано в
udev               2,0G            0  2,0G            0% /dev
tmpfs              394M         636K  393M            1% /run
/dev/vda3           49G          14G   33G           29% /
tmpfs              2,0G            0  2,0G            0% /dev/shm
tmpfs              5,0M            0  5,0M            0% /run/lock
/dev/vda2         1021M          49M  973M            5% /boot
/dev/vdb           9,8G         9,3G     0          100% /data
tmpfs              394M            0  394M            0% /run/user/1000
@anikeev-yadro commented on GitHub (Sep 20, 2022): In the original config all nodes had one disk with data, cache and metadata. I had to reproduce with other config close to real life. Node3 has 2 disks: -first physical system disk with cache and metadata -second physical disk with data ``` blobstor: - path: /data/neofs/data0/blobovnicza <---/dev/vdb type: blobovnicza - path: /data/neofs/data0 type: fstree metabase: path: /srv/neofs/meta0/metabase0.db <---dev/vda pilorama: path: /srv/neofs/meta0/pilorama0.db writecache: path: /srv/neofs/meta0/write_cache0 ``` Data disk /dev/vdb has no space left. Now I **can** PUT object to node if replication **policy allow** to create replicas on the other nodes. But I **cannot** PUT object if replication **policy disallow** create other replicas. This is strange, because node has enough space in write cache. ``` сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.208Z warn engine/put.go:112 could not put object in shard {"shard": "VQrS8znyyq4goHwm8vXFwv", "error": "could not put object to BLOB storage: mkdir /data/neofs/data2/5/s: no space left on device"} сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.229Z warn engine/put.go:112 could not put object in shard {"shard": "KE1bG4rzWQeUg2T3GpXARc", "error": "could not put object to BLOB storage: mkdir /data/neofs/data0/5/s: no space left on device"} сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.376Z warn engine/put.go:112 could not put object in shard {"shard": "BZJwTshaGfZk57eY7dW3tA", "error": "could not put object to BLOB storage: mkdir /data/neofs/data1/5/s/L: no space left on device"} сен 20 07:46:47 glagoli neofs-node[594]: 2022-09-20T07:46:47.382Z warn engine/put.go:112 could not put object in shard {"shard": "ELGmwQ37PNw9ey9p9BzF5v", "error": "could not put object to BLOB storage: mkdir /data/neofs/data3/5/s: no space left on device"} ``` ``` admin@glagoli:~$ df -h Файловая система Размер Использовано Дост Использовано% Cмонтировано в udev 2,0G 0 2,0G 0% /dev tmpfs 394M 636K 393M 1% /run /dev/vda3 49G 14G 33G 29% / tmpfs 2,0G 0 2,0G 0% /dev/shm tmpfs 5,0M 0 5,0M 0% /run/lock /dev/vda2 1021M 49M 973M 5% /boot /dev/vdb 9,8G 9,3G 0 100% /data tmpfs 394M 0 394M 0% /run/user/1000 ```
Author
Owner

@fyrchik commented on GitHub (Sep 20, 2022):

As we discussed with @anikeev-yadro , the problem here was possibly due to the default writecache.capacity value.

@fyrchik commented on GitHub (Sep 20, 2022): As we discussed with @anikeev-yadro , the problem here was possibly due to the default `writecache.capacity` value.
Author
Owner

@anikeev-yadro commented on GitHub (Sep 21, 2022):

As we discussed with @anikeev-yadro , the problem here was possibly due to the default writecache.capacity value.

you are right, the put operation works fine while write cache has free space (not disk with write cache has free space).

@anikeev-yadro commented on GitHub (Sep 21, 2022): > As we discussed with @anikeev-yadro , the problem here was possibly due to the default `writecache.capacity` value. you are right, the put operation works fine while write cache has free space (not disk with write cache has free space).
Author
Owner

@roman-khimov commented on GitHub (Oct 22, 2025):

Obsolete.

@roman-khimov commented on GitHub (Oct 22, 2025): Obsolete.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nspcc-dev/neofs-node#738
No description provided.