Revise write-cache role, functionality, etc #1021

Closed
opened 2025-12-28 17:21:31 +00:00 by sami · 4 comments
Owner

Originally created by @notimetoname on GitHub (May 11, 2023).

It's an epic now, let's solve it

Original issue

There is no strict theory (or, at least, i do not know such) that describes how it should work, what problems it should solve, etc.

At least:

  • it has strange limits for the object it has (its capacity divided by the avarage object it can store);
  • the only way to clean up space for WC is to exceed its object number;
  • it is not "write" cache but a "read-write" cache (see the prev point), which means it is always filled with something even if there is no load at all (although e.g. it can try to flush objects to the blobstor to be prepared for the new load peak instead);
  • it has some interfaces that are not used at all;
  • its initialization may take incredible time because of iterating every object it stores.
Originally created by @notimetoname on GitHub (May 11, 2023). ## It's an epic now, let's solve it * https://github.com/nspcc-dev/neofs-node/issues/3076 * https://github.com/nspcc-dev/neofs-node/issues/3077 * https://github.com/nspcc-dev/neofs-node/issues/1754 * https://github.com/nspcc-dev/neofs-node/issues/1501 * https://github.com/nspcc-dev/neofs-node/issues/1150 * https://github.com/nspcc-dev/neofs-node/issues/1528 * https://github.com/nspcc-dev/neofs-node/issues/1822 * https://github.com/nspcc-dev/neofs-node/issues/3099 * https://github.com/nspcc-dev/neofs-node/issues/3100 * https://github.com/nspcc-dev/neofs-node/issues/3101 * #3201 ## Original issue There is no strict theory (or, at least, i do not know such) that describes how it should work, what problems it should solve, etc. At least: - it has [strange limits](https://github.com/nspcc-dev/neofs-node/blob/e1b1c7c3228811c5eb6d975f832685ef63838e3e/pkg/local_object_storage/writecache/writecache.go#L111-L113) for the object it has (its capacity divided by the _avarage_ object it _can_ store); - the only way to clean up space for WC is to [exceed](https://github.com/nspcc-dev/neofs-node/blob/e1b1c7c3228811c5eb6d975f832685ef63838e3e/pkg/local_object_storage/writecache/init.go#L19-L37) its object number; - it is not "write" cache but a "read-write" cache (see the prev point), which means it is _always_ filled with something even if there is no load at all (although e.g. it can try to flush objects to the blobstor to be prepared for the new load peak instead); - it has some [interfaces](https://github.com/nspcc-dev/neofs-node/blob/e1b1c7c3228811c5eb6d975f832685ef63838e3e/pkg/local_object_storage/writecache/writecache.go#L23) that are not used at all; - its initialization may take _incredible time_ because of [iterating](https://github.com/nspcc-dev/neofs-node/blob/e1b1c7c3228811c5eb6d975f832685ef63838e3e/pkg/local_object_storage/writecache/init.go#L39) every object it stores.
sami 2025-12-28 17:21:31 +00:00
Author
Owner

@notimetoname commented on GitHub (May 12, 2023):

Well, WC is even a racer:

WARNING: DATA RACE
Read at 0x00c000270043 by goroutine 31:
  testing.(*common).logDepth()
      /usr/local/go/src/testing/testing.go:889 +0x4e7
  testing.(*common).log()
      /usr/local/go/src/testing/testing.go:876 +0xa4
  testing.(*common).Logf()
      /usr/local/go/src/testing/testing.go:927 +0x6a
  testing.(*T).Logf()
      <autogenerated>:1 +0x75
  go.uber.org/zap/zaptest.testingWriter.Write()
      /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/zaptest/logger.go:130 +0x12c
  go.uber.org/zap/zaptest.(*testingWriter).Write()
      <autogenerated>:1 +0x7e
  go.uber.org/zap/zapcore.(*ioCore).Write()
      /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/core.go:99 +0x199
  go.uber.org/zap/zapcore.(*CheckedEntry).Write()
      /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/entry.go:255 +0x2ce
  go.uber.org/zap.(*Logger).Debug()
      /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/logger.go:212 +0x6d
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).flushDB()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:137 +0x40a
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).runFlushLoop.func1()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:51 +0x12b

Previous write at 0x00c000270043 by goroutine 8:
  testing.tRunner.func1()
      /usr/local/go/src/testing/testing.go:1433 +0x7e4
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:476 +0x32
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1493 +0x47

Goroutine 31 (running) created at:
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).runFlushLoop()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:42 +0x204
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).Init()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/writecache.go:148 +0x38
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.TestFlush.func1()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush_test.go:67 +0x882
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.TestFlush.func4()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush_test.go:103 +0x83
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1493 +0x47
@notimetoname commented on GitHub (May 12, 2023): Well, WC is even a racer: ``` WARNING: DATA RACE Read at 0x00c000270043 by goroutine 31: testing.(*common).logDepth() /usr/local/go/src/testing/testing.go:889 +0x4e7 testing.(*common).log() /usr/local/go/src/testing/testing.go:876 +0xa4 testing.(*common).Logf() /usr/local/go/src/testing/testing.go:927 +0x6a testing.(*T).Logf() <autogenerated>:1 +0x75 go.uber.org/zap/zaptest.testingWriter.Write() /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/zaptest/logger.go:130 +0x12c go.uber.org/zap/zaptest.(*testingWriter).Write() <autogenerated>:1 +0x7e go.uber.org/zap/zapcore.(*ioCore).Write() /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/core.go:99 +0x199 go.uber.org/zap/zapcore.(*CheckedEntry).Write() /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/zapcore/entry.go:255 +0x2ce go.uber.org/zap.(*Logger).Debug() /home/carpawell/go/pkg/mod/go.uber.org/zap@v1.24.0/logger.go:212 +0x6d github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).flushDB() /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:137 +0x40a github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).runFlushLoop.func1() /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:51 +0x12b Previous write at 0x00c000270043 by goroutine 8: testing.tRunner.func1() /usr/local/go/src/testing/testing.go:1433 +0x7e4 runtime.deferreturn() /usr/local/go/src/runtime/panic.go:476 +0x32 testing.(*T).Run.func1() /usr/local/go/src/testing/testing.go:1493 +0x47 Goroutine 31 (running) created at: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).runFlushLoop() /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:42 +0x204 github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).Init() /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/writecache.go:148 +0x38 github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.TestFlush.func1() /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush_test.go:67 +0x882 github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.TestFlush.func4() /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush_test.go:103 +0x83 testing.tRunner() /usr/local/go/src/testing/testing.go:1446 +0x216 testing.(*T).Run.func1() /usr/local/go/src/testing/testing.go:1493 +0x47 ```
Author
Owner

@roman-khimov commented on GitHub (Nov 20, 2024):

What it can do is:

  • help with load spikes
  • reduce latency for smaller write loads
  • level writing to slower medium (push to it at more proper constant rate)

That's about it. There is no magic, it can not make writing faster in general, eventually you'll run out of space and drop down to the primary storage performance level. It only works if it's located on a faster drive since it doesn't have any magic technology that can make it write faster to the same medium. Usually this means that primary storage is located on HDD with writecache on SSD and it's a nice combination, HDD/HDD and SSD/SSD won't give any benefit.

What we can do to improve it is:

  • drop BoltDB from it completely (provide flush/migration in the new version), it sucks for SSDs as we know from #2814
  • make its flush loop delete objects it has (looks like currently it's only done on Init which defeats the purpose completely)
  • adjust flusher behavior to utilize all of the underlying blobstor capacity (make it more aggressive in general)

This will remove artificial limits and mostly solve init at the same time (most of the time it'll be empty).

@roman-khimov commented on GitHub (Nov 20, 2024): What it can do is: * help with load spikes * reduce latency for smaller write loads * level writing to slower medium (push to it at more proper constant rate) That's about it. There is no magic, it can not make writing faster in general, eventually you'll run out of space and drop down to the primary storage performance level. It only works if it's located on a faster drive since it doesn't have any magic technology that can make it write faster to the same medium. Usually this means that primary storage is located on HDD with writecache on SSD and it's a nice combination, HDD/HDD and SSD/SSD won't give any benefit. What we can do to improve it is: * drop BoltDB from it completely (provide flush/migration in the new version), it sucks for SSDs as we know from #2814 * make its flush loop _delete_ objects it has (looks like currently it's only done on Init which defeats the purpose completely) * adjust flusher behavior to utilize all of the underlying blobstor capacity (make it more aggressive in general) This will remove artificial limits and mostly solve init at the same time (most of the time it'll be empty).
Author
Owner

@roman-khimov commented on GitHub (Dec 28, 2024):

Other things to consider:

  • make it an engine-level thing (not shard level)
  • don't mess with metabase storage IDs, keep writecache index in memory only
@roman-khimov commented on GitHub (Dec 28, 2024): Other things to consider: * make it an engine-level thing (not shard level) * don't mess with metabase storage IDs, keep writecache index in memory only
Author
Owner

@roman-khimov commented on GitHub (Mar 6, 2025):

don't mess with metabase storage IDs, keep writecache index in memory only

#2888.

make it an engine-level thing (not shard level)

#3210.

@roman-khimov commented on GitHub (Mar 6, 2025): > don't mess with metabase storage IDs, keep writecache index in memory only #2888. > make it an engine-level thing (not shard level) #3210.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nspcc-dev/neofs-node#1021
No description provided.