Graceful shutdown if metrics/pprof/control service is not available #1155

Open
opened 2025-12-28 17:21:58 +00:00 by sami · 0 comments
Owner

Originally created by @carpawell on GitHub (Nov 16, 2023).

If some bad happens in http.Serve we just die instantly: here, here and here. These services are kinda the most important for us (they allows us understanding if the other services work in general and making decisions based on statistic information) but not that important to kill application on any error like os.Exit(1).

Describe the solution you'd like

Find out a better way to handle these errors e.g. canceling app context and waiting for other services to shutdown.

Describe alternatives you've considered

Do nothing, it works almost all the time ok.

Additional context

After https://github.com/nspcc-dev/neofs-node/issues/2585, https://github.com/nspcc-dev/neofs-node/issues/2428 and similar things, canceling context may become not the best way to do it: metrics will be the first thing to run and if they die, other services may not be inited yet (and initialization sometimes take its time and no Init takes context.Context), so no shutdown will be performed until all the Init is done.

Originally created by @carpawell on GitHub (Nov 16, 2023). ## Is your feature request related to a problem? Please describe. If some bad happens in `http.Serve` we just die instantly: [here](https://github.com/nspcc-dev/neofs-node/blob/f865d17a93e6a5e7ec165f6da7374fc073f73d2c/cmd/neofs-node/metrics.go#L31), [here](https://github.com/nspcc-dev/neofs-node/blob/f865d17a93e6a5e7ec165f6da7374fc073f73d2c/cmd/neofs-node/pprof.go#L30) and [here](https://github.com/nspcc-dev/neofs-node/blob/f865d17a93e6a5e7ec165f6da7374fc073f73d2c/cmd/neofs-node/control.go#L69). These services are kinda the most important for us (they allows us understanding if the other services work in general and making decisions based on statistic information) but not that important to kill application on any error like `os.Exit(1)`. ## Describe the solution you'd like Find out a better way to handle these errors e.g. canceling app context and waiting for other services to shutdown. ## Describe alternatives you've considered Do nothing, it works almost all the time ok. ## Additional context After https://github.com/nspcc-dev/neofs-node/issues/2585, https://github.com/nspcc-dev/neofs-node/issues/2428 and similar things, canceling context may become not the best way to do it: metrics will be the first thing to run and if they die, other services may not be inited yet (and initialization sometimes take its time and no `Init` takes `context.Context`), so no shutdown will be performed until all the `Init` is done.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nspcc-dev/neofs-node#1155
No description provided.