Skip to content
Unverified Commit d54feeb1 authored by Svyatoslav Nikolsky's avatar Svyatoslav Nikolsky Committed by GitHub
Browse files

Fixed RPC subscriptions leak when subscription stream is finished (#4533)


Recently we've changed our bridge configuration for Rococo <> Westend
and our new relayer has started to submit transactions every ~ `30`
seconds. Eventually, it switches itself into limbo state, where it can't
submit more transactions - all `author_submitAndWatchExtrinsic` calls
are failing with the following error: `ERROR bridge Failed to send
transaction to BridgeHubRococo node: Call(ErrorObject { code:
ServerError(-32006), message: "Too many subscriptions on the
connection", data: Some(RawValue("Exceeded max limit of 1024")) })`.

Some links for those who want to explore:
- server side (node) has a strict limit on a number of active
subscriptions. It fails to open a new subscription if this limit is hit:
The limit is set to `1024` by default;
- internally this limit is a semaphore with `limit` permits:;
- semaphore permit is acquired in the first link;
- the permit is "returned" when the `SubscriptionSink` is dropped:;
- the `SubscriptionSink` is dropped when [this `polkadot-sdk`
returns. In other words - when the connection is closed, the stream is
finished or internal subscription buffer limit is hit;
- the subscription has the internal buffer, so sending an item contains
of two steps: [reading an item from the underlying
and [sending it over the
- when the underlying stream is finished, the `inner_pipe_from_stream`
wants to ensure that all items are sent to the subscriber. So it: [waits
until the current send operation
and then [send all remaining items from the internal
Once it is done, the function returns, the `SubscriptionSink` is
dropped, semaphore permit is dropped and we are ready to accept new
- unfortunately, the code just calls the `pending_fut.await.is_err()` to
ensure that [the current send operation
But if there are no current send operation (which is normal), then the
`pending_fut` is set to terminated future and the `await` never
completes. Hence, no return from the function, no drop of
`SubscriptionSink`, no drop of semaphore permit, no new subscriptions
allowed (once number of susbcriptions hits the limit.

I've illustrated the issue with small test - you may ensure that if e.g.
the stream is initially empty, the
`subscription_is_dropped_when_stream_is_empty` will hang because
`pipe_from_stream` never exits.
parent b00e1681
Pipeline #476596 waiting for manual action with stages
in 1 hour, 15 minutes, and 36 seconds