Rather than use defer. It is only a tiny amount faster, but this
function is frequently called.
Before:
$ go test -bench=BenchmarkSendQueue -benchtime=2s
BenchmarkSendQueue-4 15901834 151 ns/op
After:
$ go test -bench=BenchmarkSendQueue -benchtime=2s
BenchmarkSendQueue-4 15859948 147 ns/op
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40177
This should increase the maximum amount of inflight data and hopefully
the performance of Snowflake, especially for clients geographically
distant from proxies and the server.
Introduce a waitgroup and done channel to ensure that both the read and
write gorouting for turbotunnel connections terminate when the
connection is closed.
The client opts into turbotunnel mode by sending a magic token at the
beginning of each WebSocket connection (before sending even the
ClientID). The token is just a random byte string I generated. The
server peeks at the token and, if it matches, uses turbotunnel mode.
Otherwise, it unreads the token and continues in the old
one-session-per-WebSocket mode.