This was only useful to us when we first implemented the feature, to be
able to support proxies that hadn't yet updated, when we had a single
Snowflake bridge. Now that we have multiple bridges, it is unecessary as
proxies that don't send their accepted relay pattern are rejected
anyway.
Client timeouts are currently counted from when the client is matched
with a proxy. Instead, count client timeouts from the moment when the
request is received.
Closes#40449
This fixes a pointer bug in our broker sqs code by enabling the loopvar
feature https://go.dev/wiki/LoopvarExperiment
See tpo/anti-censorship/pluggable-transports/snowflake#40363
This sets up CI to allow the use of the GitLab Dependency Proxy which
caches images pulled from DockerHub, in order to bypass rate-limiting.
The DOCKER_REGISTRY_URL variable is set dynamically by the
check_dependency_proxy_access job defined in dependency_proxy.yml such
that only pipelines triggered by users with the requisite access will be
configured to use the proxy, while all others will continue to pull from
DockerHub as before.
When DOCKER_REGISTRY_URL is pre-set in a project's CI/CD variable
settings, the extra job is skipped and the dependency proxy is used
always, unconditionally.
To avoid breaking CI pipelines on 3rd-party GitLab instances, we only
include the dependency proxy template on gitlab.tpo
See: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40335
...and adjust the `totalFailedConnections` metric name
and description.
This commit should make the periodic stats log messages
and the relevant metric look less scary to users:
P2P connection failures are relatively frequent and are usually
not indicative of the proxy operator having done something wrong.
So let's tone the wording down.
See the discussion: https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/516#note_3173677.
This migrates from using `atomic.LoadInt64` on `int64`
to making the `clients` field itself `atomic.Int64`.
Also `count` now takes `*tokens_t` by reference,
which fixes a linter warning.
It's not clear to me why it warned about this,
but I simplified it anyway.
And `failedConnectionCount`.
Convert the `int` / `uint` to `atomic.Int32` / `atomic.Uint32`.
The race was discovered by running a proxy with the `-race` flag.
Although it does not look like that there are situations
where it is critical to use a mutex, because it's only used
when performing rendezvous with a proxy, which doesn't happen
too frequently,
let's still do it just to be sure.
Some of the changes do not appear to have a potential race condition,
so there it is purely a refactor,
while in others (e.g. in broker.go and in proxy/lib/snowflake.go)
we do use the same variable from multiple threads / functions.
Our metrics were undercounting client polls by missing the case where
clients are matched with a snowflake but receive a timeout before the
snowflake responds with its answer. This change adds a new metric,
called client-snowflake-timeout-count, to the 24 hour broker stats and a
new "timeout" status label for prometheus metrics.
For the summary log and for Prometheus metrics.
Log output example:
> In the last 1h0m0s, there were 7 completed successful connections. 2 connections failed. Traffic Relayed ↓ 321 KB (0.10 KB/s), ↑ 123 KB (0.05 KB/s).
To test that the broker responds with a proxy answer if available, have
only one valid client offer to ensure metrics will always be in the
first multiple of 8.
Fixes a bug where socksAcceptLoop was reusing the same client config
when processing arguments from multiple SOCKS connections, causing
different bridge lines to clobber each other.