Some of the changes do not appear to have a potential race condition,
so there it is purely a refactor,
while in others (e.g. in broker.go and in proxy/lib/snowflake.go)
we do use the same variable from multiple threads / functions.
i.e. if no bridge list file is provided, the relay pattern
would not get set.
AFAIK this is not a breaking change because the broker
can't be used as a library, unlike client and server.
So the assignment of proxies is based on the load. The number of clients
is ronded down to 8. Existing proxies that doesn't report the number
of clients will be distributed equaly to new proxies until they get 8
clients, that is okish as the existing proxies do have a maximum
capacity of 10.
Fixes#40048
Send the client poll request and response in a json-encoded format in
the HTTP request body rather than sending the data in HTTP headers. This
will pave the way for using domain-fronting alternatives for the
Snowflake rendezvous.
Doesn't seem like it needs to exist outside of the metrics struct.
Also, the call to logMetrics is moved to the constructor. A metrics
instance is only created when a BrokerContext is created, which only
happens at startup. The sync of only doing that once is left for
documentation purposes, since it doesn't hurt, but also seems redundant.
The default prometheus registry exports data that may be useful for
side-channel attacks. This removes all of the default metrics and makes
sure we are only reporting snowflake metrics from the broker.
This change adds a prometheus exporter for our existing snowflake broker
metrics. Current values for the metrics can be fetched by sending a GET
request to /prometheus.
As we now partition proxies by NAT type, our stats are more useful if they
capture how many proxies of each type we have, and information on
whether we have enough proxies of the right NAT type for our clients.
This change adds proxy counts by NAT type and binned counts of denied clients by NAT type.
This will allow browser-based proxies that are unable to determine their
NAT type to conservatively label themselves as restricted NATs if they
fail to work with clients that have restricted NATs.
Now when proxies poll, they provide their NAT type to the broker. This
introduces a new snowflake heap of just restricted snowflakes that the
broker can pull from if the client has a known, unrestricted NAT. All
other clients will pull from a heap of snowflakes with unrestricted or
unknown NAT topologies.
Added another lock to the metrics struct to synchronize accesses to the
broker stats. There's a possible race condition if stats are updated at
the same time they are being logged.
There's a race condition in the broker where both the proxy and the
client processes try to pop/remove the same snowflake from the heap.
This patch adds synchronization to prevent simultaneous accesses to
snowflakes.
Proxies now include information about what type they are when they poll
for client offers. The broker saves this information along with
snowflake ids and outputs it on the /debug page.
Switch to containing all communication between the proxy and the broker
in the HTTP response body. This will make things easier if we ever use
something other than HTTP communicate between different actors in the
snowflake system.
Other changes to the protocol are as follows:
- requests are accompanied by a version number so the broker can be
backwards compatable if desired in the future
- all responses are 200 OK unless the request was badly formatted
Many of our log messages were being used to generate metrics, but are
now being aggregated and logged to a separate metrics log file and so we
don't need them in the regular logs anymore.
This addresses the goal of ticket #30830, to remove unecessary messages
and keep broker logs for debugging purposes.
The broker /debug page was displaying proxy IDs and roundtrip times. As
serna pointed out in bug #31460, the proxy IDs can be used to launch a
denial of service attack. As the metrics team pointed out on #21315, the
round trip time average can be potentially sensitive.
This change displays only proxy counts and uses ID lengths to
distinguish between standalone proxy-go instances and browser-based
snowflake proxies.
This implements a handler at https://[snowflake-broker]/metrics for the
snowflake collecTor module to fetch stats from the broker. Logged
metrics are copied out to the response with a text/plain; charset=utf-8
content type. This implements bug #31376.
Added three new metrics:
- proxyIdleCount counts the number of times a proxy polls and receives
no snowflakes
- clientDeniedCount counts the number of times a client requested a
snowflake but none were available
- clientProxyMatchCount counts the number of times a client successfully
received a snowflake
So far, we request a certificate each time we start the broker. Let's
Encrypt maintains several rate limiters and if we exceed one of them, we
won't get a certificate. Worse, since we don't store certificates, we
won't even be able to use an old one.
This patch uses autocert's DirCache structure to cache certificates on
disk.
This patch fixes <https://bugs.torproject.org/30512>.
MaxBytesReader is only documented for server side reads, so we're using
a local limitedRead function instead that uses an io.LimitedReader.
Declared limits in a commented constant