archives/snowflake

mirror of https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake.git synced 2025-10-13 11:11:30 -04:00

Commit graph

Author	SHA1	Message	Date
Cecylia Bocovich	766988b77b	Add proxy type label to ProxyPollTotal metric Annotate ProxyPollTotal prometheus metrics with the proxy type so that we can track counts of proxies that are matched and that answer by implementation. This will help us catch bugs by implementation or deployment.	2025-09-09 18:18:42 -04:00
Cecylia Bocovich	d08efc34c3	Add prometheus metric for proxy answer counts This adds a prometheus metric that tracks snowflake proxy answers. If the client has not timed out before the proxy responds with an answer, the proxy type is recorded along with a status of "success". If the client has timed out, the type is left blank and the status is recorded as "timeout". The goal of these metrics is to help us determine how many proxies fail to respond and to help narrow down which proxy implementations are causing client timeouts.	2025-09-09 12:41:27 -04:00
Cecylia Bocovich	f777a93b29	Add failing concurrency test This test fails when run with go test -v -race due to data races in calls to prometheus update functions.	2025-08-20 16:37:31 -04:00
Cecylia Bocovich	c8b0b31601	Clear map of seen proxy IP addresses We were not previously clearing the map we keep of seen IP addresses, which resulted in our unique proxy IP counts representing churn rather than unique IP counts per day, except during broker process restarts.	2025-08-20 09:35:32 -04:00
Cecylia Bocovich	43a35655ad	Add failing test for cleared IP map We are not clearing the map of seen IP addresses when metrics are printed, resulting in lower than expected unique IP address counts for daily metrics. See https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40472	2025-08-20 09:35:32 -04:00
David Fifield	fd42bcea8a	Comment typo.	2025-08-19 14:50:23 +00:00
David Fifield	74c39cc8e9	Bin country stats counts before sorting, not after. This avoids an information leak where, if two countries have the same count but are not in alphabetical order, you know the first one had a larger count than the second one before binning. Example: AA=25,BB=27 Without binning, these should sort descending by count: BB=27,AA=25 But with binning, the counts are the same, so it should sort ascending by country code: AA=32,BB=32 Before this change, BB would sort before AA even after binning, which lets you infer that the count of BB was greater than the count of AA before binning: BB=32,AA=32	2025-08-19 09:58:51 -04:00
David Fifield	7a003c9bb1	Add a failing test for ordering of country stats that bin to the same value.	2025-08-19 09:58:50 -04:00
David Fifield	9946c0f1d8	Add a test for formatAndClearCountryStats with binned=true.	2025-08-19 09:58:50 -04:00
David Fifield	bd04cd7752	Refactor TestFormatAndClearCountryStats.	2025-08-19 09:58:50 -04:00
David Fifield	cc0a33faea	Move formatAndClearCountryStats test into a new metrics_test.go. This is more unit test–y. We don't need a full broker instantiation for testing this function, unlike other tests in snowflake-broker_test.go.	2025-08-19 09:58:50 -04:00
David Fifield	ec39237e69	Add a test that formatAndClearCountryStats clears the map.	2025-08-15 19:24:58 +00:00
David Fifield	ed3bd99df6	Rename displayCountryStats to formatAndClearCountryStats. The old name did not make it clear that the function has the side effect of clearing the map.	2025-08-15 19:24:58 +00:00
David Fifield	75daf2210f	Refactor displayCountryStats. Move the record types closer to where they are used. Use a strings.Builder rather than repeatedly concatenating strings (which creates garbage). Use the value that m.Range already provides us, don't look it up again with LoadAndDelete. Add documentation comments.	2025-08-15 19:24:58 +00:00
David Fifield	6e0e5f9137	Express records.Less more clearly.	2025-08-15 19:24:58 +00:00
David Fifield	fed11184c7	Have records.Less express the order we want directly. The ordering is descending by count, then ascending by cc. Express that directly, rather than specifying the opposite ordering and using sort.Reverse.	2025-08-15 19:24:58 +00:00
David Fifield	b058b10a94	Express binCount using integer operations. No need to bring a float64 into this.	2025-08-15 19:24:58 +00:00
Cecylia Bocovich	70974640ab	Defer SQS client IP extraction to ClientOffers Now that both SQS and AMP cache are pulling remote addresses from the SDP, avoid duplicate decodings of the ClientPollRequest by extracting the remote addr in ClientOffers.	2025-08-14 14:13:47 -04:00
Cecylia Bocovich	0bbcb1eca4	Add test for AMP cache geolocation	2025-08-14 14:13:47 -04:00
Cecylia Bocovich	31f879aad5	Pull client IP from SDP for AMP cache rendezvous The remote address for AMP cache rendezvous is always geolocated to the AMP cache server address. For more accurate metrics on where this rendezvous method is used and working, we can pull the remote address directly from the client SDP sent in the poll request.	2025-08-14 14:13:47 -04:00
Cecylia Bocovich	58b1d48e54	Increment prometheus proxy_total count once per IP This fixes a regression from !574 that did not check whether the IP was unique before incrementing the counter. Closes https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40470	2025-07-10 10:41:26 -04:00
Cecylia Bocovich	1d73e14f34	Rename metrics update functions This changes the metrics update functions to UpdateProxyStats and UpdateClientStats, which is more accurate and clear than the previous CountryStats and RendezvousStats names.	2025-06-24 13:12:10 -04:00
Cecylia Bocovich	78cf8e68b2	Simplify broker metrics and remove mutexes This is a large change to how the snowflake broker metrics are implemented. This change removes all uses of mutexes from the metrics implementation in favor of atomic operations on counters stored in sync.Map. There is a small change to the actual metrics output. We used to count the same proxy ip multiple times in our snowflake-ips-total and snowflake-ips country stats if the same proxy ip address polled more than once with different proxy types. This was an overcounting of the number of unique proxy IP addresses that is now fixed. If a unique proxy ip polls with more than one proxy type or nat type, these polls will still be counted once for each proxy type or nat type in our proxy type and nat type specific stats (e.g., snowflake-ips-nat-restricted and snowflake-ips-nat-unrestricted).	2025-06-24 13:12:10 -04:00
David Fifield	64c7a26475	Comment typo.	2025-06-19 15:39:24 +00:00
Cecylia Bocovich	08239cca2a	Remove broker log messages for invalid SDP and SQS cleanup	2025-03-27 15:34:09 -04:00
Cecylia Bocovich	dd5fb03c49	Remove default relay pattern option from broker This was only useful to us when we first implemented the feature, to be able to support proxies that hadn't yet updated, when we had a single Snowflake bridge. Now that we have multiple bridges, it is unecessary as proxies that don't send their accepted relay pattern are rejected anyway.	2025-03-26 13:32:30 -04:00
Cecylia Bocovich	c0ac0186f1	Remove bad relay pattern log message We already count proxies rejected for their supported relay URL in snowflake metrics and these messages are filling up our broker logs.	2025-03-26 13:32:30 -04:00
Cecylia Bocovich	8343bbc336	Add context with timeout for client requests Client timeouts are currently counted from when the client is matched with a proxy. Instead, count client timeouts from the moment when the request is received. Closes #40449	2025-03-26 13:30:59 -04:00
WofWca	1923803124	fix: potential race conditions with non-local err Some of the changes do not appear to have a potential race condition, so there it is purely a refactor, while in others (e.g. in broker.go and in proxy/lib/snowflake.go) we do use the same variable from multiple threads / functions.	2025-03-12 00:47:07 +04:00
Cecylia Bocovich	57dc276e48	Update broker metrics to count matches, denials, and timeouts Our metrics were undercounting client polls by missing the case where clients are matched with a snowflake but receive a timeout before the snowflake responds with its answer. This change adds a new metric, called client-snowflake-timeout-count, to the 24 hour broker stats and a new "timeout" status label for prometheus metrics.	2025-03-11 12:36:27 -04:00
Cecylia Bocovich	9e619a3654	Remove metrics race condition in sqs test To test that the broker responds with a proxy answer if available, have only one valid client offer to ensure metrics will always be in the first multiple of 8.	2025-03-04 10:37:37 -05:00
Cecylia Bocovich	80374c6d93	Move nonblocking AddSnowflake out of goroutine in sqs test This fixes a race condition in tests where sometimes snowflake matching happens before enough snowflakes get added to the heap.	2025-03-04 10:37:37 -05:00
Cecylia Bocovich	63613cc50a	Fix minor data race in Snowflake broker metrics	2025-02-20 09:39:11 -05:00
Cecylia Bocovich	1180d11a66	Remove data races from sqs tests Our SQS tests were not concurrency safe and we hadn't noticed until now because we were processing incoming SQS queue messages sequentially rather than in parallel. This fix removes the log output checks, which were prone to error anyway, and relies instead on gomock's expected function calls and strategic use of the context cancel function for each test.	2025-02-20 09:39:11 -05:00
Cecylia Bocovich	2250bc86f6	Process and read broker SQS messages more quickly We're losing a lot of messages from the broker SQS queue because they are exceeding their maximum lifetime before being read and processed by the broker. This change speeds up that process by increasing the size of messagesChn and processing the messages within a go routine.	2025-02-20 09:37:18 -05:00
WofWca	cb0fb02cd5	fix(proxy): not answering before client timeout This is related to https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40230. The initial MR that closed that issue, https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/391, was not semantically correct, because `DataChannelTimeout` starts after the client has already received the answer. After https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/498#note_3156256 the code became not only semantically incorrect, but also functionally incorrect because now if this timeout is hit by the proxy, the client is guaranteed to be gone already. This commit fixes it, by lowering the timeout. This addresses a suggestion in https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40447. This also closes https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40381 and supersedes https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/415.	2025-02-12 10:17:08 -05:00
Cecylia Bocovich	4a1e075ee0	Lower broker ClientTimeout to 5 seconds Matches the observed timeout for CDN77, based on user reports. https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40446	2025-02-04 15:41:35 -05:00
Neel Chauhan	8792771cdc	broker and proxy must not reject client offers with no ICE candidates Fixes #40371. Partially reverts !141.	2024-10-17 15:46:02 -04:00
WofWca	71828580bb	fix(broker): empty pattern if bridge-list is empty i.e. if no bridge list file is provided, the relay pattern would not get set. AFAIK this is not a breaking change because the broker can't be used as a library, unlike client and server.	2024-09-21 15:11:37 +00:00
WofWca	ec9476e5ab	Better error msg on bad fingerprint	2024-09-04 10:47:08 -04:00
meskio	0804d8651f	Merge remote-tracking branch 'gitlab/mr/362'	2024-08-22 13:35:53 +02:00
WofWca	677146c9d5	add `test_bridgeList.txt` file As an example for the `bridge-list-path` parameter	2024-08-21 20:50:59 +04:00
WofWca	103278d6fa	docs(broker): clarify `allowed-relay-pattern` Specify that the broker will reject proxies whose AcceptedRelayPattern is more restrictive than this, and not less restrictive. The parameter was introduced here https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/87 > The proxy sends its allowed URL pattern to the broker. > The broker rejects proxies that are too restrictive.	2024-08-20 12:43:31 +01:00
meskio	f64f234eeb	New ptuitl/safeprom doesn't have Rounded in the type names This version fixes the test issue of double registering metrics. * Closes: #40367	2024-07-11 17:45:57 +02:00
meskio	a9df5dd71a	Use ptutil for safelog and prometheus rounded metrics * Related: #40354	2024-05-09 16:24:33 +02:00
Michael Pu	b512e242e8	Implement better client IP per rendezvous method tracking for clients Implement better client IP per rendezvous method tracking for clients Add tests for added code, fix existing tests chore(deps): update module github.com/miekg/dns to v1.1.58 Implement better client IP tracking for http and ampcache Add tests for added code, fix existing tests Implement GetCandidateAddrs from SDP Add getting client IP for SQS Bug fixes Bug fix for tests	2024-03-09 13:36:25 -05:00
am3o	acce1f1fd9	refactor: change deprecated "io/ioutil" package to recommended "io" package	2024-02-17 12:47:22 +01:00
Michael Pu	5f5cbe6431	Prune metrics that are reported for rendezvous Signed-off-by: Cecylia Bocovich <cohosh@torproject.org>	2024-01-31 14:34:32 -05:00
Anthony Chang	dbecefa7d2	Move RendezvousMethod field to messages.Arg	2024-01-31 14:34:29 -05:00
Michael Pu	26ceb6e20d	Add metrics for tracking rendezvous method Update tests for metrics Add rendezvous_method to Prometheus metrics Update broker spec docs with rendezvous method metrics Bug fix	2024-01-31 14:34:29 -05:00

1 2 3 4

194 commits