Commit graph

60 commits

Author SHA1 Message Date
Cecylia Bocovich
766988b77b
Add proxy type label to ProxyPollTotal metric
Annotate ProxyPollTotal prometheus metrics with the proxy type so that
we can track counts of proxies that are matched and that answer by
implementation. This will help us catch bugs by implementation or
deployment.
2025-09-09 18:18:42 -04:00
Cecylia Bocovich
d08efc34c3
Add prometheus metric for proxy answer counts
This adds a prometheus metric that tracks snowflake proxy answers. If
the client has not timed out before the proxy responds with an answer,
the proxy type is recorded along with a status of "success". If the
client has timed out, the type is left blank and the status is recorded
as "timeout".

The goal of these metrics is to help us determine how many proxies fail
to respond and to help narrow down which proxy implementations are
causing client timeouts.
2025-09-09 12:41:27 -04:00
Cecylia Bocovich
c8b0b31601
Clear map of seen proxy IP addresses
We were not previously clearing the map we keep of seen IP addresses,
which resulted in our unique proxy IP counts representing churn rather
than unique IP counts per day, except during broker process restarts.
2025-08-20 09:35:32 -04:00
David Fifield
fd42bcea8a Comment typo. 2025-08-19 14:50:23 +00:00
David Fifield
74c39cc8e9
Bin country stats counts before sorting, not after.
This avoids an information leak where, if two countries have the same
count but are not in alphabetical order, you know the first one had a
larger count than the second one before binning.

Example: AA=25,BB=27

Without binning, these should sort descending by count:
BB=27,AA=25

But with binning, the counts are the same, so it should sort ascending
by country code:
AA=32,BB=32

Before this change, BB would sort before AA even after binning, which
lets you infer that the count of BB was greater than the count of AA
*before* binning:
BB=32,AA=32
2025-08-19 09:58:51 -04:00
David Fifield
ed3bd99df6 Rename displayCountryStats to formatAndClearCountryStats.
The old name did not make it clear that the function has the side effect
of clearing the map.
2025-08-15 19:24:58 +00:00
David Fifield
75daf2210f Refactor displayCountryStats.
Move the record types closer to where they are used.

Use a strings.Builder rather than repeatedly concatenating strings
(which creates garbage).

Use the value that m.Range already provides us, don't look it up again
with LoadAndDelete.

Add documentation comments.
2025-08-15 19:24:58 +00:00
David Fifield
6e0e5f9137 Express records.Less more clearly. 2025-08-15 19:24:58 +00:00
David Fifield
fed11184c7 Have records.Less express the order we want directly.
The ordering is descending by count, then ascending by cc. Express that
directly, rather than specifying the opposite ordering and using
sort.Reverse.
2025-08-15 19:24:58 +00:00
David Fifield
b058b10a94 Express binCount using integer operations.
No need to bring a float64 into this.
2025-08-15 19:24:58 +00:00
Cecylia Bocovich
58b1d48e54
Increment prometheus proxy_total count once per IP
This fixes a regression from !574 that did not check whether the IP was
unique before incrementing the counter.

Closes https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40470
2025-07-10 10:41:26 -04:00
Cecylia Bocovich
1d73e14f34
Rename metrics update functions
This changes the metrics update functions to UpdateProxyStats and
UpdateClientStats, which is more accurate and clear than the previous
CountryStats and RendezvousStats names.
2025-06-24 13:12:10 -04:00
Cecylia Bocovich
78cf8e68b2
Simplify broker metrics and remove mutexes
This is a large change to how the snowflake broker metrics are
implemented. This change removes all uses of mutexes from the metrics
implementation in favor of atomic operations on counters stored in
sync.Map.

There is a small change to the actual metrics output. We used to count
the same proxy ip multiple times in our snowflake-ips-total and
snowflake-ips country stats if the same proxy ip address polled more
than once with different proxy types. This was an overcounting of the
number of unique proxy IP addresses that is now fixed.

If a unique proxy ip polls with more than one proxy type or nat type,
these polls will still be counted once for each proxy type or nat type
in our proxy type and nat type specific stats (e.g.,
snowflake-ips-nat-restricted and snowflake-ips-nat-unrestricted).
2025-06-24 13:12:10 -04:00
Cecylia Bocovich
57dc276e48
Update broker metrics to count matches, denials, and timeouts
Our metrics were undercounting client polls by missing the case where
clients are matched with a snowflake but receive a timeout before the
snowflake responds with its answer. This change adds a new metric,
called client-snowflake-timeout-count, to the 24 hour broker stats and a
new "timeout" status label for prometheus metrics.
2025-03-11 12:36:27 -04:00
meskio
f64f234eeb
New ptuitl/safeprom doesn't have Rounded in the type names
This version fixes the test issue of double registering metrics.

* Closes: #40367
2024-07-11 17:45:57 +02:00
meskio
a9df5dd71a
Use ptutil for safelog and prometheus rounded metrics
* Related: #40354
2024-05-09 16:24:33 +02:00
Michael Pu
b512e242e8 Implement better client IP per rendezvous method tracking for clients
Implement better client IP per rendezvous method tracking for clients

Add tests for added code, fix existing tests

chore(deps): update module github.com/miekg/dns to v1.1.58

Implement better client IP tracking for http and ampcache

Add tests for added code, fix existing tests

Implement GetCandidateAddrs from SDP

Add getting client IP for SQS

Bug fixes

Bug fix for tests
2024-03-09 13:36:25 -05:00
Michael Pu
5f5cbe6431
Prune metrics that are reported for rendezvous
Signed-off-by: Cecylia Bocovich <cohosh@torproject.org>
2024-01-31 14:34:32 -05:00
Anthony Chang
dbecefa7d2
Move RendezvousMethod field to messages.Arg 2024-01-31 14:34:29 -05:00
Michael Pu
26ceb6e20d
Add metrics for tracking rendezvous method
Update tests for metrics

Add rendezvous_method to Prometheus metrics

Update broker spec docs with rendezvous method metrics

Bug fix
2024-01-31 14:34:29 -05:00
David Fifield
6393af6bab
Remove proxy churn measurements from broker.
We've done the analysis we planned to do on these measurements.

A program to analyze the proxy churn and extract hour-by-hour
intersections is available at:
https://github.com/turfed/snowflake-paper/tree/main/figures/proxy-churn

Closes #40280.
2023-10-09 16:16:05 +01:00
meskio
82cc0f38f7
Move the development to gitlab
Related: tpo/anti-censorship/team#86
2023-05-31 10:01:47 +02:00
Shelikhoo
36f03dfd44
Record proxy type for proxy relay stats 2022-09-23 13:08:13 +01:00
Shelikhoo
fa7d1e2bb7
Add distinct IP counter to metrics 2022-06-16 14:58:12 +01:00
Shelikhoo
a4bbb728e6
Fix not zero metrics for 1.3 values 2022-06-16 14:06:58 +01:00
Shelikhoo
dd61e2be0f
Add Proxy Relay URL Metrics Collection 2022-06-16 14:06:57 +01:00
Shelikhoo
b78eb74e42
Add Proxy Relay URL Rejection Metrics 2022-06-16 14:06:57 +01:00
Shelikhoo
7caab01785
Fixed desynchronized comment and behavior for log interval
In 64ce7dff1b, the log interval is modified while the comment is left unchanged.
2022-06-16 14:06:57 +01:00
Shelikhoo
b391d98679
Add Proxy Relay URL Support Counting Metrics Output 2022-06-16 14:06:57 +01:00
meskio
b265bd3092
Make easier to extend the list of known proxy types
And include iptproxy as a valid proxy type.
2022-03-21 19:23:49 +01:00
meskio
4396d505a3
Use tpo geoip library
Now the geoip implmentation has being moved to it's own library to be
shared between projects.
2021-10-04 12:24:55 +02:00
Arlo Breault
7ef49272fa Remove sync.Once from around logMetrics
Follow up to 160ae2d

Analysis by @dcf,

> I don't think the sync.Once around logMetrics is necessary anymore.
Its original purpose was to inhibit logging on later file handles of
metrics.log, if there were more than one opened. See 171c55a9 and #29734
(comment 2593039) "Making a singleton *Metrics variable causes problems
with how Convey does tests. It shouldn't be called more than once, but
for now I'm using sync.Once on the logging at least so it's explicit."
Commit ba4fe1a7 changed it so that metrics.log is opened in main, used
to create a *log.Logger, and that same instance of *log.Logger is passed
to both NewMetrics and NewBrokerContext. It's safe to share the same
*log.Logger across multiple BrokerContext.
2021-05-20 15:39:30 -04:00
Arlo Breault
160ae2dd71 Make promMetrics not a global
Doesn't seem like it needs to exist outside of the metrics struct.

Also, the call to logMetrics is moved to the constructor.  A metrics
instance is only created when a BrokerContext is created, which only
happens at startup.  The sync of only doing that once is left for
documentation purposes, since it doesn't hurt, but also seems redundant.
2021-05-18 20:07:43 -04:00
Cecylia Bocovich
af6e2c30e1 Replace default with custom prometheus registry
The default prometheus registry exports data that may be useful for
side-channel attacks. This removes all of the default metrics and makes
sure we are only reporting snowflake metrics from the broker.
2021-04-26 14:18:50 -04:00
Cecylia Bocovich
2a310682b5 Add new gauge to show currently available proxies 2021-04-26 14:18:50 -04:00
Cecylia Bocovich
92bd900bc5 Implement binned counts for polling metrics 2021-04-26 14:07:55 -04:00
Cecylia Bocovich
83ef0b6f6d Export snowflake broker metrics for prometheus
This change adds a prometheus exporter for our existing snowflake broker
metrics. Current values for the metrics can be fetched by sending a GET
request to /prometheus.
2021-04-22 10:39:35 -04:00
Philipp Winter
5efcde5187
Sort snowflake-ips stats by country count.
We currently don't sort the snowflake-ips metrics:

    snowflake-ips CA=1,DE=1,AR=1,NL=1,FR=1,GB=2,US=4,CH=1

To facilitate eyeballing our metrics, this patch sorts snowflake-ips by
value.  If the value is identical, we sort by string, i.e.:

    snowflake-ips US=4,GB=2,AR=1,CA=1,CH=1,DE=1,FR=1,NL=1

This patch fixes tpo/anti-censorship/pluggable-transports/snowflake#40011
2020-11-27 11:20:40 -08:00
Cecylia Bocovich
3c3317503e Update broker stats to include info on NAT types
As we now partition proxies by NAT type, our stats are more useful if they
capture how many proxies of each type we have, and information on
whether we have enough proxies of the right NAT type for our clients.
This change adds proxy counts by NAT type and binned counts of denied clients by NAT type.
2020-08-24 09:39:17 -04:00
Cecylia Bocovich
06298eec73 Added another lock to protect broker stats
Added another lock to the metrics struct to synchronize accesses to the
broker stats. There's a possible race condition if stats are updated at
the same time they are being logged.
2019-12-05 10:17:20 -05:00
Cecylia Bocovich
94de69aa36 Updated broker specification and comments 2019-11-28 13:52:58 -05:00
Cecylia Bocovich
97554e03e4 Updated proxyType variable name for readability 2019-11-28 13:52:58 -05:00
Cecylia Bocovich
981abffbd9 Add proxy type to stats exported by broker 2019-11-28 13:52:58 -05:00
Shane Howearth
3cfceb3755 Handle generated errors in broker 2019-10-08 10:13:29 -04:00
Cecylia Bocovich
f3be34a459 Removed extraneous log messages
Many of our log messages were being used to generate metrics, but are
now being aggregated and logged to a separate metrics log file and so we
don't need them in the regular logs anymore.

This addresses the goal of ticket #30830, to remove unecessary messages
and keep broker logs for debugging purposes.
2019-09-19 16:48:14 -04:00
Cecylia Bocovich
8f2dc3563b Added a metric that sums available snowflakes
Added another metrics item that counts the total availabel snowflakes
(unique by IP address)
2019-06-25 09:33:45 -04:00
Cecylia Bocovich
f779013b2d Fixed small formatting errors of log output
- removed trailing ","s
- removed unecessary space before seconds
2019-06-14 17:09:06 -04:00
Cecylia Bocovich
0767a637c1 Changed variable names/types to be more reasonable
Also moved the geoip check to occur after we've make sure the proxy IP
hasn't yet been recorded. This is will cut down on unecessary
computation.
2019-06-14 17:00:31 -04:00
Cecylia Bocovich
92d61f2555 Added a comment for the metrics specification 2019-06-12 10:17:55 -04:00
Cecylia Bocovich
fe3356a54d Unit tests for metrics code
Added unit tests for metrics logging. Refactored the logMetrics()
function to allow for easier testing
2019-06-12 10:14:21 -04:00