TL;DR: The current implementation uses a 32K buffer size for a total of 64K of buffers/connection, but each read/write is less than 2K according to my measurements. # Background The Snwoflake proxy uses as particularly hot function `copyLoop` (proxy/lib/snowflake.go) to proxy data from a Tor relay to a connected client. This is currently done using the `io.Copy` function to write all incoming data both ways. Looking at the `io.Copy` implementation, it internally uses `io.CopyBuffer`, which in turn defaults to a buffer of size 32K for copying data (I checked and the current implementation uses 32K every time). Since `snowflake-proxy` is intended to be run in a very distributed manner, on as many machines as possible, minimizing the CPU and memory footprint of each proxied connection would be ideal, as well as maximising throughput for clients. # Hypothesis There might exist a buffer size `X` that is more suitable for usage in `copyLoop` than 32K. # Testing ## Using tcpdump Assuming you use `-ephemeral-ports-range 50000:51000` for `snowflake-proxy`, you can capture the UDP packets being proxied using ```sh sudo tcpdump -i <interface> udp portrange 50000-51000 ``` which will provide a `length` value for each packet captured. One good start value for `X` could then be slighly larger than the largest captured packet, assuming one packet is copied at a time. Experimentally I found this value to be 1265 bytes, which would make `X = 2K` a possible starting point. ## Printing actual read The following snippe was added in `proxy/lib/snowflake.go`: ```go // Taken straight from standardlib io.copyBuffer func copyBuffer(dst io.Writer, src io.Reader, buf []byte) (written int64, err error) { // If the reader has a WriteTo method, use it to do the copy. // Avoids an allocation and a copy. if wt, ok := src.(io.WriterTo); ok { return wt.WriteTo(dst) } // Similarly, if the writer has a ReadFrom method, use it to do the copy. if rt, ok := dst.(io.ReaderFrom); ok { return rt.ReadFrom(src) } if buf == nil { size := 32 * 1024 if l, ok := src.(*io.LimitedReader); ok && int64(size) > l.N { if l.N < 1 { size = 1 } else { size = int(l.N) } } buf = make([]byte, size) } for { nr, er := src.Read(buf) if nr > 0 { log.Printf("Read %d", nr) // THIS IS THE ONLY DIFFERENCE FROM io.CopyBuffer nw, ew := dst.Write(buf[0:nr]) if nw < 0 || nr < nw { nw = 0 if ew == nil { ew = errors.New("invalid write result") } } written += int64(nw) if ew != nil { err = ew break } if nr != nw { err = io.ErrShortWrite break } } if er != nil { if er != io.EOF { err = er } break } } return written, err } ``` and `copyLoop` was amended to use this instead of `io.Copy`. The `Read: BYTES` was saved to a file using this command ```sh ./proxy -verbose -ephemeral-ports-range 50000:50010 2>&1 >/dev/null | awk '/Read: / { print $4 }' | tee read_sizes.txt ``` I got the result: min: 8 max: 1402 median: 1402 average: 910.305 Suggested buffer size: 2K Current buffer size: 32768 (32K, experimentally verified) ## Using a Snowflake Proxy in Tor browser and use Wireshark I also used Wireshark, and concluded that all packets sent was < 2K. # Conclusion As per the commit I suggest changing the buffer size to 2K. Some things I have not been able to answer: 1. Does this make a big impact on performance? 1. Are there any unforseen consequences? What happens if a packet is > 2K (I think the Go standard libary just splits the packet, but someone please confirm). |
||
---|---|---|
broker | ||
client | ||
common | ||
doc | ||
probetest | ||
proxy | ||
server | ||
.gitignore | ||
.gitlab-ci.yml | ||
.gitmodules | ||
.travis.yml | ||
ChangeLog | ||
CONTRIBUTING.md | ||
Dockerfile | ||
go.mod | ||
go.sum | ||
LICENSE | ||
README.md | ||
renovate.json | ||
Vagrantfile |
Snowflake
Pluggable Transport using WebRTC, inspired by Flashproxy.
Table of Contents
Structure of this Repository
broker/
contains code for the Snowflake brokerdoc/
contains Snowflake documentation and manpagesclient/
contains the Tor pluggable transport client and client library codecommon/
contains generic libraries used by multiple pieces of Snowflakeproxy/
contains code for the Go standalone Snowflake proxyprobetest/
contains code for a NAT probetesting serviceserver/
contains the Tor pluggable transport server and server library code
Usage
Snowflake is currently deployed as a pluggable transport for Tor.
Using Snowflake with Tor
To use the Snowflake client with Tor, you will need to add the appropriate Bridge
and ClientTransportPlugin
lines to your torrc file. See the client README for more information on building and running the Snowflake client.
Running a Snowflake Proxy
You can contribute to Snowflake by running a Snowflake proxy. We have the option to run a proxy in your browser or as a standalone Go program. See our community documentation for more details.
Using the Snowflake Library with Other Applications
Snowflake can be used as a Go API, and adheres to the v2.1 pluggable transports specification. For more information on using the Snowflake Go library, see the Snowflake library documentation.
Test Environment
There is a Docker-based test environment at https://github.com/cohosh/snowbox.
FAQ
Q: How does it work?
In the Tor use-case:
- Volunteers visit websites which host the "snowflake" proxy. (just like flashproxy)
- Tor clients automatically find available browser proxies via the Broker (the domain fronted signaling channel).
- Tor client and browser proxy establish a WebRTC peer connection.
- Proxy connects to some relay.
- Tor occurs.
More detailed information about how clients, snowflake proxies, and the Broker fit together on the way...
Q: What are the benefits of this PT compared with other PTs?
Snowflake combines the advantages of flashproxy and meek. Primarily:
-
It has the convenience of Meek, but can support magnitudes more users with negligible CDN costs. (Domain fronting is only used for brief signalling / NAT-piercing to setup the P2P WebRTC DataChannels which handle the actual traffic.)
-
Arbitrarily high numbers of volunteer proxies are possible like in flashproxy, but NATs are no longer a usability barrier - no need for manual port forwarding!
Q: Why is this called Snowflake?
It utilizes the "ICE" negotiation via WebRTC, and also involves a great abundance of ephemeral and short-lived (and special!) volunteer proxies...
More info and links
We have more documentation in the Snowflake wiki and at https://snowflake.torproject.org/.
-- Android AAR Reproducible Build Setup --
Using gomobile
it is possible to build snowflake as shared libraries for all
the architectures supported by Android. This is in the .gitlab-ci.yml, which
runs in GitLab CI. It is also possible to run this setup in a Virtual Machine
using vagrant. Just run vagrant up
and it will
create and provision the VM. vagrant ssh
to get into the VM to use it as a
development environment.
uTLS Settings
Snowflake communicate with broker that serves as signaling server with TLS based domain fronting connection, which may be identified by its usage of Go language TLS stack.
uTLS is a software library designed to initiate the TLS Client Hello fingerprint of browsers or other popular software's TLS stack to evade censorship based on TLS client hello fingerprint with -utls-imitate
. You can use -version
to see a list of supported values.
Depending on client and server configuration, it may not always work as expected as not all extensions are correctly implemented.
You can also remove SNI (Server Name Indication) from client hello to evade censorship with -utls-nosni
, not all servers supports this.