#tor-meeting log

16:03:07 <onyinyang> #startmeeting tor anti-censorship meeting
16:03:07 <MeetBot> Meeting started Thu Sep 12 16:03:07 2024 UTC.  The chair is onyinyang. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:03:07 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:03:07 <onyinyang> hello everyone!
16:03:07 <onyinyang> here is our meeting pad: [https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469](https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469)
16:03:15 <meskio> now yes, hello :)
16:03:16 <onyinyang> sorry for the delay everyone >.<
16:03:35 <meskio> no prob, just a couple of minutes, we are setting very high standards on puntuality
16:03:37 <shelikhoo> hi~
16:03:42 <cohosh> hi
16:06:14 <onyinyang> ok so first we have an announcement:     https://bridges.torproject.org and moat are now served by rdsys
16:06:23 <shelikhoo> yeah!
16:06:27 <onyinyang> yay!
16:06:42 <meskio> the email distributor is still having issues, and we did roll back to bridgedb until we solve them
16:06:47 <meskio> but moat and https are migrated
16:06:54 <meskio> almost there to deprecate bridgedb
16:07:18 <onyinyang> nice work
16:07:25 <shelikhoo> nice!
16:08:13 <onyinyang> next we have some discussion points
16:08:41 <onyinyang> The first one is: shadow integration for snowflake catching some good stuff
16:09:01 <onyinyang> I believe this is from cohosh
16:09:41 <cohosh> i wanted to bring up the usefulness of this kind of test, especially since we're doing a lot more dependency upgrades these days
16:10:32 <cohosh> the breaking webrtc dependency update was almost by accident, i haven't confirmed this but shadow supports a limited number of system calls and socket options that sometimes overlaps with older versions of android and i think that's what's causing the tests to fail
16:10:57 <meskio> yeah, that reminds me I need to bring back the integration tests into rdsys, those are very useful
16:11:08 <cohosh> but anyway, it was sort of a trial run to do this with snowflake and see if it's useful and it does seem useful enough to push for using it for other projects
16:11:56 <meskio> yes
16:12:01 <cohosh> i'm not sure i have anything more to add
16:12:23 <shelikhoo> just wants to say I integration test real useful
16:12:31 <shelikhoo> just wants to say I find integration test real useful
16:12:45 <shelikhoo> eof
16:13:59 <onyinyang> ok nice, thanks cohosh
16:14:18 <onyinyang> next discussion point is: https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/315#note_3055755
16:14:18 <onyinyang> A regression is required for snowflake with this UDP-like transport mode merge request
16:14:18 <onyinyang> should release a version before it is merged
16:14:31 <shelikhoo> yes, this is from shell
16:14:58 <shelikhoo> the current situation is that for simplicity, tcp like transport mode is being removed from snowflake in this merge request
16:15:33 <dcf1> this may be a misunderstanding
16:16:03 <shelikhoo> and one of the API we have for the previous version of snowflake no longer makes sense, as a mandatory field for current UDP-like version of the snowflake does not present in the previous version of the snowflake
16:16:13 <dcf1> For the purposes of review, I wanted this merge request !315 to ignore backward compatibility issues
16:16:43 <dcf1> that is, to *replace* the current TCP-like mode with the new UDP-mode, with no option for version negotiation or anything like that
16:17:20 <dcf1> When this new feature is merged, it will retain compatibility with the old TCP-like mode, but that is a complication I wanted to separate from the basic functionality of the new UDP-like mode
16:18:16 <dcf1> My idea was that we could set up a temporary, independent, testing installation, based on the changes of !315, to test performance and think thoughtfully about the code changes, before working on making it backward compatible with the existing deployment
16:19:11 <shelikhoo> yes, the misunderstanding I currently have with ignoring compatibility issues is that the goal of final design must be already present, when developing an intermediary product
16:19:31 <shelikhoo> otherwise it encounter issue when actually trying to add the compatibility back
16:19:33 <dcf1> no, I disagree on that point, and that is the kind of thinking I want to avoid
16:20:08 <shelikhoo> okay, I will just go ahead with ignoring all these no regression issue as well for review purpose
16:20:31 <shelikhoo> and find a way to fix them later
16:20:45 <dcf1> the central issue is the design questions around the UDP-like mode. as a reviewer, I want to see exactly what changes happen, for example
16:20:48 <dcf1> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/315/diffs#f29769643d0f510550b42e7b91446bbb901b6991_238_249
16:21:21 <dcf1> which changes `ordered` from `true` to `false`. I want to be able to see exactly those changes, not obscured with newly added if/else statements
16:21:30 <dcf1> to help let you know where I am coming from
16:21:44 <shelikhoo> yes.. okay I understood this part now
16:22:01 <shelikhoo> for that change I am reverting it after reading the source code of kcp
16:22:12 <shelikhoo> but anyway I understand what is happening with the review now
16:22:12 <dcf1> so yes, from my point of view, please treat this merge request as an incompatible upgrade, as if we were going to start a new independent deployment. then we will add in the if/else afterwards.
16:22:41 <shelikhoo> yes! I will do that and ignore any regression issue for now
16:22:42 <shelikhoo> over
16:23:09 <dcf1> yeah! and I think the MR is already in a pretty good state where the design issues are almost settled and we can start doing the backward compatibility
16:23:12 <dcf1> thank you
16:23:27 <onyinyang> anything more on this topic?
16:23:33 <shelikhoo> eof from me
16:24:01 <onyinyang> ok great. We have a few interesting links
16:24:45 <onyinyang> we don't always talk about all of them so I guess if anyone wants to discuss anything in particular, now's your chance!
16:25:02 <WofWca[m]> Sorry to come in unannounced, but there is a significant drop in unrestricted proxies, according to metrics.... (full message at <https://matrix.org/oftc/media/v1/media/download/AWnPFVSD0HMie05bbjYS9kaQjXshA0VLFecpOVIuuMcRFpFmDYE-19V54338MDUxBpaOpO1A8mpUmtIlNNs_XhNCeR6MTgwAAG1hdHJpeC5vcmcvaGpzSFZoeE94ZUZFTGtncW5CbGtDd3ph>)
16:25:23 <meskio> WofWca[m]: you are welcome to come to this meetings :)
16:27:32 <cohosh> thanks for keeping an eye on this WofWca[m]
16:27:50 <shelikhoo> unsure why matrix love to cut the message and show a longer link.. but yes, I think many thing could have happened with the nat type test
16:27:58 <dcf1> ❤️ WofWca[m] shelikhoo thank you for your record keeping around the deployment, it really helps for things like this
16:28:18 <dcf1> https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40383
16:28:32 <cohosh> i haven't had a chance to catch up with the NAT testing discussions that have been happening over last weekend and this week
16:28:33 <shelikhoo> yes! I try to keep a public journal of server actions
16:30:42 <shelikhoo> I think there could be many reasons why it failed, but for candidcate gathering, one possibility is that when there is a packet loss it would take some time before retry
16:30:47 <cohosh> i just checked the prometheus metrics and our unrestricted proxy capacity looks pretty good
16:31:13 <dcf1> So it's probably a safe default to revert the deployment, and then regroup and think about what might be going wrong, if there's something going wrong
16:31:32 <dcf1> cohosh: if you're looking at prometheus, that might be more robust than the broker /metrics.
16:31:54 <shelikhoo> but otherwise, the nat type for the remote end does not matter for the time it takes to gathercandidcate for self
16:32:07 <shelikhoo> but it would matter when it comes to connecting with the remove peer
16:32:27 <shelikhoo> but it would matter when it comes to connecting with the remote peer
16:32:28 <cohosh> i see a slight drop in idle polls for unrestricted proxies, but it's not half of what it was
16:33:20 <cohosh> it's honestly barely noticable, i'm looking at the past 7 days
16:33:38 <cohosh> polls != unique IPs, but I'd expect them to be related
16:34:26 <shelikhoo> I think one thing we could do is to increase the timeout to 60 seconds and deploy again
16:34:32 <shelikhoo> and then observe consequences
16:34:49 <shelikhoo> unless someone could directly find out the root cause of the poll drop
16:35:05 <shelikhoo> or we wish to ignore the drop
16:37:13 <dcf1> If it doesn't look like it's urgent, we could just watch it over the weekend and fix a time early next week to reassess
16:37:26 <cohosh> that would be my suggestion
16:37:37 <meskio> +1
16:37:40 <cohosh> while we're here, WofWca[m] thanks for all the snowflake contributions lately :)
16:37:48 <shelikhoo> yes
16:38:02 <shelikhoo> thank you WofWca[m]!!!
16:38:03 <cohosh> some of them are very large, like the changes to distributed snowflake server support, and i think we should carve out time to discuss them at a meeting
16:38:24 <cohosh> i haven't had time to sit down and really think about them yet, but maybe next week or the week after if enough people are around
16:38:31 <WofWca[m]> Glad I'm welcome!
16:38:31 <WofWca[m]> Thanks for your viewing, And for making it so easy to work with in the first place,!
16:38:48 <cohosh> they are the kind of changes that should have multiple eyes on them, not just one reviewer
16:39:12 <WofWca[m]> cohosh: Sure, let's do that sometime.
16:39:29 <cohosh> :)
16:40:22 <onyinyang> thanks for bringing this up WofWca[m] and nice to see you here, feel free to come again anytime :)
16:40:38 <onyinyang> ok let's move on to the reading group
16:40:49 <meskio> just a short notice on the links
16:40:58 <onyinyang> oh ok, sure
16:41:01 <meskio> the passport link sounds interesting as an idea for a signaling channel
16:41:05 <meskio> I added it to the wiki
16:41:12 <meskio> https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Signaling-Channels
16:41:15 <onyinyang> nice
16:41:18 <dcf1> I agree, I was thinking that for a signaling channel.
16:41:20 <shelikhoo> yes! nice!
16:41:27 <meskio> dcf1: thanks for bringing it
16:41:45 <meskio> eof
16:41:47 <cohosh> oh nice
16:42:38 <cohosh> is that a papers, please reference lol
16:42:47 <meskio> XD
16:43:55 <onyinyang> speaking of papers. . .
16:44:07 <dcf1> No presentation video of SpotProxy yet
16:44:09 <dcf1> https://www.usenix.org/conference/usenixsecurity24/presentation/kon
16:44:13 <onyinyang> today's reading group is "SpotProxy: Rediscovering the Cloud for Censorship Circumvention "
16:44:32 <dcf1> onyinyang: do you want me to give a brief intro?
16:44:38 <onyinyang> thanks for providing the link dcf1
16:44:41 <dcf1> or you have one?
16:45:02 <meskio> yes, please, give an intro
16:45:20 <onyinyang> sure. I forgot the reading group was today but read the paper a couple of months ago. my summary might be a bit hazy
16:45:20 <dcf1> the idea here is to run circumvention proxies on "spot VMs" in the cloud
16:45:45 <dcf1> What a "spot VM" means is that you get relatively inexpensive VPS service, but it can be taken away at any moment according to load
16:46:02 <dcf1> that's why they are cheaper, you don't have any guarantee of permanance
16:46:31 <dcf1> but of course, impermanent proxies are no problem when you build on an infrastructure like Snowflake that expects it
16:46:59 <dcf1> and that's why they do in this paper, they build on cheap, unreliable spot VMs, with implementations using Snowflake and Wireguard
16:47:34 <dcf1> (recall Wireguard effectively has its own "turbo tunnel" reliability layer, because it works at a layer lower in the stack, effectively the TCP stacks at both ends are doing what KCP does in snowflake)
16:49:05 <dcf1> The architecture (Figure 1) is pretty similar to what we're used to with Snowflake, they have an analog to the broker called the Controller, and an analog to the bridge that consolidates all the ephemeral proxy instances into a reliable central connection
16:49:43 <dcf1> one of the cool things about this paper is the "active migration". It turns out, you get a little bit of notice (around 1 minute, I think), when your spot VM is going to go away
16:50:22 <dcf1> using that little bit of advance notice, they do another rendezvous (in effect) *inside* the already established channel, which avoids the need to re-rendezvous through an actual rendezvous channel as we do in snowflake
16:50:55 <dcf1> the other part of this paper I found interesting was the optimization they perform to try and maintain use of the cheapest spot VMs
16:51:09 <dcf1> eof
16:51:11 <shelikhoo> I think with unreliable packet based proxy like wireguard the things around making the proxy with network interruption tolerance is easier
16:51:11 <shelikhoo> The part I find unrealistic about this paper is about the cost reduction. I think the spot server only need to forward traffics and does not need to be a top tier server at all. So a smaller server would still work. As with most proxy system, the traffic cost would be the primary cost of operating it.
16:51:49 <shelikhoo> there is another paper about reducing the cost of traffic with source ip address forging
16:51:50 <dcf1> Yes, they say in Section 11.1 "egress fees dominate"
16:52:19 <dcf1> btw this is the spotproxy source code
16:52:20 <dcf1> https://github.com/spotproxy-project/spotproxy
16:52:27 <dcf1> and this is their hacked snowflake
16:52:30 <dcf1> https://github.com/unknown-cstdio/iceball
16:52:47 <meskio> I wonder how much is really needed the 'rendezvous in the channel', as in is it that slower to do a re-rendezvous?
16:53:05 <meskio> if we didn't need the re-rendezvous the relocator will not be needed for the snowflake case
16:53:17 <shelikhoo> personally I think spot vm would be quite useful when it comes to getting more unrestricted proxies when there is a spike in usage
16:53:18 <meskio> but yes, I agree I'm not sure the cost is that much reduced
16:53:38 <meskio> they only talk about AWS and the likes that charge you per network transfer
16:53:43 <dcf1> meskio: for me it's a question of potential detectability. hitting the rendezvous channel as often as we do in snowflake is probably a bit unusual.
16:53:51 <meskio> usually is cheaper to host bridges/proxies in unmetered VMs
16:54:18 <shelikhoo> yes, unmetered vm sometime have worse network performance with China
16:54:19 <meskio> dcf1: I see, that makes sense
16:54:30 <shelikhoo> as they are not paying the paid peering cost with Chinese ISPs
16:54:38 <dcf1> shelikhoo: yeah, even without the cost optimization stuff, and the active migration stuff, it would be easy to run snowflake proxies on spot VMs that are 100% compatible with the existing deployment
16:55:25 <shelikhoo> I am unsure if we are allowed to operate snowflake proxy in this way, but I do think we could prepare a script to do so
16:55:54 <dcf1> we have the idea of multiplexing over multiple snowflake simultaneously -- even without advance notice of a snowflake proxy going away, we could do something like the "active migration" to minimize the number of rendezvouses and the handoff time between proxies
16:55:58 <dcf1> like:
16:56:07 <dcf1> 1. do a normal rendezvous, acquire snowflake proxy A
16:56:20 <dcf1> 2. using the connection already established with proxy A, request another proxy B
16:56:47 <dcf1> 3. use just proxy A (or use A and B at the same time, whatever), and as soon as either of A or B disconnects,
16:57:01 <dcf1> request another proxy C through the still existing WebRTC channel.
16:57:22 <dcf1> The only time you would need to do another full rendezvous would be if both of your current proxies happen to disconnect at the exact same time.
16:57:54 <cohosh> i like this idea
16:57:58 <meskio> +1
16:58:04 <shelikhoo> yes! nice!
16:58:04 <dcf1> There actually was such a feature in flash proxy, I believe: hold onto 2 proxies at a time, use just one of them, switch to the other when that one disconnects
16:59:06 <meskio> spotproxy could be an interesting tool to have around ready if censors start blocking unrestricted proxies, as those might be easier to list and more stable in IPs
16:59:29 <meskio> I mean modified to provide proxies to the current network
16:59:35 <dcf1> yes
16:59:59 <shelikhoo> we could have a issue about creating a script to install and run snowflake-proxy in a fully automated way
17:00:20 <shelikhoo> which can be supplied with cloudinit or automated ssh connection
17:01:50 <meskio> yes, would be interesting to evaluate if we can reuse things from spotproxy, the cost arvitrage could be useful
17:02:10 <meskio> and the respawning of proxies when they get removed
17:02:21 <meskio> so maybe we don't want to rewrite it from scratch
17:02:30 <meskio> (I haven't checked their code, so not sure)
17:02:59 <dcf1> right. the other nice advantage (for us) of spot VMs is you naturally get changing IP addresses.
17:03:52 <dcf1> Oh, here's their reference [6] on graceful shutdown of spot VMs on Google Cloud. "Spot VMs terminate 30 seconds after receiving a termination notice."
17:03:55 <dcf1> https://cloud.google.com/kubernetes-engine/docs/concepts/spot-vms#termination-graceful-shutdown
17:03:56 <theodorsm> Regarding holding 2 proxies, do we have any stats for the average number of available proxies per day? I can only remember seeing the usual average users per day. Do we have an idea of how many available proxies there usually is per user?
17:04:00 <meskio> yes, in the paper they mention to activelly rotate them, but I'm not sure this is really needed as censors tend to be slow updating their block lists
17:04:09 <dcf1> meskio: I tend to agree.
17:04:37 <shelikhoo> the fastest time china block a website is about 15 minutes
17:04:38 <cohosh> theodorsm: we do! we have some static metrics on this and some prometheus metrics, that are mostly useful if you scrape the endpoing consistently
17:04:58 <cohosh> https://snowflake-broker.torproject.net/metrics
17:05:12 <cohosh> https://snowflake-broker.torproject.net/prometheus
17:05:12 <theodorsm> Cool, thanks!
17:05:34 <dcf1> theodorsm: do you mean like this, this is number of unique proxy IP addresses per day, which doesn't quite translate to "proxies per client"
17:05:35 <shelikhoo> the fastest time a censor from  china block a website is about 15 minutes
17:05:37 <dcf1> https://www.bamsoftware.com/papers/snowflake/#proxies
17:05:37 <cohosh> i wonder if this kind of system would go well with the push notification work
17:05:58 <cohosh> which is a unidirectional channel that can send updates to clients
17:06:15 <shelikhoo> cohosh: the client should already have a snowflake connection
17:06:25 <shelikhoo> so push update channel is kind of secondary here
17:06:35 <shelikhoo> I think
17:06:43 <dcf1> cohosh: what information is being pushed? A new proxy identity?
17:06:46 <cohosh> shelikhoo: i'm thinking more about as an extension to irl's work on rotating obfs4 bridge IPs
17:06:55 <dcf1> aha
17:06:57 <shelikhoo> oh, that is true!
17:07:02 <cohosh> not related to snowflake
17:07:03 <shelikhoo> cohosh you are right
17:07:17 <cohosh> the main problem with that was how to get the updated bridges to people
17:07:31 <cohosh> and this would be even more agile, but push notifications might make it possible
17:07:53 <dcf1> that's a good idea
17:07:57 <onyinyang> I was reading the paper with my lox hat on since the paper cites Lox as a solution for limiting bridge enumeration
17:07:57 <onyinyang> but Spotproxy as is, wouldn't really work with Lox directly.
17:07:57 <onyinyang> I thought it might be an interesting system to explore build an anonymous reputation through  proxy usage though
17:08:07 <onyinyang> *building
17:09:10 <theodorsm> dcf1: not specifically, more if there was any stats that could indicate if using more proxies per users would be feasible. "Is there enough proxies?" kinda
17:09:15 <irl> Installing and running snowflake proxy on cloud VMs is already something I was planning to add to dynamic bridges
17:09:28 <cohosh> oh hey irl! it's been a while :)
17:09:46 <irl> I am in the having ideas phase of writing a concept note for funding to do dynamic bridges dev because it’s been a while and things need some updates
17:10:09 <irl> Would be happy to sync about this so we all go in the same direction
17:10:10 <cohosh> awesome, you might really like this paper we're discussing then
17:10:42 <shelikhoo> yes... onyinyang I think there will need to be some update, I mean the point is lox is prevent bridge being block, but with dynamic proxy we won't worry about bridge being blocked that much anymore
17:10:52 <onyinyang> yeah exactly
17:11:46 <cohosh> well i think it's interesting to think about this in the context of lox though too, because a fresh set of bridge IPs to replace blocked bridges is important
17:11:56 <onyinyang> I'm not sure if trying to build reputation makes sense, but maybe it's something like: you only get new proxies if you've actually used old ones
17:12:01 <cohosh> my understanding is that lox slows down the rate that they're blocked, but doesn't prevent it
17:12:27 <cohosh> and maybe the rate at which bridges exit and leave the network is good enough, but maybe not with open registrations
17:12:48 <onyinyang> this is something we considered with lox, but didn't want to focus on building reputation through bridge operators because that would be a big ask for bridge oeprators to support
17:12:49 <meskio> having dinamic bridges as level 0 bridges in lox might be useful
17:12:53 <onyinyang> but here that is not a constraint
17:12:54 <meskio> as those will be the ones more blocked
17:13:15 <meskio> and might be ok for them to be slower
17:14:43 <onyinyang> anyway, this is pretty orthogonal to what is more immediately possible with snowflake-like applications
17:15:09 <meskio> yep
17:15:11 <onyinyang> and we're also about ~15 min overtime, so maybe we should end the discussion here with plans to follow up on some of these ideas!
17:15:17 <cohosh> having the inside knowledge too that the IP address changed could really speed up replacement in lox, since we already have a notion of that, and it's difficult to know why a bridge is unreachable (went away vs was blocked)
17:15:40 <onyinyang> 100%
17:16:14 <shelikhoo> eof from me
17:16:36 <onyinyang> thanks everyone for the great discussion today!
17:16:37 <onyinyang> #endmeeting