16:00:20 <onyinyang> #startmeeting tor anti-censorship meeting 16:00:21 <MeetBot> Meeting started Thu Apr 3 16:00:20 2025 UTC. The chair is onyinyang. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:21 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:21 <onyinyang> hello everyone! 16:00:21 <onyinyang> here is our meeting pad: [https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469](https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469) 16:00:26 <meskio> hello 16:00:27 <shelikhoo> hi~hi~ 16:01:08 <cohosh> hi 16:04:26 <onyinyang> ok, let's get started 16:04:46 <onyinyang> the first discussion point is: Give @renovate-bot guest access to anti-censorship group for it to use dependency proxy src shell 16:06:01 <meskio> we discussed it last week 16:06:05 <meskio> shelikhoo: did you enable it? 16:06:10 <onyinyang> ahh ok, sorry 16:06:10 <meskio> if so it looks like is not that noisy 16:06:28 <shelikhoo> it was done 16:06:30 <meskio> I was worried that this will trigger renovate to start sending MRs in all our repos 16:06:31 <shelikhoo> and from last week 16:06:42 <meskio> nice, then I guess we are fine with it 16:07:11 <onyinyang> ok cool, let's move on to the next point then 16:07:14 <shelikhoo> yes 16:07:27 <onyinyang> That's mine: Should we move the amp library in snowflake to ptutil? 16:07:50 <onyinyang> We discussed this at some point but I don't remember if we came to a conclusion 16:08:25 <onyinyang> we are enabling registration through amp cache for conjure though and it is currently pointing to the snowflake amp library 16:08:26 <meskio> I recall talking about it in private and there were some concerns on bringing too many dependencies on ptutil 16:08:32 <onyinyang> which is fine, but a bit strange 16:08:41 <meskio> and if it was worth it to create it's own library for signaling channels 16:08:49 <onyinyang> yes, I think this was the discussion 16:09:55 <onyinyang> I think it would be nice to have a signalling channel library ofc :) 16:10:19 <meskio> sounds good to me 16:10:33 <shelikhoo> yeah. I agree with having a signaling library 16:10:37 <shelikhoo> that would be nice 16:11:26 <onyinyang> ok, so we will leave it in snowflake for now 16:11:46 <onyinyang> and put "build a signalling library with all the things" on our roadmap 16:11:52 <onyinyang> I guess XD 16:12:37 <meskio> I think is fine to start with ampcache 16:12:51 <meskio> and create a simple signaling library with that and grow it as needed 16:13:05 <onyinyang> yeah, that makes sense 16:14:18 <shelikhoo> yes.... 16:14:44 <onyinyang> ok. Is there anything else to discuss today before the reading group discussion? 16:15:12 <shelikhoo> nothing from me 16:16:37 <onyinyang> ok, let's move on to the reading group then 16:16:56 <dcf1> I invited some people to the reading group discussion, I don't know if they are here. 16:17:00 <onyinyang> Today we're discussing Differential Degradation Vulnerabilities in Censorship Circumvention Systems https://arxiv.org/abs/2409.06247 16:17:05 <dcf1> The Protozoa authors, and the authors of this paper. 16:17:15 <onyinyang> oh nice! 16:17:28 <shelikhoo> nice! 16:17:31 <dcf1> Zhen replied to say they couldn't make it to the discussion, but I will send a transcript afterward. 16:17:40 <dcf1> I didn't hear from Vitaly. 16:18:00 <meskio> :) 16:18:32 <meskio> do someone wants to start with a summary? 16:18:46 <dcf1> If no one has one prepared, I will type one quick. 16:18:57 <meskio> that will be nice, thanks 16:19:26 <dcf1> This paper presents a class of attacks called "differential degradation" attacks and demonstrates them against Snowflake and Protozoa. 16:20:27 <dcf1> The key idea is that even if a circumvention protocol is "behaviorally independent" in the sense of "Learning to Behave", there are attacks that can degrade the performance of the tunneled traffic, 16:21:15 <dcf1> even though its external behavior may remain the same as a normal instance of the cover protocol, and (especially) even when a normal instance of the cover protocol is less affected by the degradation (this is the "differential" part). 16:22:11 <dcf1> They make a claim: "detection is not necessary for blocking". This means that it may be possible to attack all instances of some protocol equally, without trying to distinguish which are covert and which are not, 16:22:43 <dcf1> and the natural channel properties will cause the covert tunnels to suffer more (even if they remain externally indistinguishable; i.e., behaviorally independent). 16:23:14 <dcf1> The attack on Snowflake, however (and IMO), doesn't really fit into the "differential degradation" class. 16:23:52 <dcf1> They just block WebRTC data channels -- but to their credit they show that blocking WebRTC data channels doesn't affect the usability of a selection of other WebRTC applications. 16:24:22 <dcf1> The attack on Protozoa involves parsing SRTP packets to identify video frame boundaries, and selectively dropping video frames. 16:24:48 <dcf1> Under this attack, normal WebRTC video keeps working ok, while the Protozoa tunnel slows a lot. 16:25:12 <dcf1> They make another strong claim (Section 7.1) that *no* existing cover application can satisfy all three of: 16:25:17 <dcf1> 1. secure against differential degradation 16:25:20 <dcf1> 2. high-capacity 16:25:24 <dcf1> 3. doesn't create new protocol channels 16:26:00 <dcf1> They say that Snowflake and Protozoa satisfy 2 and 3 but not 1; Balboa satisfies 1 and 2 16:26:27 <dcf1> The create a new protocol which they call Ciliate, which satisfies 1 and 3 but not 3. 16:26:45 <dcf1> Ciliate is like Protozoa but it works over TLS, specifically RTMPS. 16:27:09 <dcf1> The reason it doesn't satisfy 3 is that you need to create two video streams, one in each direction, which is abnormal for RTMPS. 16:27:21 <dcf1> End summary 16:28:06 <meskio> I think is very interesting the concept of blocking without detecting 16:28:09 <cohosh> thanks! i thought this paper was interesting, though i'm skeptical of some of the claims 16:28:12 <dcf1> I have a few particular discussion points but anyone go ahead. 16:28:18 <meskio> we kind of already see this in china with the UDP packet drops 16:28:24 <meskio> that make snowflake unusable 16:28:33 <meskio> but problably they are not targeting snowflake 16:28:36 <shelikhoo> I think for snowflake we need to admit that snowflake is more like web-torrent traffic rather than web-conferencing traffic 16:29:01 <cohosh> yes true, i'm so curious about how disporportionate the effects of packet loss are on snowflake vs other webrtc datachannels 16:29:13 <meskio> good point, I wonder why they didn't look into web-torrent, maybe is not common enough 16:29:30 <dcf1> Yeah. On my first reading, I misunderstood the point. I thought the paper was claiming that a protocol can be behaviorally independent with regard to a passive adversary, but not behaviorally independent against an active adversary. 16:30:12 <dcf1> But it's actually more sophisticated and interesting than that: they're saying that *even if* the protocol remains behaviorally independent under active attack, the traffic in the tunnel could be unequally affected in a way that makes it unusable. 16:30:55 <cohosh> that was one of the claims i'm not sure about, have we seen blocking of just webrtc data channels? 16:31:05 <dcf1> They did identify a couple of applications that *did* break when data channels were blocked, but they were relatively obscure. 16:31:28 <shelikhoow> I think there was some analysis on how udp based traffic reacts to packet loss, like comparing hysteria2 and quic 16:31:36 <dcf1> Section 5.4: Snapdrop, Sharedrop, FilePizza. 16:32:02 <meskio> AFAIK PeerTube uses web-torrent, and is not very obscure 16:32:03 <cohosh> if we haven't, it doesn't necessarily mean that censors are unwilling to block it, but it does make me skeptical of how easy it is and how trivial the tradeoff is for censors 16:32:19 <shelikhoow> but they are not in this paper 16:32:47 <shelikhoow> but https://www.petsymposium.org/foci/2025/foci-2025-0001.php 16:32:55 <dcf1> cohosh: Yes, to me this was a big point that I would have liked to see engaged with more, the lack of blocking of WebRTC data channels in the real world. 16:33:16 <cohosh> i'm curious about their methodology for finding data channel applications 16:33:27 <dcf1> They say (Section 5.2) "This attack is thus available even to very weak real-world censors". If that's so, why don't they do it? 16:33:29 <cohosh> this seems like it would be a good use case for those university network tap papers we've seen 16:33:44 <dcf1> Or the bigger following question: What do the censors know that we don't? 16:34:58 <meskio> it can also be that censors didn't think data channels were so easy to block 16:34:59 <cohosh> meskio: ah good point, i hadn't realized peertube was webtorrent 16:35:10 <dcf1> And this point stands out especially, because they emphasize a mismatch between research and practice, saying (and citing "Grounding circumvention in empiricism") that censors prefer single-packet and single-flow attacks, not sophisticated multi-flow or traffic analysis attacks 16:35:35 <dcf1> in order to underscore the practicality of differential degradation attacks. 16:36:14 <dcf1> To highlight a mismatch of research and practice, but then to posit an apparently trivial attack that nevertheless hasn't been observed in practice (to our knowledge), stands out. 16:36:50 <dcf1> meskio: Yes, that is another possibility, that censors haven't invested the time or knowledge to know that it is in fact as easy to perform as is claimed. 16:36:54 <shelikhoow> personally I don't fully agree with the point that censor does not conduct "sophisticated multi-flow or traffic analysis attacks" at all, it is more of a bias introduced by the limitations of researchers to discover such censor behaviour 16:38:02 <shelikhoow> like the slow reacting ip,port blocking system that is active in China but to be best of my knowledge is not documented by well known paper 16:38:38 <shelikhoow> since the block will observe the traffic for a long time, and then block ip address, it is difficult to run controlled experiment 16:39:13 <shelikhoow> even if it is widely reported by users to exist to some degree 16:39:52 <cohosh> yeah we also never narrowed down the cause of the temporary snowflake block in russia either: https://github.com/net4people/bbs/issues/422 16:40:03 <dcf1> Regardless of censors' real preferences and capabilities, the fact that this paper has implementations of the attacks is a point in its favor. 16:40:55 <shelikhoow> as for data channel, yes it is true that our usage of webrtc is not perfectly blending it it into web-conferencing traffic 16:40:56 <dcf1> The Protozoa attack presumably has a fair amount of source code, which I'm not sure is online, but the Snowflake one just takes some netfilter rules (Figure 6) 16:40:58 <cohosh> yeah i was so impressed by the implementation work here 16:42:25 <meskio> yes is impresive how simple it is 16:42:46 <meskio> I think the paper is a nice read 16:43:10 <cohosh> yeah i agree, i enjoyed reading this 16:43:17 <shelikhoow> yes! 16:43:20 <dcf1> In any case, I'm not too worried about data channel attacks in Snowflake, as, though it's not trivial to change, conceptually snowflake can run over media streams like TorKameleon. 16:43:26 <dcf1> Or https://archive.torproject.org/websites/lists.torproject.org/pipermail/anti-censorship-team/2023-February/000284.html 16:43:43 <cohosh> for the protozoa attack, i would be curious to see how just naively applying existing throttling techniques would affect the usability of circumvention tools 16:43:47 <onyinyang> Yes, I also thought it was nice to read. I'm glad I'm in this reading group so I can hear more informed/critical takes than I came up with on my own XD 16:43:49 <cohosh> like the throttling russia has done 16:44:18 <dcf1> cohosh: Yes, and it was really interesting that they say straightforward whole-packet dropping does *not* work against Protozoa. 16:44:46 <dcf1> 6.2: "The naive approach ... is to introduce high packet losses. It does not work against Protozoa." 16:44:47 <cohosh> yeah that surprised me 16:45:01 <dcf1> Which is why they had to resort to parsing the STRP and finding video frames. 16:46:07 <dcf1> It would also have been interesting to see whether there are simple countermeasures that Protozoa could deploy to defend against the attack. I.e., is it really fundamental, or did they just find a current weak point? 16:46:41 <dcf1> I'm a big skeptical of the "can't have all three properties" claim, it doesn't seem well motivated with a survey or similar. 16:47:15 <dcf1> If it were possible to modify Protozoa to resist this particular attack, will there always be another attack waiting, or would that make Protozoa satisfy all 3 requirements and disprove the claim? 16:48:18 <shelikhoow> dcf1: I think for Protozoa to resist this attack, it must have packet loss compensation, which will have a different packet loss reaction compared to video stream 16:48:21 <cohosh> agreed, i can't think of a fundamental reason why this tradeoff would hold 16:48:26 <shelikhoow> let's say hysteria2 16:48:34 <dcf1> The most thought-provoking claim of the paper, to me, is "detection is not necessary for blocking". Of all of it, this is the point I would like to see more developed. 16:49:07 <meskio> agree 16:49:12 <dcf1> shelikhoow: no, protozoa uses traffic replacement, its external behavior will still be indistinguishable. the attacks in this paper are not to induce different external behavior, it's to make the channel inside unusable (if I understand it right). 16:49:39 <dcf1> The "detection is not necessary for blocking" makes me think of throttling, which you can view through that lens. 16:49:59 <shelikhoow> yes... so the sending rate is deterministic... 16:50:09 <dcf1> Throttling affects everything equally, and it is only because of the nature of different types of applications that some remain more or less usable under throttling, and some beomce unusable. 16:50:34 <dcf1> Throttling may simply be a crude and yet somewhat effective type of differential degradation attack. 16:50:59 <dcf1> Or, you could view things like the Protozoa frame dropping as being a more refined version of throttling. 16:51:33 <dcf1> But: then I got really confused when, in other places, they cite attacks that, to me, are clear examples of detection, and they claim those attacks do not count as detection. 16:51:54 <dcf1> Section 3.3: "For the recent blocking of fully encrypted channels, GFW uses a even cheaper technique: it blocks all traffic that fails a single-flow entropy test (with a few exemption rules). This effectively blocks VMess, obfs4, and Shadowsocks with few false positives [50]. This attack is a pure blocking attack that does not require detection, and a real-world example of a channel mismatch." 16:52:24 <dcf1> And Section 7.2, citing the same GFW fully encrypted traffic attack: "Recent studies [50] show that detection-free disruption has already been adopted by real-world censors." 16:52:41 <dcf1> I was like, huh? How does entropy measurement not count as detection? 16:52:50 <shelikhoow> maybe "detection" means identifying the exact type of traffic 16:52:51 <cohosh> i wonder to what extent russia's traffic policing differs from their naive packet drop method 16:52:57 <shelikhoow> like VMess or Shadowsocks 16:53:25 <shelikhoow> and entropy is a general characteristic of these traffics 16:53:36 <dcf1> shelikhoow: yes, that is the generous reading, I think. They're saying, okay, within the class of high-entropy protocols, we don't care about what *kind* of fully encrypted protocol it is. 16:53:46 <shelikhoow> yes.... 16:54:01 <dcf1> But to me, saying that kind of thing doesn't count as "detection" is a stretch, and undercuts what I thought was a more interesting point. 16:54:48 <dcf1> It's still a classifier, and like all classifiers makes some number of Type I and Type II errors. 16:55:22 <dcf1> One last point I had. This paper is at pains to distinguish itself from "Cover Your ACKs" (https://censorbib.nymity.ch/#Geddes2013a), but I'm not so sure. 16:56:18 <dcf1> Section 1 says "unlike Cover Your ACKs attacks", but then Section 8 says "Cover Your ACKs ... is an early example of differential degradation" 16:56:43 <meskio> I was confused by their solution, replacing webRTC by TLS brings it's own source of problems, like you now need a valid certificate (I guess is kind of free to block self signed ones), and the certificate needs to be different on each connection to avoid fingerprinting, ... 16:56:48 <dcf1> Cover Your ACKs has a sentence which is how I would characterize this paper: "Even with perfect emulation of the cover channel, these systems can be vulnerable to attacks that detect or disrupt the covert communications while having no effect on legitimate cover traffic." 16:57:14 <dcf1> And Section 4 in Cover Your ACKs is "Differential Error Tolerance" which is practically the title of this paper. 16:57:54 <dcf1> There's no real harm in that, though, as Cover Your ACKs is old now and it's good to see the thinking applied to contemporary proposals. 16:58:16 <dcf1> onyinyang: I am interested in hearing the takes you came up with on your own :) 16:58:32 <onyinyang> I mostly just believed them XD 16:58:43 <onyinyang> rookie mistake 16:58:44 <onyinyang> lol 16:59:54 <cohosh> well it is good and interesting work :) i was glad to see this paper 16:59:56 <dcf1> :) 17:00:15 <dcf1> Yes, there's nothing really wrong about it, I think, it's worth reading and thinking about. 17:00:29 <meskio> +1 17:00:38 <onyinyang> I thought the point that collateral damage and tradeoffs for a censor in these kinds of attacks may not be as straightforward as one thinks 17:01:23 <cohosh> yeah that's a good thought 17:01:24 <dcf1> Section 3.3 introduces a mini taxonomy of attack types that I don't think I've seen in exactly this way before. 17:01:26 <onyinyang> So to me it was an additional consideration that should be applied when developing circumvention tools 17:01:37 <dcf1> "constant-packet", "single-flow", "multi-flow" 17:04:47 <meskio> I think is a useful taxonomy, but still there are some "easy" multi-flow things that we see GFW doing while is not doing more complicated single-flow ones 17:05:18 <cohosh> what are the easy multi-flow things? 17:05:19 <meskio> so is hard to categorize censors just by that taxonomy, but maybe one extra to cross over 17:05:57 <meskio> cohosh: shelikhoow mentioned before how GFW does block some endpoints after time, I assume doing multi-flow analysis 17:06:24 <meskio> while there is tons of papers about doing machine learning on single-flow ones and we don't see censors doing that 17:06:52 <shelikhoo_> It was multi-flow but I wouldn't say it is easy... 17:07:06 <dcf1> The "graph-represented behaviors" from the last reading group (https://github.com/net4people/bbs/issues/455), even though it's not documenting real observed behavior, could also be considered a kind of statistical multi-flow analysis. 17:07:06 <shelikhoo_> since we have yet to find out what it is actually doing 17:07:11 <meskio> fair, is not easy 17:07:24 <cohosh> do we know for sure it is a multi-flow attack? 17:07:36 <cohosh> i guess we suspect because of the delay 17:07:42 <dcf1> And some that I don't have handy about netflow and traffic sampling. 17:09:31 <dcf1> And the "host-based analysis" of "On Precisely Detecting" (https://github.com/net4people/bbs/issues/312). 17:09:40 <dcf1> (Also not documenting real observed behavior.) 17:10:39 <cohosh> i hope we see more analysis of these differential degredation attacks. i'd love to see some analysis of packet loss on snowflake data channels for one 17:11:02 <cohosh> any ideas of where else to look? 17:11:19 <dcf1> Yeah. Or any applications under general throttling. (As I think the differential degradation can be generalized to include throttling.) 17:12:15 <shelikhoow_> yes, I would love to see a paper about detecting whether a traffic is web torrenting/file sharing or snowflake 17:12:42 <meskio> maybe things to add to https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Research-ideas?edit=true 17:13:09 <cohosh> if there's ever a QUIC based transport this is probably something to keep in mind 17:13:25 <dcf1> cohosh: yes, that's a good point about snowflake data channels. It makes me wonder if these authors also tried packet dropping of DTLS before doing the full blocking tests. I will ask when I send the transcript. 17:16:14 <meskio> I guess the silence means that we are done with the discussion 17:16:23 <onyinyang> I was just going to say that, yes 17:16:27 <onyinyang> I guess we can finish for today 17:16:29 <dcf1> Thanks Zhen and Vitaly for the paper. 17:16:33 <dcf1> Thanks cohosh for suggesting it. 17:16:34 <onyinyang> Thanks for the great discussion everyone 17:16:40 <shelikhoow_> yes, thanks for the amazing research 17:16:43 <onyinyang> and yes! Thanks to the authors of the paper! 17:16:46 <meskio> nice paper 17:17:01 <onyinyang> #endmeeting