16:00:20 <onyinyang> #startmeeting tor anti-censorship meeting
16:00:21 <MeetBot> Meeting started Thu Apr  3 16:00:20 2025 UTC.  The chair is onyinyang. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:21 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:21 <onyinyang> hello everyone!
16:00:21 <onyinyang> here is our meeting pad: [https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469](https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469)
16:00:26 <meskio> hello
16:00:27 <shelikhoo> hi~hi~
16:01:08 <cohosh> hi
16:04:26 <onyinyang> ok, let's get started
16:04:46 <onyinyang> the first discussion point is:  Give @renovate-bot guest access to anti-censorship group for it to use dependency proxy src shell
16:06:01 <meskio> we discussed it last week
16:06:05 <meskio> shelikhoo: did you enable it?
16:06:10 <onyinyang> ahh ok, sorry
16:06:10 <meskio> if so it looks like is not that noisy
16:06:28 <shelikhoo> it was done
16:06:30 <meskio> I was worried that this will trigger renovate to start sending MRs in all our repos
16:06:31 <shelikhoo> and from last week
16:06:42 <meskio> nice, then I guess we are fine with it
16:07:11 <onyinyang> ok cool, let's move on to the next point then
16:07:14 <shelikhoo> yes
16:07:27 <onyinyang> That's mine: Should we move the amp library in snowflake to ptutil?
16:07:50 <onyinyang> We discussed this at some point but I don't remember if we came to a conclusion
16:08:25 <onyinyang> we are enabling registration through amp cache for conjure though and it is currently pointing to the snowflake amp library
16:08:26 <meskio> I recall talking about it in private and there were some concerns on bringing too many dependencies on ptutil
16:08:32 <onyinyang> which is fine, but a bit strange
16:08:41 <meskio> and if it was worth it to create it's own library for signaling channels
16:08:49 <onyinyang> yes, I think this was the discussion
16:09:55 <onyinyang> I think it would be nice to have a signalling channel library ofc :)
16:10:19 <meskio> sounds good to me
16:10:33 <shelikhoo> yeah. I agree with having a signaling library
16:10:37 <shelikhoo> that would be nice
16:11:26 <onyinyang> ok, so we will leave it in snowflake for now
16:11:46 <onyinyang> and put "build a signalling library with all the things" on our roadmap
16:11:52 <onyinyang> I guess XD
16:12:37 <meskio> I think is fine to start with ampcache
16:12:51 <meskio> and create a simple signaling library with that and grow it as needed
16:13:05 <onyinyang> yeah, that makes sense
16:14:18 <shelikhoo> yes....
16:14:44 <onyinyang> ok. Is there anything else to discuss today before the reading group discussion?
16:15:12 <shelikhoo> nothing from me
16:16:37 <onyinyang> ok, let's move on to the reading group then
16:16:56 <dcf1> I invited some people to the reading group discussion, I don't know if they are here.
16:17:00 <onyinyang> Today we're discussing Differential Degradation Vulnerabilities in Censorship Circumvention Systems     https://arxiv.org/abs/2409.06247
16:17:05 <dcf1> The Protozoa authors, and the authors of this paper.
16:17:15 <onyinyang> oh nice!
16:17:28 <shelikhoo> nice!
16:17:31 <dcf1> Zhen replied to say they couldn't make it to the discussion, but I will send a transcript afterward.
16:17:40 <dcf1> I didn't hear from Vitaly.
16:18:00 <meskio> :)
16:18:32 <meskio> do someone wants to start with a summary?
16:18:46 <dcf1> If no one has one prepared, I will type one quick.
16:18:57 <meskio> that will be nice, thanks
16:19:26 <dcf1> This paper presents a class of attacks called "differential degradation" attacks and demonstrates them against Snowflake and Protozoa.
16:20:27 <dcf1> The key idea is that even if a circumvention protocol is "behaviorally independent" in the sense of "Learning to Behave", there are attacks that can degrade the performance of the tunneled traffic,
16:21:15 <dcf1> even though its external behavior may remain the same as a normal instance of the cover protocol, and (especially) even when a normal instance of the cover protocol is less affected by the degradation (this is the "differential" part).
16:22:11 <dcf1> They make a claim: "detection is not necessary for blocking". This means that it may be possible to attack all instances of some protocol equally, without trying to distinguish which are covert and which are not,
16:22:43 <dcf1> and the natural channel properties will cause the covert tunnels to suffer more (even if they remain externally indistinguishable; i.e., behaviorally independent).
16:23:14 <dcf1> The attack on Snowflake, however (and IMO), doesn't really fit into the "differential degradation" class.
16:23:52 <dcf1> They just block WebRTC data channels -- but to their credit they show that blocking WebRTC data channels doesn't affect the usability of a selection of other WebRTC applications.
16:24:22 <dcf1> The attack on Protozoa involves parsing SRTP packets to identify video frame boundaries, and selectively dropping video frames.
16:24:48 <dcf1> Under this attack, normal WebRTC video keeps working ok, while the Protozoa tunnel slows a lot.
16:25:12 <dcf1> They make another strong claim (Section 7.1) that *no* existing cover application can satisfy all three of:
16:25:17 <dcf1> 1. secure against differential degradation
16:25:20 <dcf1> 2. high-capacity
16:25:24 <dcf1> 3. doesn't create new protocol channels
16:26:00 <dcf1> They say that Snowflake and Protozoa satisfy 2 and 3 but not 1; Balboa satisfies 1 and 2
16:26:27 <dcf1> The create a new protocol which they call Ciliate, which satisfies 1 and 3 but not 3.
16:26:45 <dcf1> Ciliate is like Protozoa but it works over TLS, specifically RTMPS.
16:27:09 <dcf1> The reason it doesn't satisfy 3 is that you need to create two video streams, one in each direction, which is abnormal for RTMPS.
16:27:21 <dcf1> End summary
16:28:06 <meskio> I think is very interesting the concept of blocking without detecting
16:28:09 <cohosh> thanks! i thought this paper was interesting, though i'm skeptical of some of the claims
16:28:12 <dcf1> I have a few particular discussion points but anyone go ahead.
16:28:18 <meskio> we kind of already see this in china with the UDP packet drops
16:28:24 <meskio> that make snowflake unusable
16:28:33 <meskio> but problably they are not targeting snowflake
16:28:36 <shelikhoo> I think for snowflake we need to admit that snowflake is more like web-torrent traffic rather than web-conferencing traffic
16:29:01 <cohosh> yes true, i'm so curious about how disporportionate the effects of packet loss are on snowflake vs other webrtc datachannels
16:29:13 <meskio> good point, I wonder why they didn't look into web-torrent, maybe is not common enough
16:29:30 <dcf1> Yeah. On my first reading, I misunderstood the point. I thought the paper was claiming that a protocol can be behaviorally independent with regard to a passive adversary, but not behaviorally independent against an active adversary.
16:30:12 <dcf1> But it's actually more sophisticated and interesting than that: they're saying that *even if* the protocol remains behaviorally independent under active attack, the traffic in the tunnel could be unequally affected in a way that makes it unusable.
16:30:55 <cohosh> that was one of the claims i'm not sure about, have we seen blocking of just webrtc data channels?
16:31:05 <dcf1> They did identify a couple of applications that *did* break when data channels were blocked, but they were relatively obscure.
16:31:28 <shelikhoow> I think there was some analysis on how udp based traffic reacts to packet loss, like comparing hysteria2 and quic
16:31:36 <dcf1> Section 5.4: Snapdrop, Sharedrop, FilePizza.
16:32:02 <meskio> AFAIK PeerTube uses web-torrent, and is not very obscure
16:32:03 <cohosh> if we haven't, it doesn't necessarily mean that censors are unwilling to block it, but it does make me skeptical of how easy it is and how trivial the tradeoff is for censors
16:32:19 <shelikhoow> but they are not in this paper
16:32:47 <shelikhoow> but https://www.petsymposium.org/foci/2025/foci-2025-0001.php
16:32:55 <dcf1> cohosh: Yes, to me this was a big point that I would have liked to see engaged with more, the lack of blocking of WebRTC data channels in the real world.
16:33:16 <cohosh> i'm curious about their methodology for finding data channel applications
16:33:27 <dcf1> They say (Section 5.2) "This attack is thus available even to very weak real-world censors". If that's so, why don't they do it?
16:33:29 <cohosh> this seems like it would be a good use case for those university network tap papers we've seen
16:33:44 <dcf1> Or the bigger following question: What do the censors know that we don't?
16:34:58 <meskio> it can also be that censors didn't think data channels were so easy to block
16:34:59 <cohosh> meskio: ah good point, i hadn't realized peertube was webtorrent
16:35:10 <dcf1> And this point stands out especially, because they emphasize a mismatch between research and practice, saying (and citing "Grounding circumvention in empiricism") that censors prefer single-packet and single-flow attacks, not sophisticated multi-flow or traffic analysis attacks
16:35:35 <dcf1> in order to underscore the practicality of differential degradation attacks.
16:36:14 <dcf1> To highlight a mismatch of research and practice, but then to posit an apparently trivial attack that nevertheless hasn't been observed in practice (to our knowledge), stands out.
16:36:50 <dcf1> meskio: Yes, that is another possibility, that censors haven't invested the time or knowledge to know that it is in fact as easy to perform as is claimed.
16:36:54 <shelikhoow> personally I don't fully agree with the point that censor does not conduct "sophisticated multi-flow or traffic analysis attacks" at all, it is more of a bias introduced by the limitations of researchers to discover such censor behaviour
16:38:02 <shelikhoow> like the slow reacting ip,port blocking system that is active in China but to be best of my knowledge is not documented by well known paper
16:38:38 <shelikhoow> since the block will observe the traffic for a long time, and then block ip address, it is difficult to run controlled experiment
16:39:13 <shelikhoow> even if it is widely reported by users to exist to some degree
16:39:52 <cohosh> yeah we also never narrowed down the cause of the temporary snowflake block in russia either: https://github.com/net4people/bbs/issues/422
16:40:03 <dcf1> Regardless of censors' real preferences and capabilities, the fact that this paper has implementations of the attacks is a point in its favor.
16:40:55 <shelikhoow> as for data channel, yes it is true that our usage of webrtc is not perfectly blending it it into web-conferencing traffic
16:40:56 <dcf1> The Protozoa attack presumably has a fair amount of source code, which I'm not sure is online, but the Snowflake one just takes some netfilter rules (Figure 6)
16:40:58 <cohosh> yeah i was so impressed by the implementation work here
16:42:25 <meskio> yes is impresive how simple it is
16:42:46 <meskio> I think the paper is a nice read
16:43:10 <cohosh> yeah i agree, i enjoyed reading this
16:43:17 <shelikhoow> yes!
16:43:20 <dcf1> In any case, I'm not too worried about data channel attacks in Snowflake, as, though it's not trivial to change, conceptually snowflake can run over media streams like TorKameleon.
16:43:26 <dcf1> Or https://archive.torproject.org/websites/lists.torproject.org/pipermail/anti-censorship-team/2023-February/000284.html
16:43:43 <cohosh> for the protozoa attack, i would be curious to see how just naively applying existing throttling techniques would affect the usability of circumvention tools
16:43:47 <onyinyang> Yes, I also thought it was nice to read. I'm glad I'm in this reading group so I can hear more informed/critical takes than I came up with on my own XD
16:43:49 <cohosh> like the throttling russia has done
16:44:18 <dcf1> cohosh: Yes, and it was really interesting that they say straightforward whole-packet dropping does *not* work against Protozoa.
16:44:46 <dcf1> 6.2: "The naive approach ... is to introduce high packet losses. It does not work against Protozoa."
16:44:47 <cohosh> yeah that surprised me
16:45:01 <dcf1> Which is why they had to resort to parsing the STRP and finding video frames.
16:46:07 <dcf1> It would also have been interesting to see whether there are simple countermeasures that Protozoa could deploy to defend against the attack. I.e., is it really fundamental, or did they just find a current weak point?
16:46:41 <dcf1> I'm a big skeptical of the "can't have all three properties" claim, it doesn't seem well motivated with a survey or similar.
16:47:15 <dcf1> If it were possible to modify Protozoa to resist this particular attack, will there always be another attack waiting, or would that make Protozoa satisfy all 3 requirements and disprove the claim?
16:48:18 <shelikhoow> dcf1: I think for Protozoa to resist this attack, it must have packet loss compensation, which will have a different packet loss reaction compared to video stream
16:48:21 <cohosh> agreed, i can't think of a fundamental reason why this tradeoff would hold
16:48:26 <shelikhoow> let's say hysteria2
16:48:34 <dcf1> The most thought-provoking claim of the paper, to me, is "detection is not necessary for blocking". Of all of it, this is the point I would like to see more developed.
16:49:07 <meskio> agree
16:49:12 <dcf1> shelikhoow: no, protozoa uses traffic replacement, its external behavior will still be indistinguishable. the attacks in this paper are not to induce different external behavior, it's to make the channel inside unusable (if I understand it right).
16:49:39 <dcf1> The "detection is not necessary for blocking" makes me think of throttling, which you can view through that lens.
16:49:59 <shelikhoow> yes... so the sending rate is deterministic...
16:50:09 <dcf1> Throttling affects everything equally, and it is only because of the nature of different types of applications that some remain more or less usable under throttling, and some beomce unusable.
16:50:34 <dcf1> Throttling may simply be a crude and yet somewhat effective type of differential degradation attack.
16:50:59 <dcf1> Or, you could view things like the Protozoa frame dropping as being a more refined version of throttling.
16:51:33 <dcf1> But: then I got really confused when, in other places, they cite attacks that, to me, are clear examples of detection, and they claim those attacks do not count as detection.
16:51:54 <dcf1> Section 3.3: "For the recent blocking of fully encrypted channels, GFW uses a even cheaper technique: it blocks all traffic that fails a single-flow entropy test (with a few exemption rules). This effectively blocks VMess, obfs4, and Shadowsocks with few false positives [50]. This attack is a pure blocking attack that does not require detection, and a real-world example of a channel mismatch."
16:52:24 <dcf1> And Section 7.2, citing the same GFW fully encrypted traffic attack: "Recent studies [50] show that detection-free disruption has already been adopted by real-world censors."
16:52:41 <dcf1> I was like, huh? How does entropy measurement not count as detection?
16:52:50 <shelikhoow> maybe "detection" means identifying the exact type of traffic
16:52:51 <cohosh> i wonder to what extent russia's traffic policing differs from their naive packet drop method
16:52:57 <shelikhoow> like VMess or Shadowsocks
16:53:25 <shelikhoow> and entropy is a general characteristic of these traffics
16:53:36 <dcf1> shelikhoow: yes, that is the generous reading, I think. They're saying, okay, within the class of high-entropy protocols, we don't care about what *kind* of fully encrypted protocol it is.
16:53:46 <shelikhoow> yes....
16:54:01 <dcf1> But to me, saying that kind of thing doesn't count as "detection" is a stretch, and undercuts what I thought was a more interesting point.
16:54:48 <dcf1> It's still a classifier, and like all classifiers makes some number of Type I and Type II errors.
16:55:22 <dcf1> One last point I had. This paper is at pains to distinguish itself from "Cover Your ACKs" (https://censorbib.nymity.ch/#Geddes2013a), but I'm not so sure.
16:56:18 <dcf1> Section 1 says "unlike Cover Your ACKs attacks", but then Section 8 says "Cover Your ACKs ... is an early example of differential degradation"
16:56:43 <meskio> I was confused by their solution, replacing webRTC by TLS brings it's own source of problems, like you now need a valid certificate (I guess is kind of free to block self signed ones), and the certificate needs to be different on each connection to avoid fingerprinting, ...
16:56:48 <dcf1> Cover Your ACKs has a sentence which is how I would characterize this paper: "Even with perfect emulation of the cover channel, these systems can be vulnerable to attacks that detect or disrupt the covert communications while having no effect on legitimate cover traffic."
16:57:14 <dcf1> And Section 4 in Cover Your ACKs is "Differential Error Tolerance" which is practically the title of this paper.
16:57:54 <dcf1> There's no real harm in that, though, as Cover Your ACKs is old now and it's good to see the thinking applied to contemporary proposals.
16:58:16 <dcf1> onyinyang: I am interested in hearing the takes you came up with on your own :)
16:58:32 <onyinyang> I mostly just believed them XD
16:58:43 <onyinyang> rookie mistake
16:58:44 <onyinyang> lol
16:59:54 <cohosh> well it is good and interesting work :) i was glad to see this paper
16:59:56 <dcf1> :)
17:00:15 <dcf1> Yes, there's nothing really wrong about it, I think, it's worth reading and thinking about.
17:00:29 <meskio> +1
17:00:38 <onyinyang> I thought the point that collateral damage and tradeoffs for a censor in these kinds of attacks may not be as straightforward as one thinks
17:01:23 <cohosh> yeah that's a good thought
17:01:24 <dcf1> Section 3.3 introduces a mini taxonomy of attack types that I don't think I've seen in exactly this way before.
17:01:26 <onyinyang> So to me it was an additional consideration that should be applied when developing circumvention tools
17:01:37 <dcf1> "constant-packet", "single-flow", "multi-flow"
17:04:47 <meskio> I think is a useful taxonomy, but still there are some "easy" multi-flow things that we see GFW doing while is not doing more complicated single-flow ones
17:05:18 <cohosh> what are the easy multi-flow things?
17:05:19 <meskio> so is hard to categorize censors just by that taxonomy, but maybe one extra to cross over
17:05:57 <meskio> cohosh: shelikhoow mentioned before how GFW does block some endpoints after time, I assume doing multi-flow analysis
17:06:24 <meskio> while there is tons of papers about doing machine learning on single-flow ones and we don't see censors doing that
17:06:52 <shelikhoo_> It was multi-flow but I wouldn't say it is easy...
17:07:06 <dcf1> The "graph-represented behaviors" from the last reading group (https://github.com/net4people/bbs/issues/455), even though it's not documenting real observed behavior, could also be considered a kind of statistical multi-flow analysis.
17:07:06 <shelikhoo_> since we have yet to find out what it is actually doing
17:07:11 <meskio> fair, is not easy
17:07:24 <cohosh> do we know for sure it is a multi-flow attack?
17:07:36 <cohosh> i guess we suspect because of the delay
17:07:42 <dcf1> And some that I don't have handy about netflow and traffic sampling.
17:09:31 <dcf1> And the "host-based analysis" of "On Precisely Detecting" (https://github.com/net4people/bbs/issues/312).
17:09:40 <dcf1> (Also not documenting real observed behavior.)
17:10:39 <cohosh> i hope we see more analysis of these differential degredation attacks. i'd love to see some analysis of packet loss on snowflake data channels for one
17:11:02 <cohosh> any ideas of where else to look?
17:11:19 <dcf1> Yeah. Or any applications under general throttling. (As I think the differential degradation can be generalized to include throttling.)
17:12:15 <shelikhoow_> yes, I would love to see a paper about detecting whether a traffic is web torrenting/file sharing or snowflake
17:12:42 <meskio> maybe things to add to https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Research-ideas?edit=true
17:13:09 <cohosh> if there's ever a QUIC based transport this is probably something to keep in mind
17:13:25 <dcf1> cohosh: yes, that's a good point about snowflake data channels. It makes me wonder if these authors also tried packet dropping of DTLS before doing the full blocking tests. I will ask when I send the transcript.
17:16:14 <meskio> I guess the silence means that we are done with the discussion
17:16:23 <onyinyang> I was just going to say that, yes
17:16:27 <onyinyang> I guess we can finish for today
17:16:29 <dcf1> Thanks Zhen and Vitaly for the paper.
17:16:33 <dcf1> Thanks cohosh for suggesting it.
17:16:34 <onyinyang> Thanks for the great discussion everyone
17:16:40 <shelikhoow_> yes, thanks for the amazing research
17:16:43 <onyinyang> and yes! Thanks to the authors of the paper!
17:16:46 <meskio> nice paper
17:17:01 <onyinyang> #endmeeting