16:00:12 <cohosh> #startmeeting tor anti-censorship meeting 16:00:12 <MeetBot> Meeting started Thu May 28 16:00:12 2026 UTC. The chair is cohosh. Information about MeetBot at https://wiki.debian.org/MeetBot. 16:00:12 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:20 <cohosh> hello 16:00:23 <meskio[mds]> hello 16:00:35 <cohosh> here is our meeting pad: https://pad.riseup.net/p/r.9574e996bb9c0266213d38b91b56c469 16:00:41 <onyinyang[mds]> hihi 16:00:47 <cohosh> let us know if you'd like a link the editable pad 16:01:15 <gaba[mds]> hi 16:01:48 <cohosh> first we have an announcement 16:01:57 <Shelikhoo[mds]> hi~hi~ 16:02:01 <cohosh> from meskio[mds] i think? 16:02:05 <meskio[mds]> yes 16:02:18 <meskio[mds]> I put it as an announcement, as don't think we need to discuss it 16:02:36 <meskio[mds]> anyway, I've been cleaning up and evaluating the list of signaling channels we have colected in the wiki 16:02:48 <meskio[mds]> https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Signaling-Channels/channels 16:02:53 <meskio[mds]> and I wrote some conclusions: 16:03:05 <meskio[mds]> https://gitlab.torproject.org/tpo/anti-censorship/team/-/work_items/192#note_3418348 16:03:16 <meskio[mds]> feel free to comment in the issue if you have comments to them 16:03:31 <cohosh> thanks! 16:03:40 <cohosh> next up is a discussion 16:03:57 <cohosh> oops i missed another announcement 16:03:59 <Shelikhoo[mds]> The next announcement is from me: here is a written report of TLS fingerprint diversification/imitation system 16:03:59 <Shelikhoo[mds]> feedbacks are more than welcome 16:04:03 <Shelikhoo[mds]> sorry.... 16:04:10 <cohosh> thanks 16:04:20 <Shelikhoo[mds]> I typed it too late, it was my mistake 16:04:25 <cohosh> ok now onto the discussion 16:04:44 <cohosh> Guardian project asked me if we'd be willing to allow some new proxy type names for orbot 16:04:45 <Shelikhoo[mds]> yes 16:04:48 <cohosh> to show up in our metrics 16:05:08 <cohosh> so far, orbot proxies have reported the "iptproxy" proxy type 16:05:37 <cohosh> but now that more applications are using iptproxy and that library has exposed a means to change it, they are interested in setting more specific proxy type names 16:06:08 <cohosh> they proposed having not just a single "orbot" proxy name, but "orbot-ios" and "orbot-android" 16:07:10 <cohosh> i don't have a strong opinion, these metrics are useful to us for distinguishing different implementations, because we can track down bugs or configuration updates 16:07:50 <cohosh> the main thing i'd worry about is the storage requirements on prometheus metrics, but we could truncate these type strings to "orbot" for those 16:07:53 <Shelikhoo[mds]> I didn't think of a reason not to add these two names as well. I am happy with adding these 2 news 16:08:21 <Shelikhoo[mds]> I am happy with adding these 2 new names 16:08:42 <meskio[mds]> yes, the prometheus scalability if we keep adding names is worrisome 16:08:49 <cohosh> well they wanted potentially: 16:08:52 <cohosh> android/386 16:08:52 <cohosh> android/amd64 16:08:52 <cohosh> android/arm 16:08:52 <cohosh> android/arm64 16:08:54 <cohosh> darwin/amd64 16:08:57 <cohosh> darwin/arm64 16:08:58 <meskio[mds]> two more names should be fine as we don't have so many, but I wonder if we should have two separated ones 16:08:59 <cohosh> ios/amd64 16:09:02 <cohosh> ios/arm64 16:09:05 <cohosh> which seems a bit excessive to me 16:09:19 <cohosh> we could ask them to stick with android/darwin/ios 16:09:36 <Shelikhoo[mds]> oh, that is a little too much... 16:09:51 <dcf1> haha we're not running a telemetry service 16:10:07 <Shelikhoo[mds]> considering for each name, we are storing maybe 10 new items 16:10:11 <meskio[mds]> we could report to prometheus only orbot and keep in the collector metrics the full list.... 16:10:23 <Shelikhoo[mds]> and not just a single counter for that specific type 16:10:46 <meskio[mds]> because we do produce collector metrics with the proxies, isn't it? 16:10:56 <cohosh> yeah, i think we should allow new identifiers in our metrics to the extent they are useful to us for debugging 16:11:08 <meskio[mds]> I mean, if those metrics will be useful for guardian project maybe is good to collect them... 16:11:28 <Shelikhoo[mds]> or we can support the name prefix thing I suggested earlier, and add maybe 2-3 new prefixs 16:11:38 <cohosh> their motivation is to see trends that would help them direct their outreach efforts 16:12:24 <meskio[mds]> we had discussed the idea of supporting a prefix in the past, maybe is time to do it 16:12:25 <Shelikhoo[mds]> so they can use the long detailed name in their codebase 16:12:39 <dcf1> so they want to count the proportion of os/arch among users who have turned on kindness mode? 16:13:01 <cohosh> dcf1: yes 16:13:03 <Shelikhoo[mds]> and we can, if proven necessary, add more prefix or names to count in a more detail way 16:13:45 <Shelikhoo[mds]> or have varying level of detail for different matrices 16:13:58 <Shelikhoo[mds]> like have only a count for longer names 16:14:10 <dcf1> I mean, do they have to do that counting via tor metrics descriptors? If it's fine-grained user counting, maybe they should have their users opt in to that and do it themselves? 16:14:23 <Shelikhoo[mds]> and aggregated prefix for detailed match/nat info 16:14:47 <cohosh> yeah i think asking them to do their own telemetry is a reasonable thing to do 16:15:00 <dcf1> like, in that case why not add system light/dark theme to the id string 16:16:01 <dcf1> I don't mean to shut it down out of hand. But we should reflect on what purpose the proxy type distinctions are actually meant to serve, that will help guide our decisions. 16:16:33 <Shelikhoo[mds]> I do think adding a consent step will make data less useful, as different users may have different tendency for telemetry 16:16:41 <cohosh> yep, for us the proxy types have been useful to track down bugs and feature rollouts 16:16:46 <dcf1> Like, we used it to make graphs in the research paper, we're using it for whatever we're using it for now, what do we need it to do, and is gp's proposed change compatible with that? 16:17:08 <cohosh> we don't distinguish between chrome and firefox webext installs for example, but that might actually be useful for debugging purposes 16:17:17 <cohosh> because of the different webrtc library implementations 16:17:27 <cohosh> and requirements for the installation of webextensions 16:17:50 <cohosh> so i could see a world in which distinguishing between platforms on a high level (mac vs ios vs android) would be useful 16:18:18 <cohosh> there's also differences in how users update the app on different app stores 16:18:35 <cohosh> and we've used these metrics to help debug rollouts to changes in proxy configurations in the past 16:19:20 <cohosh> given this, i'd be okay with something like orbot-[android|ios|mac] in CollecTor metrics and just the orbot prefix in prometheus 16:21:26 * cohosh tracks down issue where these metrics came in handy for debugging 16:21:42 <Shelikhoo[mds]> yeah, I think we can add maybe add the prefix support as requested, and initially use a few prefix 16:22:18 <Shelikhoo[mds]> and if necessary, add longer and narrower prefixes 16:22:32 <cohosh> https://gitlab.torproject.org/tpo/anti-censorship/team/-/work_items/142 16:22:44 <dcf1> as a question of design, there's the question of whether to break it in to separate fields, rather than have the N*M*...*Q*R combinatorial explosion of field1-field2-field3 16:23:06 <cohosh> here's one where we lost a bunch of snowflake proxies ^ 16:24:53 <dcf1> otherwise it becomes a poorly specified string embedding of structured data inside of what is already structured data (the tor descriptor) 16:25:27 <dcf1> or I guess you *do* want find-grained counts of every possible field1-field2-field3 combination, not just a count of field1, a count of field2, etc. 16:25:34 <cohosh> dcf1: can you elaborate on what that would look like? at the moment all of our metrics are (-) separated fields 16:26:06 <cohosh> would this be an overhaul of metrics output to make it more structured? 16:26:27 <dcf1> ok so first let me say the way we do the snowflake descriptors has always struct me as wrong 16:26:40 <dcf1> snowflake-ips-iptproxy 37682 16:26:40 <dcf1> snowflake-ips-standalone 4888 16:26:40 <dcf1> snowflake-ips-webext 88772 16:26:40 <dcf1> snowflake-ips-badge 10219 16:26:40 <dcf1> snowflake-ips-total 139706 16:26:58 <dcf1> I think it's obvious that this should be more like how snowflake-ips works 16:27:03 <dcf1> snowflake-ips DE=51628,US=16865,IN=10444 16:27:07 <dcf1> I.e. 16:27:43 <dcf1> snowflake-ips-type iptproxy=37682,standalone=4888,webext=88772,badge=10219 16:28:11 <dcf1> Why? because we also have separate metrics which are not supposed to mix with the above ones: 16:28:22 <dcf1> snowflake-ips-nat-restricted 111526 16:28:22 <dcf1> snowflake-ips-nat-unrestricted 18181 16:28:22 <dcf1> snowflake-ips-nat-unknown 52827 16:28:33 <dcf1> This should obviously be 16:28:51 <dcf1> snowflake-ips-nat restricted=111526,unrestricted=18181,unknown=52827 16:29:49 <dcf1> Why? Because how is a parse supposed to know that "nat-restricted" is not just another proxy type, like "iptproxy", "standalone", etc.? There's no way to tell, that information has to be hardcoded in the parser. 16:30:39 <cohosh> ok, i am in support of this change 16:30:51 <dcf1> There's similar confusion with client-denied-count, client-unrestricted-denied-count / client-ampcache-count,client-sqs-count, etc. Parsers just have to know which counts belong to different pools. 16:30:57 <cohosh> time for a @type snowflake-stats 2.0 maybe 16:31:00 <dcf1> It may be too late to change it now. 16:31:03 <dcf1> I don't know. 16:31:57 <dcf1> But if suddently there are dozens of variations on snowflake-ips-orbot-ios-amd64, snowflake-ips-orbot-android-amd64, on to N*M*..., all that information has to be encoded in parsers. 16:32:22 <meskio[mds]> I think this change is a great idea, but we'll need to coordinate with the metrics team so they update their side before we do it 16:32:26 <dcf1> Or, like, institiute heuristics like "if it starts with 'nat-', then it belongs to the nat pool of statistics" 16:33:11 <dcf1> What I was suggesting earlier, of splitting into separate fields, I think doesn't work. Sorry, I hadn't fully thought it out. 16:33:37 <cohosh> ok no worries 16:34:02 <cohosh> i wouldn't mind an overhaul of snowflake metrics output if it's possible but we can leave that for a separate discussion 16:34:06 <dcf1> What I was thinking is, if the proposal is that each proxy should report some string like "orbot-ios-amd64", then maybe it would be better to report that as different fields, like "type=orbot","os=ios","arch=amd64" 16:34:41 <cohosh> got it, this is more like what prometheus allows 16:34:43 <cohosh> with labels 16:34:55 <cohosh> but we have a state space explosion problem there 16:34:59 <dcf1> It still makes sense for individual proxy reporting to split it into structured fields instead of having a poorly specified "stringly typed" thing, but for aggregate statistics counting I guess you would want to smoosh them all back together anywya. 16:35:22 <dcf1> I'll show you how the parsing code works in snowflake-graphs 16:36:07 <dcf1> https://gitlab.torproject.org/dcf/snowflake-graphs/-/blob/799d931084b428fa4612dc1e947a9432b9ac35a8/snowflake-stats#L146 16:36:35 <dcf1> Basically the parser needs an exhaustive whitelist of every descriptor field that might be observed, because it has to know, for each one, which statistics it affects. 16:37:13 <dcf1> E.g. "snowflake-ips-badge"? That alters desc.snowflake_type_ips. "snowflake-ips-nat-restricted"? That alters desc.snowflake_nat_type_ips. 16:37:30 <dcf1> Which is a consequences of the "bag of words" way in which snowflake-stats descriptors are currently structured. 16:37:36 <cohosh> yeah i have worked with that code a little bit and it can be painful 16:37:43 <dcf1> Sorry for the rant. 16:37:48 <cohosh> no problem at all 16:38:36 <cohosh> i see we have two orthogonal problems: the (lack of) conventions we're currently using for metrics data, and which fields to allow 16:39:46 <cohosh> because no matter what we have a problem when we allow an arbitrary number of nested fields and identifiers, for prometheus at least, we have some constraints 16:40:03 <cohosh> i can make an issue to discuss a better metrics structure, that could be a longer term project 16:40:17 <dcf1> Currently I'm not able to generate new snowflake stats CSVs because of https://gitlab.torproject.org/tpo/network-health/metrics/collector-rs/-/issues/48, but I expect that I will have to add new field support for the other new proxy type we added. 16:41:19 <cohosh> the bloco type, yeah 16:42:06 <cohosh> ok so the two things we could tell orbot are: we're willing to add just 'orbot' as a new proxy type, to distinguish from general 'iptproxy' use 16:42:09 <cohosh> or 16:42:32 <cohosh> we tell them we'll allow some small number of specific strings like orbot-[ios|mac|android] 16:42:42 <cohosh> or we tell them we're blocked on a metrics overhaul 16:43:07 <cohosh> sorry that's 3 different options heh 16:43:35 <dcf1> I think option 1, add 'orbot', is not objectionable. It should probably have been that from the start. 16:43:48 <dcf1> And maybe option 2 is ok, too. 16:44:06 <meskio[mds]> I don't have strong opinions here 16:44:10 <dcf1> Otherwise, if it's potentially dozens of new distinct types, yes, we probably want to change our metrics format. 16:44:13 <Shelikhoo[mds]> In general I think maybe adding 3 new names are fine... 16:44:37 <cohosh> ok 16:44:38 <Shelikhoo[mds]> I don't think we wants to block them on metrics overhaul as... it might take a while 16:44:51 <cohosh> thanks for the perspective on this 16:47:31 <cohosh> i'll tell them orbot is ok for now 16:48:05 <meskio[mds]> +1 16:48:23 <cohosh> and we can add more specifics later if the tradeoffs are worth it 16:48:31 <cohosh> ok let's end with interesting links 16:48:37 <cohosh> https://github.com/net4people/bbs/issues/603#issuecomment-4529373091 16:49:20 <theodorsm> I added it, rumors about relay traffic in russia, might block all p2p traffic not going through approved TURN servers 16:49:51 <Shelikhoo[mds]> oh no.. 16:50:14 <cohosh> thanks for sharing it 16:50:53 <Shelikhoo[mds]> thanks for sharing this! 16:51:12 <cohosh> ok, anything else before we end the meeting for today? 16:52:01 <Shelikhoo[mds]> EOF from me 16:52:12 <meskio[mds]> nothing from me 16:52:26 <cohosh> #endmeeting