13:03:00 <hiro> #startmeeting network-health 2026-01-19 13:03:00 <MeetBot> Meeting started Mon Jan 19 13:03:00 2026 UTC. The chair is hiro. Information about MeetBot at https://wiki.debian.org/MeetBot. 13:03:00 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 13:03:11 <sarthikg[mds]> o/ 13:03:21 <hiro> #link https://pad.riseup.net/p/tor-nethealthteam-2026-keep 13:03:24 <hiro> the new pad everyone! 13:04:07 <hiro> all right who wants to start with this week updates? 13:04:13 <GeKo> the pad is dead, long live the new pad! 13:04:18 <GeKo> i can go 13:04:24 <hiro> lol 13:04:37 <GeKo> i did more work on p183 wrt the anomaly report writing 13:04:49 <GeKo> and created a project for the documentation 13:05:09 <GeKo> which i put in my private namespace for now as we need to think where we want to have the final work 13:05:36 <GeKo> we have the result on gitlab pages: https://tor-anomaly-detection-59beaa.pages.torproject.net/ 13:05:43 <hiro> nice that was great GeKo (IRC) 13:05:45 <GeKo> i plan to add more stuff over the next weeks 13:05:50 <juga> nice 13:05:55 <GeKo> as i still have plenty of material 13:06:03 <GeKo> but for the sponsor part i think we are done 13:06:20 <hiro> I have been wondering if this could be a better format for our documentation that the current wiki 13:06:32 <juga> +1 :) 13:06:33 <GeKo> yeah, we could do that as well 13:06:34 <hiro> but this can be a wider discussion for an in person meetup maybe 13:07:19 <GeKo> i've started thinking about what data we need for the anomaly algorithms we want to imlplement 13:07:24 <GeKo> *implement 13:07:43 <GeKo> hiro: what do you and sarthikg[mds] plan for the user data on the new website? 13:08:05 <GeKo> right now we have some .csv data available for the graphs 13:08:14 <hiro> I am creating the user data computing logic in parser-rs 13:08:15 <GeKo> how is that supposed to look like with the website 2.0? 13:08:29 <hiro> so far so good, but I am not able to estimate properly snowflake users 13:08:45 <hiro> I spent all friday trying to understand why was that 13:08:59 <GeKo> i guess i need to look at that then 13:09:14 <GeKo> because that's what we basically need for what juga is working on 13:09:37 <GeKo> i was wondering whether we would have ideally some materialized table or view 13:09:47 <hiro> ok, for the tuning of the algorithm we can still use what we have in the csv from the current metrics.tpo 13:09:59 <GeKo> holding that data so we can easily query that for arbitrary timeframes 13:10:30 <hiro> yes so https://gitlab.torproject.org/tpo/network-health/metrics/datastore/-/blob/main/stats_tables.sql?ref_type=heads these would be the tables 13:10:40 <GeKo> yeah, but ideally we would query the db for that i think 13:11:02 <hiro> and should map what we have right now on the csv (I mean same columns) 13:11:23 <GeKo> okay, great we do have that available then already 13:11:38 <hiro> so the daily_relay_users has accurate numbers 13:11:52 <hiro> but not the daily_bridge_users 13:12:15 <GeKo> okay, interesting. i'll take a closer look at those numbers 13:12:28 <hiro> oh well, it does, but when there is a country or transport that is more predominant in one country I am doing a mistake it seems when estimating the interval windows 13:12:39 <GeKo> because i was concerned we calculate them in a way that they would differ from what we get on the website 13:13:06 <hiro> so like if you look at russia for example the snowflake estimation is completly wrong, but not the other transports 13:13:07 <juga> the input for the tool only needs date and number of users/clients 13:13:44 <GeKo> yep 13:13:59 <GeKo> okay, i'll update team#393 accordingly 13:13:59 <tor> Uhm, which one of [tpo/anti-censorship/team, tpo/applications/team, tpo/community/team, tpo/core/team, tpo/network-health/metrics/team, tpo/network-health/team, tpo/operations/team, tpo/team, tpo/tpa/team, tpo/ux/team, tpo/web/team] did you mean? 13:14:04 <hiro> so I am trying to figure out why 13:14:19 <GeKo> sorry, tor my friend: network-health/team#393 13:14:29 <GeKo> that's all from me 13:14:37 <hiro> thank you! 13:14:43 <juga> on mi side, i'm just focused now on the anomaly tool, so far getting different values in each function for r and python 13:15:01 <sarthikg[mds]> my update: was migrating the code to the new tables (json ingested with rust), but there were a bunch of changes in the tables, hence tracking down each change requires a whole rewrite of the structs representing the json. I am mostly done with that, and subsequently the getters to the right types. 13:15:07 <sarthikg[mds]> also, i did a small experiment with changing the aggregator to use clickhouse streams instead of batches. the performance is like 20x faster. both changes paired are almost done, and in local testing. 13:15:13 <sarthikg[mds]> lastly, I have tried to simplify the logic for calculating each field based on if the router is new or old. it would be a great help if everyone could review this, and suggest if i am missing any edge-cases. (descriptorParser doesn't handle stuff correctly at times) 13:15:13 <sarthikg[mds]> https://gitlab.torproject.org/tpo/network-health/metrics/aggregator-rs/-/blob/feat/error-handling/src/service/bridge/resolver/is_running.rs?ref_type=heads 13:15:58 <hiro> @juga check those "nan" values as we said, in my experience moving from R (or other mathematical programming environments) to some programming language, is where one finds issues 13:16:35 <juga> hiro: yeah, i'm checking that, but even not having nan as input or output, i'm getting different valuse... 13:16:52 <juga> probably the functions i use, i wonder if they're close enough, will continue investigating 13:17:13 <hiro> sarthikg: I think GeKo (IRC) would be the person to help you with that... and file the appropriate issues on descriptorParser side if that's the case too ;) 13:18:50 <sarthikg[mds]> hiro: sure, i'll check with GeKo (IRC) once! 13:19:13 <GeKo> sounds good! 13:21:06 <hiro> ooook! Rohithh do you have an update? I do not want to pressure you but since you are around you might want to give yours too? 13:21:41 <Rohithh[mds]> on my side I'm still working on nsa bandwidth route will keep updating 13:22:43 <Rohithh[mds]> about the progress 13:22:53 <hiro> ok thank you! 13:23:23 <hiro> on my side I am working on the stats for relay and bridge users and solving a bunch of issues with current metrics.tpo that I haven't finished last week. 13:24:00 <hiro> alright does anyone have something else for this week meeting? 13:24:18 * juga is good 13:24:31 <GeKo> me too 13:24:50 * hiro is groot too 13:24:55 <sarthikg[mds]> me too 13:25:05 <hiro> ok if everyone is groot, I'll end the meeting 13:25:05 <hiro> #endmeeting