13:03:00 <hiro> #startmeeting network-health 2026-01-19
13:03:00 <MeetBot> Meeting started Mon Jan 19 13:03:00 2026 UTC.  The chair is hiro. Information about MeetBot at https://wiki.debian.org/MeetBot.
13:03:00 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
13:03:11 <sarthikg[mds]> o/
13:03:21 <hiro> #link https://pad.riseup.net/p/tor-nethealthteam-2026-keep
13:03:24 <hiro> the new pad everyone!
13:04:07 <hiro> all right who wants to start with this week updates?
13:04:13 <GeKo> the pad is dead, long live the new pad!
13:04:18 <GeKo> i can go
13:04:24 <hiro> lol
13:04:37 <GeKo> i did  more work on p183 wrt the anomaly report writing
13:04:49 <GeKo> and created a project for the documentation
13:05:09 <GeKo> which i put in my private namespace for now as we need to think where we want to have the final work
13:05:36 <GeKo> we have the result on gitlab pages: https://tor-anomaly-detection-59beaa.pages.torproject.net/
13:05:43 <hiro> nice that was great GeKo (IRC)
13:05:45 <GeKo> i plan to add more stuff over the next weeks
13:05:50 <juga> nice
13:05:55 <GeKo> as i still have plenty of material
13:06:03 <GeKo> but for the sponsor part i think we are done
13:06:20 <hiro> I have been wondering if this could be a better format for our documentation that the current wiki
13:06:32 <juga> +1 :)
13:06:33 <GeKo> yeah, we could do that as well
13:06:34 <hiro> but this can be a wider discussion for an in person meetup maybe
13:07:19 <GeKo> i've started thinking about what data we need for the anomaly algorithms we want to imlplement
13:07:24 <GeKo> *implement
13:07:43 <GeKo> hiro: what do you and sarthikg[mds] plan for the user data on the new website?
13:08:05 <GeKo> right now we have some .csv data available for the graphs
13:08:14 <hiro> I am creating the user data computing logic in parser-rs
13:08:15 <GeKo> how is that supposed to look like with the website 2.0?
13:08:29 <hiro> so far so good, but I am not able to estimate properly snowflake users
13:08:45 <hiro> I spent all friday trying to understand why was that
13:08:59 <GeKo> i guess i need to look at that then
13:09:14 <GeKo> because that's what we basically need for what juga is working on
13:09:37 <GeKo> i was wondering whether we would have ideally some materialized table or view
13:09:47 <hiro> ok, for the tuning of the algorithm we can still use what we have in the csv from the current metrics.tpo
13:09:59 <GeKo> holding that data so we can easily query that for arbitrary timeframes
13:10:30 <hiro> yes so https://gitlab.torproject.org/tpo/network-health/metrics/datastore/-/blob/main/stats_tables.sql?ref_type=heads these would be the tables
13:10:40 <GeKo> yeah, but ideally we would query the db for that i think
13:11:02 <hiro> and should map what we have right now on the csv (I mean same columns)
13:11:23 <GeKo> okay, great we do have that available then already
13:11:38 <hiro> so the daily_relay_users has accurate numbers
13:11:52 <hiro> but not the daily_bridge_users
13:12:15 <GeKo> okay, interesting. i'll take a closer look at those numbers
13:12:28 <hiro> oh well, it does, but when there is a country or transport that is more predominant in one country I am doing a mistake it seems when estimating the interval windows
13:12:39 <GeKo> because i was concerned we calculate them in a way that they would differ from what we get on the website
13:13:06 <hiro> so like if you look at russia for example the snowflake estimation is completly wrong, but not the other transports
13:13:07 <juga> the input for the tool only needs date and number of users/clients
13:13:44 <GeKo> yep
13:13:59 <GeKo> okay, i'll update team#393 accordingly
13:13:59 <tor> Uhm, which one of [tpo/anti-censorship/team, tpo/applications/team, tpo/community/team, tpo/core/team, tpo/network-health/metrics/team, tpo/network-health/team, tpo/operations/team, tpo/team, tpo/tpa/team, tpo/ux/team, tpo/web/team] did you mean?
13:14:04 <hiro> so I am trying to figure out why
13:14:19 <GeKo> sorry, tor my friend: network-health/team#393
13:14:29 <GeKo> that's all from me
13:14:37 <hiro> thank you!
13:14:43 <juga> on mi side, i'm just focused now on the anomaly tool, so far getting different values in each function for r and python
13:15:01 <sarthikg[mds]> my update: was migrating the code to the new tables (json ingested with rust), but there were a bunch of changes in the tables, hence tracking down each change requires a whole rewrite of the structs representing the json. I am mostly done with that, and subsequently the getters to the right types.
13:15:07 <sarthikg[mds]> also, i did a small experiment with changing the aggregator to use clickhouse streams instead of batches. the performance is like 20x faster. both changes paired are almost done, and in local testing.
13:15:13 <sarthikg[mds]> lastly, I have tried to simplify the logic for calculating each field based on if the router is new or old. it would be a great help if everyone could review this, and suggest if i am missing any edge-cases. (descriptorParser doesn't handle stuff correctly at times)
13:15:13 <sarthikg[mds]> https://gitlab.torproject.org/tpo/network-health/metrics/aggregator-rs/-/blob/feat/error-handling/src/service/bridge/resolver/is_running.rs?ref_type=heads
13:15:58 <hiro> @juga check those "nan" values as we said, in my experience moving from R (or other mathematical programming environments) to some programming language, is where one finds issues
13:16:35 <juga> hiro: yeah, i'm checking that, but even not having nan as input or output, i'm getting different valuse...
13:16:52 <juga> probably the functions i use, i wonder if they're close enough, will continue investigating
13:17:13 <hiro> sarthikg: I think GeKo (IRC) would be the person to help you with that... and file the appropriate issues on descriptorParser side if that's the case too ;)
13:18:50 <sarthikg[mds]> hiro: sure, i'll check with GeKo (IRC)  once!
13:19:13 <GeKo> sounds good!
13:21:06 <hiro> ooook! Rohithh do you have an update? I do not want to pressure you but since you are around you might want to give yours too?
13:21:41 <Rohithh[mds]> on my side I'm still working on nsa bandwidth route will keep updating
13:22:43 <Rohithh[mds]> about the progress
13:22:53 <hiro> ok thank you!
13:23:23 <hiro> on my side I am working on the stats for relay and bridge users and solving a bunch of issues with current metrics.tpo that I haven't finished last week.
13:24:00 <hiro> alright does anyone have something else for this week meeting?
13:24:18 * juga is good
13:24:31 <GeKo> me too
13:24:50 * hiro is groot too
13:24:55 <sarthikg[mds]> me too
13:25:05 <hiro> ok if everyone is groot, I'll end the meeting
13:25:05 <hiro> #endmeeting