16:00:00 #startmeeting network-health 09/06/2021 16:00:00 Meeting started Mon Sep 6 16:00:00 2021 UTC. The chair is GeKo. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:00 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:04 hello everyone! 16:00:14 let's see how many folks are around today 16:00:25 i think we won't have a gaba nor a dgoulet today 16:00:30 and gus is out, too 16:00:43 but i guess hiro is around :) 16:00:47 pad is at http://kfahv6wfkbezjyg4r6mlhpmieydbebr5vkok5r34ya464gqz6c44bnyd.onion/p/tor-nethealthteam-2021.1-keep#L65 16:01:02 or better 16:01:04 kfahv6wfkbezjyg4r6mlhpmieydbebr5vkok5r34ya464gqz6c44bnyd.onion/p/tor-nethealthteam-2021.1-keep 16:01:09 o/ 16:01:20 hihi 16:05:35 okay, let's go 16:05:55 hiro: do you have anything we should chat about? 16:06:20 uhm I'd like to close that support document about metricsport if that's possible and deploy those website changes 16:06:30 right 16:06:31 but I guess we can take more time thinking about it 16:06:41 i meant to ping dgoulet tomorrow 16:06:45 we don't want to send the wrong information to people 16:06:54 figuring out whether he can give us some sentences 16:07:06 irl was also mentioning that it was a good topic for a blog post 16:07:21 hrm 16:07:37 fine with me 16:07:52 i guess we should at least write a mail to tor-relays@, though 16:08:07 once we merged all the related changes and things are live 16:08:19 i am not sure whether we have the time for a blog post 16:08:23 yes I was thinking tor-project but also tor-relays makes sense I wasn't sure about the blog post 16:08:36 but i am not opposed to it 16:08:37 hehe yes same 16:08:40 :) 16:09:23 but, anyway, we should aim for this week getting all the work done here and merged 16:09:32 sounds good 16:09:40 there is no need to drag this longer 16:10:14 I have also started outlining okrs. I created a milestone in the network health team space. I hope that's ok. 16:11:00 if that helps you working on that part, fine with me :) 16:11:15 I wasn't sure if that should have gone in the metrics space instead 16:11:18 but I guess it can be moved 16:11:50 it's good having it at the network-health level 16:12:59 hiro: do you have any opinion on https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/40021 ? 16:13:15 i'd just move it to base and would trust irl here 16:13:28 in that nothing breaks in the other projects 16:14:09 yeah I think that is ok 16:14:19 great 16:14:34 i'll write a patch then and put it on your review plate later this week 16:14:48 sounds good I'll check if we set that somewhere in some other metrics project 16:14:54 k 16:15:15 the final thing i had was the collector issue 16:15:22 how do we proceed here? 16:15:26 yes 16:15:39 just waiting until rob stops flooding and seeing whether that#s been the problem? 16:15:50 so I spent some time during the past few days thinking of how we could improve logging of the checker 16:16:04 but even so, it can be used to just confirm what we can read in the logs of the downloader 16:16:16 it's taking more time to download server descriptor and extra-info descriptor 16:16:26 right 16:16:28 when does rob's experiment finish? 16:16:55 the current iteration is supposed to stop on 09/08 16:16:59 this wed 16:17:15 then a week for the advertized bw to get back to "normal" levels" 16:17:19 *levels 16:17:25 then one week off 16:17:30 and then we'd start again 16:17:38 with two weeks flooding 16:18:04 uhm 16:18:34 so I was reading through https://research.torproject.org/techreports/modern-collector-2018-12-19.pdf where irl and karsten did put tgether issues with the current collection model that we have 16:18:57 and being i/o intensive is one of them. the main metrics services rely a lot of disk and network 16:19:45 and what we see if in line with an excessive load imo 16:20:00 what does "Missing too many referenced descriptors" actually mean? do we really lose them? 16:20:21 in the sense that they aren't collected and don't show up in our arhived data? 16:20:22 it means they couldn't be downloaded for some reason 16:20:31 and collector will queue and try again later 16:20:33 *archived 16:20:38 aha, okay 16:20:50 so it's not as bad as actually losing them? 16:22:15 I am not sure if at the end of the data data on those relays is recovered somehow or we just lost it 16:22:33 I guess we should go over the archived data to know and see if we are missing 16:22:35 anything 16:22:47 s/data/day 16:23:30 i guess that would help me at least understanding how severe the problem is 16:24:09 ok then this is what I am going to do tomorrow. I guess it makes sense to know for the next experiment 16:24:16 even if it is just the experiment 16:24:47 yep, thanks 16:25:08 sounds like a good next step 16:25:32 and then we can wait this week and check whether things get better once the experiment is off 16:26:57 okay, that's all i had 16:27:04 do you have anything else? 16:27:09 nope I am fine 16:27:14 great! 16:27:21 thanks and ttyl o/ 16:27:25 #endmeeting