16:00:18 <GeKo> #startmeeting network-health 16:00:18 <MeetBot> Meeting started Mon Jun 7 16:00:18 2021 UTC. The chair is GeKo. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:18 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:53 <dgoulet> o/ 16:00:54 <GeKo> alright, let's get started for the weekly meeting 16:00:59 <GeKo> https://pad.riseup.net/p/tor-netteam-2021.1-keep is the pad 16:01:07 <GeKo> ah, no 16:01:09 <ggus> hey! 16:01:12 <GeKo> let me get the right onw 16:01:14 <GeKo> *one 16:01:27 <irl[m]> hi 16:01:34 <GeKo> kfahv6wfkbezjyg4r6mlhpmieydbebr5vkok5r34ya464gqz6c44bnyd.onion/p/tor-nethealthteam-2021.1-keep 16:02:31 <gaba> hi! 16:04:29 <GeKo> okay, let's get started 16:04:45 <GeKo> if someones needs to add things to the pad, please do while we are chatting here 16:05:19 <GeKo> irl[m]: so, for the metrics.tpo outages what are the next steps here? 16:05:40 <irl[m]> as yet, no idea, i'm still mostly working on collector outage 16:05:58 <GeKo> i've filed https://gitlab.torproject.org/tpo/metrics/team/-/issues/15 16:06:01 <gaba> irl[m]: what specifically you did to bring metrics.tpo back? 16:06:08 <irl[m]> turned it off and on again 16:06:17 <GeKo> and added an item on the status page to indicate we have issues 16:06:33 <irl[m]> i've not restarted it since the weekend, and it looks happy again now 16:06:44 <GeKo> i had issues this morning 16:07:00 <GeKo> showing a 502 proxy error on relay search 16:07:10 <irl[m]> that would imply it's a load-related problem then 16:07:20 <irl[m]> until i've got the prometheus stuff set up i have no visibility of any of this 16:07:23 <GeKo> so it seems theses problems are at least only intermittently happening 16:07:34 <GeKo> what is the ticket for that? 16:07:52 <GeKo> or do we need to create one? 16:08:54 <GeKo> irl[m]: ^ 16:10:04 <irl[m]> https://gitlab.torproject.org/tpo/tpa/team/-/issues/40280 is the ticket that is blocking the prometheus exporter being set up for collector, then i was going to add collector into the prometheus, and then go from there 16:10:04 <irl[m]> it's a whole new thing, to replace the old metrics nagios that seems to have been turned off while i was gone and nothing replaced it 16:10:04 <irl[m]> i think the biggest problem here is that i only knew metrics was broken because someone told me 16:10:04 <irl[m]> not a single alert was triggered anywhere 16:10:16 <irl[m]> the second problem is that the logs are very noisy, because metrics-web does a lot more than it did when the logging was initially devised, so without monitoring you just have a mountain of logs 16:10:30 <irl[m]> you don't know where to look because you have no timestamp 16:10:37 <GeKo> i see 16:10:54 <irl[m]> i'll look to see if we made a ticket for the larger thing 16:11:01 <GeKo> thanks 16:11:16 <irl[m]> https://gitlab.torproject.org/tpo/tpa/team/-/issues/40216 is related 16:11:24 <irl[m]> https://gitlab.torproject.org/tpo/tpa/team/-/issues/40274 is related 16:11:31 <GeKo> right 16:11:34 <irl[m]> there isn't a "project" ticket as such that i can see 16:11:42 <GeKo> i remember the last one 16:12:04 <irl[m]> the ticket would probably be titled "Monitor Metrics services with Prometheus" 16:12:36 <irl[m]> anarcat has set up some git stuff to make it easier for me to directly write the prometheus configs and have them deployed 16:12:45 <GeKo> can we take some shortcuts here so that the issue potentially buggging metrics.tpo is caught first? 16:13:00 <GeKo> i am not sure what logging infra needs to get set up for that 16:13:14 <GeKo> as i don't really know all the pieces involved here 16:13:40 <irl[m]> yes, i need to refresh my knowledge of blackbox exporter and then write the config for that 16:13:47 <irl[m]> instead of matrix it will just send me emails, which is better than nothing 16:14:05 <GeKo> but if collector is e.g. not involved in the metrics.tpo outage we could postpone setting prometheus alerts up for that one 16:14:23 <GeKo> and start with a different part first 16:14:38 <irl[m]> right yes, that is the plan 16:14:48 <GeKo> great 16:14:52 <gaba> so the idea is to do this in nagios? 16:14:55 <gaba> and not prometheus 16:14:59 <irl[m]> no, prometheus 16:15:02 <gaba> ook 16:15:10 <irl[m]> it all used to be in nagios but i guess people didn't like nagios as it got turned off 16:15:44 <irl[m]> the prometheus is being used by anti-censorship too, so there's redundancy of knowledge 16:15:53 <GeKo> yeah 16:16:07 <GeKo> i don't know anything about why the nagios part got turned off 16:16:35 <GeKo> but we should not start with it again if we move to prometheus i guess 16:17:03 <GeKo> okay 16:17:11 <GeKo> that's anything i had for that item 16:17:19 <GeKo> the other is the roadmap 16:17:25 <GeKo> http://kfahv6wfkbezjyg4r6mlhpmieydbebr5vkok5r34ya464gqz6c44bnyd.onion/p/IutVYvgMq9614nDk-KFm 16:17:29 <gaba> yes, not sure. We never talked about retiring nagios 16:17:46 <GeKo> i've cleaned it up and created tickets and we started triaging them 16:18:11 <GeKo> so for this week it would be useful if any of you could go over it and think about things that are missing 16:18:20 <GeKo> or even mis-categorized 16:18:26 <GeKo> arma2: mikeperry: ^ 16:18:51 <GeKo> we tried to put things in Needed and Wanted etc. according to what we came up during the meeting 16:19:05 <GeKo> and by me thinking about it afterwards 16:19:10 <GeKo> but things are not set in stone 16:19:28 <GeKo> so, if there is anything we should fix here, let gaba or me know 16:19:45 <GeKo> ggus: should we deal with the remaining community items? 16:19:49 <ggus> GeKo: yes 16:20:10 <GeKo> so, i have the meetup in Needed 16:20:20 <GeKo> anything else we should put into that? 16:20:31 <GeKo> the otf fellow and operator census work? 16:20:48 <ggus> yes, the operator census work is needed. 16:20:59 <GeKo> to we have a ticket for that work? 16:21:14 <ggus> mmmh, let me check 16:21:21 <gaba> arma2: we also assigned you a ticket. 16:22:00 <GeKo> and could easily assign more :) 16:22:31 <gaba> :) 16:23:51 <GeKo> ggus: no need to find it now, if it takes too long (yeah gitlab search is horrible) 16:23:51 <ggus> GeKo: https://gitlab.torproject.org/tpo/community/team/-/issues/39 16:23:58 <GeKo> :) 16:24:33 <GeKo> there are three items in the wanted section 16:24:43 <GeKo> which i marked with "XXX Ticket" 16:24:57 <GeKo> i guess we a) want to have them as wanted 16:25:10 <GeKo> and b) there should be tickets for them? 16:25:28 <GeKo> could you file them if they are missing and add the links to the pad? 16:25:47 <GeKo> i'll clean up things around them afterwards 16:26:43 <ggus> > Understand where relay operators try to go to get support (UX side) 16:26:48 <ggus> who created this one? 16:27:01 <GeKo> dunno 16:27:13 <GeKo> maybe arma2 16:27:16 <gaba> i think it was arma2 16:27:32 <ggus> ok, i will create a ticket for that one. this will also part of the new fellow 16:27:41 <ggus> part of the work 16:27:45 <GeKo> nice 16:27:48 <GeKo> thanks 16:28:10 <GeKo> the final item i had to discuss is the website blocking tor one 16:28:36 <GeKo> i guess we can keep that as wanted given that we have a gsoc project running 16:28:44 <GeKo> which is providing the infra for that 16:29:18 <GeKo> ggus: at some point we should connect both worlds the advocacy one with the tools one 16:29:28 <ggus> and the comms world too 16:29:28 <GeKo> so the former can start using the latter 16:29:31 <GeKo> yes 16:29:48 <ggus> when this project will be released? 16:30:05 <GeKo> i'll leave that for you to decide when the right time is to get started with that 16:30:11 <GeKo> let me see 16:31:02 <GeKo> i am actually not sure when gsoc ends 16:31:13 <GeKo> but i think end of july 16:31:16 <GeKo> or begin of august 16:31:26 <GeKo> https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/GSoC-2021 is the page for the project 16:32:05 <GeKo> ggus: i'll put you as the comms liasion on the pad, too 16:32:11 <GeKo> not just the community one :) 16:32:33 <GeKo> and we can then put that item on the whishlist for comms folks 16:32:33 <ggus> ok! 16:32:41 <GeKo> i'll create a ticket after the meeting 16:32:47 <GeKo> and then we can take it from there 16:33:11 <GeKo> but the tool is a thing (or will be) and it can be useful for the advocavy part i think 16:33:18 <GeKo> *advocacy 16:33:20 <ggus> GeKo: it would be nice to have woswos and _ranchak_ presenting both projects during a Tor demo day. 16:33:28 <GeKo> right! 16:33:29 <ggus> GeKo: yeah! 16:33:31 <woswos> if you have any wishlist for the gsoc project, please let me/us know 16:33:34 <GeKo> good idea 16:33:52 <GeKo> woswos: you could think about the demo day idea, too 16:34:07 <GeKo> would be awesome to have it presented there 16:34:38 <woswos> is there a link for getting more information about it? 16:34:43 <ggus> woswos: yes, one sec 16:34:47 <GeKo> ggus: that's all i had from my side 16:36:04 <ggus> woswos: example - https://lists.torproject.org/pipermail/tor-project/2021-February/003047.html 16:36:12 <GeKo> while ggus is looking for the link let me know if there is anything else to discuss today 16:36:33 <ggus> we will announce the next demo day on torproject mailing list. but it should happen in august 16:36:58 <gaba> nice 16:36:59 <ggus> 5 - 10 minutes, open for community members, small crowd (~40 ppl) 16:37:02 <gaba> end of August 16:37:09 <woswos> thanks for the link 16:37:51 <GeKo> ggus: are we good wrt the roadmap for now? 16:38:17 <ggus> GeKo: yes, i will create the 2 tickets that are missing 16:38:26 <GeKo> thanks 16:38:47 <GeKo> okay. i heard nothing getting raised for discussion 16:38:57 <GeKo> so thanks for being here and have a nice week 16:38:59 <GeKo> #endmeeting