16:59:31 <ahf> #startmeeting Network team meeting, 31 January 2022
16:59:31 <MeetBot> Meeting started Mon Jan 31 16:59:31 2022 UTC.  The chair is ahf. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:59:31 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:59:35 <ahf> hello hello welcome
16:59:41 <ahf> our pad is at https://pad.riseup.net/p/tor-netteam-2022.1-keep
16:59:46 <nickm> hihi
16:59:50 <Diziet> Evening all.
16:59:51 <dgoulet> hi
16:59:58 <jnewsome> o/
17:00:20 <mikeperry> o/
17:00:36 <ahf> how is our boards doin' at https://gitlab.torproject.org/groups/tpo/core/-/boards ?
17:02:09 <nickm> I think we have a few things to untangle at arti, but I also think we're on it
17:02:19 <hiro> o/
17:02:22 <nickm> I also need to pick up a little more hacking work once I'm done with reviews
17:02:28 <nickm> (hi hiro!)
17:02:41 <ahf> i need to finish those two s30 tickets so i can get out of that this week
17:02:45 <juga> o/
17:02:52 <Diziet> Today has been a busy review day (and fixing random stuff day) for me.  Volunteers turn up at the weekend and Do Stuff!
17:03:02 <ahf> may need some help understanding how we handle our flag calculations at some point from either nick, david or mike
17:03:27 <ahf> Diziet: :-D
17:03:59 <ahf> ok i don't see anything looking off on the board at least
17:04:05 <nickm> ahf: as in flags in the directory votes?
17:04:13 <eta> o/
17:04:21 <ahf> nickm: yes, in this case for the bridge auth
17:04:26 <nickm> (hi, juga, eta!)
17:04:30 <ahf> we have an annoying bug that i cannot reproduce locally with stable flag assignments there
17:04:35 <ahf> o/
17:04:47 <ahf> on gman's bridge auth
17:05:10 <ahf> nickm: maybe i can prod you about it wed/thu this week if you have a moment there? i think i could use your brainz there
17:05:18 * gman999 seeped in $work but around-ish
17:05:29 <nickm> sure. I think I understand how it's supposed to work, though the others may have a better handle on how it really behaves
17:05:34 <ahf> gman999: don't need you, yet, but we will need to test some things at some point :-/
17:05:45 <ahf> i'll ping you, probably over email this week on this
17:05:45 <gman999> cool cool
17:05:49 <gman999> np
17:05:55 <ahf> nickm: awesome
17:06:00 <ahf> dgoulet: anything on releases this week?
17:06:41 <dgoulet> not really. I'll let mikeperry update us about s61 which is in relation to 047
17:06:57 <ahf> excellent
17:07:04 <ahf> we can do that in the s61 section
17:07:21 <ahf> ugh, it's not the first monday of the month.. it's the last
17:07:52 <ahf> doesn't look like there is any uncaught items from other teams
17:08:14 <ahf> eta added an excellent item today to our discussion section:
17:08:18 <ahf> [2022-01-31] giving external contributors more access / authority to merge (specifically trinity for their work on CI, etc) ~eta
17:08:47 <ahf> we *finally* have a project where a lot of external people seems able to contribute a bit every now and then
17:09:05 <ahf> and we have never really found a good way to handle that in our team in general, but now that we have arti we should probably start doing something here
17:09:42 <ahf> we have in the org the ability to make people are "core contributor" which happens every now and then, so i think the goal with people we get in this way is eventually they should become that
17:09:43 <ahf> but
17:09:52 <ahf> what can we do to make it more attractive here and now to contribute to the project(s) ?
17:10:01 <Diziet> We should come up with a skeleton process for making someone into a maintainer.  It doesn't have to be heavyweight, but we should write down the steps (and which parts of the discussion should be public) etc.
17:10:17 <ahf> yeah
17:10:53 <ahf> it seems to me, for example with trinity, that they already have some domains where they contribute a lot to: infrastructure, testing, and CI, right?
17:10:56 <eta> yeah
17:10:58 <Diziet> Inevitably it will invole some discussions of someone as a person so there needs to be a substantial element not done in the glare of publicity.
17:11:13 <eta> like, why I asked the question in the first place was because I'd kind of rather they review CI MRs
17:11:30 <eta> they know a fair deal more about it than me anyhow, since they've worked on it a fair deal
17:11:47 <Diziet> I got assigned a CI MR from trinity for review this morning and was like "urrrr no idea, err, let me throw it at nickm"...
17:11:56 <ahf> maybe we should do the easy thing then for now while we build up a process for this and ask them if they are up for taking on the review task for CI/test ?
17:12:02 <ahf> they still need a reviewer for their ow nchanges tho
17:12:04 <ahf> own changes*
17:12:05 <Diziet> eta: That's really valuable expertise
17:12:41 <Diziet> ahf: Right, having a 2nd eye on everything is a good idea, even for maintainers.
17:13:47 <ahf> so i think there is two things in this: we need to find someone who is up for talking with trinity if they are up for this? if nobody wants to, i'm cool with it and we should say this is a new thing for us that we are trying to build and the other thing i will have to do here is to figure out if we can make the review assignment bot be a bit smarter on who to assign things to based on files touched?
17:14:09 <Diziet> SGTM on both counts
17:14:26 <ahf> does anybody want to talk with trinity here or should i? :-)
17:14:58 * eta doesn't feel a strong desire to, but can if required
17:15:06 <ahf> i can do it, that is fine
17:15:26 <ahf> ok, i think that was this item for now. i'll start a discussion on the process in our team and then prod trinity here and now
17:15:33 <ahf> and look into what the triage-bot can be made to do
17:15:34 <ahf> ok!
17:15:38 <Diziet> :-)
17:15:40 <ahf> mikeperry: wanna do s61 stuff?
17:15:47 <mikeperry> ok
17:17:02 <mikeperry> so last week, I began running sims after switching the simulator over to negotiation, and while simulataneously checking this nagging issue of some guard relays having large-ish circuit queues, I noticed that something in the negotiation branch made it worse
17:17:39 <mikeperry> jnewsome and I suspect that the geoip file update could have changed the network characteristics of the network model that the simulator builds, but I am making a list of other potential issues from the diff since rebase as well
17:18:03 <mikeperry> so we will be trying to confirm the cause of that regression while I prepare the branch for review
17:18:37 <mikeperry> while preparing the branch and checking the spec, I also noticed a missing piece from onion service negotiation, which dgoulet quickly fixed
17:19:02 <ahf> hm, interesting with the geoip file
17:19:05 <mikeperry> but I need to clean up commit structure now and make sure everything is clean, so review is not annoying
17:19:33 <mikeperry> so I am a bit behind on that
17:20:08 <mikeperry> I will be updating the sim plan  soon wrt this investigation, as well as onion service testing, and other things the alpha needs tested wrt negotiation
17:20:24 <ahf> cool!
17:20:37 <mikeperry> jnewsome: how is the onion service sim support? hiro was asking me about what/when to do there?
17:21:39 <jnewsome> mikeperry: i need to make another pass over it and send it back to rob for review. I'll try to make sure I get that done today
17:21:43 <mikeperry> with this regression, there's now a few things that need simming. those can be done first, to find that regression, while the onion service stuff moves further along
17:22:15 <mikeperry> or we could run a full scale sim with onion services just to see how it behaves. I can check for some things with just scripts, before graphs exist
17:23:05 <ahf> i think i may have just forgotten to check this off in my list, but does reviews happen in parallel with this investigation or do we continue on the code review after this have been looked into?
17:23:12 <mikeperry> ok. let me know. once we have that, it might be good just to kick off a full size sim, so hiro can have a pipeline output with results to look at graphing
17:23:13 * ahf have not gone over GL yet today
17:24:21 <mikeperry> ahf: the regression is extremely minor. I think running sims to track down the commit that caused it is ok to do in parallel with code review
17:24:30 <ahf> excellent
17:24:45 <mikeperry> but I also have some things to do on the branch and the spec to make it cleaner and easier to review
17:24:58 <ahf> ok, you wanna do that first?
17:25:06 <mikeperry> so I am juggling a few things here. it may be another couple days before MRs are ready
17:25:11 <ahf> ok!
17:25:16 <ahf> let's wait a few days then, no worries
17:25:18 <ahf> cool
17:25:28 <ahf> i assume you'll prod us on irc when things are ready
17:26:03 <mikeperry> we also have the report metrics, which were in fact impacted by something that happened last quarter. either the DoS and/or relay removal made our performance indicators worse, in the report
17:26:36 <mikeperry> it looks like we mentioned the DoS in the indicators table, but maybe both are worth mentioning in the report, if we are not going to filter the dates out of the indicator metrics
17:26:46 <mikeperry> gaba: that report is due like, today, yeah?
17:26:52 <GeKo> yes
17:26:54 <gaba> yes
17:27:06 <mikeperry> what time?
17:27:14 <gaba> well. I already sent it to bekeela
17:27:20 <gaba> if we are changing something, it should be now
17:27:20 <mikeperry> do you think we should mentioned the dos and relay removal in more places, or is what we have enough?
17:27:23 <gaba> yes
17:27:23 <mikeperry> ok
17:27:31 <gaba> let's add that into the narrative
17:27:35 <gaba> the summary
17:27:48 <mikeperry> ok. I can put a paragraph in there right after the meeting
17:27:52 <gaba> thanks!
17:27:56 <mikeperry> on the nextcloud link, yeah? that is still canonical?
17:28:24 <gaba> yes
17:28:27 <mikeperry> ok
17:28:29 <gaba> still it is the right place
17:30:23 <mikeperry> so I will be updating the simulator plan for the alpha today. that is top priority for me, so we're less scatterbrained about what we're doing for the negotiation branch and this overload issue
17:31:03 <mikeperry> there may be some switching of ordering there, depending on how the onion service graphing comes along vs other things we can test sooner
17:31:21 <ahf> souldn't we just mention this dos and relay removal in the next report?
17:31:32 <ahf> seems a bit close to the deadline to revise a doc today if we are sending it off?
17:31:35 <mikeperry> hiro,jnewsome: let's follow up later once rob does that review and/or we have something that can run in a full-size sim
17:31:51 <jnewsome> mikeperry: sg
17:32:04 <mikeperry> ahf: the problem is they will look at our metrics, like they always do, and ask "why did things get worse", like they always do
17:32:06 <gaba> ahf: the report is already done. We are only adding one paragraph
17:32:45 <mikeperry> so we need a pragraph that says we had two diff attacks on the network, so they at least know why
17:32:54 <ahf> ah
17:32:55 <ahf> ok
17:33:02 <ahf> makes sense
17:34:16 <mikeperry> gotta cross the T's on this stuff. this is why we're trying to create a process wrt ticket tags and tracking these date ranges. this stuff has to go into the report if it changes metrics, which it did
17:34:31 <ahf> ya
17:34:33 <mikeperry> so we need to make sure we see that earlier next time, I guess. it is a pretty clear change
17:35:39 <ahf> anything else for today?
17:35:48 <mikeperry> we're also a little handicapped not having geko on full capacity for that :/
17:36:02 <GeKo> i am back! :)
17:36:09 <mikeperry> juga,geko: anything wrt sbws and network-health?
17:36:31 <GeKo> we could use your input on https://gitlab.torproject.org/tpo/network-health/sbws/-/issues/40119 for sbws
17:36:39 <juga> yes
17:36:40 <GeKo> to figure out the prio of that one
17:37:06 <juga> also, new gabelmoo's CDF graph
17:37:15 <juga> none of them is urgent
17:37:25 <GeKo> otherwise not much from me. i made it out alive of the tor browser dungeon and ramping up on n-h work again
17:37:49 <mikeperry> oh yes, that difference with the weight sum is likely gabelmoo's latency vs long claw. I meant to comment but I have been very distracted with the report, negotiation, and this overload cell queue issue
17:38:01 <juga> np
17:38:03 <mikeperry> but juga's guess is likely right
17:38:27 <juga> so you would say no need to investigate more and it's fine?
17:39:19 <mikeperry> yeah, it is the kind of thing that we should look into after sbws switches to congestion control
17:39:31 <juga> ok
17:39:46 <mikeperry> before that point, performance in sbws is *heavily* dominated by latency of dirauth to fast relays
17:40:55 <mikeperry> the graph in https://gitlab.torproject.org/tpo/network-health/metrics/analysis/-/issues/33077#note_2773414 looks more sane, btw
17:41:21 <mikeperry> it looks a lot more in-line, and sbws is measuring some relays as faster, where as torflow was not
17:41:35 <juga> i see
17:41:46 <mikeperry> sbws is the red line, yeah?
17:41:55 <juga> 1min
17:42:05 <mikeperry> I think this is ok
17:42:13 <juga> yes
17:42:29 <mikeperry> and yeah, the previous discrepancy was likely because of the network attacks and flooding experiments in those months
17:42:34 <mikeperry> many variables to control for, heh
17:42:43 <juga> yep
17:42:50 <mikeperry> but I think this is good
17:42:57 <mikeperry> it looks like sbws is working, from this last graph
17:43:09 <juga> (phew ;))
17:43:14 <GeKo> \o/
17:43:25 <ahf> :-)
17:44:41 <mikeperry> ok. I think that is it. I am of course displeased by how chaotic this is right now, but I will organize the sim plan at least, so it's more clear what we're doing there
17:45:25 <ahf> i don't think you should be displeased by anything. there are many gears that needs to fit together in all of this and we are not that much off by the plan
17:45:52 <ahf> i have one small thing i forgot to ask
17:45:56 <mikeperry> yeah I didn't expect to have these kinds of sim issues at this point. but we will get to the bottom of these issues!
17:46:29 <ahf> nickm, eta, Diziet: could y'all do 17 BBB tomorrow for arti api discussion? everybody else is welcome too, but based on my experience with the last round of this topic i will say that you need to have a good idea of rust if you want to dive into this conversation
17:46:49 <ahf> mikeperry: \o/
17:46:49 <Diziet> I'm free then
17:46:52 <eta> ahf: wfm
17:46:58 <nickm> I can to 1700.
17:47:02 <nickm> *do
17:47:06 <ahf> awesome, let's go for that then
17:47:09 <nickm> (that's what you meant, right?)
17:47:12 <ahf> yes
17:47:34 <Diziet> nickm: Not 17 BBB's all at once.
17:47:40 <ahf> ok, thanks all for a good meeting. next monday is the meeting where s61 is split into its own meeting since it's the first monday of february
17:47:58 <ahf> i'll merge the january and december report for the forum as december was pretty low in stuff, but we can talk about that next week
17:48:00 <ahf> thanks all o/
17:48:01 <ahf> #endmeeting