16:59:46 <ahf> #startmeeting Network team meeting, 20 december 2021 16:59:46 <MeetBot> Meeting started Mon Dec 20 16:59:46 2021 UTC. The chair is ahf. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:59:46 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 16:59:49 <ahf> hello hello 16:59:56 <ahf> our pad is at https://pad.riseup.net/p/tor-netteam-2021.1-keep 17:00:20 <ahf> last meeting of 2021! 17:00:27 <juga> o/ 17:00:40 * ahf pokes jnewsome, dgoulet, nickm, mikeperry, eta 17:00:53 <dgoulet> o/ 17:01:13 <mikeperry> o/ 17:01:23 <jnewsome> o/ 17:01:24 <ahf> i am gonna do this real quick because i think everybody is trying to get their stuff done before checking out 17:01:40 <ahf> i don't see anything on the boards that require any special attention, it seems like people are OK there 17:01:54 * jnewsome is on vacation starting today but available to check in or answer q's 17:02:11 <nickm> hello 17:02:22 <ahf> jnewsome: ah! enjoy the vacation then! :-D 17:02:48 <ahf> dgoulet: anything we need to talk about re lsat week's releases? i added some bigger discussions that we can take about comms on releases in the new year 17:03:06 <dgoulet> nothing on my radar 17:03:39 <ahf> i don't see any tasks from other teams 17:03:59 <ahf> no announcements, no discussions 17:04:21 <ahf> the first meeting in 2022 will be monday the 10th at 17 UTC 17:04:36 <ahf> ian will start the 6th and start ramping up on arti things 17:04:40 <mikeperry> oh I had not updated the pad yet with s61 17:04:44 <ahf> mikeperry: you want to talk about s61? 17:04:53 <ahf> no worries 17:05:09 <mikeperry> sure 17:06:16 <mikeperry> jnewsome and hiro switched us to a new baseline for the sim, using a flooded network period in sept, and jnewsome created some sim parameters for consensus params 17:06:54 <mikeperry> I have basicaly tuned to within shadow's varience within runs. there's still some more tuning to double-check and re-run, but we have some pretty good results 17:07:38 <mikeperry> with 1ms KIST and the flooded network model (which should more accurately represent capacities), we're seeing some really high throughputs, and not uch impact on latency 17:07:48 <mikeperry> https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/jobs/73790/artifacts/file/public/hk/tornet.plot.pages.pdf 17:07:53 <mikeperry> https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/jobs/73790/artifacts/file/public/de/tornet.plot.pages.pdf 17:08:22 <mikeperry> I will continue to queue up some remaining whole-network sims over the break 17:08:53 <mikeperry> and in january, we will start looking at negotiation, circewma, final calibration, and then mixed network sims 17:09:05 <ahf> nice, really nice 17:09:13 <ahf> first time i look at this pdf 17:09:14 <dgoulet> epic 17:09:54 <mikeperry> yeah. 1ms KIST plus congestion control really takes the speed limits away 17:09:57 <ahf> so what is meant here by baseline? it's what a model should be able to do? and experiment is what tor is able to do? 17:10:32 <mikeperry> the baseline is a sim of 0.4.6 stock, using a network model derived from relay capacities during Rob's flooding experiment 17:10:47 <mikeperry> and public tor is a graph of the onionperfs from the live network during that period 17:11:15 <mikeperry> so there are still some small discrepen cies in the sim vs public tor, which we will dig into in Jan 17:11:20 <ahf> i see 17:11:32 <ahf> ya, not a big difference 17:12:15 <mikeperry> but yeah, some xfers have up to 300Mbit throughput, according to the sim 17:13:02 <mikeperry> and this causes very little/no additional latency 17:13:15 <mikeperry> circewma may occasionally be turning on still, though 17:14:00 <ahf> gonna be very exciting to see this out in the hands of our users \o/ 17:14:10 <mikeperry> for reference, we are almost done eith round4 in the sim plan, thanks to the cloud runners jnewsome added the past couple weeks 17:14:13 <mikeperry> https://gitlab.torproject.org/mikeperry/tor/-/blob/cc_shadow_experiments_v2/SHADOW_EXPERIMENTS.txt#L586 17:14:41 <mikeperry> round 5 is our goal in January, before we release an alpha with negotiation and new default params 17:14:57 <mikeperry> round 6 is the sims to complete before a stable 17:15:42 <anarcat> mikeperry: how are runners? 17:15:44 <anarcat> jnewsome: ^ 17:16:21 <jnewsome> anarcat: mostly good; the new one runs a lot slower for some reason, but I haven't had time to investigate 17:16:35 <jnewsome> we changed the label of it to 'shadow-small' for now 17:16:39 <mikeperry> anarcat: we put the two runners you made in 'shadow-small', since they are 4-5X slower than the beefy one we have in 'shadow'. jnewsome is using them for test sims 17:16:46 <jnewsome> https://gitlab.torproject.org/jnewsome/sponsor-61-sims/-/issues/12 17:17:10 <mikeperry> I will run a sim in both 'shadow-small' and 'shadow' at the end of round4, just to see how the results compare between the two sets 17:17:17 <jnewsome> also I think we'll need to start garbage collecting old sim results from the persistent volume pretty soon 17:17:33 <mikeperry> there still is a bit of variance between runs in shadow in general tho. it is making the final tuning a bit tricky 17:17:33 <anarcat> the beefy one is a machine that is possibly a decade newer than the other 17:17:40 <anarcat> i am not surprised by 4-5x slowdown 17:17:53 <mikeperry> we are at the point where changing things is not really making as much difference as the variance in shadow 17:18:06 <mikeperry> which is good. it means we're pretty much as well-tuned as we can get 17:18:20 <mikeperry> at least, with the current shadow run length and run count 17:18:22 <jnewsome> we can reduce variance if we need to. easiest is to run more trials per experiment 17:18:50 <jnewsome> but using a larger network also makes a pretty big difference. simulating > 30m would also probably help (rob usually uses 60m) 17:19:06 <anarcat> chi-node-14 has 80 cores, chi-node-12 has 24... with your workload, it's bound to make at least a 4x difference, even disregarding the difference in CPU generation 17:19:27 <anarcat> chi-node-12: https://paste.anarc.at/publish/2021-12-20-Fweg1rPpketidZlNg2QTxZjReyIrFxJqRIBXZ9EGlcU/clipboard.txt 17:19:39 <mikeperry> yeah, in round5 in january, maybe we can try larger networks/more runs to investigate some of these ties 17:19:47 <anarcat> chi-node-14 https://paste.anarc.at/publish/2021-12-20-oeK7_ZvgYuyT8qbgv4c6DtiPHfoJ5znkgTJ-70t20gM/clipboard.txt 17:21:21 <ahf> anything else for today? 17:21:22 <jnewsome> anarcat: yeah, maybe that's enough to explain the performance difference 17:21:51 <mikeperry> geko and juga have anything? are they here today? 17:22:02 <mikeperry> juga has been examining sbws graphs 17:22:44 <juga> hey 17:22:49 <juga> no news today 17:22:55 <mikeperry> sbws looks good.. geko and I wanted to look at a gabelmoo graph from before and after the change, overlayed 17:22:56 <juga> still processing gabelmoo's data 17:23:00 <mikeperry> ok 17:23:08 <GeKo> i am 17:23:24 <GeKo> but not much in network health land... 17:23:30 <mikeperry> ok 17:23:59 <mikeperry> well that might be it then. I am very excited for january 17:24:40 <mikeperry> and 2022 in general 17:24:59 <ahf> ditto 17:25:24 <ahf> ok folks, let's call it then. hope everybody will have a really nice holiday, and thanks for all the awesome work this year \o/ 17:25:35 <ahf> #endmeeting