14:59:14 <karsten> #startmeeting metrics team meeting 14:59:14 <MeetBot> Meeting started Thu Aug 27 14:59:14 2020 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:59:14 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:59:16 <karsten> hi mikeperry! 14:59:18 <dennis_jackson> o/ 14:59:22 <karsten> hi dennis_jackson! 14:59:22 <jnewsome> o/ 14:59:31 <karsten> hi jnewsome! 14:59:42 <mikeperry> hello all 15:00:01 <gaba> hi! 15:00:10 <karsten> https://pad.riseup.net/p/tor-metricsteam-2020.1-keep <- pad 15:00:12 <karsten> hi gaba! 15:00:28 <acute> hi everyone! 15:00:37 <karsten> hi acute! 15:02:05 <karsten> okay, I added two topics for today. do we have more? 15:03:38 <karsten> let's start, and if more topics come up, append them to the agenda. 15:03:44 <karsten> OnionPerf 0.7 release 15:03:51 <mikeperry> I added one. if you're going on leave at end of sept we should try to do some trial experiments on CBT to make sure the workflow is ok and I understand how to get data, etc 15:04:01 <karsten> today's the final date of the current roadmap. 15:04:04 <karsten> mikeperry: sounds great! 15:04:27 <karsten> for 0.7, we have two changes according to the change log. 15:04:47 <karsten> https://gitlab.torproject.org/tpo/metrics/onionperf/-/blob/develop/CHANGELOG.md 15:05:17 <karsten> I'm wondering what to do about #33399 here. 15:05:37 <karsten> mikeperry, you mentioned on #33420 that we'll want to drop timeouts together with guards. 15:05:45 <karsten> we're not doing that yet. 15:05:54 <karsten> should we try to include that in 0.7? 15:06:11 <karsten> otherwise, if we change it later, the behavior of 0.7 and 0.8+ will be different. 15:06:26 <mikeperry> hrm yah 15:06:38 <karsten> I'm just not sure what happens if we send a DROPTIMEOUTS command and tor doesn't understand that. 15:06:50 <karsten> or, if we can handle that, how we should handle that. 15:07:03 <karsten> ignore that we cannot drop timeouts, or die? 15:07:22 <mikeperry> just adding in a DROPTIMEOUTS call where DROPGUARDS is done should be sufficient.. but yeah if tor doesn't support it, or if we're not properly able to remove measurements before a new timeout is learned, then the data is not useful 15:07:26 <karsten> (dying seems harsh, just throwing it out here.) 15:07:40 <acute> die with a warning then? 15:08:20 <karsten> maybe? 15:08:25 <mikeperry> dying seems ok so long as we only die if --drop-guards was specified (ie we don't try DROPTIMEOUTS if --drop-guards was enabled) 15:08:34 <karsten> yes, right. 15:08:41 <mikeperry> err don't try=>only try... 15:08:48 <acute> yes, exactly 15:09:10 <karsten> okay, let's try that. 15:09:29 <karsten> who picks #33399? would be good to get this resolved this week. 15:09:44 <karsten> (I can pick it if nobody wants.) 15:10:30 <karsten> okay, picking! 15:10:45 <karsten> that's all for 0.7 from me. 15:11:04 <karsten> moving on to the next roadmap? 15:11:27 <karsten> Roadmap for OnionPerf 0.8 15:11:43 <karsten> I'll be gone in 3.5 weeks. 15:12:02 <karsten> and ideally we'll have 0.8 out earlier than that, so that we can start new measurements with 0.8. 15:12:34 <karsten> how about we go through the tickets on the board and see which one fits into 0.8? 15:12:48 <karsten> https://gitlab.torproject.org/tpo/metrics/onionperf/-/boards 15:13:01 <gaba> sounds good 15:13:14 <karsten> going from right to left, top to bottom: 15:13:21 <karsten> tpo/metrics/onionperf#33260 15:13:36 <karsten> we're almost there, let's include it. 15:13:50 <acute> +1 15:14:21 <mikeperry> great. so long as we can remove a list of fingerprints, then I can test the Fast and Guard relay cutoffs too 15:14:26 <karsten> tpo/metrics/onionperf#33399 is already part of 0.7. 15:14:36 <mikeperry> assuming I can get access to data and apply those filters + graph 15:14:52 <karsten> mikeperry: that should work, yes. 15:15:11 <karsten> mikeperry: you can also start trying that as soon as #33260 is merged to the develop branch, if you want. 15:15:44 <karsten> speaking of, should we include a section in the readme for filtering?... 15:15:56 <mikeperry> ok. that would be great if you can walk me through that process as soon as develop is ready 15:15:57 <karsten> let's put that on the list... 15:16:04 <karsten> yes, will do! 15:16:09 <acute> karsten: happy to have a go at this 15:16:24 <karsten> acute: the readme? 15:16:30 <acute> yes 15:16:36 <karsten> cool! commenting on the ticket now. 15:17:21 <karsten> done. 15:17:36 <karsten> tpo/metrics/onionperf#34231 15:17:42 <mikeperry> is https://gitlab.torproject.org/tpo/metrics/onionperf/-/issues/33328 a dup? 15:18:17 <gaba> yes, it is the objective from the project 15:18:19 <karsten> mikeperry: that's a "Project" ticket. 15:18:43 <karsten> I have been ignoring those as good as I could. 15:19:07 <mikeperry> oh from trac's old parent tickets? 15:19:32 <acute> does this mean the objective is complete once we implement #33260? 15:20:27 <mikeperry> #33260 sounds like it meets what I need. if so, then yes 15:20:44 <mikeperry> so long as it doesn't explode if the list of relays to remove is too large or something like that 15:20:58 <karsten> we never tried, but it shouldn't. :) 15:21:13 <karsten> filtering by fingerprints was the most basic way of filtering we came up with. 15:21:13 <mikeperry> then I can just generate fingerprint lists in stem on the side and filter arbitrarily that way 15:21:22 <karsten> right. that was the plan. 15:21:38 <karsten> the original plan was to import tor descriptors and do more sophisticated filters in onionperf. 15:21:57 <karsten> we can still do that at a later time. but for now, using stem to generate fingerprints and handing those over to onionperf is the way to go. 15:22:51 <karsten> okay, going back to tpo/metrics/onionperf#34231: 15:22:58 <karsten> acute: should we move that to backlog? 15:23:19 <karsten> with the reasoning that we already have a way to map tgen and tor parts. 15:23:35 <karsten> we can still do the more elegant way later, but it's not a blocker right now. 15:23:40 <acute> I don't think there is any rush to include it in 0.8 15:23:49 <acute> so we can 15:23:52 <karsten> okay. moving it. 15:24:18 <karsten> tpo/metrics/onionperf#33420 15:24:37 <karsten> I'd like to keep that for 0.8. 15:25:07 <karsten> it's also related to mikeperry's trial experiment/analysis idea. 15:25:26 <mikeperry> yeah I will likely need to work with that before you come back 15:25:37 <mikeperry> so making sure it does stuff properly first is wise 15:25:45 <karsten> yep. let's keep it then. 15:26:47 <karsten> tpo/metrics/onionperf#40001 15:27:08 <karsten> I wonder if I could have some help with that. 15:27:39 <karsten> for example, part of this documentation includes the setup of our long-running instances. 15:27:44 <mikeperry> I can help by trying to use the docs and whining and crying when I get confused :) 15:27:53 <karsten> yes, that _is_ helpful! 15:28:10 <karsten> let's try that as soon as filters are in the develop branch, okay? 15:28:15 <mikeperry> ok 15:28:30 <acute> ok, so I've actually not set up one of our onionpefs 15:28:50 <acute> but I did examine the setup of op-ab, so I could attempt to draft something 15:28:51 <karsten> would you want to do that together with me, and we write the documentation as we go? 15:29:02 <acute> yes, that sounds great! 15:29:08 <karsten> awesome! 15:30:20 <karsten> great. let's pick a date and time offline. 15:30:36 <acute> cool! 15:30:56 <karsten> tpo/metrics/onionperf#33421 15:31:17 <karsten> it's still a lot of work. 15:31:37 <karsten> and the first part would be to understand how exactly guards work. 15:31:39 <karsten> ;) 15:31:56 <karsten> mikeperry: maybe you could help with the first part there? 15:31:56 <dennis_jackson> :P 15:32:09 <mikeperry> also one of the experiments I want to do is use more than one guard at once. this should improve long-tail performance 15:32:12 <karsten> https://gitlab.torproject.org/tpo/metrics/onionperf/-/issues/33421#note_2706521 15:32:43 <mikeperry> via torrc Num*Guards settings 15:33:08 <karsten> that sounds doable. 15:33:19 <karsten> adding more torrc options is easy in onionperf. 15:33:53 <karsten> the hard part of this issue is to find out what exactly in the tor logs we'd like to process in onionperf. 15:34:07 <karsten> well, as the comment on the issue says. 15:34:26 <mikeperry> can't we just use GUARD events and compare to circuit path lines from the control port? 15:34:45 <karsten> the GUARD events, even the recently fixed ones, are possibly insufficient for this. 15:34:54 <mikeperry> like if I have a data file that records GUARD events, and also path lines, in theory I can do checks on that myself 15:35:01 <mikeperry> oh 15:35:36 <karsten> again, this whole guards thing is a mystery. with all the different sets of candidates, primary guards, and so on. 15:35:42 <karsten> maybe I'm wrong, and they are sufficient. 15:35:56 <karsten> that would be the good result of this first analysis. 15:36:12 <karsten> the not-so-good result would be that we'll have to fix GUARD events even more. 15:36:20 <karsten> because we're not going to parse tor logs in onionperf, just torctl logs. 15:36:43 <dennis_jackson> Question: Are the experiments intended to find bugs in how Tor handles Guards etc? 15:37:10 <dennis_jackson> If not, just using stem directly steps over that issue right? At least, that's what I've done to avoid having to dig into the issue too much 15:37:27 <karsten> how did you use stem? 15:37:35 <dennis_jackson> Programmatically building the circuits I wanted directly 15:37:52 <karsten> ah, that would be a huge change to what onionperf does right now. 15:37:57 <mikeperry> onionperf lets tor itself choose paths 15:38:18 <dennis_jackson> Sure yes, but that's why I asked what you want to measure 15:38:28 <karsten> we might use stem to ask tor what guards it uses. 15:38:34 <karsten> and log that. 15:38:39 <mikeperry> so we need to record those paths, and the output of GUARD events, and see if tor is doing the right thing when we tell it to use 1 guard, or 2 guard, or 3 guards 15:38:44 <karsten> that would work around relying on events. 15:38:47 <dennis_jackson> ah okay 15:38:56 <mikeperry> this GUARD event is a sad stateful mess 15:39:19 <karsten> how about this: I can spend a few hours on this to get this analysis started. 15:39:36 <karsten> I'm just not sure if we'll get it resolved in time for 0.8. 15:39:37 <mikeperry> it should have just told us what Tor thinks the current guards are right now instead of all this stateful per-guard UP/DOWN info 15:40:05 <karsten> never too late to add another event type... 15:40:43 <mikeperry> UP/DOWN also seem not necessarily correlated with in-use 15:40:54 <mikeperry> they might just mean possible to use 15:41:09 <mikeperry> same thing for BAD/GOOD 15:41:37 <karsten> mikeperry: do you want to take a closer look at this first and comment on the ticket before I do something there? 15:42:43 <karsten> in any case, let's keep it in the roadmap, though it might turn out to be too big for 0.8. 15:43:13 <karsten> quickly looking through "Backlog". 15:43:33 <mikeperry> yeah I think this GUARD event requires knowledge of prop271 internal tor state to make use of 15:43:35 <karsten> I don't think there's room for more, if we want to finish in 2-2.5 weeks. 15:43:47 <karsten> mikeperry: sounds like it. 15:43:48 <mikeperry> it is just telling us about *potential* guards, not the one that prop271 decided was the best 15:43:56 <mikeperry> if I am reading the patch right 15:44:04 <mikeperry> ugh I should have looked at that earlier 15:44:19 <karsten> do you want to write a better tor patch, and we run that in onionperf for a while? 15:45:20 <mikeperry> yeah I should at least try.. but you're right.. it is not clear how the hell this GUARD event is *supposed* to tell you the current in-use guard(s) 15:45:28 <mikeperry> because of all this primary secondary business 15:46:05 <karsten> can I assign the issue to you for possible next steps? 15:46:08 <mikeperry> yah 15:47:30 <karsten> done. 15:47:33 <karsten> thanks! 15:47:48 <karsten> okay, I think that's the plan for 0.8 then. 15:48:02 <karsten> anything else on the roadmap topic? 15:48:32 <karsten> moving on: 15:48:33 <karsten> CBT trial experiments/analysis 15:49:04 <karsten> 15:05:25 <+mikeperry> I added one. if you're going on leave at end of sept we should try to do some trial experiments on CBT to make sure the workflow is ok and I understand how to get data, etc 15:49:29 <karsten> assuming we implement something like I suggested on tpo/metrics/onionperf#33420 today, 15:49:30 <mikeperry> so for that, I will want to run my own unionperf instance and examine and graph output most likely 15:50:13 <mikeperry> it already smells fishy if those values you posted on tpo/metrics/onionperf#33420 are real 15:50:22 <karsten> they are. 15:50:47 <karsten> I wonder if it's easier to just analyze the torctl logs directly first. 15:51:05 <karsten> if the goal is to get a feel of the data. 15:51:56 <karsten> can you write down what exactly you're interested in, and I run a quick analysis on the torctl logs locally? 15:52:01 <mikeperry> well the goal is to make sure I know enough onionperf kungfu to be able to diagnose and fix the issue later, as well as tune the quantile value via a custom tor patch 15:53:09 <mikeperry> for the tuning, I want to see what different values of cutoff_quantile do to the actual timeout rate 15:53:13 <karsten> I'm just not sure if we should add all the buildtimeout values to graphs and/or the CSV output. 15:53:22 <mikeperry> and to TTFB and throughput metrics 15:54:14 <karsten> okay. in that case let's try to do this in onionperf, as part of #33420. 15:54:30 <mikeperry> if it is simpler to keep the fields you already put in, that is sufficient 15:54:44 <mikeperry> I can do debugging and analysis with torctl logs, as you said, yah 15:54:50 <mikeperry> of a custom onionperf 15:55:08 <karsten> well, that might be easier. 15:55:16 <karsten> we can later add new stuff to onionperf for this. 15:55:27 <karsten> but knowing what exactly we're interested in would help with that. 15:55:50 <mikeperry> when we do the full experiment on the live network, we will want to be able to mark which sections of the onionperf graphs used what cutoff_quantile 15:56:01 <karsten> you don't even need a custom onionperf (in terms of patched). onionperf already writes torctl logs containing all those events. 15:56:16 <mikeperry> and also know what their timeout_rate (and what onionperf things is the timeout+failure rates) at those times 15:56:23 <mikeperry> ah ok 15:56:49 <karsten> sounds like we'll need to discuss that more. 15:56:52 <mikeperry> I do need to patch tor if I want to change cutoff_quantile locally (as opposed to network-wide in consensus) 15:56:53 <karsten> (90 seconds left) 15:57:00 <karsten> oh, right. 15:57:08 <karsten> I mean, sounds plausible. I wouldn't know for sure. 15:57:24 <karsten> let's use the last minute for the last topic: 15:57:28 <karsten> Simply Secure and Tor UX are running a survey to collect user feedback about the metrics website. Please, participate! I plan to email lists the next month to encourage people to do it (antonela) 15:57:32 <karsten> https://tools.simplysecure.org/survey/index.php?r=survey/index&sid=39865&lang=en 15:57:45 <karsten> I added something to the metrics website for that. 15:57:48 <antonela> yes, thanks Karsten for pushing the banner live! 15:57:50 <karsten> with a link. 15:57:53 <karsten> sure! 15:58:02 <antonela> this work has OTF funding and given the current situation, the work has a stop order. We will use the survey to collect info until things back to regular mood. 15:58:14 <antonela> ill email the lists the next month to call for participation 15:58:22 <karsten> sounds great! 15:58:26 <karsten> thanks for this! 15:58:28 <antonela> all people here should jump in! 15:58:34 <antonela> of course 15:58:46 <karsten> time's up! I think there's another meeting after this. 15:58:58 <karsten> thanks, everyone! talk to you next week. o/ 15:59:01 <acute> will do :) 15:59:10 <karsten> #endmeeting