14:42:37 <karsten> #startmeeting metrics team
14:42:37 <MeetBot> Meeting started Thu May  4 14:42:37 2017 UTC.  The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:42:37 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:42:51 <tjr> Aside from 'new graphs = new code', what CSV time-granularity format wouldn't require new code?
14:43:24 <karsten> well, all graphs use 1 data point per UTC day.
14:43:34 <karsten> so, graphing that would be easiest.
14:44:04 <karsten> but some .csv files already grow quite big, and I don't know how much data you want to include per data point.
14:44:15 <karsten> I was mainly thinking of possible ways to reduce that.
14:44:23 <tjr> Oh, woops. Yes okay I got you know. (I forgot that 1 day _is_ a smoothed function of 24 consensii)
14:44:57 <tjr> I think everything would work fine at day-level granularity
14:45:13 <karsten> okay, then maybe start with that and possibly optimize later.
14:45:35 <tjr> So my next question is what should the next steps be?
14:46:11 <tjr> I want to make interactive graphs using javascript controls and d3.js.
14:46:41 <tjr> So that's the clientside stuff. Would that be easy to port into the Metrics frontend?
14:48:25 <karsten> there's no established process for adding graphs to metrics, so I'm not exactly sure what the next step would be.
14:48:42 <karsten> I could imagine that we discuss the .csv file format a bit more in the next step.
14:48:52 <karsten> like, can we avoid dynamic column sets?
14:49:17 <karsten> and maybe there will be similar issues once we look at actual files.
14:49:19 <karsten> or formats.
14:50:15 <tjr> I guess I can refactor the schema... it will add additional clientside processing (which will slow graph generation ) though. Since we'll have to walk over the data and re-assemble it into something with dynamic columns for graphing in d3
14:50:18 <karsten> regarding the client side, I could imagine just writing some R code based on your .csv file format and use the existing web code.
14:50:48 <tjr> Hm... would that be something your team does if/when you want to adopt the graphs?
14:50:58 <karsten> R code? yes.
14:51:18 <tjr> Cool :)
14:51:20 <karsten> see the last part in my mail where I wrote which parts we'd need help with.
14:51:52 <tjr> Okay
14:51:57 <tjr> This seems easy enough then...
14:52:15 <tjr> The main thing I need to do is refactor the database schema
14:52:18 <karsten> do you have a specific graph you want to start with, or do you think it's easier to do them all together?
14:52:35 <karsten> regarding the database, can you use psql on henryi?
14:53:03 <tjr> That is a question for weasel or someone similar
14:53:17 <tjr> Fallback Directory Authorities Running is a trivial graph compared to any dirauth related
14:54:02 <karsten> how do you know which directories are fallback directories?
14:54:06 <karsten> from the tor sources?
14:54:30 <tjr> I ask stem and stem pulls them dynamically from source i believe
14:54:51 <tjr> After i refactor the schema I'll make my own interactive graphs, and then just show you the python generation code, the python convert-to-csv code, the d3 code; and you can decide if/when you want to adopt these. And if/when you do I'll help with the database schema, descriptions, and data format
14:54:52 <karsten> okay, that makes it slightly more difficult for us to port this to metrics.
14:55:29 <karsten> sounds like a fine plan.
14:55:52 <tjr> cool
14:56:20 <karsten> so, regarding graph choice, it might be best to start with data that is contained in votes and consensuses.
14:56:47 <karsten> because we also don't have a good process for adding new data sources to collector/metrics yet. ;)
14:57:00 <karsten> okay, shall we move on?
14:57:40 <tjr> fine by me
14:57:54 <karsten> great!
14:57:56 <karsten> hiro: hey
14:58:01 <hiro> hey
14:58:07 <karsten> shall we quickly talk about onionperf?
14:58:09 <hiro> sure
14:58:27 <karsten> okay. :) looks like the three op-?? are quite stable now.
14:58:31 <hiro> yes
14:58:39 <karsten> op-hl, op-nl, op-us.
14:58:55 <karsten> and we have 2-3 more in the queue, right?
14:59:01 <hiro> the ideal dev-ops part here would be to do some orchestration I was starting to look into that before I took the time off to finish phd
14:59:18 <karsten> (did you succeed? :))
14:59:32 <hiro> (yes submitted - it's basically over till the defence)
14:59:38 <karsten> yay!!
14:59:41 <hiro> I have to catch up with irl
14:59:51 <hiro> but I think that is online already the -ab instance
15:00:04 <karsten> I read something about issues there.
15:00:06 <hiro> as the subdomain was created before my break
15:00:14 <karsten> I didn't look though.
15:00:22 <hiro> I will check the data, catchup it him and get back to you regarding that
15:00:26 <karsten> great!
15:00:30 <karsten> what about op-se?
15:00:38 <hiro> will update the ticket anyway
15:00:40 <karsten> we recently lost siv.
15:01:00 <karsten> which was the torperf instance running on the op-se host.
15:01:24 <hiro> yes have to catchup with ln5 too regarding that
15:01:45 <karsten> where "lost" means there was a problem that I didn't want to fix anymore because we were moving over to op-se anyway.
15:01:50 <karsten> so I took it out.
15:01:53 <karsten> ok.
15:01:54 <hiro> so I have also seen your ticket regarding the old tor-perf
15:02:11 <hiro> that will be retired right?
15:02:18 <karsten> all torperfs are retired by now.
15:02:25 <karsten> moria, siv, and torperf (ferrinii).
15:02:35 <hiro> ok got it
15:02:53 <hiro> also the onionperf.tpo is retired by now
15:02:53 <karsten> the last is phantomtrain.
15:02:57 <karsten> okay, good.
15:03:13 <karsten> I was wondering if we should leave phantomtrain out of collector/metrics and keep it as testing instance.
15:03:21 <karsten> I didn't talk to rob about this plan yet.
15:03:36 <hiro> testing for onionperf?
15:03:40 <karsten> yes.
15:03:46 <hiro> ok
15:03:53 <karsten> he'll sure want to test new client models etc.
15:04:21 <karsten> and that probably shouldn't happen on a production system.
15:04:44 <hiro> well there is no problem on creating test instances on greenhost
15:04:45 <karsten> also, we'd have to redo all the tarballs to include historic data from phantomtrain.
15:04:59 <hiro> if we just want to have a testing environment
15:05:10 <karsten> and that might be useful as well. but I think rob wants to test on his own machine.
15:05:12 <hiro> we can create a op-dev
15:05:15 <hiro> ah ok
15:05:41 <karsten> but I realize that we should have this discussion together with rob. so I'll move this to email and copy you, okay?
15:06:07 <karsten> by the way, note the "Server (beta)" option here: https://metrics.torproject.org/torperf.html
15:06:18 <karsten> we're now plotting onion server performance.
15:06:36 <karsten> it's still in beta, because it's not reviewed yet. but it exists.
15:07:03 <hiro> so neat
15:07:27 <karsten> hmm, maybe I need to look at the drop there: https://metrics.torproject.org/torperf.html?start=2017-02-03&end=2017-05-04&source=op-hk&server=onion&filesize=50kb
15:07:31 <hiro> I have to check hk instance.. I see no data there
15:07:39 <hiro> oh yes that's what I meant
15:08:07 <hiro> it's consistent
15:08:14 <hiro> I think we are missing some data
15:08:22 <karsten> yes. looks like an issue with the new onionperf module. beta...
15:08:34 <karsten> I'll look into that. probably a problem on the metrics-web side.
15:09:04 <karsten> alright. so much about onionperf for today?
15:09:45 <hiro> I think it's all about it
15:09:51 <karsten> great. thanks!
15:10:04 <hiro> thanks again
15:10:17 <karsten> Samdney: still here? :)
15:10:19 <karsten> https://trac.torproject.org/projects/tor/query?status=!closed&component=^Metrics%2Fmetrics-lib&group=milestone&col=id&col=summary&col=component&col=owner&col=type&col=priority&col=version&order=priority
15:10:20 <Samdney> yes
15:10:55 <karsten> https://trac.torproject.org/projects/tor/query?status=accepted&status=assigned&status=merge_ready&status=needs_information&status=needs_review&status=needs_revision&status=new&status=reopened&component=%5EMetrics%2Fmetrics-lib&group=milestone&col=id&col=summary&col=component&col=status&col=type&order=priority
15:11:05 <karsten> (different columns)
15:11:23 <karsten> how's your java.nio?
15:12:27 <Samdney> mmm, my last jave code was "long" time ago. :)
15:12:45 <Samdney> I'm learning fast :)
15:13:32 <Samdney> I think it should work.
15:13:47 <karsten> so, the easy part of the java.nio related ticket (or tickets?) is that you don't have to learn as much about metrics-lib before producing something useful.
15:14:31 <karsten> https://trac.torproject.org/projects/tor/ticket/17831
15:14:40 <Samdney> ah
15:14:42 <karsten> https://trac.torproject.org/projects/tor/ticket/21751
15:14:54 <karsten> not exactly java.nio but related to performance improvements.
15:15:29 <Samdney> ok, I will have a look on this the next day and see what is the best for me to start
15:15:36 <Samdney> thank you for your suggestions
15:15:50 <karsten> I don't know if they're good suggestions.
15:16:15 <karsten> maybe take a look, don't sink too much time into them, and give me feedback what you had really expected from an easy ticket?
15:16:53 <karsten> I'm just thinking that a ticket like #19640 might be more difficult if you're not as familiar with metrics-lib.
15:17:19 <Samdney> metrics-lib is ok. I spent some time with it :)
15:17:21 <karsten> so, maybe start with #21751 which comes with a simple patch.
15:17:24 * hiro knows why there are drops on op-hk
15:17:44 <karsten> (did I say simple?! I meant rudimentary!)
15:18:00 <Samdney> I will be afk in some minutes. Will look on this tomorrow.
15:18:05 <karsten> Samdney: okay, cool. let me know how this goes!
15:18:08 <karsten> hiro: oh?
15:18:22 <Samdney> ok, maybe will send you an email :)
15:18:38 <karsten> sure!
15:19:08 <hiro> so we have "good" data since the 11th of April, before it was the testing fase when I was not understanding what was happening between routing the traffic and having time outs
15:19:26 <hiro> so that data was imported on collector but should be deleted
15:19:31 <karsten> oh.
15:19:36 <hiro> *phase
15:19:55 <karsten> I really wonder why that got imported...
15:19:58 <hiro> I saw iwakeh saying that. i.e. deleting the data
15:20:01 <karsten> good catch!
15:20:04 <karsten> yes, and I deleted it.
15:20:13 <hiro> on all of them?
15:20:18 <karsten> but maybe it was still on the metrics host..
15:20:23 <hiro> ahh i see
15:20:31 <karsten> should be easy to fix.
15:20:38 <hiro> yay
15:20:39 <karsten> thanks for spotting that!
15:21:03 <karsten> alright, are we done?
15:21:06 <hiro> yep
15:21:22 <karsten> perfect. thanks, and let's talk more next week! bye!
15:21:29 <Samdney> bye!
15:21:34 <karsten> #endmeeting