14:00:22 #startmeeting metrics team 14:00:22 Meeting started Thu Sep 15 14:00:22 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:22 Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:26 hi iwakeh! 14:00:35 https://pad.riseup.net/p/3M7VyrTVgjlF 14:03:20 iwakeh: let me know when you're done with the agenda pad. 14:03:33 I think this is all now. 14:03:42 okay, cool, let's start. 14:03:45 * Shiny prototype (karsten) 14:03:52 https://tor-metrics.shinyapps.io/webstats/ 14:03:53 looks really cool. 14:04:02 and it was really easy to write. 14:04:09 there are some open questions though. 14:04:21 how easy? just R? 14:04:32 right now, it's hosted by a third party, though we could host such a server ourselves. 14:04:40 but this seemed fine for the prototype. 14:04:42 just R. 14:04:46 is it available as source? 14:04:52 the shiny server? 14:05:00 ah, yes. 14:05:32 questions are? 14:05:41 https://www.rstudio.com/products/shiny/shiny-server/ 14:05:59 I think the main question is requirements to clients. 14:06:10 javascript 14:06:18 this runs in Tor Browser, but I think only on medium-something level. 14:06:45 well, I had to enable scripts. 14:06:50 yes, javascript. so, that's the main question. 14:06:57 not one we can answer today. 14:07:16 but one we should answer before moving forward. 14:07:19 we could take a look at the source 14:07:35 to see whether we can work around that? 14:07:43 and decide then. They might use familiar tools in the background. 14:07:45 yes. 14:08:04 or to get an idea how to provide the service. 14:08:14 with other tools/servers. 14:08:37 wow, ok. 14:08:42 shall I add this to my list. 14:08:43 that would be quite the project though. 14:08:55 first just see what is used. 14:09:00 feel free to do that, just don't put it right on number 1. ;) 14:09:12 two and a half. 14:09:16 ;-) 14:09:17 heh 14:09:39 okay, I mainly put it out there as a way to look at this particular data set and to make some first experiences with the tool. 14:10:02 other questions: obtaining data. 14:10:09 ok? 14:10:26 this application comes with its own data exported from a database on my server, shipped with the application bundle. 14:10:43 what we'd like to do is fetch data from a server somewhere. but don't fetch it every single time. 14:10:50 or we might even want to use a database for this. 14:10:59 which might be possible if we use our own shiny server. 14:11:04 but I didn't look yet. prototype. 14:11:16 I keep these 14:11:27 questions in mind when looking at shiny. 14:11:44 cool! otherwise, I think it's powerful enough for the things we want to do. 14:11:55 yes, looks neat. 14:12:04 and it would be really cool not to have to develop all that stuff. 14:12:10 and instead focus on the R code. 14:12:19 let me quickly upload that to give you an idea: 14:12:37 cool. 14:13:02 http://paste.debian.net/823844/ 14:13:06 two files. 14:13:20 well, plus about.html, but that only contains the description part. 14:14:01 less than 70 lines! nice. 14:14:15 yep! 14:14:42 okay, so much about shiny. 14:14:50 moving on? 14:14:55 sure. 14:15:00 * Bridge descriptors (karsten) 14:15:24 bridge descriptors until 2016-05 will be ready later today or tomorrow at the latest, depending on how fast this `xz -9e` finishes the rest. 14:15:33 how do we proceed? 14:15:50 with a parallel instance? 14:16:08 yes, with just the minimal patch to sanitize tcp ports? 14:16:33 did you try that? 14:17:03 well, I ran a version with that patch plus a few more to not run out of memory. 14:17:34 so, a hotfix release for these all? 14:18:05 hmmmmm 14:18:10 yes, we could do that. 14:18:22 branch? 14:18:37 oh, wait, those other patches are not ready to be merged yet. 14:18:44 just the sanitize tcp ports patch is. 14:18:49 ah, ok 14:19:04 but we only need those other patches to batch-process months and years of tarballs. 14:19:14 okay, let me prepare a branch for you to review. 14:19:21 fine. 14:19:47 then we release that, I set up a new instance and resume sanitizing descriptors from 2016-06 till today. 14:20:06 sounds like a good plan. 14:20:08 it might even be done before seattle. 14:20:13 :-) 14:20:32 here's something else: would you be able to verify the new tarballs and see if they contain the right descriptors? 14:20:37 well, some samples. 14:20:44 sure. 14:21:03 cool. how about I encrypt and upload a few months of descriptors? 14:21:05 You refer to the date issue? 14:21:11 date issue? 14:21:35 descriptors from a different month in tarball xy. 14:21:38 with descriptors being sorted into the wrong month tarball? no, that was only an issue with relay descriptors. 14:21:55 I'm just thinking of things I overlooked. 14:22:04 things I left in, or something. 14:22:24 so, basically I should look at the tar and find oddities? 14:22:31 yes! :) 14:22:49 keeping in mind that we used different secrets for sanitizing IP addresses than last time. 14:22:58 so, the 10.x.y.z addresses are all different. 14:23:28 I think, I didn't review the tars before. 14:23:41 Or did I? 14:23:42 that's perfect. no assumptions. :) 14:23:47 not sure, maybe not. 14:24:00 alright, let me upload something and you take a look. 14:24:08 and I prepare the branch for the hotfix release. 14:24:09 fine. 14:24:16 great! moving on? 14:24:21 yes. 14:24:24 * wiki changes according to planning and discussions in Berlin (iwakeh) 14:24:40 I updated a few pages links follow 14:24:57 https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam#ReleasesandMilestones 14:25:20 https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam/Documentation 14:26:16 some formatting in faq and update to the road-map questions. 14:26:47 and a draft for the new volunteers page discussed in Berlin 14:26:50 https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam/Volunteers 14:26:55 yes, thanks for that! 14:27:26 idea: tag easy tasks with open timeframe as 'metrics-help' 14:27:46 so they can be listed on the volunteering page. 14:29:07 I think I could do something like that after seattle. 14:29:31 but this and next week (and during the seattle week) I'll be mostly distracted by other stuff. 14:29:42 And in Seattle maybe contribute this to the volunteer discussion? 14:29:57 The ideas behind the page. 14:30:02 yes, certainly! 14:30:41 I'll link the volunteer page from the other documentation. 14:31:04 sounds good! 14:31:26 the other changes were just updates; so next topic? 14:31:43 okay. 14:31:44 * logging for operators concerns collector, onionoo, ... (iwakeh) 14:32:11 I should have opened a new issue, but as the log-mailing was the reason for me to think about this I changed the 14:32:23 description of #20128 14:32:59 hmm? when did you change it? 14:33:00 operators will have to think about choices like 14:33:00 logging framework implementation 14:33:00 log-level settings 14:33:00 logging environment, e.g. path settings etc. 14:33:27 i.e. no more commits, because of the log level etc. 14:33:40 as in #20079 14:33:58 well, we'll still have to give them reasonable defaults. 14:34:31 which can simply be our choices. 14:34:39 reasonable pointers. Of course, no default trace setting. 14:34:46 yes. :) 14:34:51 What I want to get to is 14:35:13 to separate development and operation (even of the main instances). 14:35:27 I do not use the default log for example 14:35:32 that's why 14:35:47 my mirror wasn't affected by the trace setting. 14:36:15 operators could even prefer not to use logback, but some other slf4j implementation. 14:36:22 and still, for development, we should make reasonable choices when and how often to use which level, for example. 14:36:35 and be consistent between modules and products. 14:36:36 yes, thats true. 14:36:46 it was bad that the torperf module logged everything on trace. 14:36:53 err, maybe it even still does. 14:36:56 it is*. 14:37:06 i think so. 14:37:19 I'd be happy to change those things if we have reasonable guidelines. 14:37:26 but, the ticket basically wants 14:37:46 to reduce the comfort a little to enforce thinking. 14:38:00 on the operators part. 14:38:12 hehe 14:38:17 to avoid blind usage of new log-settings. 14:38:20 :-) 14:38:32 and I 14:38:44 assume good operators with a decent 14:39:09 knowledge of linux/other operating system and java would prefer that. 14:39:22 I'm all for leaving choices. 14:39:35 forcing choices. 14:39:58 An operator will have to choose actively 14:40:11 what logging implementation and setting to use. 14:40:22 and, they can be sure it won't 14:40:27 well, what if they don't want to make choices about logging? we could just use "no logging" as default choice. 14:40:31 change on a simple jar update. 14:40:40 or "stdout logging". 14:40:59 no logging is default if no implementation is supplied. 14:41:09 It's slf4j setting. 14:41:25 yes. I noticed that it puts out a warning and stays silent. 14:41:42 right. and we have the documentation 14:41:43 and we could even provide a jar to shut off that warning. 14:42:00 for our choice which is logback, at the moment. 14:42:03 I don't know whether that's the right choice. 14:42:06 default choice*. 14:42:31 I would provide the jars for logback as extra lib in the release. 14:42:48 The logging framework configuration should be decoupled from CollecTor, i.e. 14:42:48 remove default logback.xml from collector-.jar 14:42:48 add an example of logback.xml to src/main/resources 14:42:48 provide the two logback-{classic,core}.jars with a release, but remove them from collector-.jar 14:42:48 add more logging info to the operating guide 14:43:37 ah, you replaced the whole description? 14:43:43 That also for the first Onionoo release. 14:43:54 it doesn't say so in the diff. 14:43:57 yes, I did. should have been new ticket. 14:44:06 now I know. 14:45:23 okay. I'm not sure what's the right thing to do here. I should read the ticket in more detail and think about it. 14:45:37 yes :-) 14:45:48 regarding the mailing .. 14:46:38 its just an example that works for me as the server's settings are right. 14:47:07 But, I'll respond with more detail to you're comment in the ticket. 14:47:29 it would be good to learn something from this regarding how we should be using ERROR log messages in the code. 14:47:43 definitely! 14:47:44 it's one data point how people are using logging. 14:48:02 if you can provide some ideas for how we should be using logging, I'd be happy to adapt some code. 14:48:04 I hardly get ERRORs the only one is 14:48:32 when I remove collector.properties. So, the level decision is a ver iportant question. 14:48:39 agreed! 14:48:49 new ticket for levels? 14:48:54 yes, please. 14:49:03 here's something somewhat related: 14:49:10 but, not much more elaboration on the mailing as it is only logback? 14:49:28 should we think about packaging collector et al. in a more standard way, including start script? 14:49:48 I could supply a start script. 14:49:51 but 14:50:01 I'm just thinking that such a package might determine how we should be doing logging and configuration and so on. 14:50:17 and it would be sad to redo many things now and have to redo them again later. 14:50:38 the configuration should be the operator's task. 14:51:08 I think we might rather need a special small operating 14:51:18 git repo for the mainn collector instance? 14:51:37 for the config? 14:51:49 or run from git? (not sure what you mean) 14:51:53 well, the server admins use git for their configs. 14:52:06 right. 14:52:09 not run from git, but docuent your settings there. 14:52:14 oh, sure. 14:52:26 an that way separates operation. 14:52:31 but still, as packagers, we'd have to make the decision where the config file lives. 14:52:39 I can use a different setup on the mirror. 14:52:40 or where log files go. 14:52:51 or what the start script does. 14:53:10 well, ideally we could use a simple java-packager for first 14:53:22 install and ask questions on first setup. 14:53:55 well, maybe? I never used one of those. 14:53:56 this would write the first config according to the suplied answers. 14:54:11 ah, or maybe I did. I don't remember what tool exactly I used. 14:54:22 I tried packaging metrics-lib a while ago. well, years ago probably. 14:54:41 that doesn't need a setup 14:54:50 and I'm not saying we must do this now. I'm just thinking whether we should do that before changing many things regarding logging and configuration and so on. 14:54:59 no, that's why I used that. it was really easy. ;) 14:55:40 well, the ticket suggest de-coupling of operational setup from what we package. 14:56:19 first state all the operational questions in the operators guide 14:56:39 and later we have the option of moving the question into a packager. 14:56:50 ok! 14:56:57 with that in mind, this sounds like a good process. 14:57:14 and, no accidental full disks anymore! 14:57:21 hehe 14:57:33 trace logs did not fill disks entirely. not this time! 14:57:50 eventually 14:57:51 okay, I'll re-read that ticket. 14:58:00 the disk would have been filled ... 14:58:07 thanks. 14:58:08 yes, but I receive warnings from nagios now. 14:58:11 ;) 14:58:24 I have a script 14:58:39 for start, status and the like. 14:58:44 oh, nice. 14:58:59 could be added to resources? 14:59:04 sure! 14:59:59 alright. 15:00:00 that's all for this topic. 15:00:15 let's have another meeting next week before seattle? 15:00:31 is that ok timewise for you? 15:00:41 yes, sure. 15:00:46 flight is on sat. 15:00:52 fine. 15:01:14 and feel free to set priority of tickets to high if I should be looking sooner. 15:01:22 and during the seattle meeting? 15:01:37 you probably won't have time there? 15:02:07 I don't know if things need to be discussed then? 15:02:27 Topics that arise in Seattle? 15:02:29 7 am. 15:02:41 7am? 15:02:43 is that right? 15:02:56 14 utc = 7 pacific time. 15:03:13 ah, ok. meet later? 15:04:04 maybe 14 utc? 15:04:16 or, let's talk about that early in the seattle week via email. 15:04:25 that's fine. 15:04:33 by mail then. 15:04:34 okay, cool! but next meeting next week. 15:04:44 type to yo then :-) 15:04:53 heh, thanks for coming. bye! :) 15:04:58 #endmeeting