14:00:22 <karsten> #startmeeting metrics team 14:00:22 <MeetBot> Meeting started Thu Sep 15 14:00:22 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:22 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:26 <karsten> hi iwakeh! 14:00:35 <karsten> https://pad.riseup.net/p/3M7VyrTVgjlF 14:03:20 <karsten> iwakeh: let me know when you're done with the agenda pad. 14:03:33 <iwakeh> I think this is all now. 14:03:42 <karsten> okay, cool, let's start. 14:03:45 <karsten> * Shiny prototype (karsten) 14:03:52 <karsten> https://tor-metrics.shinyapps.io/webstats/ 14:03:53 <iwakeh> looks really cool. 14:04:02 <karsten> and it was really easy to write. 14:04:09 <karsten> there are some open questions though. 14:04:21 <iwakeh> how easy? just R? 14:04:32 <karsten> right now, it's hosted by a third party, though we could host such a server ourselves. 14:04:40 <karsten> but this seemed fine for the prototype. 14:04:42 <karsten> just R. 14:04:46 <iwakeh> is it available as source? 14:04:52 <iwakeh> the shiny server? 14:05:00 <karsten> ah, yes. 14:05:32 <iwakeh> questions are? 14:05:41 <karsten> https://www.rstudio.com/products/shiny/shiny-server/ 14:05:59 <karsten> I think the main question is requirements to clients. 14:06:10 <iwakeh> javascript 14:06:18 <karsten> this runs in Tor Browser, but I think only on medium-something level. 14:06:45 <iwakeh> well, I had to enable scripts. 14:06:50 <karsten> yes, javascript. so, that's the main question. 14:06:57 <karsten> not one we can answer today. 14:07:16 <karsten> but one we should answer before moving forward. 14:07:19 <iwakeh> we could take a look at the source 14:07:35 <karsten> to see whether we can work around that? 14:07:43 <iwakeh> and decide then. They might use familiar tools in the background. 14:07:45 <iwakeh> yes. 14:08:04 <iwakeh> or to get an idea how to provide the service. 14:08:14 <iwakeh> with other tools/servers. 14:08:37 <karsten> wow, ok. 14:08:42 <iwakeh> shall I add this to my list. 14:08:43 <karsten> that would be quite the project though. 14:08:55 <iwakeh> first just see what is used. 14:09:00 <karsten> feel free to do that, just don't put it right on number 1. ;) 14:09:12 <iwakeh> two and a half. 14:09:16 <iwakeh> ;-) 14:09:17 <karsten> heh 14:09:39 <karsten> okay, I mainly put it out there as a way to look at this particular data set and to make some first experiences with the tool. 14:10:02 <karsten> other questions: obtaining data. 14:10:09 <iwakeh> ok? 14:10:26 <karsten> this application comes with its own data exported from a database on my server, shipped with the application bundle. 14:10:43 <karsten> what we'd like to do is fetch data from a server somewhere. but don't fetch it every single time. 14:10:50 <karsten> or we might even want to use a database for this. 14:10:59 <karsten> which might be possible if we use our own shiny server. 14:11:04 <karsten> but I didn't look yet. prototype. 14:11:16 <iwakeh> I keep these 14:11:27 <iwakeh> questions in mind when looking at shiny. 14:11:44 <karsten> cool! otherwise, I think it's powerful enough for the things we want to do. 14:11:55 <iwakeh> yes, looks neat. 14:12:04 <karsten> and it would be really cool not to have to develop all that stuff. 14:12:10 <karsten> and instead focus on the R code. 14:12:19 <karsten> let me quickly upload that to give you an idea: 14:12:37 <iwakeh> cool. 14:13:02 <karsten> http://paste.debian.net/823844/ 14:13:06 <karsten> two files. 14:13:20 <karsten> well, plus about.html, but that only contains the description part. 14:14:01 <iwakeh> less than 70 lines! nice. 14:14:15 <karsten> yep! 14:14:42 <karsten> okay, so much about shiny. 14:14:50 <karsten> moving on? 14:14:55 <iwakeh> sure. 14:15:00 <karsten> * Bridge descriptors (karsten) 14:15:24 <karsten> bridge descriptors until 2016-05 will be ready later today or tomorrow at the latest, depending on how fast this `xz -9e` finishes the rest. 14:15:33 <karsten> how do we proceed? 14:15:50 <iwakeh> with a parallel instance? 14:16:08 <karsten> yes, with just the minimal patch to sanitize tcp ports? 14:16:33 <iwakeh> did you try that? 14:17:03 <karsten> well, I ran a version with that patch plus a few more to not run out of memory. 14:17:34 <iwakeh> so, a hotfix release for these all? 14:18:05 <karsten> hmmmmm 14:18:10 <karsten> yes, we could do that. 14:18:22 <iwakeh> branch? 14:18:37 <karsten> oh, wait, those other patches are not ready to be merged yet. 14:18:44 <karsten> just the sanitize tcp ports patch is. 14:18:49 <iwakeh> ah, ok 14:19:04 <karsten> but we only need those other patches to batch-process months and years of tarballs. 14:19:14 <karsten> okay, let me prepare a branch for you to review. 14:19:21 <iwakeh> fine. 14:19:47 <karsten> then we release that, I set up a new instance and resume sanitizing descriptors from 2016-06 till today. 14:20:06 <iwakeh> sounds like a good plan. 14:20:08 <karsten> it might even be done before seattle. 14:20:13 <iwakeh> :-) 14:20:32 <karsten> here's something else: would you be able to verify the new tarballs and see if they contain the right descriptors? 14:20:37 <karsten> well, some samples. 14:20:44 <iwakeh> sure. 14:21:03 <karsten> cool. how about I encrypt and upload a few months of descriptors? 14:21:05 <iwakeh> You refer to the date issue? 14:21:11 <karsten> date issue? 14:21:35 <iwakeh> descriptors from a different month in tarball xy. 14:21:38 <karsten> with descriptors being sorted into the wrong month tarball? no, that was only an issue with relay descriptors. 14:21:55 <karsten> I'm just thinking of things I overlooked. 14:22:04 <karsten> things I left in, or something. 14:22:24 <iwakeh> so, basically I should look at the tar and find oddities? 14:22:31 <karsten> yes! :) 14:22:49 <karsten> keeping in mind that we used different secrets for sanitizing IP addresses than last time. 14:22:58 <karsten> so, the 10.x.y.z addresses are all different. 14:23:28 <iwakeh> I think, I didn't review the tars before. 14:23:41 <iwakeh> Or did I? 14:23:42 <karsten> that's perfect. no assumptions. :) 14:23:47 <karsten> not sure, maybe not. 14:24:00 <karsten> alright, let me upload something and you take a look. 14:24:08 <karsten> and I prepare the branch for the hotfix release. 14:24:09 <iwakeh> fine. 14:24:16 <karsten> great! moving on? 14:24:21 <iwakeh> yes. 14:24:24 <karsten> * wiki changes according to planning and discussions in Berlin (iwakeh) 14:24:40 <iwakeh> I updated a few pages links follow 14:24:57 <iwakeh> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam#ReleasesandMilestones 14:25:20 <iwakeh> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam/Documentation 14:26:16 <iwakeh> some formatting in faq and update to the road-map questions. 14:26:47 <iwakeh> and a draft for the new volunteers page discussed in Berlin 14:26:50 <iwakeh> https://trac.torproject.org/projects/tor/wiki/org/teams/MetricsTeam/Volunteers 14:26:55 <karsten> yes, thanks for that! 14:27:26 <iwakeh> idea: tag easy tasks with open timeframe as 'metrics-help' 14:27:46 <iwakeh> so they can be listed on the volunteering page. 14:29:07 <karsten> I think I could do something like that after seattle. 14:29:31 <karsten> but this and next week (and during the seattle week) I'll be mostly distracted by other stuff. 14:29:42 <iwakeh> And in Seattle maybe contribute this to the volunteer discussion? 14:29:57 <iwakeh> The ideas behind the page. 14:30:02 <karsten> yes, certainly! 14:30:41 <iwakeh> I'll link the volunteer page from the other documentation. 14:31:04 <karsten> sounds good! 14:31:26 <iwakeh> the other changes were just updates; so next topic? 14:31:43 <karsten> okay. 14:31:44 <karsten> * logging for operators concerns collector, onionoo, ... (iwakeh) 14:32:11 <iwakeh> I should have opened a new issue, but as the log-mailing was the reason for me to think about this I changed the 14:32:23 <iwakeh> description of #20128 14:32:59 <karsten> hmm? when did you change it? 14:33:00 <iwakeh> operators will have to think about choices like 14:33:00 <iwakeh> logging framework implementation 14:33:00 <iwakeh> log-level settings 14:33:00 <iwakeh> logging environment, e.g. path settings etc. 14:33:27 <iwakeh> i.e. no more commits, because of the log level etc. 14:33:40 <iwakeh> as in #20079 14:33:58 <karsten> well, we'll still have to give them reasonable defaults. 14:34:31 <karsten> which can simply be our choices. 14:34:39 <iwakeh> reasonable pointers. Of course, no default trace setting. 14:34:46 <karsten> yes. :) 14:34:51 <iwakeh> What I want to get to is 14:35:13 <iwakeh> to separate development and operation (even of the main instances). 14:35:27 <iwakeh> I do not use the default log for example 14:35:32 <iwakeh> that's why 14:35:47 <iwakeh> my mirror wasn't affected by the trace setting. 14:36:15 <iwakeh> operators could even prefer not to use logback, but some other slf4j implementation. 14:36:22 <karsten> and still, for development, we should make reasonable choices when and how often to use which level, for example. 14:36:35 <karsten> and be consistent between modules and products. 14:36:36 <iwakeh> yes, thats true. 14:36:46 <karsten> it was bad that the torperf module logged everything on trace. 14:36:53 <karsten> err, maybe it even still does. 14:36:56 <karsten> it is*. 14:37:06 <iwakeh> i think so. 14:37:19 <karsten> I'd be happy to change those things if we have reasonable guidelines. 14:37:26 <iwakeh> but, the ticket basically wants 14:37:46 <iwakeh> to reduce the comfort a little to enforce thinking. 14:38:00 <iwakeh> on the operators part. 14:38:12 <karsten> hehe 14:38:17 <iwakeh> to avoid blind usage of new log-settings. 14:38:20 <iwakeh> :-) 14:38:32 <iwakeh> and I 14:38:44 <iwakeh> assume good operators with a decent 14:39:09 <iwakeh> knowledge of linux/other operating system and java would prefer that. 14:39:22 <karsten> I'm all for leaving choices. 14:39:35 <iwakeh> forcing choices. 14:39:58 <iwakeh> An operator will have to choose actively 14:40:11 <iwakeh> what logging implementation and setting to use. 14:40:22 <iwakeh> and, they can be sure it won't 14:40:27 <karsten> well, what if they don't want to make choices about logging? we could just use "no logging" as default choice. 14:40:31 <iwakeh> change on a simple jar update. 14:40:40 <karsten> or "stdout logging". 14:40:59 <iwakeh> no logging is default if no implementation is supplied. 14:41:09 <iwakeh> It's slf4j setting. 14:41:25 <karsten> yes. I noticed that it puts out a warning and stays silent. 14:41:42 <iwakeh> right. and we have the documentation 14:41:43 <karsten> and we could even provide a jar to shut off that warning. 14:42:00 <iwakeh> for our choice which is logback, at the moment. 14:42:03 <karsten> I don't know whether that's the right choice. 14:42:06 <karsten> default choice*. 14:42:31 <iwakeh> I would provide the jars for logback as extra lib in the release. 14:42:48 <iwakeh> The logging framework configuration should be decoupled from CollecTor, i.e. 14:42:48 <iwakeh> remove default logback.xml from collector-<version>.jar 14:42:48 <iwakeh> add an example of logback.xml to src/main/resources 14:42:48 <iwakeh> provide the two logback-{classic,core}.jars with a release, but remove them from collector-<version>.jar 14:42:48 <iwakeh> add more logging info to the operating guide 14:43:37 <karsten> ah, you replaced the whole description? 14:43:43 <iwakeh> That also for the first Onionoo release. 14:43:54 <karsten> it doesn't say so in the diff. 14:43:57 <iwakeh> yes, I did. should have been new ticket. 14:44:06 <iwakeh> now I know. 14:45:23 <karsten> okay. I'm not sure what's the right thing to do here. I should read the ticket in more detail and think about it. 14:45:37 <iwakeh> yes :-) 14:45:48 <iwakeh> regarding the mailing .. 14:46:38 <iwakeh> its just an example that works for me as the server's settings are right. 14:47:07 <iwakeh> But, I'll respond with more detail to you're comment in the ticket. 14:47:29 <karsten> it would be good to learn something from this regarding how we should be using ERROR log messages in the code. 14:47:43 <iwakeh> definitely! 14:47:44 <karsten> it's one data point how people are using logging. 14:48:02 <karsten> if you can provide some ideas for how we should be using logging, I'd be happy to adapt some code. 14:48:04 <iwakeh> I hardly get ERRORs the only one is 14:48:32 <iwakeh> when I remove collector.properties. So, the level decision is a ver iportant question. 14:48:39 <karsten> agreed! 14:48:49 <iwakeh> new ticket for levels? 14:48:54 <karsten> yes, please. 14:49:03 <karsten> here's something somewhat related: 14:49:10 <iwakeh> but, not much more elaboration on the mailing as it is only logback? 14:49:28 <karsten> should we think about packaging collector et al. in a more standard way, including start script? 14:49:48 <iwakeh> I could supply a start script. 14:49:51 <iwakeh> but 14:50:01 <karsten> I'm just thinking that such a package might determine how we should be doing logging and configuration and so on. 14:50:17 <karsten> and it would be sad to redo many things now and have to redo them again later. 14:50:38 <iwakeh> the configuration should be the operator's task. 14:51:08 <iwakeh> I think we might rather need a special small operating 14:51:18 <iwakeh> git repo for the mainn collector instance? 14:51:37 <karsten> for the config? 14:51:49 <karsten> or run from git? (not sure what you mean) 14:51:53 <iwakeh> well, the server admins use git for their configs. 14:52:06 <karsten> right. 14:52:09 <iwakeh> not run from git, but docuent your settings there. 14:52:14 <karsten> oh, sure. 14:52:26 <iwakeh> an that way separates operation. 14:52:31 <karsten> but still, as packagers, we'd have to make the decision where the config file lives. 14:52:39 <iwakeh> I can use a different setup on the mirror. 14:52:40 <karsten> or where log files go. 14:52:51 <karsten> or what the start script does. 14:53:10 <iwakeh> well, ideally we could use a simple java-packager for first 14:53:22 <iwakeh> install and ask questions on first setup. 14:53:55 <karsten> well, maybe? I never used one of those. 14:53:56 <iwakeh> this would write the first config according to the suplied answers. 14:54:11 <karsten> ah, or maybe I did. I don't remember what tool exactly I used. 14:54:22 <karsten> I tried packaging metrics-lib a while ago. well, years ago probably. 14:54:41 <iwakeh> that doesn't need a setup 14:54:50 <karsten> and I'm not saying we must do this now. I'm just thinking whether we should do that before changing many things regarding logging and configuration and so on. 14:54:59 <karsten> no, that's why I used that. it was really easy. ;) 14:55:40 <iwakeh> well, the ticket suggest de-coupling of operational setup from what we package. 14:56:19 <iwakeh> first state all the operational questions in the operators guide 14:56:39 <iwakeh> and later we have the option of moving the question into a packager. 14:56:50 <karsten> ok! 14:56:57 <karsten> with that in mind, this sounds like a good process. 14:57:14 <iwakeh> and, no accidental full disks anymore! 14:57:21 <karsten> hehe 14:57:33 <karsten> trace logs did not fill disks entirely. not this time! 14:57:50 <iwakeh> eventually 14:57:51 <karsten> okay, I'll re-read that ticket. 14:58:00 <iwakeh> the disk would have been filled ... 14:58:07 <iwakeh> thanks. 14:58:08 <karsten> yes, but I receive warnings from nagios now. 14:58:11 <karsten> ;) 14:58:24 <iwakeh> I have a script 14:58:39 <iwakeh> for start, status and the like. 14:58:44 <karsten> oh, nice. 14:58:59 <iwakeh> could be added to resources? 14:59:04 <karsten> sure! 14:59:59 <karsten> alright. 15:00:00 <iwakeh> that's all for this topic. 15:00:15 <karsten> let's have another meeting next week before seattle? 15:00:31 <iwakeh> is that ok timewise for you? 15:00:41 <karsten> yes, sure. 15:00:46 <karsten> flight is on sat. 15:00:52 <iwakeh> fine. 15:01:14 <karsten> and feel free to set priority of tickets to high if I should be looking sooner. 15:01:22 <iwakeh> and during the seattle meeting? 15:01:37 <iwakeh> you probably won't have time there? 15:02:07 <iwakeh> I don't know if things need to be discussed then? 15:02:27 <iwakeh> Topics that arise in Seattle? 15:02:29 <karsten> 7 am. 15:02:41 <iwakeh> 7am? 15:02:43 <karsten> is that right? 15:02:56 <karsten> 14 utc = 7 pacific time. 15:03:13 <iwakeh> ah, ok. meet later? 15:04:04 <karsten> maybe 14 utc? 15:04:16 <karsten> or, let's talk about that early in the seattle week via email. 15:04:25 <iwakeh> that's fine. 15:04:33 <iwakeh> by mail then. 15:04:34 <karsten> okay, cool! but next meeting next week. 15:04:44 <iwakeh> type to yo then :-) 15:04:53 <karsten> heh, thanks for coming. bye! :) 15:04:58 <karsten> #endmeeting