13:59:57 <karsten> #startmeeting metrics team 13:59:57 <MeetBot> Meeting started Thu Jul 28 13:59:57 2016 UTC. The chair is karsten. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:57 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 14:00:07 <karsten> it's meeting time. who's here for the metrics team meeting? 14:00:32 * karsten already saw iwakeh 14:00:40 <iwakeh> right :-) 14:00:45 * qbi lurks. 14:00:48 <karsten> hi qbi! 14:01:06 * karsten finds the pad.. 14:01:33 <karsten> https://pad.riseup.net/p/zUNzEIFRq5S4 14:03:16 <karsten> okay. 14:03:21 <karsten> * Bridge descriptor sanitizer (karsten) 14:03:22 <iwakeh> ok. 14:03:39 <karsten> I spent the last 20 days (well, it felt like 20) writing tests. 14:03:45 <karsten> and spotted many bugs. 14:03:49 <iwakeh> hihi 14:03:58 <iwakeh> good. 14:04:08 <iwakeh> one question 14:04:11 <karsten> I also found out that the batch process that re-processes archives broke. 14:04:16 <karsten> after 13 of 28 or so days. 14:04:20 <karsten> out of memory. 14:04:23 <karsten> sure, what's the question? 14:04:24 <iwakeh> oh no. 14:04:43 <iwakeh> these tests having a TODO; do they fail already? 14:04:44 <karsten> I have an idea what the reason could be. I don't have a good fix though. 14:04:51 <karsten> no, I changed them all to pass. 14:04:57 <karsten> and to fail once we fix things. 14:05:14 <iwakeh> maybe, fix before refactoring? 14:05:25 <karsten> sure! 14:05:40 <iwakeh> but, the batch ... 14:05:48 <karsten> should I fix them, or would you want to look into that? 14:06:00 <iwakeh> that's topic 2 14:06:04 <iwakeh> planning. 14:06:08 <karsten> ok :) 14:06:14 <karsten> yes, the batch. 14:06:26 <karsten> so, we're keeping a data structure of all file digests we're processing. 14:06:31 <karsten> to avoid processing them again. 14:06:38 <karsten> and that data structure grows and grows. 14:06:46 <karsten> apparently, after 13 days, it grew too much. 14:06:49 <iwakeh> needs bigger hw. 14:07:01 <karsten> well, maybe. 14:07:03 <iwakeh> how much RAM? 14:07:09 <karsten> 8g 14:07:31 <karsten> hmm, I wonder if that old mac mini can handle more. 14:07:42 <iwakeh> ok, that was for which amount of files processed? 14:07:59 <karsten> 600g, I think. 14:08:12 <iwakeh> which is ? 14:08:17 <iwakeh> half ? 14:08:21 <karsten> ah! 14:08:26 <karsten> well, 40% or so. 14:08:37 <karsten> 240g. 14:08:44 <karsten> 600g is the total size. 14:08:47 <iwakeh> well, I could offer 32G ram. 14:09:00 <iwakeh> but I'd have to download ... 14:09:18 <karsten> right. and I'd want to keep these archives offline. 14:09:32 <iwakeh> yes. well 14:09:41 <karsten> so, I just restarted the batch where it stopped. 14:09:44 <iwakeh> then we need to improve the processing 14:09:53 <karsten> in theory, it'll break again at 80%, and then it will run through. 14:09:58 <iwakeh> won't it reprocess? 14:10:11 <karsten> nope. I moved the old files away. 14:10:32 <iwakeh> why not let it chew on smaller chunks? 14:10:35 <karsten> improving the processing would also be my favored solution. 14:11:00 <karsten> well, I could have moved the last 20-30% away, too. true. 14:11:03 <iwakeh> such a reprocessing might come up again? 14:11:11 <karsten> right. 14:11:20 <iwakeh> new ticket? 14:11:34 <karsten> so, my plan was to use an LRU cache instead of keeping all digest. 14:11:47 <karsten> but that's also just my guess that it's this data structure. I don't know for sure. 14:12:01 <karsten> I had jvisualvm running, but that broke after 90 hours for some other reason. 14:12:07 <karsten> new ticket sounds good. 14:12:08 <iwakeh> the processed ones could be stored in a simple db, too. 14:12:49 <karsten> well, switching to a db sounds like a bigger change. 14:13:08 <karsten> which also crossed my mind: fix all the bugs now, do the reprocessing afterwards. 14:13:19 <iwakeh> yes? 14:13:35 <iwakeh> you mean the bugs 14:13:44 <iwakeh> found in the refactoring part? 14:13:47 <iwakeh> and 14:13:48 <karsten> yep. 14:13:53 <iwakeh> ok 14:14:04 <karsten> I don't think they were ever triggered, because tonga was always nice enough not to give us bad data. 14:14:13 <karsten> still, would be good to fix them. 14:14:19 <iwakeh> yes, if reprocessing can wait a little. 14:14:28 <karsten> yes, a week or two. 14:14:42 <iwakeh> then that should be done. 14:14:59 <karsten> alright. let me create that ticket for the out-of-memory problem later today. 14:15:06 <iwakeh> fine. 14:15:16 <karsten> ok. 14:15:33 <karsten> moving to the next topic? 14:15:42 <iwakeh> ok 14:15:47 <karsten> * CollecTor planning (iwakeh) 14:16:11 <iwakeh> well, we have milestones(ms) for 14:16:24 <iwakeh> the collector (ct) release 14:16:38 <iwakeh> I'm wondering when to put out the 14:16:43 <iwakeh> first ct release. 14:16:47 <iwakeh> I'd like 14:17:03 <iwakeh> to have that soon when all the 1.0.0 ms tickets are done. 14:17:25 <iwakeh> https://trac.torproject.org/projects/tor/query?milestone=CollecTor+1.0.0&group=status&order=priority 14:17:56 <iwakeh> #18865 will be ready for review today 14:18:15 <iwakeh> and #19169 could rather be moved to ms 110 14:18:29 <karsten> I'm not sure if we can add #19317 before we add #19755. 14:18:44 <karsten> still, having #19317 in 1.0.0 seems useful. 14:19:13 <iwakeh> move it to 110 14:19:14 <iwakeh> ? 14:19:40 <iwakeh> add release 101? 14:20:19 <karsten> so, if we assume that reprocessing bridges will take another few weeks, 14:20:25 <karsten> do you think 1.1.0 would be out by then? 14:20:54 <iwakeh> depends, what we assign to ms 1.0.x 14:20:55 <karsten> what was your idea for releasing 1.0.0? 14:21:30 <iwakeh> good question. 14:22:04 <iwakeh> just noticing that 14:22:25 <iwakeh> there is a ticket missing for the release process 14:22:37 <iwakeh> the signing uploading whatever needs to done. 14:22:42 <karsten> right. 14:23:32 <iwakeh> before 10th of Aug? 14:23:46 <karsten> so, #2966 needs more discussion before being included in 1.1.0. 14:23:59 <karsten> I'd say unassign from that milestone. 14:24:06 <iwakeh> ok 14:24:23 <karsten> and #19317 goes to 1.1.0? 14:24:42 <iwakeh> isn't done? 14:24:43 <karsten> would it make sense to move #19720 back to 1.0.0? 14:24:53 <karsten> ah, I didn't reload. 14:25:09 <karsten> not done yet, should I move it? 14:25:38 <iwakeh> ok. 14:26:43 <iwakeh> we can have a 1.0.x for the fixes. 14:26:56 <karsten> sure. 14:27:58 <karsten> removed #2966 from milestone. 14:28:15 <iwakeh> so, have priority on the ms 100 tickets? 14:28:31 <iwakeh> I think I work on these mostly. 14:29:21 <karsten> okay, so there are three tickets left? 14:29:39 <karsten> can I add a fourth? :) 14:29:40 <iwakeh> four, if we move the runtime configuration change ticket. 14:29:49 <karsten> which one? 14:29:50 <iwakeh> sure. 14:29:58 <iwakeh> you just named it 14:30:22 <iwakeh> #19720 14:30:41 <karsten> ok. should I move it? 14:30:49 <iwakeh> done. 14:30:54 <karsten> ok. 14:31:13 <iwakeh> i can also add the ant tasks 14:31:27 <iwakeh> for pmd&findbugs this week. 14:31:52 <karsten> hmm, but we wouldn't fix any of those issues before the 10th, right? 14:31:59 <karsten> well, s/any/many/ 14:32:02 <iwakeh> you're right. 14:32:30 <karsten> my fourth (now fifth or sixth) ticket would be about improving the scheduler a bit. 14:32:38 <karsten> things like: 14:32:39 <iwakeh> how? 14:32:39 <karsten> undo path changes (everything under out/) 14:32:39 <karsten> make recent/ truly configurable 14:32:39 <karsten> start at 00:00.000 of configured minute, not x minutes from current time 14:32:39 <karsten> add mode to run once immediately 14:32:55 <karsten> things that came up while testing today. 14:33:26 <iwakeh> x minutes from current? 14:33:43 <karsten> here's what I did: 14:33:46 <iwakeh> well, just add tickets for these :-) 14:33:58 <karsten> I edited collector.properties to contain the next minute, like 35. 14:34:09 <karsten> then I started the process at, say, 34:15. 14:34:16 <karsten> and it would start at 35:15. 14:34:23 <karsten> when it should ideally start at 35:00. 14:34:32 <iwakeh> ah, that's interesting. 14:34:38 <karsten> but yes, I can be even more verbose than those four lines in the ticket. ;) 14:34:42 <iwakeh> period was 60 I suppose? 14:34:58 <iwakeh> good ;-) 14:35:03 <karsten> hmm, no, 10. 14:35:23 <iwakeh> oh? 14:35:42 <karsten> but minutes. 14:35:57 <iwakeh> yes, that need clarification in a ticket ... 14:36:00 <karsten> :) 14:36:11 <iwakeh> :-) 14:36:26 <karsten> are you going to create a ticket for the release? 14:36:36 <iwakeh> yes. 14:36:42 <karsten> I usually follow the instructions for releasing metrics-lib line by line. 14:36:50 <iwakeh> ok. 14:37:22 <karsten> okay, I think that's a good plan for 1.0.0 then. 14:37:29 <iwakeh> right. 14:37:34 <iwakeh> will you begin the 14:37:39 <karsten> let's make a plan for 1.0.1 or 1.1.0 after that. 14:37:44 <iwakeh> sanitizer bugfixes? 14:37:54 <iwakeh> sure. 14:38:19 <karsten> yes, happy to. 14:38:24 <iwakeh> I'd like to make a suggestion for that test class. 14:38:26 <karsten> should I also fix findbugs/pmd issues? 14:38:31 <karsten> please do! 14:38:40 <iwakeh> hmm 14:39:07 <iwakeh> the one-liners, anything else might be a real big change. 14:39:07 <karsten> I don't have to. 14:39:16 <karsten> okay. 14:39:22 <karsten> what's the suggestion? 14:39:39 <iwakeh> the things that are really small. 14:39:44 <iwakeh> and of course 14:39:55 <iwakeh> the potential null dereferences and the like. 14:40:18 <iwakeh> these can be done while working on the functional errors. 14:40:32 <iwakeh> i.e. the TODOs you identified. 14:41:00 <karsten> right. 14:41:05 <karsten> some things need more thoughts. 14:41:10 <karsten> like removing System.gc(); ... 14:41:21 <iwakeh> true, we do not 14:41:27 <karsten> I mean, in theory I agree that it shouldn't have to be there. 14:41:35 <karsten> but then it's there because we ran out of memory before. 14:41:46 <karsten> so maybe we should look what happened and if it still happens. 14:41:52 <iwakeh> need to change some of the rules or toss one or the other. 14:41:57 <karsten> not following findbugs suggestions blindly. ;) 14:42:00 <karsten> right. 14:42:03 <iwakeh> right. 14:42:09 <karsten> what's the suggestion about the test class? 14:42:31 <iwakeh> Configuration needs just an InputStream 14:42:34 <karsten> to be clear, I'd want to make that class better. the goal is not just to increase coverage. 14:42:37 <iwakeh> which can come from a String. 14:42:50 <karsten> the goal is also to write better test classes for other code bases. 14:42:58 <iwakeh> it's about simplifying the test class. 14:43:00 <karsten> hmm, didn't I fix that? 14:43:21 <iwakeh> I didn't have time to look at the class before this meeting. 14:43:33 <iwakeh> so maybe. 14:43:36 <iwakeh> :-) 14:43:50 <karsten> hmm, maybe I fixed it a bit but could fix it even more. 14:44:00 <karsten> so, yes, we should simplify the test class as much as possible. 14:44:16 <iwakeh> It'll also make the test a little more readable. 14:44:51 <iwakeh> maybe rename runTest to prepareTest? 14:44:53 <karsten> I'm conflicted how much code to write that's not actually tests. 14:45:07 <iwakeh> what do you mean? 14:45:08 <karsten> just to make tests more readable. 14:45:26 <iwakeh> more readable means shorter. 14:45:27 <karsten> well, right now, the first @Test annotation comes in line 515. :) 14:45:54 <iwakeh> maybe, I should look at the new version of the test and talk then? 14:46:03 <iwakeh> or write. 14:46:14 <karsten> and I can see us simplifying things even more, but at the cost of the first @Test being in line 700 or 800. 14:46:24 <iwakeh> oh no. 14:46:29 <karsten> yes, I'd very much appreciate your review here. 14:46:42 <iwakeh> ok. 14:46:57 <karsten> having clean tests seems like a good goal, too. 14:47:05 <iwakeh> yes. 14:47:10 <karsten> especially if we want to re-use concepts for other parts of the code. 14:47:32 <karsten> but, that's for 1.1.0. 14:47:41 <karsten> feel free to prioritize 1.0.0 stuff. 14:47:54 <iwakeh> yes or 1.2.0? 14:48:00 <karsten> or that. 14:48:07 <karsten> by the way, am I behind on any reviews? 14:48:15 <iwakeh> there are actually quite some intermodule code duplications, too. 14:48:31 <iwakeh> no, al up to-date, i think. 14:48:35 <karsten> ok. 14:48:48 <karsten> hmmmm 14:48:54 <karsten> in theory, there are only 2 real modules. 14:49:04 <iwakeh> huh? 14:49:07 <karsten> exit list stuff is tiny, torperf goes away. 14:49:14 <iwakeh> ah, ok. 14:49:40 <iwakeh> relaydescs and bridgedescs have code im common. 14:49:47 <karsten> okay, we should look at that. 14:50:05 <karsten> we could have a shared package, 14:50:10 <karsten> or we could move things to metrics-lib. 14:50:18 <iwakeh> That's why I wanted to add the tasks this week. 14:50:19 <karsten> depending on how generic the code is. 14:50:43 <iwakeh> yes, that's to be seen when refactoring. 14:50:50 <iwakeh> did you look at 14:51:12 <iwakeh> #19170 14:51:33 <iwakeh> comment:7 14:52:43 <karsten> looked, yes, but I don't know what's the right thing to do there. 14:53:23 <iwakeh> you mean, what data to store? 14:53:24 <karsten> I'll put it on my list. 14:53:29 <karsten> yes. 14:53:32 <iwakeh> fine. 14:53:43 <iwakeh> it needs thinking. 14:53:50 <karsten> yes. 14:53:54 <iwakeh> of several brains :-) 14:53:59 <karsten> ideally, yes! 14:54:16 <karsten> ah, one question about milestones: 14:54:24 <karsten> #18910. 14:54:36 <karsten> that's what we promised for the MOSS award, right? 14:54:50 <iwakeh> yes 14:54:59 <karsten> would it make sense to include that in 1.0.0, just to lower the pressure of getting out 1.1.0 soon? 14:55:11 <karsten> even if that delays 1.0.0 a bit. 14:55:12 <karsten> ? 14:55:40 <iwakeh> I'd first like to have a new instance running with a scheduler. 14:56:21 <karsten> ok. how about we subdivide the current 1.1.0 into one part with that ticket and the rest? 14:56:27 <karsten> and call the rest 1.2.0? 14:56:35 <iwakeh> sure! 14:56:43 <iwakeh> that's a good idea. 14:56:44 <karsten> just to make it more realistic to get 1.1.0 out son. 14:56:46 <karsten> soon* 14:56:52 <iwakeh> august. 14:57:02 <karsten> august would be great. 14:57:21 <karsten> should I create a 1.2.0 milestone in trac? 14:57:28 <iwakeh> please do. 14:57:43 <karsten> oh, and should I define dates for 1.0.0 and 1.1.0? 14:58:02 <iwakeh> not yet? 14:58:09 <karsten> ok. 14:58:12 <karsten> 1.2.0 created. 14:58:14 <karsten> without date. 14:58:17 <iwakeh> great. 14:58:20 <iwakeh> regarding 14:58:29 <iwakeh> the sync 14:58:55 <iwakeh> the meta-design needs to be one very soon. 14:59:13 <iwakeh> i.e. @source tags if and for what. 14:59:22 <karsten> ah ok. 14:59:27 <karsten> I thought we gave up on those. 14:59:34 <karsten> but I didn't look for a while. 14:59:39 <karsten> adding to the list. 15:00:07 <iwakeh> there is just a long discussion with no decision reached yet. 15:00:13 <karsten> ok. 15:00:32 <karsten> alright, we just crossed 15:00 UTC! 15:00:40 <iwakeh> we could first assume only benevolent collectors. 15:00:42 <karsten> and I have a looooong list of things. 15:00:46 <iwakeh> ok. 15:00:54 <iwakeh> then, back to work. 15:00:55 <karsten> yes, I think that's a good assumption. 15:00:57 <karsten> haha 15:00:58 <iwakeh> :-) 15:01:19 <iwakeh> more in tickets 15:01:20 <karsten> alright, we could talk more on monday, or next thursday. 15:01:24 <karsten> yes, and in tickets. 15:01:26 <iwakeh> sure. 15:01:44 <karsten> monday? 15:01:59 <iwakeh> 9utc 15:02:04 <karsten> sounds good. 15:02:09 <iwakeh> fine. 15:02:21 <iwakeh> are we done? 15:02:22 <karsten> great! thanks for taking the time. 15:02:26 <karsten> yes! bye. :) 15:02:29 <iwakeh> thanks 15:02:30 <karsten> #endmeeting