15:02:04 #startmeeting SponsorR 15:02:04 Meeting started Tue Mar 17 15:02:04 2015 UTC. The chair is asn. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:04 Useful Commands: #action #agreed #help #info #idea #link #topic. 15:02:07 hello! 15:02:12 who is around for this occasion? 15:02:18 hello 15:02:23 * isabela ! 15:02:23 syverson around? 15:02:30 * dgoulet here 15:02:31 he is in the infernal channel but not here 15:02:39 asn: ahah! 15:02:50 oh well 15:02:55 i believe that he is planning to attend 15:03:04 ah hello syverson 15:03:09 so let's start with the status report as usual 15:03:10 Hi asn 15:03:22 during past week, I did some work on #8243 15:03:36 i'm quite confident now that requiring Stable flag for HSDirs is a reasonable idea. 15:03:49 not sure how to proceed. whetehr i should write a brand new proposal, patch rend-spec.txt or patch rend-spec-ng.txt 15:03:56 i will ask nickm for feedback on this 15:04:08 i also discussed encrypted serviecs a bit in #15271 15:04:30 also took a look at the metrics stuff of karsten, but he will talk more about these. 15:04:50 and I wrote some top-down research questions in the pad: https://etherpad.mozilla.org/psNv5z99Z5 15:04:55 and that's that from me. 15:05:01 should I go next? 15:05:04 who wants to go next? 15:05:04 yes 15:05:06 karsten 15:05:19 so, I put hidserv.csv on Metrics, which gets updated automatically. 15:05:30 I also put three graphs on Metrics. that's #15273. 15:05:42 karsten++ 15:05:53 super agile metrics development 15:05:54 oh, and I add filtering/ordering capabilities to Metrics to make room for "Advanced" graphs. (what a great excuse to finally re-do the website a bit...) 15:06:01 https://metrics.torproject.org/?tag=hs&type=gr&type=tb&type=ln&type=dt&level=bs&level=ad&order=type 15:06:17 it also satisfies my desire to write HTML for quite a while. 15:06:27 hehe 15:06:31 haha looks good karsten 15:06:43 I also helped clarify dir-spec by saying that hidserv-stats can be negative. that's #15276. 15:06:50 (are these sql queries underneath? is it sql injection proof?) 15:06:58 no sql, just R. 15:07:04 reading from local .csv files. 15:07:13 this stuff: ?tag=hs&type=gr&type=tb&type=ln&type=dt&level=bs&level=ad&order=type 15:07:23 ah. 15:07:38 all metrics are hard-coded in java. :) 15:07:41 no sql. 15:07:43 ack 15:07:48 can't break ever. 15:07:53 unbreakable. 15:07:56 much simpler++, love it 15:07:58 ok 15:08:03 and finally, I re-read our first tech report draft, 15:08:09 ah yes 15:08:11 you sent an email 15:08:13 haven't read yet 15:08:16 wanna summarize super brief? 15:08:17 and suggested 25 mins ago that we don't publish but instead re-use parts. 15:08:18 or just read email later? 15:08:38 well, I pointed out a couple of things that would need tweaking. 15:08:43 it's a draft, not ready for publication. 15:09:05 and given how long we took to write that blog post about the report we were mostly sure about is not an issue, 15:09:12 publishing this would be a mess. 15:09:26 but, parts of it are good, and we should re-use them. 15:09:30 ack 15:09:31 just for 5 stats, not 25. 15:09:39 okay, that's all from me. 15:09:41 ok thx 15:09:43 who next? 15:09:47 * dgoulet can go 15:09:49 dgoulet: please 15:09:55 Finally finished the #14847 monster, under review. Working on the HS health measurer using stem (dev repo: https://gitorious.org/hs-health/hs-health). 15:10:05 Also, I did some spelunking on the introduction point removal algorithm, work in progress. Finally, reading new TorPerf tech report to see if we could think about implementing that for SponsorR. 15:10:14 stop dictatorship! stop humilations and tortures. 15:10:15 (yeah I prepared my stuff :P) 15:10:19 Oh last, I would like to bring your attention to #15254 so we can resolve that (discussion phase I guess :) 15:10:27 * dgoulet done 15:10:34 * ohmygodel can go 15:10:38 ohmygodel: yes 15:10:54 i found a flaw with the anonstats approach 15:11:01 yeah that was bad 15:11:07 i think its pretty killer, so too bad about that 15:11:26 i just added several high-level goals to the top-down plan 15:11:35 btw in your mail about the anonstats attack 15:11:41 you said "maybe true aggregation is solution" or sth 15:11:47 what is true aggregation? multiparty computation? 15:12:13 something along those lines yes 15:12:16 ok 15:12:33 i made improvement to peerflow (increase size of adv it is secure against by 2x) 15:12:53 i went to the DARPA Brandeis proposers day 15:13:09 that program wants to develop technologies to share and analyze data in a privacy-preserving way 15:13:38 we at NRL are considering a proposal to develop technologies that would be useful for Tor and similar privacy networks 15:14:05 i also had some useful conversations with security researchers 15:14:28 George Danezis at UCL and Dov Gordon at Applied Communications 15:14:43 ah george danezis likes statistics aggregation 15:14:53 i think the secure multi-party computation is does have great potential for Tor 15:14:56 he thinks about the problem some times i think 15:15:36 and if Tor isn’t in a position to develop it, luckily there are other people that might help 15:15:53 ack 15:16:21 there are some Tor-specific unique challenges / opportunities that might make it interesting as well as useful 15:16:32 that is all 15:16:35 interesting 15:16:36 thanks 15:16:43 anyone else? 15:16:45 * syverson can go 15:16:47 syverson: please 15:16:57 stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire sadistic moderators! stop sadistic moderation! fire 15:17:25 I finished monthly reports and submitte them to Phil. 15:17:29 Everyone who needs to be here is here right? 15:17:36 yes, Yawning 15:17:44 thanks! 15:17:44 I sent Brian input for the DARPA open catalog thin they requested. 15:18:03 I had an exchange w/ Chris about the POSTNote that came out. 15:18:14 He knew about our involvement w/ that. 15:18:34 i don't know what postnote is tbh 15:18:37 Mostly I worked on revisions to my "genuine onion" paper, that is _almost_ done. 15:19:06 I also did the Brandeis day thing with Aaron and talked about it with Rob and him as he outlined. 15:19:08 Done. 15:19:19 (I kind of did hs stuff but not sure if relevant) 15:19:29 Yawning: please tell us 15:19:32 asn: UK parliement notes about Tor and privacy I think 15:19:49 I squash/rebased my add_onion control port command branch onto master 15:19:57 see #6411 and pull it in 15:20:08 if you ever have a need to create a ton of hidden services at runtime 15:20:13 dgoulet: Yes Parilamentary Office of Science ad Technology 15:20:13 so there is #14847 and #6411 both control port stuff. ready for review. 15:20:25 (it's 1 line in python to create a HS with stem with the patch) 15:20:47 asn: yup and would be *really* nice to have it sooner than later 15:21:03 asn: hs health for SponsorR depens on #14847 quite heavily for it to be in stem 15:21:09 asn: and for SRI to use it 15:21:14 (controller.msg("ADD_ONION NEW:BEST Flags=DiscardPK Port=80") for example) 15:21:20 yes. i would like to review. but they are not small and i would need to reserve time for them. 15:21:40 ok ack 15:21:46 good stuff all around. 15:21:48 You can even crash your tor instance by running it in a tight loop (please don't do that) 15:21:59 is everyone done with their reports? 15:22:00 i guess so. 15:22:04 move to discussion phaes? 15:22:36 About the report that isn't being published for now. 15:22:36 We told Sponsor R about it. 15:22:36 We will need to give them something along these lines. 15:22:36 They like to have everything open and published (yeah them!) 15:22:36 How do those fit into the decision not to publish. 15:22:48 ok id like to talk about this also 15:22:54 ok that's a good thing to talk about. 15:22:54 karsten thanks for the thorough reading 15:23:00 another thing is to talk about top-down stats. 15:23:17 sure, let's talk about the report. (after agenda-making I guess.) 15:23:17 i do recall that when i submitted for publication release, i did cleanup along the lines of what you suggested just to make it readable 15:23:17 let's talk about the lost report first. 15:23:21 asn yes! 15:23:33 asn: also the anonstats / smc ideas please 15:23:34 ohmygodel: do you have a newer version somewhere? 15:23:38 but back to thie tech report 15:23:49 i do have a branch, but you bring up a good 15:23:51 ohmygodel: yes 15:23:54 point about the HS stats listing 15:23:58 can't we just "publish" that draft? 15:24:03 and by publish I mean, have it lie around. 15:24:17 and then in the future we will frankenstein it to make it a better tech report. 15:24:21 that's not exactly publishing. :) 15:24:22 with the top-down approach and whatnot. 15:24:31 but sure, happy to do that. 15:24:45 i would like to have a published something to point to 15:24:50 i see 15:24:51 by publish I mean assign a number, put on research.tp.o, etc. 15:24:55 and if we dont want to publish a scary list of stats 15:25:06 how about we remove that list 15:25:23 put it on the wiki, because we tend to actually refer to it 15:25:32 in our planning discussions 15:25:42 that would leave a discussion about the threat model 15:25:59 you mean remove section 4? 15:26:02 the types of HS data that we definitely want to keep private and that guides our stats decisions 15:26:10 karsten: yes 15:26:20 i have not read the other sections carefully tbh 15:26:24 ohmygodel: you mean the SRI memex wiki, or some other wiki? 15:26:37 it would also leave a description of our obfuscation methodologies and why we think they work 15:26:48 I should read section 5 in more detail, too. and we need a real section 6. 15:26:54 i think its important to put both of those things out there, so everybody known and can comment on our reasoning 15:27:04 syverson: the Tor Sponsor R wiki 15:27:06 btw, that documetn is currently publicly reachable, and was linked from a sponsorr monthly report. 15:27:13 do they need more stuff for "everything to be open and published"? 15:27:18 what do they need exactly? 15:27:46 i'm also not very fond about that "tech report" that's why I'm asking 15:27:58 but I think I will leave karsten take the decision here. 15:28:03 no no 15:28:08 I'm 1/4 of the authors here. 15:28:12 asn: benefits of “publishing” include that it wont disappear or change substantially, that we have a name/version that we can all refer to, and that we can and should point outside people to it and expect that they can read and understand it 15:28:13 If the document is publicly reachable and linked, what are we talking about? The Tor Report number? 15:28:26 syverson: not sure. 15:28:49 ohmygodel: i seee. 15:28:56 syverson: being put on the Tech Report site (which entails a number) 15:29:14 i would also like to be able to cite this in academic publications 15:29:26 my goodness... 15:29:30 but it's not that good :( 15:29:37 ohmygodel: OK so having the public thing be officially published in some generally citable way I guess. 15:29:47 and i'm mainly talking about section 4. 15:29:52 asn: wrt #8243 ... 15:29:53 which I thought was the point of that report. 15:29:57 nickm: ah nice 15:30:02 yeah, asn. 15:30:05 ... I'd suggest writing a very short proposal. 15:30:12 nickm: sounds good 15:30:13 we started brainstorming and then turned that into section 4. 15:30:14 asn: yes, and im suggesting cutting section 4 (i.e. moving it to an wiki mostly for internal use) 15:30:17 and then we wrote sections around it. 15:30:25 if we rip out section 4, it's a different report. 15:30:34 (Is this where I should bring up the fact that tor apparently explodes when you try to run lots of HSes at once?) 15:30:48 Yawning: there was that exact ticket by naif :P 15:31:04 where I'm not sure whether a different report would be better/worse. :) 15:31:04 karsten: yes, im suggesting that we create a different report that explains our reasoning and methods for publishing HS stats (and stats more generally) 15:31:11 ok. 15:31:32 asn: yeah I commented on that, y'all should fix that :P 15:31:33 i would find that very useful, and so might anybody else who wonders why “Laplace noise” makes any sense at all, for example 15:31:46 true. 15:31:48 ohmygodel: yes that's true. 15:32:01 so, it feels like there's more effort needed than commenting out a section. :) 15:32:06 we should be clear about that. 15:32:09 yes 15:32:10 karsten: yes, i would be happy to take care of that 15:32:13 but I'm not opposed to the idea. 15:32:27 ohmygodel: if what remains is a subset of the approved report modulo some cleanup edits, you might not need a new release. 15:32:33 (I just think that this report shouldn't be published as-is.) 15:32:34 so the report shifts from "all the statistics we could gather" to "how we gather statistics and why its secure" 15:32:36 syverson: that would be great 15:32:37 or something. 15:32:40 asn: yes 15:33:02 i see 15:33:30 that kind of makes sense. 15:33:34 we could and probably should include those two HS stats that we are currently collecting as running examples 15:33:42 right 15:33:44 curious to see the non-4 sections. i remember something about hiding values and distributions of data. 15:34:44 ok, so should we move to next topic? or do you want to assign people to do this? 15:34:49 this is not in our roadmap btw. 15:34:52 im willing to do this 15:34:56 but it's something presentable in april. 15:35:03 yes for april 15:35:04 ohmygodel: ok great. 15:35:06 finally publish and blog about existing technical reports on statistics collection: ("Hidden-service statistics reported by relays" and "Extrapolating network totals from hidden-service statistics") - [karsten, george, aaron] 15:35:13 ohmygodel: after you finish your pass, I would also like to take a look at it. 15:35:15 somewhat on the roadmap. 15:35:30 thanks, ohmygodel! I can review, too. 15:35:32 asn: yes, i would like all authors to read it if possible 15:35:44 ohmygodel: yes, ack 15:35:56 i'm mainly afraid that the rest of the secctions won't hang there on their own without section4 15:36:03 without a major refurbishment 15:36:10 sections are not as tightly connected ;) 15:36:40 asn: i can see how to write this well 15:36:41 sounds good. 15:36:42 yeah. i'd like to see some alternative direction in that document, now that its main topic is cut. 15:36:45 ok fantastic. 15:36:49 good topic. 15:36:52 let's move to the next one. 15:36:59 top-down stats? 15:37:28 sure - i listed six high-level goals that all seem like good ones to me 15:37:32 id like to hear what you all think 15:37:39 what's the link again? 15:37:42 https://etherpad.mozilla.org/psNv5z99Z5 15:37:48 thanks 15:38:00 ok let's talk a bit about the blue ones. and then we can talk about the green ones. or something. 15:38:16 nickm: poke 15:38:39 do you think that walking through the list makes sense? 15:38:41 or too many items? 15:38:49 :x 15:38:53 the list on the etherpad? 15:38:55 yes 15:39:01 tbh, the green ones don’t seem very high-level to me 15:39:03 the first order asterisks. 15:39:05 will try. in 3-4 meetings at once 15:39:13 maybe im not perceiving their organization 15:39:26 ohmygodel: yes, they are less high-levl than the blue ones 15:39:28 (green ones are mine) 15:39:50 i mainly posed them as research questions kindof 15:39:58 asn: ok. it seems that many of them could fit under a high-level goal of “optimize HS parameters” or something like that 15:40:06 yes true 15:40:13 yes 15:40:58 some green ones are very hs health oriented, very useful to me! :D 15:41:00 ok so it seems the first 4 fall unedr that 15:41:22 the next 3 seem to fall under “onion service performance" 15:41:31 yes plausible 15:41:36 which is the second-to-last goal that i listed 15:41:47 I put it as “How well are onion services performing?” 15:42:06 ok 15:42:17 ok asn: ill let you do reorganization here so we dont conflict 15:42:22 sounds good 15:42:39 so some thoughts on the blue ones: 15:42:48 i'd like to discourage botnets. 15:43:03 and a graph of the botnet activity seems plausible to me; if we have a way to get it. 15:43:39 i'm a bit afraid of the ethical side of this. that if we start visualizing botnet activity, then we can also start visualizing cp activity, and then ... 15:43:47 i don't think we are going to go to that slipperly slope 15:44:09 but I'd like a second opinion on this too. 15:44:19 asn: that is a good point 15:44:56 not all of these goals have to be solved by stats gathered by all relays and published on metrics 15:45:02 The nice thing about botnets is that we could treat it as patterns of flows not as anything about content. (I know there's leakage, but the potential is there.) 15:45:16 yes 15:45:23 i think the idea was to find them by the tor version they use?9$ 15:45:37 or maybe one idea was that. 15:46:08 so next one is "churn of hidden services" 15:46:11 we have discussed this quite a bit. 15:46:49 i can see some potential PR and future funding benefits maybe. 15:46:56 not sure I see the technical benefits. 15:47:10 if someone else has feedback on this that would be great. 15:47:26 i have discussed this too much myself. 15:48:10 ok... 15:48:11 so next... 15:48:13 * How many users do onion services actually have? 15:48:14 yeah the churn one i was less sure about 15:48:21 roger asked for it a couple of times 15:48:28 i dont recall his motivation there 15:48:40 yes roger has been asking for it. he hasn't persuaded me yet. 15:48:59 so the users of onion services seems like an interesting figure to have 15:49:13 kind of useless technically, like the number of hidden services. 15:49:22 but useful information to have in general. 15:49:29 unclear how to get thsi info though. 15:49:39 the closest we can get easily is the gareth owen stat 15:49:56 of descriptor fetches. or maybe RP establishemtns which is easier to do. 15:50:11 "How many users do onion services actually have"... how can we actually know that? 15:50:25 i don't think we can have it precisely. 15:51:00 yeah its a research question 15:51:07 i have ideas… :-> 15:51:18 ok 15:52:15 * Are onion services broadly popular, or are there just a few services that most users access? 15:52:18 * A related question: are some onion services consuming way more bandwidth than others? 15:52:27 reasonable question. 15:52:53 not sure exactly how we could learn figures here. 15:53:29 i'm not even sure if walking through this list is even beneficial at this point 15:53:31 but might as well continue 15:53:49 asn: fetches can reveal popularity 15:54:15 and last two questrions are: 15:54:15 * How well are onion services performing? 15:54:18 * Are there denial of service attacks on hidden services? 15:54:22 and a new oine: 15:54:25 * Are there bi-modal or n-modal usage patterns of access in hidden services that could lead us to optimize for different cases of use patterns? 15:54:45 the last one seems a bit related to the popularity one. 15:54:59 so OK, this is a reasonable list of questions. 15:55:04 how do we proceed? 15:55:07 asn: yeah, i agree that it can fit under the popularity distribution 15:55:30 asn: develop ideas about how to learn each of these 15:55:31 should we assign them to folks to, well, flesh them out somehow? 15:55:38 until next week? 15:55:53 plausible karsten 15:56:05 see how we can develop general methods that will help with multiple goals simultaneously 15:56:12 karsten: that seems like a good idea 15:56:39 one obvious thing that needs to be done for each question, is state ways that it could be collected or aproximated. 15:56:46 yep 15:56:55 if i recall, the goal is to have a long-term plan and a short-term plan for HS stats by april 15:56:58 is that correct ? 15:57:23 hm 15:57:35 we could do that 15:57:41 not sure if that was our original goal 15:57:49 but it seems plausible 15:57:56 ok well we shuold target something for april 15:58:20 i don't think we can do stats that require little-t-tor mods by april. 15:58:35 asn: right, which is why we are instead just trying to make plans 15:58:44 yes 15:59:30 (there were some other stats that i thought you were considering getting relating to attacks and performance) 15:59:37 the other thing that is missing from those questions are precise risk/benefits 15:59:50 but i'm not sure if the risk/benefits approach is useful anymore 15:59:52 asn: i list the benefits 16:00:23 for each goal, the lines starting “** This is [also] important” describe why the goal is desirable 16:00:41 yes 16:00:46 fair enough 16:01:25 fwiw, the top-down approach might be a bit better than our old approach 16:01:28 but it's still highly subjective 16:02:09 in the end, I will want other/more people to decide and prioritize these stats. yes. 16:02:30 i don't think i want to take decisions here either. 16:02:46 asn: this sounds like a job for roger to me 16:03:04 yes maybe. 16:03:07 i always recommend using roger, though :-/ 16:03:11 yes. 16:03:33 anyway 16:03:37 but asn: i do not want to be blocking on you on these things 16:03:37 so wrt to this thing 16:03:48 the short-term plan to 1) Risks/Benefits and 2) State how to collect them, is a reasonable approache for april I think 16:03:49 ohmygodel: indeed 16:04:01 so lets not let this organizational issue linger 16:04:04 maybe isabella can help here ? 16:04:16 isabela 16:04:20 ah sorry 16:04:20 (she is single l'ed) 16:04:46 she stepped off 10 min ago: "11:51 < isabela> I will take a shower brb" 16:04:51 ok no worries 16:04:52 so 16:04:54 in fact, could you please take responsibility to escalate this to the appropriate person ? 16:04:56 * karsten gotta run; should I read backlog for things to work on this week? 16:05:12 karsten: ok 16:05:18 asn: ok. 16:05:23 adios karsten 16:05:25 ohmygodel: ok 16:05:37 * syverson gotta go real soon too. Any major things not brought up yet? 16:05:38 bye, ohmygodel! 16:05:40 so for next week ,let's curate the list some more. 16:05:59 and i will also try to speak with isabela or whatever, to find a reasonable process for selecting these stats 16:06:02 asn: great, thanks. that will help us all effectively plan 16:06:29 ok 16:06:38 so next week? 16:06:43 karsten: not still around eh? 16:06:49 dgoulet: karsten: 16:06:51 let's decide for next week 16:06:55 stuff in there --> https://trac.torproject.org/projects/tor/wiki/org/roadmaps/HS 16:06:57 what we are going to be doing 16:07:08 I need to do that by the end of the week "Explain how HS control events works. " 16:07:18 ok 16:07:34 and i will curate the top-down list and maybewrite a propsoal for #8243 16:07:35 and mostly work on april 2015 stuff, hs health is very important for mi-april 16:07:42 dgoulet: great 16:08:09 asn: do you need help from the rest of us in fleshing out methods for the top-down goals ? 16:08:09 asn: ah yes #8243 is a useful one 16:08:33 ohmygodel: would be nice 16:08:51 ohmygodel: you probably have ideas for the ones you added right? 16:09:13 asn: yeah, i can take charge of a couple of them if you like 16:09:18 ok 16:09:23 hi 16:09:24 sorry 16:09:28 was away for a little bit 16:09:31 ohmygodel: any specific ones? 16:09:39 how about number of users ? 16:09:40 isabela: no worries 16:09:47 ohmygodel: ok 16:10:14 and onion service popularity ? 16:10:18 ok 16:10:20 sounds good 16:10:23 ok 16:10:30 isabela: so, you were mentioned 16:10:47 because we still have the problem of picking statistics to do 16:11:18 I was reading the backlog, do you want my help on driving a process to decide this? 16:11:22 yes 16:11:30 ok, deadline is april? 16:11:38 yeah kind of 16:11:42 ok 16:11:45 will do ;) 16:11:54 i personally feel OK about some stats in that list (mainly some of the ones I added) 16:12:00 i feel curious about other stats in that list 16:12:01 isabela: deadline is 4/13, the start of the quarterly progress report (QPR) for Sponsor R 16:12:06 and uneasy about others 16:12:35 i'd like nick or roger or someone to help us pick statistics 16:12:50 or maybe we can do a vote or something. 16:13:14 ok, things to consider (looking at the backlog) risks/benefits and 'cost' to collect them (too much work, impossible to do in a safe way etc) 16:13:30 yes 16:13:31 But this deadline: we're not talking about gathering more stats by then, yes? We already have enough to report, no? It's just to be able to coherently report about plans. I want that to be clear for isabela. 16:13:31 anything else? 16:13:40 syverson: yes correct. 16:13:46 asn: i think that prioritizing goals is a better level to be thinking right now, although obviously without effective methods, goals are unachievable 16:14:05 ohmygodel: yes prioritizing is a good way to do this. 16:14:47 isabela: my main problem with the risks/benefits approach is that they can be subjective or vague. 16:15:19 But I'm a bayesian. 16:15:22 Sorry 16:15:32 anyway isabela 16:15:34 by next week 16:15:39 the list should be a bit more robust 16:16:03 more close to be prioritizable 16:16:11 * isabela looking at the list now 16:17:17 * syverson needs to run AFK. Bye for now. 16:17:23 ok are the rest of us needed ? 16:17:32 #endmeeting