#tor-dev log

15:04:53 <asn> #startmeeting SoP-Donncha
15:04:53 <MeetBot> Meeting started Wed Jun  3 15:04:53 2015 UTC.  The chair is asn. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:04:53 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
15:04:56 <asn> ok
15:05:09 <asn> sooooo
15:05:23 <dgoulet> I guess DonnchaC can status update :) ?
15:05:23 <asn> i guess let's start with what you've been working on?
15:05:27 <asn> i guess yes
15:05:32 <DonnchaC> II got started with my SoP project last week with emails to HS operators, and tor-talk. Summarised the responses on tor-talk. https://lists.torproject.org/pipermail/tor-talk/2015-June/038058.html
15:05:41 <DonnchaC> I responded to #15991 to discuss potential risks with exposing the a HS is using onionbalance.
15:05:50 <DonnchaC> Carefully read #prop224 with focus on any issues it may cause for my SoP project. Fixed some basic typos with #16197
15:06:10 <DonnchaC> Started looking at #14846.
15:06:33 <asn> is there a problem with prop224?
15:06:38 <DonnchaC> It was a bank holiday weekend last weekend, still catching up from it :)
15:06:51 <asn> i guess we will have to give you the decrypted descriptor with HSFETCH when prop224 gets done.
15:07:32 <asn> so in what state is onionbalance now?
15:07:58 <asn> what can it do, and what can't it do?
15:09:29 <DonnchaC> I think prop224 could work okay with this combining intro points in one descriptor technique. However need to specify a blinded public key to each HS instance which can be cross-certified with the intro-point auth keys.
15:10:14 <asn> oh
15:10:29 <asn> so a bit more crypto work on the onionbalance side? or this can be done with copy-pasting stuff that tor does already?
15:11:02 <DonnchaC> asn: It currently seems to be working. I pushed some security fixes but other than that haven't been writing code for the last week while I've been trying to flesh out the design.
15:11:18 <asn> security fixes? o.o
15:11:22 <asn> ok
15:11:58 * dgoulet will ask a question
15:12:55 <dgoulet> DonnchaC: so this all seems like a fine start! I'm also interested in knowing what's your plan in the next let's say two weeks? (so we can know how we can help or what to review in what timeframe, etc..)
15:13:46 <asn> #idea (question for later) still curious on how annoying race conditions are in this space. do we have reachability data?
15:15:21 <DonnchaC> dgoulet: I'd like to write a more formal spec for onionbalance. I think it would be useful to help analyse potential issues early.
15:15:27 <asn> +1
15:15:35 <dgoulet> indeed!
15:17:02 <DonnchaC> dgoulet: From operator feedback, seems like HS redundancy/failover is a big priority. To be able to do that properly, and avoid race conditions and some potential attacks I'm now thinking about a more complex mode of operation.
15:17:43 <asn> there are attacks? are they documented somewhere?
15:18:07 <DonnchaC> 1. Basic mode - 1 management service, polling from the HSDirs, no extra software need on HS instance, no listening services needed on the management server
15:18:52 <asn> yes that's the one i'm familiar with
15:19:06 <DonnchaC> 2. More complex mode - Two or more management services, direct descriptor upload from onion service instance to the management service. Automatic failover is management service down.
15:19:56 <asn> direct descriptor upload, i see.
15:20:22 <dgoulet> and those management service synchronize between each other?
15:20:50 <asn> and I guess there is the super complex mode, where not only we have direct descriptor upload, but also a metadata channel as well for signalling between instances and mgmt.
15:21:14 <dgoulet> ah well I was thinking of that for 2. ^ :P
15:23:46 <DonnchaC> I was thinking somewhere between the two. I'm not sure how much extra signaling is needed. I'm also not sure if the managment services need to communicate with each other. Maybe it would be fine if two managment service just competed with each other, both uploading descriptors. It might be a little wasteful on HSDir resources but shouldn't be such a strain.
15:24:27 <asn> plausible as a first plan
15:25:01 <DonnchaC> I'm not sure if the complication of the managment servers keeping conenctions open to one another to ensure they are online is completely necessary. I think that's something that would be okay to bolt on later if limitations are experienced.
15:25:15 <asn> +1
15:25:19 <dgoulet> yup
15:25:45 <asn> this seems like a project that can be developed decently in an incremenetal manner
15:25:58 <asn> and you seem to have a good idea of how to do this correctly
15:26:15 <dgoulet> agreed, I'm excited to see the detail spec build up :)
15:26:20 <dgoulet> lots of good ideas
15:26:30 <DonnchaC> When I mentioned potential attacks, One potential issues is that the lack of a revision counter in the v2 rend protocol prevents the management server from knowing if the received descriptor is the latest descriptor. An attacker could try DoS a service by serving an old descriptor with expired IP's. With OnionBalance an attacker only has to control 1 HSDir. Clients lose some redundancy as they lose the ability to retry at the other HSDirs.
15:27:50 <asn> hm
15:28:23 <dgoulet> DonnchaC: I'm unsure to know what "revision counter" means here?
15:28:54 <asn> so there is a 24 hour window for replays right?
15:29:01 <asn> like that if the attacker could always win the race,
15:29:33 <asn> the attacker would have the option to replace the legit descriptor with any other of at most 24 previous descriptors
15:29:52 <asn> because after that the descriptor id changes (?) and it won't get accepted
15:30:01 <asn> so it's not like the attacker can replace the current descriptor with one from 4 days ago
15:30:04 <asn> or is it?
15:30:47 <DonnchaC> asn: Actually only 2 hours I think. The descriptor includes a timestamp. OnionBalance can just check that is hasn't received an older descriptor than the last one it's seen
15:30:50 <DonnchaC> https://github.com/DonnchaC/onion-balance/commit/f90ccbc3b84569dc50d15e50dd65397937646de0
15:31:48 <asn> hm
15:32:31 <asn> well the attacker could try to win the race all day
15:32:43 <asn> and always present the first descriptor that went to that HSDir
15:34:44 <DonnchaC> Currently OnionBalance lets Tor try do a standard fetch with HSFETCH. Maybe it should instead keep a list of current HSDir and iterate through them until its found a valid descriptor or tried them all.
15:35:19 <asn> how would that help?
15:35:44 <DonnchaC> Alternatively, it could simple retry a random fetch if thinks its been given a old/bad descriptor from a HSDir.
15:36:29 <asn> theoretically, the attacker could poison all 6 HSDirs for all day
15:36:36 <asn> she just needs to win an easy race
15:36:41 <asn> every hour
15:37:09 <asn> but, if i understand the problem correctl,y
15:37:14 <asn> it's not that terrible
15:37:33 <asn> because if the attacker only has 24 hours window. then if we have like 6 IPs, it's very unlikely that all of them will go bad so quickly.
15:37:49 <asn> and since it's a replay attack, those IPs were up at some point less than 24 hours ago.
15:38:58 <DonnchaC> Yep, I don't think it is that bad.  I'm just trying not to make that attack easier for an attacker.
15:39:09 <asn> yes
15:39:14 <asn> that's good
15:39:19 <asn> your defence is already good
15:39:32 <asn> i think prop224 has revision fields, right?
15:40:07 <DonnchaC> Some of the IPs from the last 2 (or 24 hours), are probably going to be still valid unless the attacker is trying to kill them by making lots of INTRO requests.
15:40:18 <asn> lol
15:40:28 <asn> that could work
15:40:39 <asn> fwiw, this is also an attack on the current system
15:40:42 <DonnchaC> asn: dgoulet: Yes, prop224 has a 'revision counter' which would be an easy way of avoiding this type of problem.
15:40:45 <dgoulet> oh good old much intro
15:41:06 <dgoulet> DonnchaC: ah! from 224, cool thanks for clarification
15:41:38 <asn> of course, one could argue that pre-224 (#8244) one can HSDir DoS an HS more easily.
15:41:53 <asn> which is terrible on its onw.
15:42:19 <asn> DonnchaC: that's a fun attack to document in any case
15:42:46 <asn> (anything eles to discuss about this?)
15:42:57 <asn> otherwise, i wanted to ask you something about the "complex mode" above
15:43:15 <DonnchaC> asn: The expired descriptor attack works against the current system, but a client has a better chance as they can try all 6 of the HSDirs. With OnionBalance, the managment service picks one copy of the descriptor from a HSDir. If it has bad luck with its choice of HSDir it will publish a descriptor with only those old IP's. Clients won't have the option of trying other HSDir's for a different descriptor.
15:44:16 <asn> hm
15:44:20 <DonnchaC> No, I think that's it. It's just a reason to favour the direct descriptor upload approach.
15:44:31 <DonnchaC> asn: shoot!
15:45:01 <asn> (well, an attacker could also race all 6 HSDirs, if it can race one. So even a normal client will get 6 bad descs)
15:45:08 <asn> DonnchaC: ehm when you say "direct descriptor upload"
15:45:13 <asn> for the complex mode
15:45:24 <asn> do you mean using the actual directory system logic
15:45:29 <asn> or you would write your own metadata channel
15:45:33 <asn> or you haven't thought about it yet
15:46:12 <asn> like are you actually talking about a channel that you just upload descriptors with, or you also say more things?
15:46:50 <DonnchaC> asn: I still have to think about it more, but I was thinking about using #14846 to fetch the local descriptor when it changes and uploading it directly to the management service of a HS connection.
15:47:43 <DonnchaC> asn: I don't have any uses beside descriptor upload in mind for that channel right now
15:50:38 <DonnchaC> I'm open to arguments about whether this is worth the extra complication. It would avoid the need for polling, ensures the IP's arent stale, and avoids the previously described attac. But it also means more little-t-tor changes, running more code on the HS instance, and having a listening service on the managment service.
15:52:11 <asn> yes it would require a little program on the HS instance
15:52:19 <asn> to learn that there is a new descriptor and push it to mgmt
15:52:29 <asn> is there a "new hs descriptor" event now?
15:52:31 <asn> guess not
15:52:38 <asn> that's the only thing that needs to be coded right?
15:52:47 <asn> the other thing you could do, which is less elegant
15:53:00 <asn> would be to hack a way for Tor to upload its descriptor to an IP, instead of an HSDir
15:53:19 <asn> and then the HS instance will do a POST query with its descriptor that you have to receive on the mgmt side
15:53:31 <asn> bleh that's shit. then you would have to understand BEGIN_DIR etc on the mgmt side
15:53:38 <dgoulet> well we have an UPLOADED event so passively monitoring that could do the trick but that requires to run stuff on each hs :S
15:54:16 <dgoulet> with that you can track the latest by comparing it to the previous ones I guess, ... meh maybe not ideal though
15:55:36 <asn> DonnchaC: in general ,i think that a signalling channel (or at least a descriptor upload channel) will be useful in the long -run
15:55:36 <dgoulet> an ~hour in the meeting ok so maybe we can wrap this up?
15:55:49 <asn> ye
15:56:32 <DonnchaC> Yep, I'd like to avoid the extra complication and running extra code on each HS but I agree asn. I think the signalling channel provides flexibility in the future and would be useful.
15:56:54 <asn> DonnchaC: but if the "basic mode" actually _works_
15:57:12 <asn> i think releasing the basic mode, and saying "ok use this. in the meanwhile, we are developing something even greater"
15:57:26 <asn> would be a rasonable way t odo it
15:57:58 <dgoulet> +1
15:59:07 <DonnchaC> asn: I think that's a good idea. For regular HS's which aren't having the IP circuits DoS'd to death, I think polling would work fine. From the IP analysis ticket it seemed that some HS's were picking completely new sets of IPs at least every 10 min.
15:59:43 <asn> i'm still not sure if that's becaues of max-intro rotation, or just of weird HS setup .
16:00:00 <asn> but I'm OK with saying "first version of this tool, won't be able to defend against super crazy DoS attacks"
16:00:14 <DonnchaC> sounds reasonable to me too
16:00:21 <asn> also, i have some more data for #15515 which might show us why that hs was so anomalous.
16:00:38 <asn> #15513
16:00:40 <asn> or whatever it was
16:00:48 <asn> OK
16:00:55 <DonnchaC> I can work on the basic mode for now, also working on #14846. I have some commments and questions which I will ask on that ticket.
16:01:38 <asn> #14846 is like HSFETCH but without taching the network?
16:01:45 <asn> *touching
16:01:45 <DonnchaC> I'll write up a more formal spec for the design I have in mind for OnionBalance this week for you to review.
16:02:13 <dgoulet> asn: yes, you get the descriptor of a service your tor instance runs
16:02:28 <asn> what do you give as identiifer? the onion address?
16:02:29 <DonnchaC> I have some question about how this command should work which I'm going to add to the ticket later.
16:02:32 <DonnchaC> Is it better to generate hidden service descriptor on demand when request or should the command the return last uploaded descriptor.
16:02:35 <DonnchaC> If PublishHidServDescriptors == 0 then HS descriptors are not generated at all in upload_service_descriptor(). I think the logic for descriptor generation should be refactored to a separate function to allow descriptors to be generated and retrieved from the control port without needing to be uploaded.
16:02:37 <dgoulet> DonnchaC: sounds good to me, if you think a NEW_HS_DESC control event would be useful also, opening a ticket on that sounds like a good idea
16:02:58 <dgoulet> asn: yeah the .onion
16:03:56 <dgoulet> DonnchaC: tor keeps a copy of the "current" descriptor so I think you should return that and not recompute one
16:04:02 <asn> DonnchaC: writing questions on the ticket sounds like a fine idea
16:04:18 <asn> DonnchaC: i'm not sure who is the audience of this new command.
16:04:19 <dgoulet> DonnchaC: you probably want what tor sees and not what it might see in 13 minutes when it's going to upload a possible new one ?
16:04:33 <asn> if it's just onionbalance and nothing else seems related, then optimizing for your use case might be ok
16:04:41 <asn> or maybe this could be a parameter of the HSFETCH command
16:04:48 <asn> like HSFETCH LOCAL or something
16:05:03 <DonnchaC> dgoulet: NEW_HS_DESC is good idea as HS_DESC_UPLOADED won't be useful if the descriptor isn't being uploaded.
16:05:14 <dgoulet> asn: not stupid, add a LOCAL=1 thingy and get it from HS_DESC_CONTENT
16:06:12 <DonnchaC> yes, seems sensible to add it under HS_FETCH. I'll write it up and add my questions to the ticket today
16:06:49 <dgoulet> DonnchaC: great
16:07:00 <dgoulet> so I think you have plenty on your place for the next weeks :)
16:07:03 <dgoulet> -s
16:07:38 <dgoulet> asn: anything more to say/ask before we wrap up this ?
16:09:20 <asn> ehm not really
16:09:29 <asn> iiuc the short term goals are
16:09:34 <asn> - design spec
16:09:39 <asn> - wrap up basic mode
16:09:54 <asn> - start thinking of complex mode (part of 'design spec')
16:10:08 <asn> which makes sense to me
16:10:18 <asn> not sure what 'wrap up basic mode' would be here
16:10:41 <asn> fix up code and README? make good instructions and website? make linux packages? send mailing list post?
16:11:20 <dgoulet> so many questions :)
16:11:38 <asn> just thinking out loud tbh
16:11:47 <dgoulet> they all make sense yes
16:11:52 <asn> i can try to do a review
16:11:57 <asn> over next 2 weeks
16:12:00 <asn> of the current code
16:12:07 <asn> or of the current code after it has been tidied up?
16:12:07 <dgoulet> I mean basic mode is ready for review and testing so that sounds like "wrap up"
16:12:12 <DonnchaC> Yep, all of the above, documentation and tests.
16:12:31 <dgoulet> ok is 2 weeks reasonable to you DonnchaC ?
16:12:35 <asn> DonnchaC: complex mode will be using the code of basic mode right? so tidying up the basic mode code is also future-proofing
16:13:04 <asn> dgoulet: well, let's have the next meeting in 2 weeks :)
16:13:18 <DonnchaC> Basic mode was written quickly, so I need to spend some time improving the code and testing.
16:13:24 <asn> ok
16:13:46 <asn> if you do this soon, send me a mail or update the ticket and i will try to give you a revie wbefore next meeting
16:13:53 <DonnchaC> dgoulet: 2 weeks sounds like a good short-term target.
16:13:56 <dgoulet> asn: right meeting in 2 weeks is fine by me, just want to know if the short term goal for that meeting makes sense
16:14:03 <dgoulet> cool
16:14:41 * asn finds the ticket
16:14:56 <dgoulet> anyway even if 2 weeks is too much, we'll simply realign at the next meeting :)
16:15:13 <DonnchaC> Perfect. Thanks very much for the feedback. Will work on #14846, the spec, and getting 'basic mode' close to release ready.
16:15:20 <asn> epic
16:15:35 <dgoulet> superb indeed!
16:15:37 <dgoulet> DonnchaC: thanks!
16:15:40 <asn> #endmeeting