15:01:52 #startmeeting 15:01:52 Meeting started Fri Feb 12 15:01:52 2016 UTC. The chair is nickm. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:52 Useful Commands: #action #agreed #help #info #idea #link #topic. 15:02:15 hi all! This is the design meeting to talk about prop#264 and prop#266. 15:02:26 https://gitweb.torproject.org/torspec.git/tree/proposals/264-subprotocol-versions.txt 15:02:33 https://gitweb.torproject.org/torspec.git/tree/proposals/266-removing-current-obsolete-clients.txt 15:02:52 hey 15:02:54 I see armadev and Grobbler. Any more folks here to chat about those? Have we read them? 15:02:57 hi isis ! 15:03:01 * dgoulet lurking 15:03:06 hi nickm :) 15:03:29 <- 15:03:47 so, first question: has anybody besides me read these? :) 15:04:03 i am here, and then i am trying not to get distracted because i'm siting with the newesthope paper authors and we're looking into the parameters recommended by the NTRU devs for prop#263 15:04:20 I read 266 15:04:27 i have read both, cursorily, twice 15:04:29 i read them this morning, and committed tiny fixes as i noticed them 15:04:58 (then i slept, and my earlier thoughts got muddled. now i'll be a second reviewer!) 15:05:15 shall we go over prop#264 first? 15:05:22 might as well 15:05:28 sure, that's the simple one. would somebody other than the author like to summarize? 15:05:53 sure 15:06:39 the simple summary is: we want to put versions on each sub-component or feature inside tor, so other people can build their own implementations and specify what features are supported. right now we use "the tor version" as a stand-in for all of these things, and it makes third-party implementations never be able to be first-class. 15:06:57 karsten: https://pastee.org/uv3t2 15:06:59 karsten: here you go 15:07:02 karsten: seems to work :] 15:07:31 one of the open design questions is where to advertise these versions -- in the consensus? in the microdescriptor? are either of these sufficient for all use cases? 15:07:34 . 15:08:36 yeah, that's about right for the main idea. 15:08:55 I think that having them in the consensus should be okay, since they should compress very very well ... 15:09:28 ... and you might want to know them early, before you pick a server to download all your mds from. 15:09:31 oh, another angle of the proposal is that tors who don't support the recommended main versions will quietly disconnect. so we can phase out old tors. 15:10:04 re the consensus thing, i was wondering about how to handle clients using bridges. but i am getting ahead of myself. 15:10:48 * armadev lets nickm continue directing :) 15:10:51 So, any comments on this one? I think it ought to be not-too-bad to implement. 15:11:04 though doing stuff based on versions _is_ easier. 15:11:09 but worse. 15:11:34 (If somebody else would like to drive when it's my proposals we're discussing, that could make sense.) 15:12:19 my first thought when reading this is that i want to hear from the orchid people, or tvdw, or others who made third-party implementations 15:12:26 hmm, good point 15:12:27 what issues did they encounter, and does this plan address them? 15:12:46 ("generalizing from zero data points" etc etc) 15:12:49 this is perhaps a stupid question, but why do we add RecommendedClientProtocols but not add RecommendedRelayProtocols? 15:13:05 I just skimmed so far, but is there a way to make a feature become mandatory later? Or do we keep them around forever? We might segment the network a lot if features are too fine-grained 15:13:30 armadev: also see https://trac.torproject.org/projects/tor/wiki/doc/ListOfTorImplementations 15:14:14 isis: good thought. i don't see a reason to not give recommendations to relays. 15:14:15 isis: I don't remember. It could be reasonable. 15:14:23 #action add RecommendedRelayProtocols 15:14:34 Sebastian: I think section 4 does that? 15:14:49 does that do what you want? 15:14:53 ah 15:15:08 (And does anybody want to start the argument about how kill switches are eeeevil?) 15:15:15 i would also re-raise armadev's question of how this might work with bridges 15:15:17 nickm: the other thought i had was around our mess with bridges. back in the day, we implemented microdescriptors, and maybe your bridge supported them or maybe it didn't, and if your bridge didn't, then you were out of luck for using microdescs. 15:15:31 and bridge users don't have consensus entries for their bridges 15:15:47 they do get the actual bridge descriptors. which presumably list these capabilities. 15:15:56 it does. And that killswitch is interesting. 15:16:13 Does any of those implementation focus only on making an Onion Service reachable? 15:16:46 I'm not sure I understand how it works from the second-to-last paragraph in 4. 15:16:59 naif: i'm sorry, is that question related to this discussion? if so please be more clear :) 15:17:19 How does the client know if the Protocol is required to be a Tor client? Presumably when the client was written it wasn't required 15:17:40 so the client would say "oh, I don't support the feature, so I don't use it. But I can still keep going" 15:17:43 unless I'm missing something? 15:17:47 Sebastian: i also did not understand that 15:18:11 I think what Nick maybe means is a "disablefeature" thing 15:18:18 nickm: here's a third thought from just now: if the client doesn't like enough of the signatures on the consensus, it won't believe the kill switch. so this is a way for the zombie horde to accidentally return? 15:18:25 Hm. I meant that some features might be optional-for-clients. But maybe Required should just mean Required? 15:18:32 where a client knows what features it needs to connect to the network, and if all these features are disabled, the client shuts down. 15:18:38 #action nickm : make required mean required 15:19:03 armadev: For these, we should take cached consensuses very seriously. 15:19:16 If required means required and the dirauths make one consensus where they require a nonexistant feature, the Tor network is dead until everyone deletes their cached consensus. Maybe that's ok 15:19:28 #action nickm note required behavior with Required and cached consensuses 15:19:53 yeah, that's some new power for the dir auths, even honest ones 15:20:42 (we have a majority of dirauths who pull the config repo I think. One typo there and stuff goes byebye) 15:21:13 (this is an argument against that repo I think) 15:21:14 dirauths should definitely validate that they aren't about to shut themselves down. 15:21:25 that might be a good start 15:21:45 agreed 15:22:00 that means dirauths need to integrate a Tor client, but yeah, it's the obvious solution 15:22:15 Or it could be non-configurable, and just done in the source. 15:22:25 even better. 15:22:28 that might be better. 15:22:39 #action hardwire the required lists? 15:22:56 and recommended lists 15:23:48 nickm: in the proposal, you have a syntax "ANYOF(9-20)" 15:23:56 is that accidentally in there from an earlier revision? 15:24:05 or is that a thing that you meant to propose? or is it just a help for the reader? 15:24:11 accident. 15:24:19 #action nickm remove the ANYOF notation. 15:24:42 also, the proposal currently says 15:24:42 The directory authorities no longer allow versions of Tor before 15:24:43 0.2.4.8-alpha. 15:24:53 but i checked, and assuming we mean relays, it's now 0.2.4.18-rc 15:25:04 maybe we updated it since you wrote the proposal? maybe i disagree with your 0.2.4.8-alpha? 15:25:08 feel free to update that one? 15:25:11 ok 15:25:12 you're probably right 15:25:40 are there good alternatives to the 'kill it forever' behavior that has us worried? 15:26:12 a bad alternative is to discard a consensus that's way too old, so clients have a chance to recover eventually. that's not sufficiently better to be a good idea though. 15:26:25 (it *only* lets the zombies back in. real users would give up.) 15:26:42 So, here's an issue with "disable via consensus" 15:26:54 when dirauths have to change keys, old clients cannot verify a consensus any longer 15:27:01 right. i mentioned this above. i agree. 15:27:03 so they can't trust that they should shut down 15:27:04 oh 15:27:05 sorry 15:27:11 > nickm: here's a third thought from just now: if the client doesn't like 15:27:11 +enough of the signatures on the consensus, it won't believe the kill switch. 15:27:11 +so this is a way for the zombie horde to accidentally return? 15:27:57 There could be a shutdown document 15:28:16 and my answer was "If you have a cached consensus that says shut down, you don't dl another one." 15:28:38 right. but if you slept in your crypt for long enough, and then woke up to be a zombie.. 15:29:05 (or if some jerk deploys 5M obsolete tor clients...) 15:29:37 I find the jerk argument unconvincing 15:29:54 the "the only reason this client is now bad is because enough dirauths changed" more so 15:30:09 hrm. 15:30:21 and the jerk arguement could be possibly fixed in #266 15:30:37 (a real jerk could always just deploy a ddos tool) 15:30:55 or disable the checks in the code :) 15:30:58 (the first few designs of the shutdown document that i thought of also rely on the client being able to validate the consensus) 15:31:12 (or some other set of "shutdown keys") 15:31:19 armadev: why? 15:31:27 We can sign the shutdown document with old keys 15:31:33 as we replace them 15:31:36 it's valid indefinitely 15:31:43 if the keys that sign it are compromised that's ok 15:32:08 if they're lost... but yes, this is probably a workable design, but i hope it is overkill. 15:33:03 ok. i think that the kill switch would address a big chunk of the problem, and having it be something that the code implements, not the flaky dir auth operators, seems safe enough to me. 15:33:32 long ago, we disabled our kill switch, because we didn't want to put that much trust in the authorities. was that 'long ago' period back when a tor believes any authority, just one? 15:33:54 does anybody believe in the shutdown document idea strongly enough that they think somebody should specify it as another proposal? 15:34:00 armadev: I don't know; let's check! 15:34:21 No, the added complexity seems not worth it 15:34:33 We've been complicating this Tor thing a huge deal lately as it is :) 15:34:52 (i agree that shutdown documents are way overkill. let's focus on building the critical next things.) 15:35:43 but we should also note this edge case in the design (that an old enough client won't shut down) 15:35:55 36f055e7ee7975fa6982cdfef8409b7a303166c5 is where you made clients start exiting in 2003... 15:36:38 df3544422c35f85cc9990b78a3a5e3ec3c5b67a0 changed the rule in 2004... 15:38:49 and https://blog.torproject.org/blog/overhead-directory-info%3A-past,-present,-future gives an overview of the directory timeline 15:38:50 hm. while i'm investigating this, should we move on to talking about prop#266 ? 15:39:08 We introduced the v2 directory design in Tor 0.1.1.20 in May 2006. 15:39:15 i think we disabled our kill switch back in the v1 dir era. 15:39:23 which made it an even wiser idea then. 15:41:28 anybody want to sumamrize prop#266 15:41:33 sure 15:42:51 prop#266 describes ideas for ways to remove old clients from the network, ideally without having them turn into "fast zombies" or "slow zombies" which harm the network by consuming resources 15:43:23 so for #266, the best thing would probably be to remove support for their protocols, but is there anyway to send the client a message that they are considered obsolete and need to upgrade to continue using the network? 15:43:54 grobbler_: this is what recommendclientversions does. it causes a warn log message and a warn controller event. 15:43:57 "fast zombies" are e.g. clients who would try one per five minutes to connect, whereas "slow zombies" are those with some degradation of connection retry times 15:45:16 the three current ideas for getting rid of obsolete clients are 1) have the DirAuths change their ports 2) disabling old link protocols on new relays 3) abruptly changing the consensus format to make it unparseable to old clients 15:45:45 ----- END SUMMARY ----- 15:46:31 more summary: 15:46:38 2 could eventually result in fast zombies. 15:47:04 1 would result in fast zombies that wouldn't do as much harm as they would if they were successfully connecting. 15:47:11 3 might work. 15:47:12 thought #1 is that we actually *have* had zombies in the past. we had a bug once where an old tor version had inverted logic, so it hammered the dir servers if it *did* have a new enough somethingerother. for years i would create new valid v2 networkstatuses for old-moria1 to appease them. 15:47:42 i would be reminded to do this appeasement when i got a mail from moria1's hoster saying i was being ddosed and could i please investigate. 15:48:02 3 sounds really scary, with all the parsing code elsewhere which would need to change 15:48:03 leaving the old dirport closed would make the zombies even more frantic. 15:48:10 e.g. in metrics-lib and stem 15:48:33 i also think option 3 is klunky. especially because we'd need to carry the old words in the consensus forever. 15:48:50 how about an option 4, where we serve a poison consensus on the old dirports, signed by the old keys, and valid forever? 15:50:01 i would like weasel's perspective here too since he lived through the zombie eras too. 15:51:20 (a poison consensus could be risky if somehow modern clients run across it and believe it. similar to the risky kill switch in 264.) 15:52:10 (we could rotate identity keys for enough dir auths that modern clients *won't* believe it.) 15:52:21 We could, once we implement a scheme to kill switch clients going forward, implement a new keyword that means "this is a poison consensus for old tor" 15:52:32 oh ho 15:52:44 Which I think I would want anyway 15:52:47 "if you recognize this shibboleth, do not exit" 15:52:53 The very existence of a poison consensus means that you could feed it to good clients. 15:53:02 nickm: and good clients never believe them 15:53:07 hmm. 15:53:10 nickm: what if it's signed only by identity keys that good clients don't believe? 15:53:24 that might work. 15:53:42 my v2 poison networkstatus was generated by old-old-moria1 15:53:48 no real risk to modern clients 15:54:09 i'll note that we also made this poison networkstatus for v1 15:54:23 where we continued to list a v1 networkstatus, but it listed zero relays 15:54:50 that's true, but at that point, no supported clients would even parse a v1 networkstatus 15:54:56 has anybody checked what old/current tor clients do when given a consensus with zero relays? 15:55:00 what would this poison networkstatus look like? 15:55:21 I hope they accept it and then do nothing. But probably we should check 15:55:24 it would look like a v3 consensus, signed with a bunch of obsolete keys, with no relays in it. assuming that appeases the zombies. 15:55:40 #action somebody check what happens when the target (bad) clients get an empty consensus 15:56:11 #action nickm or somebody flesh out the poison consensus idea and add it to the proposal 15:56:15 to be clear, i only just thought of this poison consensus idea now. maybe more thinking will yield an even better idea. :) 15:56:29 (but it is what we have already done, in the past, for this situation) 15:56:51 oh! weasel is quiet because he is being logged, isn't he. 15:56:58 yes. 15:57:11 maybe he'll chime in on #tor-project :) 15:57:12 by "target (bad) clients" here, we're just going to believe the versions they report and assume, for example, that botnet clients aren't running some different code? 15:57:39 I do happen to believe those versions, since they do the corresponding nasty old handshakes 15:58:09 isis: it isn't even about versions. it's about, when clients that don't support new functionality come knocking, how we handle it. 15:58:22 (if the botnet dude upgraded all his clients to be valid, that is a different scenario.) 15:58:29 s/valid/modern/ 15:59:04 prop#264 is my answer for how things _should_ work ; prop#266 is my attempt to figure out what we do in the absence of client-side support in existing tor versions. 15:59:25 wrt prop#266 -- how should we do the logistics? 15:59:30 nickm: speaking of a kill switch for old clients, this is not an acceptable option for us, but it was one that i pondered for a while: use an assert bug in their version of tor to make them go away. 15:59:58 it may be worth listing for completeness, in the 'not a viable option' section. 16:00:03 I have pondered that too, but it's the kind of thing I'd only consider in a dire emergency. 16:00:37 also, I am not aware of any assert bugs that affect only the versions I currently want to kill 16:01:11 do we remember exactly which version the bots are running? 16:01:40 not precisely. I think 0.2.2.1x but I don't remember for sure 16:02:47 0.2.3.x from https://blog.torproject.org/blog/tor-weekly-news-%E2%80%94-september-4th-2013 16:02:48 none of the proposed plans kill off 0.2.4.17-rc and earlier, surgically, right? 16:02:59 huh? 16:02:59 all of them need us to teach today's clients how to not die 16:03:02 yes. 16:03:09 0.2.3.25 16:03:18 isis: thanks. 16:03:26 and to teach them a new way to die -- specifically, via prop#264. 16:03:55 so, in the distant future, anybody who doesn't know how to die via prop264 will be killable in whatever way we choose for prop266. 16:04:22 so when we implement 264, that is the time to do the dir auth identity switch, or whatever we pick 16:04:37 Yes. Also we could add a version-based kill method if we think that prop#264 will be annoying to backport to everything we want to keep alive after the prop#266 singularity. 16:04:50 are there any proposed ideas that act faster than 'the distant future'? it's not clear we need that, but it would be good to consider. 16:05:59 I think a highly simplified series-based killswitch would be a good way to allow us to put prop#266 in place very soon if we think that prop#264 will be slow to implement. 16:07:04 can you elaborate on that? 16:07:17 sure. Suppose that we want prop#266 ready to roll ASAP. 16:07:35 So in Tor 0.2.4 and later, we add the following patches: 16:07:46 - Tolerance for poison consensuses. 16:08:19 - Recognition of keywords in the consensus like "shutdown-all-024" "shutdown-all-025" etc, and shutdown in response. 16:08:37 - Knowledge of how to get a non-poison consensus. 16:09:25 Then if we decide to 'pull the switch' on prop#266 and kill off old clients, any clients running 0.2.4 or later that have these patches won't get hit. 16:09:51 This is simpler than a full prop#264 implementation, and much much simpler than backporting a full prop#264 implementation 16:09:56 in this design, clients running today's stable version of 0.2.5, 0.2.6, 0.2.7 would all die. like, the ones in debian. 16:10:14 yup. We would need to make sure that these patches got into debian. 16:10:48 ok. i am increasingly convinced that we do not have a good way to quickly phase out current tors. let's continue not needing one. :) 16:11:02 hang on, I have to disagree. 16:11:34 I think we should design and build the thing I describe above, and hope we do not need to 'pull the switch' for a very long time. 16:12:15 The sooner we pull the switch, the worse and uglier. But if we eventually need to switch in 2-5 years, building it now-ish will make it (I hope!) fairly painless to pull when it's needed. 16:13:05 ok. yes i agree. 16:13:26 #action add a "pulling the switch timeline" section summarizing the above discussion 16:13:45 ok. more discussion on these proposals ? Looks like there's a bunch of stuff pending here. 16:14:35 going once..... 16:14:41 in theory, all it takes is one ORPort that bad clients think is an authority, and that no good clients do, 16:15:07 and that's sufficient to poison the bad ones. assuming they actually believe long-lasting consensuses. do they? somebody should check. 16:15:45 so in this switch timeline we might want to intentionally move one dir auth to a new location, while still having long-term control over the old location 16:15:50 or two, for redundancy 16:16:07 #action nickm should document arma's new/old location idea above. 16:16:08 oh, but we need to also rotate enough identity keys 16:16:24 I'm not sure we need to ? 16:16:29 yes, i think we need some detail-oriented person to go through and see if our plans contradict themselves 16:16:39 stupid details 16:16:43 going once.... 16:17:49 going twice... 16:19:18 #endmeeting