16:00:16 #startmeeting OONI weekly gathering 2016-09-12 16:00:16 Meeting started Mon Sep 12 16:00:16 2016 UTC. The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic. 16:00:34 hello there! 16:00:41 hello! 16:00:52 hello everyone 16:01:02 hellos! 16:01:13 so what we have up for discussion this week? 16:02:14 so I guess we can dedicate some time discussing some of the strategies for the IM blocking tests, since we cut short on that last week 16:03:43 from the last weeks discussion it seems like we have come up with 3 possible paths to implementing these types of tests 1. Doing reachability testing for the endpoints of the app (pros: low effort to implement, relatively easy to update when things change cons: not accurate when the blocking is not simply hostname/ip:port based) 16:04:53 2. Doing something where we collect a PCAP of how the traffic of a given app works and replay the traffic towards our control servers (pros: medium effort to implement, once implemented it can easily be extended to an arbitrary app cons: only accurate when blocking is protocol based) 16:05:58 would it make sense to implement a set of 2 tests per app, including both approaches? 16:06:03 3. Implementing the real protocol of the app and speaking it with the servers (pros: most accurate as it would tell us for sure when a given app is working or not working cons: high effort to implement, needs to be implemented indepedently for every app, requires reversing of the app, need to update the app every time it changes) 16:06:20 actually now I realise there was also approach 4 16:08:12 4. Setup a VPN on the probe end and route traffic from a real app in our censorship testbed (pros: medium effort to implement cons: can be complex to scale this approach to many measurement clients, requires setting up dedicated serverside infrastructure to run the measurements) 16:08:24 agrabeli: afaict, approach 1 should be easy and the though question is whether to choose options 2-4 as the best next step 16:09:08 what I plan on doing and have begun doing is implementing approach one for a selection of apps and while doing so researching into how much effort it is to do 2,3 or 4 for each of them 16:09:29 last week I spent a bit of time looking at how facebook messenger for mobile works 16:09:55 it looks like a lot of the protocol has already been reversed by the pidgin developers 16:10:08 so approach 2 sounds good, and perhaps it would be useful to implement that, in addition to approach 1 (which is relatively simple) 16:10:29 agrabeli: why approach 2 and not 3 or 4? 16:11:35 hellais: approaches 3 & 4 sound like they would lead to more accurate measurements. However: 16:12:16 in my opinion 3 and 4 are better especially if we could run them against real servers because in that case, if it works, we can be congident that also a real app would work, whereas with approach 2 we cannot be sure but we can most likely say something if we're blocked 16:12:27 approach 4: How can we set up a serverside infrastucture & could we realistically achieve this by the end of september? 16:12:28 it looks like however they have implemented a fairly small subset of the protocol and there is a lot of things that they have not really figured out and makes me worried of how and when they could break 16:12:45 like this stuff: https://github.com/bitlbee/bitlbee-facebook/blob/609ca2d52d468863c99ff3539917f2049ea3df44/facebook/facebook-api.c#L840 16:14:05 hellais: Which are the time constraints? 16:14:36 anadahz: well we should have completed some tests for some IM apps by the end of september 16:15:00 in the long-term, perhaps approach 4 is more optimal to approach 3, because approach 3 requires reversing the protocols of apps which will likely be challenging, plus constantly updating. 16:15:00 I think it's realistic to assume we will have done only 1. for some selection of the apps by the end of september 16:15:47 hellais: ah ok (initially I thought you were asking about what to implement by the end of september, per otf deliverables) 16:15:58 I think that it makes sense to aim for 1, since all other implementations will require significant more effort/time. 16:17:54 yeah I agree 16:18:12 the other thing is I guess what apps we should be prioritising for 16:18:37 whatsapp, telegram, and viber? 16:18:41 Here is a list of some apps: https://gist.github.com/hellais/19ae8f69a9772ffb177d68aef9e834be 16:18:52 I am thinking of going through them in terms of userbase 16:18:57 though it would be useful to look at stats for which apps are used the most globally 16:19:02 hellais: +1 16:21:12 I would also add to the list the development effort per app. 16:21:35 anadahz: +1 16:21:42 anadahz: development effort of what? 16:22:11 To make an ooniprobe test for an IM app. 16:22:21 well to do 1. it's the same for every app 16:23:00 what I do is basically setup the app on a test phone, route all it's traffic through a machine where I sniff the traffic. Do actions on the phone and collects pcaps for every action 16:23:24 hellais: Well different apps bootstrap in a different way and some may be way easier that other. 16:23:25 then I have some tshark scripts that extract the endpoints invovled in DNS lookup and TCP and UDP connections 16:23:30 and map it into a ooni tests 16:25:14 hellais: Usually looking at how commercial firewalls and DPI systems block these IM apps is a good source of info. 16:25:30 anadahz: well we only test if the endpoints it connects to are reachable, as in I can establish a TCP session to them and see if it succeeds 16:26:07 anadahz: well that would require us to actually reproduce a stream that is similar to that of the app and that would fit within 2., which is much more effort 16:26:33 hellais: assuming we go down road 4, would this mean that each instance of lepidopter/ooni-probe/mk would be an exit point of a specific VPN we own and use for running tests throught unmodified apps? 16:27:56 My understanding of how these firewalls work is that they have 2 main approaches 1. Enumerate the endpoints the app connects to and block them all (ie iptables -A OUTPUT -d xxx -j DROP) 2. Detect the protocol based on some fingerprint, that can be arbitrarily complex and then block it based on that. 1. and 2. can potentially be combined 16:28:16 sbs: yes, exactly 16:29:26 I don't actually know even how feasable this is, especially if we want to do it layer 2/3 16:29:49 doing it layer 4 may be more cross platform 16:29:57 but potentially more complex to implement 16:30:33 hellais: You don't need to reproduce a stream in most of the cases iptables could help with testing. 16:31:03 anadahz: is there an easy way to know, for example, how a commercial firewall (say cisco) blocks whatsapp (i.e. is this documented somewhere)? 16:31:04 anadahz: can you elaborate on what you mean by that? 16:33:39 sbs: I haven't found a complete documentation that enumerate all firewall rules (yet). 16:34:01 hellais: yes, on top of my heade I also cannot guess how this could work... I took mental note o flooking into how qemu slirp works and whether it could be helpful for us... another thing worthconsidering is that perhaps there are way to make tunnels at layer 4 16:34:03 But something similar to: https://support.viber.com/customer/portal/articles/1506350-opening-ports-for-viber-desktop is available for all/most apps. 16:36:13 hellais: Creating a test network where you are blocking the TCP/UDP ports and the specific hostnames/IPs of the service in question. 16:37:53 If an ISP implement a more sophisticated blocking like packet matching the 1st test approach will not be able to catch this blocking anyway. 16:38:40 anadahz: ah, you are talking about testing the test once it's implemented. Yeah sure that is something I do sometimes. 16:39:43 anadahz hellais: perhaps we can check whether there are bro rules for specific protocols... and also what is the signature used by tstat (http://tstat.polito.it/) to recognize stuff 16:40:47 Some IM apps are quite easy to block so I guess developing tests for these apps will not be that hard (in the 1st approach). 16:42:33 sbs: that tstat is quite interesting indeed. 16:43:17 sbs: it seems like they use some statistical analysis on certain features of packets to detect the type of a protocol. 16:44:28 http://www.clustrmaps.com/map/Tstat.polito.it 16:44:35 it seems like they get a lot of visits from south korea 16:44:43 I wonder if that means they are users of this software 16:46:37 lol, idk 16:49:20 well anyways do we have anything else to talk here? 16:49:50 if not, are there any other topics we should be covering? 16:50:54 hellais: have we decided with which approach are we going to proceed? 16:51:25 to wrap up the previous discussion: I guess that (apart from approach 1) whether and to what extent to prioritize on approaches 2-4 will depend on resources, time, and other factors. in other words, tbd along the line. 16:53:27 anadahz: I am going to implement 1. for whatsapp, facebook messenger and viber (and maybe another IM app as well). While doing so I will continue thinking of the best approach for doing a more extensive test, but defer that to a later stage 16:55:24 hellais: no telegram? 16:55:52 I guess I could also go for telegram, though it has a fairly small user share compared to the others 16:56:26 hellais: Wouldn't make sense to add support for all web-based apps? 16:56:42 anadahz: I think the protocol is different when you go on the web 16:56:55 hellais: ack 16:57:46 sbs: if the port/host is blocked there will be no protocol negotiation 16:58:02 yeah, we based have a fairly different protocol usually and it's not the same as the mobile version 16:58:38 and sometimes this can be even accomplished with DNS hijacking 16:58:55 like facebook messenger uses as an API for mobile apps b-api.facebook.com as an endpoint and protocol based on something called MMQT 16:59:08 while the web version uses api.facebook.com and json 16:59:31 the workflow for enumerating the endpoints of the web based ones would also be different 16:59:47 I don't think this is as high priority as the actually IM apps on phones 17:01:23 hellais: Though a big percentage users are using the desktop version as well, especially if they are not in a mobile connection. 17:02:43 I think even some old phones are using the web based version of the IM apps. 17:03:40 old phones: phone models that have no support for an app repository 17:04:19 in that case that is also another case in and of itself 17:04:29 since it's the mobile version of the website 17:05:24 I think it makes sense to add checks for this within the facebook test 17:05:46 unsure if it worth the trouble for any other of the web based things 17:06:04 since the basic check would already be part of running a web connectivity test 17:08:44 makes sense 17:09:44 yeah 17:09:56 any more things to talk about? 17:13:50 well in that case thanks for attending! 17:13:53 #endmeeting