16:00:16 <hellais> #startmeeting OONI weekly gathering 2016-09-12
16:00:16 <MeetBot> Meeting started Mon Sep 12 16:00:16 2016 UTC.  The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:16 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:34 <hellais> hello there!
16:00:41 <sbs> hello!
16:00:52 <anadahz> hello everyone
16:01:02 <agrabeli> hellos!
16:01:13 <hellais> so what we have up for discussion this week?
16:02:14 <hellais> so I guess we can dedicate some time discussing some of the strategies for the IM blocking tests, since we cut short on that last week
16:03:43 <hellais> from the last weeks discussion it seems like we have come up with 3 possible paths to implementing these types of tests 1. Doing reachability testing for the endpoints of the app (pros: low effort to implement, relatively easy to update when things change cons: not accurate when the blocking is not simply hostname/ip:port based)
16:04:53 <hellais> 2. Doing something where we collect a PCAP of how the traffic of a given app works and replay the traffic towards our control servers (pros: medium effort to implement, once implemented it can easily be extended to an arbitrary app cons: only accurate when blocking is protocol based)
16:05:58 <agrabeli> would it make sense to implement a set of 2 tests per app, including both approaches?
16:06:03 <hellais> 3. Implementing the real protocol of the app and speaking it with the servers (pros: most accurate as it would tell us for sure when a given app is working or not working cons: high effort to implement, needs to be implemented indepedently for every app, requires reversing of the app, need to update the app every time it changes)
16:06:20 <hellais> actually now I realise there was also approach 4
16:08:12 <hellais> 4. Setup a VPN on the probe end and route traffic from a real app in our censorship testbed (pros: medium effort to implement cons: can be complex to scale this approach to many measurement clients, requires setting up dedicated serverside infrastructure to run the measurements)
16:08:24 <sbs> agrabeli: afaict, approach 1 should be easy and the though question is whether to choose options 2-4 as the best next step
16:09:08 <hellais> what I plan on doing and have begun doing is implementing approach one for a selection of apps and while doing so researching into how much effort it is to do 2,3 or 4 for each of them
16:09:29 <hellais> last week I spent a bit of time looking at how facebook messenger for mobile works
16:09:55 <hellais> it looks like a lot of the protocol has already been reversed by the pidgin developers
16:10:08 <agrabeli> so approach 2 sounds good, and perhaps it would be useful to implement that, in addition to approach 1 (which is relatively simple)
16:10:29 <hellais> agrabeli: why approach 2 and not 3 or 4?
16:11:35 <agrabeli> hellais: approaches 3 & 4 sound like they would lead to more accurate measurements. However:
16:12:16 <sbs> in my opinion 3 and 4 are better especially if we could run them against real servers because in that case, if it works, we can be congident that also a real app would work, whereas with approach 2 we cannot be sure but we can most likely say something if we're blocked
16:12:27 <agrabeli> approach 4: How can we set up a serverside infrastucture & could we realistically achieve this by the end of september?
16:12:28 <hellais> it looks like however they have implemented a fairly small subset of the protocol and there is a lot of things that they have not really figured out and makes me worried of how and when they could break
16:12:45 <hellais> like this stuff: https://github.com/bitlbee/bitlbee-facebook/blob/609ca2d52d468863c99ff3539917f2049ea3df44/facebook/facebook-api.c#L840
16:14:05 <anadahz> hellais: Which are the time constraints?
16:14:36 <hellais> anadahz: well we should have completed some tests for some IM apps by the end of september
16:15:00 <agrabeli> in the long-term, perhaps approach 4 is more optimal to approach 3, because approach 3 requires reversing the protocols of apps which will likely be challenging, plus constantly updating.
16:15:00 <hellais> I think it's realistic to assume we will have done only 1. for some selection of the apps by the end of september
16:15:47 <agrabeli> hellais: ah ok (initially I thought you were asking about what to implement by the end of september, per otf deliverables)
16:15:58 <anadahz> I think that it makes sense to aim for 1, since all other implementations will require significant more effort/time.
16:17:54 <hellais> yeah I agree
16:18:12 <hellais> the other thing is I guess what apps we should be prioritising for
16:18:37 <agrabeli> whatsapp, telegram, and viber?
16:18:41 <hellais> Here is a list of some apps: https://gist.github.com/hellais/19ae8f69a9772ffb177d68aef9e834be
16:18:52 <hellais> I am thinking of going through them in terms of userbase
16:18:57 <agrabeli> though it would be useful to look at stats for which apps are used the most globally
16:19:02 <agrabeli> hellais: +1
16:21:12 <anadahz> I would also add to the list the development effort per app.
16:21:35 <agrabeli> anadahz: +1
16:21:42 <hellais> anadahz: development effort of what?
16:22:11 <anadahz> To make an ooniprobe test for an IM app.
16:22:21 <hellais> well to do 1. it's the same for every app
16:23:00 <hellais> what I do is basically setup the app on a test phone, route all it's traffic through a machine where I sniff the traffic. Do actions on the phone and collects pcaps for every action
16:23:24 <anadahz> hellais: Well different apps bootstrap in a different way and some may be way easier that other.
16:23:25 <hellais> then I have some tshark scripts that extract the endpoints invovled in DNS lookup and TCP and UDP connections
16:23:30 <hellais> and map it into a ooni tests
16:25:14 <anadahz> hellais: Usually looking at how commercial firewalls and DPI systems block these IM apps is a good source of info.
16:25:30 <hellais> anadahz: well we only test if the endpoints it connects to are reachable, as in I can establish a TCP session to them and see if it succeeds
16:26:07 <hellais> anadahz: well that would require us to actually reproduce a stream that is similar to that of the app and that would fit within 2., which is much more effort
16:26:33 <sbs> hellais: assuming we go down road 4, would this mean that each instance of lepidopter/ooni-probe/mk would be an exit point of a specific VPN we own and use for running tests throught unmodified apps?
16:27:56 <hellais> My understanding of how these firewalls work is that they have 2 main approaches 1. Enumerate the endpoints the app connects to and block them all (ie iptables -A OUTPUT -d xxx -j DROP) 2. Detect the protocol based on some fingerprint, that can be arbitrarily complex and then block it based on that. 1. and 2. can potentially be combined
16:28:16 <hellais> sbs: yes, exactly
16:29:26 <hellais> I don't actually know even how feasable this is, especially if we want to do it layer 2/3
16:29:49 <hellais> doing it layer 4 may be more cross platform
16:29:57 <hellais> but potentially more complex to implement
16:30:33 <anadahz> hellais: You don't need to reproduce a stream in most of the cases iptables could help with testing.
16:31:03 <sbs> anadahz: is there an easy way to know, for example, how a commercial firewall (say cisco) blocks whatsapp (i.e. is this documented somewhere)?
16:31:04 <hellais> anadahz: can you elaborate on what you mean by that?
16:33:39 <anadahz> sbs: I haven't found a complete documentation that enumerate all firewall rules (yet).
16:34:01 <sbs> hellais: yes, on top of my heade I also cannot guess how this could work... I took mental note  o flooking into how qemu slirp works and whether it could be helpful for us... another thing worthconsidering is that perhaps there are way to make tunnels at layer 4
16:34:03 <anadahz> But something similar to: https://support.viber.com/customer/portal/articles/1506350-opening-ports-for-viber-desktop is available for all/most apps.
16:36:13 <anadahz> hellais: Creating a test network where you are blocking the TCP/UDP ports and the specific hostnames/IPs of the service in question.
16:37:53 <anadahz> If an ISP implement a more sophisticated blocking like packet matching the 1st test approach will not be able to catch this blocking anyway.
16:38:40 <hellais> anadahz: ah, you are talking about testing the test once it's implemented. Yeah sure that is something I do sometimes.
16:39:43 <sbs> anadahz hellais: perhaps we can check whether there are bro rules for specific protocols... and also what is the signature used by tstat (http://tstat.polito.it/) to recognize stuff
16:40:47 <anadahz> Some IM apps are quite easy to block so I guess developing tests for these apps will not be that hard (in the 1st approach).
16:42:33 <hellais> sbs: that tstat is quite interesting indeed.
16:43:17 <hellais> sbs: it seems like they use some statistical analysis on certain features of packets to detect the type of a protocol.
16:44:28 <hellais> http://www.clustrmaps.com/map/Tstat.polito.it
16:44:35 <hellais> it seems like they get a lot of visits from south korea
16:44:43 <hellais> I wonder if that means they are users of this software
16:46:37 <sbs> lol, idk
16:49:20 <hellais> well anyways do we have anything else to talk here?
16:49:50 <hellais> if not, are there any other topics we should be covering?
16:50:54 <anadahz> hellais: have we decided with which approach are we going to proceed?
16:51:25 <agrabeli> to wrap up the previous discussion: I guess that (apart from approach 1) whether and to what extent to prioritize on approaches 2-4 will depend on resources, time, and other factors. in other words, tbd along the line.
16:53:27 <hellais> anadahz: I am going to implement 1. for whatsapp, facebook messenger and viber (and maybe another IM app as well). While doing so I will continue thinking of the best approach for doing a more extensive test, but defer that to a later stage
16:55:24 <sbs> hellais: no telegram?
16:55:52 <hellais> I guess I could also go for telegram, though it has a fairly small user share compared to the others
16:56:26 <anadahz> hellais: Wouldn't make sense to add support for all web-based apps?
16:56:42 <sbs> anadahz: I think the protocol is different when you go on the web
16:56:55 <sbs> hellais: ack
16:57:46 <anadahz> sbs: if the port/host is blocked there will be no protocol negotiation
16:58:02 <hellais> yeah, we based have a fairly different protocol usually and it's not the same as the mobile version
16:58:38 <anadahz> and sometimes this can be even accomplished with DNS hijacking
16:58:55 <hellais> like facebook messenger uses as an API for mobile apps b-api.facebook.com as an endpoint and protocol based on something called MMQT
16:59:08 <hellais> while the web version uses api.facebook.com and json
16:59:31 <hellais> the workflow for enumerating the endpoints of the web based ones would also be different
16:59:47 <hellais> I don't think this is as high priority as the actually IM apps on phones
17:01:23 <anadahz> hellais: Though a big percentage users are using the desktop version as well, especially if they are not in a mobile connection.
17:02:43 <anadahz> I think even some old phones are using the web based version of the IM apps.
17:03:40 <anadahz> old phones: phone models that have no support for an app repository
17:04:19 <hellais> in that case that is also another case in and of itself
17:04:29 <hellais> since it's the mobile version of the website
17:05:24 <hellais> I think it makes sense to add checks for this within the facebook test
17:05:46 <hellais> unsure if it worth the trouble for any other of the web based things
17:06:04 <hellais> since the basic check would already be part of running a web connectivity test
17:08:44 <anadahz> makes sense
17:09:44 <sbs> yeah
17:09:56 <hellais> any more things to talk about?
17:13:50 <hellais> well in that case thanks for attending!
17:13:53 <hellais> #endmeeting