13:59:08 <hellais> #startmeeting OONI Community Meeting 2020-02-25 13:59:08 <MeetBot> Meeting started Tue Feb 25 13:59:08 2020 UTC. The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:08 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 13:59:20 <tomcat> o/ 13:59:49 <slacktopus> <agrabeli> <here> Hello everyone! Welcome to the February 2020 OONI Community Meeting. :) :ooni::party_parrot: 14:00:24 <slacktopus> <xhdix> :man-shrugging:,:):ooni::tada: 14:00:30 <slacktopus> <agrabeli> My name is Maria and I work with OONI. Please feel encouraged to introduce yourselves and/or wave. :) 14:00:57 <slacktopus> <saptak013> My name is Saptak and I started contributing to OONI explorer and API projects 14:01:20 <slacktopus> <sbs> My name is Simone and I maintain OONI's measurement engine 14:01:39 <slacktopus> <hellais> :wave: 14:01:59 <slacktopus> <belson> o/ I’m David Belson - I work for the Internet Society, focused on Internet measurements, and more specifically, tracking & measuring Internet shutdowns. In my spare time, I publish https://internetdisruption.report/ 14:02:33 <slacktopus> <sina> :wave: 14:02:57 <slacktopus> <agrabeli> Your project is really great! :) Thanks for joining us today. 14:02:57 <slacktopus> <belson> :wave: 14:03:14 <slacktopus> <belson> Thanks! 14:03:27 <tomcat> 👋 14:04:16 <slacktopus> <agrabeli> So we have 4 agenda items for the meeting today: https://pad.riseup.net/p/ooni-community-meeting-keep 14:04:24 <slacktopus> <santiago> Hi! I am Santiago Narváez I work as researcher at R3D a mexican digital rights NGO, mainly on privacy and surveillance issues. 14:04:29 <slacktopus> <agrabeli> If there's anything additional you would like us to discuss today, please add it in the pad. 14:05:08 <slacktopus> <agrabeli> Let's get started with the first topic 14:05:26 <slacktopus> <agrabeli> #topic 1. Measuring throttling [xhdix] 14:05:46 <slacktopus> <agrabeli> @xhdix would you like to initiate the discussion? 14:07:11 <slacktopus> <xhdix> Iran's new style of Internet censorship is to increase international internet tariffs and bandwidth throttling. Which is growing more and more these days. How can it be measured? 14:08:16 <slacktopus> <tomcat> Yeah, I see you post that screenshot on the ooni-entropy. 14:10:10 <slacktopus> <hellais> I think it’s something which is quite tricky to measure broadly, because you it requires to some extent, to be able to control some aspects of the target you are interesting in measuring 14:10:47 <slacktopus> <hellais> For example in the above example, we would ideally have some control, or insight, into the server-side used by the github cloud 14:11:24 <slacktopus> <hellais> That said many of these services are reliant on existing cloud service providers, so maybe there is a way to get a foot in them and do some changes to a deployment to “look more like ${service_name}” 14:11:47 <slacktopus> <hellais> @xhdix do you know if this throttling is specific to things hosted on github or it’s in general all of AWS? 14:12:09 <slacktopus> <sbs> I have a more broad question: what is throttling? 14:13:01 <slacktopus> <tomcat> I have the same problem to access the Amazon S3 services in China... We called it "QoS". 14:13:08 <slacktopus> <sbs> (with this meaning: we need first to agree on what we want to measure) 14:13:47 <slacktopus> <tomcat> But I don't think it can be measured from users. 14:15:54 <slacktopus> <sbs> so, the symptom is clearly that the speed with which you access the internet is slow, but there is a bunch of other questions: 14:16:10 <slacktopus> <sbs> 1. slow compared to what? 2. slow for what reason? 14:16:44 <slacktopus> <sbs> the effect of scarce upstream connectivity and of traffic shaping is probably similar from the user's perspective 14:17:03 <slacktopus> <sbs> however, knowing what we assume we want to measure is generally better 14:17:21 <slacktopus> <sbs> also, OONI has generally an A/B methodology, so knowing better also helps in defining what is B 14:18:19 <slacktopus> <tomcat> (All traffic...) 14:18:20 <slacktopus> <sbs> another question is: 3. what is slow? is also browsing slow? is only bulk transfer slow? is streaming slow? 14:18:43 <slacktopus> <sbs> I believe we should probably collectively try to think at a reasoning framework to address these questions 14:20:03 <slacktopus> <sbs> speaking of solutions: the first thing that crosses my mind and that is the easy evolution path from where we are is that, if also browsing is slower, we can start measuring timing of requests, and we can maybe document what we see (still B is missing) 14:20:09 <slacktopus> <agrabeli> @sbs I imagine that part of the problem is that the answers to each of those questions varies from platform to platform, and from network to network 14:20:14 <slacktopus> <sbs> :zipper_mouth_face: now and listen 14:20:36 <slacktopus> <xhdix> 1. Access to a site whose server is in-country. Or just http/https. 2. everything. Or just something that @kbock was referring to. 14:21:19 <slacktopus> <sbs> This is not true in general. One thing is if there are middleboxes out there that selectively filter what you do and shape you accordingly. Another thing is that if the choke point to exit the country is congested. 14:21:45 <slacktopus> <sbs> In the former case you can do A/B depending on what you reach but the environment is more diverse 14:22:11 <slacktopus> <sbs> In the latter case, your B can probably be domestic and then this becomes a digital divide study rather than censorship _tout court_) 14:22:29 <slacktopus> <agrabeli> right 14:22:49 <slacktopus> <sbs> an answer like "everything" does not help me/us understand :shrug: 14:23:57 <slacktopus> <sukhbir> I run a Wireguard server for some people in Iran. Wireguard is UDP-only, and I was running this over udp/443. 14:23:58 <slacktopus> <sbs> do you mean, @xhdix, that every resource outside the country looks slower than any resource inside, perhaps? 14:24:04 <slacktopus> <kbock> In our measurements certain ports seemed to be excluded from the degradation, but it seemed more-or-less universal in terms of which IPs are affected. (I can post some speedtest results we conducted from Iran to around the world in a bit) 14:24:32 <slacktopus> <sukhbir> No issues for a long time and everything was great. But now, the speeds are slow. And measurably slow. For someone who was getting 20Mbps on fast.com, they are getting a few Kbps now, just to give you an idea. 14:24:37 <slacktopus> <sbs> I also have no knowledge of what was @kbock referring to, so maybe it makes sense to either repeat it here or give pointers 14:24:44 <slacktopus> <kbock> @sbs I think this is a good way to look at it - we did speedtests to servers in Iran and servers outside of Iran, and found the ones immediately inside got speeds that were 20-100x faster 14:25:18 <slacktopus> <kbock> I’ll hop in - we’ve been trying two things, the mass traffic degradation and the protocol whitelisting 14:25:45 <slacktopus> <kbock> They seem to be separate systems, but I also want to put the whitelister on OONI’s radar as something worth measuring 14:26:11 <slacktopus> <sukhbir> Kevin, maybe you can share what you have looked at and found so far? (Assuming that's fine) 14:26:55 <slacktopus> <kbock> 14:26:58 <slacktopus> <kbock> Yeah certainly 14:27:03 <slacktopus> <xhdix> The speed of all new traffic drops sharply after about a day. 14:27:23 <slacktopus> <kbock> So we ran speedtest to a couple hundred servers located geographically around the world 14:27:57 <slacktopus> <kbock> We ran speedtest to a couple hundred servers located geographically around the world from the vantage points we had access to in Iran 14:28:30 <slacktopus> <kbock> the y-axis of this graph is the Mbps, and the x-axis is the relative distance between the Iran vantage point and the speedtest server. So, as the number increases, so does location of the speedtest server 14:28:54 <slacktopus> <kbock> The first three dots are locations within Iran - once the traffic left Iran, it glot slowed significantly 14:28:55 <slacktopus> <sbs> what is the measurement unit of the distance? milliseconds? 14:30:18 <slacktopus> <kbock> `km` I believe 14:30:38 <slacktopus> <kbock> Another collaborator helped work on that - I’ll get him in here to talk to the graph more directly 14:30:41 <slacktopus> <xhdix> _(I had forgotten that meeting and my mind is not focused on this. )_ 14:30:45 <slacktopus> <kbock> The second thing we measured was a newish behavior of protocol whitelisting 14:31:18 <slacktopus> <kbock> On ports 53, 80, and 443, Iran deployed (and this appears to still be active) a whitelister. It is not bidirectional - it only affects traffic leaving Iran. 14:31:53 <slacktopus> <kbock> It works by checking the first two packets of your connection; if one of them matches one of its protocol fingerprints, the flow is left alone; if neither do, all remaining packets in the flow from the client are dropped. (packets from the server are unaffected) 14:32:25 <slacktopus> <kbock> It appears like this was Iran’s strategy for censorship leading up to the election - apply a ton of traffic degradation to all UDP and TCP (except for those ports), and then aggressively whitelist those ports 14:32:28 <slacktopus> <hellais> ^ this is a very interesting aspect (cc @sbs) 14:32:42 <slacktopus> <kbock> We’ve identified the fingerprints for DNS-over-TCP, HTTP, and HTTPS (and can share those if people are interested) 14:33:18 <slacktopus> <kbock> The whitelister can be beaten by just injecting that fingerprint into the start of your connection, so we could run a VPN on 53, 80, or 443 without degradation just by injecting that fingerprint into the initial connection 14:34:05 <slacktopus> <kbock> I don’t want to totally derail the meeting - sorry for the wall of text :) Happy to answer any additional questions people have, and we’re writing up a report on this now that we intend to share with this group ASAP 14:34:34 <slacktopus> <lhm123> @sbs it’s km but on a relative scale meaning that the distance between the vantage point and speedtest 2 server is greater than the distance between vantage point and speed test 3 server. The distance is also not linear so the distance between any two consecutive speedtest server can vary significantly 14:35:20 <slacktopus> <lhm123> We will produce another graph that scales with the distance (km) as well as the ping time 14:36:40 <slacktopus> <sbs> ok, cool 14:36:56 <slacktopus> <sbs> thanks for explaining the findings 14:37:19 <slacktopus> <kbock> The whitelister is also relatively easy to trigger, so it shouldn’t be too much effort to write a test for it so we can identify quickly if other countries roll out similar software in the future 14:38:05 <slacktopus> <sbs> do you control servers in country? there are experiments that it may be interesting to perform where we control the servers 14:38:47 <slacktopus> <sbs> in this first stage it seems to me there is a bunch of research questions for which it's best not to commit to write a full fledged test but rather I see quite some opportunity to collaborating in running measurements for understanding what such test could be 14:39:07 <slacktopus> <kbock> We do still have access to servers and can perform some experiments if you’re interested - the degradation seems to have already backed down, but the whitelister is still in effect. 14:39:18 <slacktopus> <sbs> I certainly now have a better understanding of some of the dimensions of the problem, thanks @kbock, @lhm123, and @xhdix ! 14:40:17 <slacktopus> <sbs> I was mainly thinking about the degradation: I am not super positive wrt to using speedtest and not super happy about not having packet captures and/or TCPInfo collected at the server 14:41:10 <slacktopus> <agrabeli> With regards to measuring throttling more broadly -- and going back to the questions that @sbs raised -- there will also be the opportunity to discuss this more analytically in person at the Internet Freedom Festival (IFF) in Valencia in April. 14:41:16 <slacktopus> <sbs> I have written a test that collects very useful and detailed information wrt downloads, which is called `ndt7`, but the most important bits of information are collected on the downloader side 14:41:37 <slacktopus> <sbs> if upload "throttling" is a problem, then this is much easier to measure :-) 14:41:46 <slacktopus> <agrabeli> More specifically, we plan to host a session on throttling as part of the Internet Measurement Village that we're co-organizing during the last 2 days of the IFF (this session will be run by the folks at M-Lab). 14:42:25 <slacktopus> <agrabeli> In the interest of time, let's proceed to the 2nd topic of the agenda 14:42:41 <slacktopus> <agrabeli> #topic 2. Is there any way to measure IP blocking? Specifically, I am interested if it is possible to check if connections to particular IPs are being blocked (not domains) 14:42:53 <slacktopus> <agrabeli> ^^ Can the person who proposed this please elaborate? :) 14:43:41 <slacktopus> <sukhbir> Ha, thanks. 14:43:41 <slacktopus> <sbs> no worries, and thanks for bringing kevin into the conversation as well as for prodding us to think about this problem 14:44:31 <slacktopus> <sbs> this is a novel problem for me, in the sense that I don't know very much about what is going on, for this reason I was asking for more precise information 14:44:56 <slacktopus> <sukhbir> I am wondering if it is possible to measure if a particular IP (or IP block) is banned, without querying the domain first. I understand that if you test if foobar.com, it checks if it can connect to it and measures the process along the way, but I am wondering if it is possible to have a test that says: is it possible to "connect" to some particular IP without specifying the domain? 14:45:35 <slacktopus> <sukhbir> The use-case I have in mind is I was reading the report about China (https://ooni.org/post/2019-china-wikipedia-blocking/) and notice the screenshot: https://ooni.org/post/2019-china-wikipedia-blocking/firefox-3.png 14:46:04 <slacktopus> <sukhbir> I am interested if there is a way to check this via OONI versus checking from the browser. 14:47:09 <slacktopus> <hellais> We used to have a test which just did a simple reachability test for an IP and port combination called tcp_connect, but we are phasing that out in the latest version of OONI Probe 14:47:49 <slacktopus> <hellais> I have the impression that the thing you are trying to measure is though probably still connected to the reachability of a particular site and suspect we are going to do a better job at measuring that within the scope of the web_connectivity test 14:48:14 <slacktopus> <tomcat> The problem is "IP-blocking" could be server-sides blocking. For example, the servers can use iptables to drop the traffic which it don't want. Is the way to take them apart? 14:48:20 <slacktopus> <hellais> One thing which we are discussing is the possibility, in the case of a DNS resolution failure, of using other options for DNS resolution such as DoH 14:48:48 <slacktopus> <hellais> In that hypothesis the test would be able to measure the reachability of that IP, even in the case of DNS injection 14:48:52 <slacktopus> <sukhbir> Yeah that makes sense. This is a very specific use-case so I guess the standard tests are probably better. 14:49:26 <slacktopus> <hellais> Is it correct, that the type of problem you are trying to solve is that of measuring the reachability of a particular IP associated to a website, even in the case of DNS based blocking? 14:49:45 <slacktopus> <sukhbir> Even in the case of DNS and SNI. 14:50:07 <slacktopus> <tomcat> I think blocktest.io is good way to test DNS based blocking. 14:50:10 <slacktopus> <sukhbir> Like the report for China, where we found that Wikipedia is being blocked by DNS and SNI but the IP itself (or range) is not blocked. 14:50:16 <slacktopus> <hellais> Right 14:50:32 <slacktopus> <hellais> Maybe @sbs has some thoughts on this too 14:51:02 <slacktopus> <tomcat> eg. https://www.blocktest.io/injection/wikipedia.org 14:51:52 <slacktopus> <sbs> I was distracted by a side conversation, so let me read the backlog first 14:52:19 <tomcat> May be OONI should use the similar way to test DNS injecting? 14:52:32 <slacktopus> <sukhbir> @tomcat: I have never looked at blocktest, but for DNS, we have sufficient data from OONI, at least for my particular requirement. 14:53:08 <slacktopus> <xhdix> Thank you for helping ;) (_bad sectors.._. :P ) 14:54:15 <slacktopus> <sukhbir> We can discuss this later as well, I don't want this to take up the meeting. Apologies. Just wanted to discuss this in case it had not come up before. 14:55:28 <slacktopus> <agrabeli> @sukhbir no need to apologize, thanks for bringing this up! It's useful feedback to know that a test similar to tcp_connect is needed. 14:55:41 <slacktopus> <hellais> @tomcat do you know who runs blocktest.io? Are there docs somewhere explaining how it works? 14:55:59 <slacktopus> <tomcat> he is in channel. 14:56:01 <slacktopus> <sukhbir> I also need to run to another meeting at 10, so sorry for not following up on this but I will read it later :) 14:56:26 <slacktopus> <tomcat> wait a minutes. 14:57:03 <slacktopus> <sukhbir> I am getting a 404 for blocktest. Is it just me? 14:57:26 <slacktopus> <tomcat> @hellais Is @fortuna run it. 14:57:38 <slacktopus> <agrabeli> me too 14:58:14 <slacktopus> <tomcat> https://openobservatory.slack.com/archives/C38CT1JDD/p1572129781212500?thread_ts=1572129781.212500 14:58:39 <slacktopus> <tomcat> He mentioned it on the channel before. 14:58:51 <slacktopus> <hellais> Aah ok, it did seem familiar 14:59:37 <slacktopus> <hellais> But yeah I think there is some value in having some more lightweight way to collect data without installing anything, but I suspect it will never be a true replacement for something like OONI Probe which can collect much more rich data 15:00:24 <slacktopus> <agrabeli> There are trade-offs in each case... 15:00:42 <slacktopus> <agrabeli> As we're running late, let's proceed to the 3rd topic 15:00:48 <slacktopus> <agrabeli> #topic 3. I need help with mining data from december 2019 for tests that measure tor blocking please. [sannh] 15:01:25 <slacktopus> <agrabeli> @santiago thanks for working with OONI data! :) Can you please share more information on the challenges you're encountering? 15:01:37 <slacktopus> <sbs> One way of doing that is certainly using Web Connectivity. I think we were not phasing out TCP Connect, as I planned to implement it in the new Go engine and it's trivial at this point. The main question to be asked is actually how the information should flow from the need to measure a resource to a specific test, for which using Web Connectivity may be more convenient as well as more scope creepy. cc: @hellais @sukhbir 15:02:47 <slacktopus> <santiago> Sure, my question is super basic: how can i mine data from a specific month? I tried this code without success "aws --no-sign-request s3 cp s3://ooni-data/autoclaved/jsonl/ ./ooni-data/ --exclude "*" --include "201912*-MX-*-meek_fronted_requests_test-*" --recursive " 15:04:00 <slacktopus> <santiago> Thanks a lot and sorry for the basic stuff, i dont really code :) 15:04:01 <slacktopus> <hellais> @santiago what question are you trying to answer with the OONI data? 15:04:27 <slacktopus> <santiago> If there is blocking of Tor in MX 15:04:45 <slacktopus> <hellais> It may be useful if we have a chat about this after the meeting 15:05:32 <slacktopus> <santiago> Sure thing 15:06:04 <slacktopus> <hellais> Generally speaking this type of batch analysis is easier to do using an instance of the OONI MetaDB: https://github.com/ooni/sysadmin/blob/master/docs/metadb-sharing.md 15:06:22 <slacktopus> <sukhbir> sbs, sorry, in a meeeting now. Will respond later. Thanks for your thoughts. 15:06:37 <slacktopus> <agrabeli> It may also be possible to answer this question through OONI Explorer: http://explorer.ooni.org/ 15:08:06 <slacktopus> <hellais> @santiago I have to jump into another meeting now, but we can sync up on this later privately :) 15:08:57 <slacktopus> <santiago> Great, thanks! 15:09:09 <slacktopus> <agrabeli> @sina I have moved your proposed topic to the next community meeting agenda due to lack of time today, but ofc feel free to discuss this topic on this channel asynchronously in the meanwhile 15:10:11 <slacktopus> <agrabeli> Thank you everyone for joining us today! If there are other topics you'd like to discuss, please feel encouraged to do so on this channel, or add them to the pad for next month's community meeting: https://pad.riseup.net/p/ooni-community-meeting-keep 15:10:21 <slacktopus> <sina> Sure no worries. 15:10:33 <slacktopus> <agrabeli> Apologies! 15:11:21 <slacktopus> <agrabeli> Hope you all have a great day/night! Thanks again for joining us today! :) :ooni: 15:11:24 <hellais> #endmeeting