How does Apple Private Relay Work?

What is Apple Private Relay?

Private Relay is an attempt by Apple to change the way traffic is routed from user to internet service and back. This is designed to break the relationship between user IP address and information about that user, reducing the digital footprint of that user and eliminating certain venues of advertising information.

It is a new feature in the latest version of iOS and MacOS that will be launching in "beta mode". It is available to all users who pay Apple for iCloud storage and I became interested in it after watching the WWDC session about preparing for it.

TL;DR

Private Relay provides real value to users, but also fundamentally changes the way network traffic flows across the internet for those users. Network administrators, programmers and owners of businesses which rely on IP addresses from clients for things like whitelisting, advertising and traffic analysis should be aware of this massive change. It is my belief that this change is not getting enough attention in the light of the CSAM scanning.

What happens when you turn on Private Relay?

The following traffic is impacted by Private Relay

  • All Safari web browsing
  • All DNS queries
  • All insecure HTTP traffic

Traffic from those sources will no longer take the normal route to their destination, instead being run through servers controlled by either Apple or its partners. They will ingress at a location close to you and then egress somewhere else, with an IP address known to be from your "region". In theory websites will still know roughly where you are coming from, but won't be able to easily combine that with other information they know about your IP address to enrich targeted advertisements. Access logs and other raw sources of data will also be less detailed, with the personally identifiable information that is your IP address no longer listed on logs for every website you visit.

Why is Apple doing this?

When you go to a website, you are identified in one of a thousand ways, from cookies to device fingerprinting. However one of the easiest ways is through your IP address. Normal consumers don't have "one" IP address, they are either given one by their ISP when their modem comes online and asks for one, or their ISP has them behind "carrier-grade NAT". So normally what happens is that you get your modem, plug it in, it receives an IP address from the ISP and that IP addresses identifies you to the world.

Normally how the process works is something like this:

  1. Your modems MAC address appears on the ISPs network and requests an IP address
  2. The ISP does a lookup for the MAC address, makes sure it is in the table and then assigns an IP, ideally the same IP over and over again so whatever cached routes from the ISPs side exist are still used.
  3. All requests from your home are mapped to a specific IP addresses and, over time, given the combination of other information about the browsing history and advertising data, it is possible to combine the data together to know where you live and who you are within a specific range.
  4. You can see how close the geographic data is by checking out the map available here. For me it got me within a few blocks of my house, which is spooky.

CGNAT

Because of IPv4 address exhaustion, it's not always possible to assign every customer their own IP address. You know you have a setup like this because the IP address your router gets is in the "private range" of IP addresses, but when you go to IP Chicken you'll have a non-private IP address.

Private IP ranges include:

  • 10.0.0.0 – 10.255.255.255
  • 172.16.0.0 – 172.31.255.255
  • 192.168.0.0 – 192.168.255.255
Credit

For those interested you can get more information about how CGNAT works here.

Doesn't my home router do that?

Yeah so your home router kind of does something similar with its own IP address ranges. So next time a device warns you about "double-NAT" this might be what it is talking about, basically nested NAT. (Most often double-NAT is caused by your modem also doing NAT though.) Your home router runs something called PAT or PAT in overload. I think more often it is called NAPT in modern texts.

This process is not that different from what we see above. One public IP address is shared and the different internal targets are identified with ports. Your machine makes an outbound connection, your router receives the request and rewrites the packet with a random high port. Every outbound connection gets its own entry in this table.

Credit

IP Exposed

So during the normal course of using the internet, your IP address is exposed to the following groups

  • Every website or web service you are connecting to
  • Your DNS server also can have a record of every website you looked up.
  • Your ISP can obviously see where every request to and from your home went to

This means there are three groups of people able to turn your request into extremely targeted advertising. The most common one I see using IP address is hyper-local advertising. If you have ever gotten an online ad for a local business or service and wondered "how did they know it was me", there is a good chance it was through your IP.

DNS is one I think is often forgotten in the conversation about leaking IPs, but since it is a reasonable assumption that if you make a DNS lookup for a destination you will go to that destination, it is as valuable as the more invasive systems without requiring nearly as much work. Let's look at one popular example, Google DNS.

Google DNS

Turkish protests relies on twitter and used Google DNS once it was blocked

The famous 8.8.8.8. Google DNS has become famous because of the use of DNS around the world as a cheap and fast way to block network access for whole countries or regions. A DNS lookup is just what turns domain names into IP address. So for this site:

➜  ~ host matduggan.com
matduggan.com has address 67.205.139.103

Since DNS servers are normally controlled by ISPs and subject to local law, it is trivial if your countries leadership wants to block access for users to get to Twitter by simply blocking lookups to twitter.com. DNS is a powerful service that is normally treated as an afterthought. Alternatives came up, the most popular being Google DNS. But is it actually more secure?

Google asserts that they only store your IP address for 24-48 hours in their temporary logs. When they migrate your data to their permanent DNS logs, they remove IP address and replace with region data. So instead of being able to drill down to your specific house, they will only be able to tell your city. You can find more information here. I consider their explanation logical though and think they are certainly more secure when compared to a normal ISP DNS server.

Most ISPs don't offer that luxury, simply prefilling their DNS servers when you get your equipment from them and add it to the network. There is very little information about what they are doing with it that I was able to find, but they are allowed now to sell that information if they so choose. This means the default setting for US users is to provide an easy to query copy of every website their household visits to their ISP.

So most users will not take the proactive step to switch their DNS servers to one provided by Google or other parties. However since most folks won't do that, the information is just being openly shared with whoever has access to that DNS server.

NOTE: If you are looking to switch your DNS servers off your ISP, I recommend dns.watch. I've been using them for years and feel strongly they provide an excellent service with a minimum amount of fuss.

How does Private Relay address these concerns?

  1. DNS

This is how a normal DNS lookup works.

Apple and Cloudflare engineers have proposed a new standard, which they discuss in their blog post here. ODNS or "oblivious DNS" is a system which allows clients to mask the originator of the request from the server making the lookup, breaking the IP chain.

This is what ODNS looks like:

Source: Princeton paper

This is why all DNS queries are getting funneled through Private Relay, removing the possibility of ISP DNS servers getting this valuable information. It is unclear to me in my testing if I am using Apple's servers or Cloudflares 1.1.1.1 DNS service. With this system it shouldn't matter in terms of privacy.

2. Website IP Tracking

When on Private Relay, all traffic is funneled first through an Apple ingress service and then out through a CDN partner. Your client makes a lookup to one of these two DNS entries using our new fancy ODNS:

mask.icloud.com
mask-h2.icloud.com

This returns a long list of IP addresses for you to choose from:

mask.icloud.com is an alias for mask.apple-dns.net.
mask.apple-dns.net has address 172.224.41.7
mask.apple-dns.net has address 172.224.41.4
mask.apple-dns.net has address 172.224.42.5
mask.apple-dns.net has address 172.224.42.4
mask.apple-dns.net has address 172.224.42.9
mask.apple-dns.net has address 172.224.41.9
mask.apple-dns.net has address 172.224.42.7
mask.apple-dns.net has address 172.224.41.6
mask.apple-dns.net has IPv6 address 2a02:26f7:34:0:ace0:2909::
mask.apple-dns.net has IPv6 address 2a02:26f7:36:0:ace0:2a05::
mask.apple-dns.net has IPv6 address 2a02:26f7:36:0:ace0:2a07::
mask.apple-dns.net has IPv6 address 2a02:26f7:34:0:ace0:2904::
mask.apple-dns.net has IPv6 address 2a02:26f7:34:0:ace0:2905::
mask.apple-dns.net has IPv6 address 2a02:26f7:36:0:ace0:2a04::
mask.apple-dns.net has IPv6 address 2a02:26f7:36:0:ace0:2a08::
mask.apple-dns.net has IPv6 address 2a02:26f7:34:0:ace0:2907::

These IP addresses are owned by Akamai and are here in Denmark, meaning all Private Relay traffic first goes to a CDN endpoint. These are globally situated datacenters which allow companies to cache content close to users to improve response time and decrease load on their own servers. So then my client opens a connection to one of these endpoints using a new protocol, QUIC. Quick, get it? Aren't network engineers fun.

QUIC integrates TLS to encrypt all payload data and most control information. Its based on UDP for speed but is designed to replace TCP, the venerable protocol that requires a lot of overhead in terms of connections. By baking in encryption, Apple is ensuring a very high level of security for this traffic with a minimum amount of trust required between the partners. It also removes the loss recovery elements of TCP, instead shifting that responsibility to each QUIC stream. There are other advantages such as better shifting between different network providers as well.

So each user makes an insecure DNS lookup to mask.apple-dns.net, establishes a QUIC connection to the local ingress node and then that traffic is passed through to the egress CDN node. Apple maintains a list of those egress CDN nodes you can see here. However users can choose whether they want to reveal even city-level information to websites through the Private Relay settings panel.

If I choose to leave "Maintain General Location" checked, websites will know I'm coming from Copenhagen. If I select the "Country and Time Zone" you just know I'm coming fron Denmark. The traffic will appear to be coming from a variety of CDN IP addresses. You can tell Apple very delibertly did not want to offer any sort of "region hopping" functionality like users require from VPNs, letting you access things like streaming content in other countries. You will always appear to be coming from your country.

3. ISP Network Information

Similar to how the TOR protocol (link) works, this will allow you to effectively hide most of what you are doing. To the ISP your traffic will simply be going to the CDN endpoint closest to you, with no DNS queries flowing to them. Those partner CDN nodes lack the complete information to connect your IP address to the request to the site. In short, it should make the information flowing across their wires much less valuable from an advertising perspective.

In terms of performance hit it should be minimal, unlike TOR. Since we are using a faster protocol with only one hop (CDN 1 -> CDN 2 -> Destination) as opposed to TOR, in my testing its pretty hard to tell the difference. While there are costs for Apple to offer the service, by limiting the traffic to just Safari, DNS and http traffic they are greatly limiting how much raw bandwidth will pass through these servers. Most traffic (like Zoom, Slack, Software Updates, etc) will all be coming from HTTPS servers.

Conclusion

Network operators, especially with large numbers of Apple devices, should take the time to read through the QUIC management document. Since the only way Apple is allowing people to "opt out" of Private Relay at a network level is by blocking DNS lookups to mask.icloud.com and mask-h2.icloud.com, many smaller shops or organizations that choose to not host their own DNS will see a large change in how traffic flows.

For those that do host their own DNS, users receive an alert that you have blocked Private Relay on the network. This is to caution you in case you think that turning it off will result in no user complaints. I won't presume to know your requirements, but nothing I've seen on the spec document for managing QUIC suggests there is anything worth blocking from a network safety perspective. If anything, it should be a maginal reduction in the amount of packets flowing across the wire.

Apple is making some deliberate choices here with Private Relay and for the most part I support them. I think it will hurt the value of some advertising and I suspect that for the months following its release the list of Apple egress nodes will confuse network operators on why they are seeing so much traffic from the same IP addresses. I am also concerned that eventually Apple will want all traffic to flow through Private Relay, adding another level of complexity for teams attempting to debug user error reports of networking problems.

From a privacy standpoint I'm still unclear on how secure this process is from Apple. Since they are controlling the encryption and key exchange, along with authenticating with the service, it seems obvious that they can work backwards and determine the IP address. I would love for them to publish more whitepapers or additional clarification on the specifics of how they handle logging around the establishment of connections.

Any additional information people have been able to find out would be much appreciated. Feel free to ping me on twitter at: @duggan_mathew.