One of our clients, the WA7VC Ham Radio Club, has gotten some
questions about how we make their networking setup work. In this post we're going to
describe the problem of running older network protocols and appliances over the modern
internet, a general purpose solution, and the details of the specific solution we built.
While trying to set this up we'd heard that it was "impossible" to get some of these things working over Starlink, so we're happy to have been able to prove that wrong!
WA7VC has radio equipment at a location that has internet access via Starlink. There are several things that have a problem with this due to the fact that Starlink uses Carrier- Grade NAT (CGNAT), preventing them from being reached directly from the internet.
On-site they have:
- DStar repeater
- Echolink/IRLP node
- FlexRadio hardware
- APRS node all of which present various challenges in a CGNAT environment. Let's have a look...
Most people who have an internet connection are going to have a single public-facing IP address, which is given to your router. Your router then gives out IP address to devices on your internal network/wifi from a private LAN IP Range. This special range of addresses are never reachable directly over the public internet. In the diagram you'll see that two different users will use the same LAN IP addresses.
Normally this isn't a problem because your router performs Network Address Translation (NAT), and tracks all packets leaving the network so that it can route returning packets to the correct internal device. If you want an internal device to be reachable from the public internet you simply set up a "port forward" on your router that says "if anyone tries to reach port #8534, send that packet on to this specific interal IP address and they'll handle it".
CGNAT complicates this by adding a second layer of NAT at the ISP level, meaning that the IP address given to your router is not publicly available. The ISP router remembers which of its customers sent an outgoing packet and correctly routes any responses to the right customer's router, which remembers which internal device sent the packet and forwards the reply back to that device.
The problem is that you can't set up any port forwards on the ISP's router. So there is no way for the customer to accept any traffic originating in the outside world and route it to a device on their own network.
(Red IP's are private LAN range, blue IPs are private ISP range, and only the green IP is an actual publicly accessible address.)
In this example here, both user's Flex radios thinks that their IP address is "188.8.131.52", and it can happily send out packets to the world via their local router. Whoever receives those packets will see a return address of 184.108.40.206. Everything works like magic, as long as you don't need to receive any packets that originate from outside of your network...
Why do we need to recieve packets?
"Normal" internet users do not often need to receive messages that are sent to them un-invited from random places on the internet. When someone sends you an email they do not send it directly to the computer on your desk or the phone in your pocket, they send it to a server in a datacenter somewhere, and your phone or computer checks in with that server and picks it up.
However, there are two specific cases here that we need to solve:
- Remote-control application needs to initiate connection to on-site device.
- We need to support a protocol that doesn't initiate and hold open a connection between devices, but just blasts packets in a send-and-forget manner towards the address of the party it's trying to talk to. (This is a "UDP" protocol between two devices that are just streaming data to each other's address.)
So what do we do?
Well, much like the email example above, we're going to use a server in a datacenter. This server will have a public IP address that doesn't change, and will always be accessible.
Since we can't just go "pick up our mail" later though, we need to use that server as to bounce packets from its public address to the device that needs to receive them.
In any solution we use we're going to need a server somewhere on the public internet, which is available and doesn't change it's IP address. A good candidate for this would be a VPS from any number of hosting providers. You're going to want one that has a a static IPv4 address (and ideally an IPv6 address), and which gives you the ability to install your own OS and have root access to the box. At DaedalusDreams we usually prefer Linode servers, but in this specific case we ended up going with Vultr due to having a datacenter in closer physical proximity to WA7VC's location. (We'll probably migrate it when Linode opens their new Seattle DC.) In either case it costs about $5/month to get their smallest VPS, which will be more than enough for our needs.
There are two ways we can do this. One is the way we'd prefer, and the second is the method that captures all the use-cases.
The ideal solution
The ideal form of the solution is conceptually quite simple, we set up a VPN server, and install a VPN client on the device that needs to be accessible from the internet, and then we can simply forward the necessary ports from the public address of the VPN server through the VPN tunnel down to the device.
You can see in this case, the dotted blue line indicating a VPN tunnel from the VPN server directly to the flex radio, with packets flowing through both the Starlink router and the local router.
Where does this go wrong?
If only wishing made it so... Unfortunately there are a couple of key problems with this approach that may or may not make it viable for a specific use case, and ruled it out for ours.
Problem 1: Appliances...
Unfortunately this method requires that you are able to install a VPN client on every device that needs to receive packets from the public internet. This may not work if any of your devices are appliances that don't give you access to install additional software. Devices such as an ICOM repeater don't exactly give you the option to install things, and even some devices which nominally do may provide a challenge due to running either ancient or incompatible operating systems.
Even if we were able to install a VPN client on all of the devices however, this method does induce some complication in the form of having more software to keep updated, more attack surface that needs to be secured, etc. Being aware of the tradeoffs is important.
Problem 2: UDP VPNs...
Most VPN technology, for reasons we won't get into here, utilizes UDP to communicate between client and server. Sound familiar? It should, UDP protocols were how we got into this mess in the first place! How is the VPN server supposed to shoot packets at down to the client if the client is behind CGNAT's two-layers-of-addresses problem?
It's possible to run a TCP VPN, but in practice that isn't going to work very well for the highly timing sensitive voice-over-IP protocols we're trying to support in this case.
This could also be solved if the entire local LAN segment supported IPv6, as that would provide unique public IP addresses for each device. Unfortunately we're back to the problem of some (if not all) of our devices/protocols having no IPv6 support.
It's worth noting that this may not be a problem in some cases, depending on exactly how you configure things, and how motivated your ISP is about killing states. It's definitely a road with some potholes to watch out for though.
The router capture solution...
This method requires some capabilities that the standard Starlink (or most other ISP for
that matter) router does not provide. In order to make this work not only will we need a
VPN server in the cloud, but our local router needs to have the capability to run a VPN
client, and to perform custom routing rules.
For the purposes of this example we will show screenshots from the router platform we use: pfSense, which should be quite similar on OPNsense, and the concept should apply on any similarly capable router. (We had previously implemented the VPN server side of this on a standard Debian Linux VM using the built-in firewall for example.)
So if we can't install a VPN client on the devices directly, what do we do? Instead of having the VPN client on the device itself, we create a VPN tunnel between the server in the cloud and our local router, and then we have the local router funnel all outbound packets for that device through the tunnel, and allow any packets from the VPN server to go directly to the target device!
This method allows us to utilize the VPN server in the cloud, but does not require anything at all of the target device (Flex Radio in the above diagram).
The key aspect of making this work is that we must capture all outbound packets from our device and route them through the VPN server, so that anyone that it sends packets to out in the public internet sees the return address for those packets as the address of the VPN server. Without this several of the protocols we're attempting to support will not work because when a packet arrives at the remote destination the remote station attempts to send a packet straight back to the last address it heard from. (Which if we did not capture all outbound packets, would be the public IP of the Starlink router, and we're right back where we started.)
VPN Server Setup
For the sake of simplicity, in this case we have chosen to use a virtualized installation of pfSense as our VPN server, in order to have minimum mental overhead when switching back and forth between our VPN server and the baremetal pfSense install that acts as the onsite router. Setting up a pfSense virtual machine in the cloud is beyond the scope of this article as much will depend on the chosen hosting provider.
Once we have the VPN server in the cloud set up, we need to create a VPN tunnel between the server and our on-site router. We use the Wireguard VPN protocol as it's supported on pfSense, simple to manage, and performs extremely well. For the sake of brevity we're not going to walk through the entire VPN setup here as that could be its own quite lengthy post, and there are a number of very good tutorials available.
It's worth noting that due to the usage of Starlink as an ISP our router has an IPv6 address, and we'll be running the VPN over IPv6. We won't actually be routing any IPv6 traffic through the VPN, simply using it as the transport layer to get packets from the server to our router.
VPN Server Configuration
Once you have the VPN server and the router talking to each other the next step is to configure things so that the VPN server can communicate directly with devices on your LAN. This exposes us to some risk because anyone who can gain control of our VPN server can send packets directly into our LAN. We have some mitigations in place to limit the danger there, such as running the ham radio devices in their own isolated VLAN where even if they were somehow taken over they wouldn't be able to impact anything else on the LAN. It is something to be aware of though.
On pfSense, letting the VPN server access the LAN involves several steps on the VPN server side:
- Ensure that there is a gateway set up for the wireguard interface in System/Routing/Gateways.
- Set up a static route in System/Routing/Static Routes to route all traffic for your LAN subnet through the gateway for the WG0 (wireguard) interface that we set up in step 1.
- Ensure Outbound NAT is set to "Hybrid" mode in Firewall/Nat/Outbound
- Add an Outbound NAT rule with the Source set to the Wireguard network subnet. Note that this whatever IP block you've put your wireguard server and client IP's into, not your LAN subnet. (This should be done as part of any tutorial on setting up a wireguard link, but it's worth mentioning here.) And a single step on the router side:
- Ensure that there is a gateway set up for the wireguard interface in System/Routing/Gateways.
Now we simply need to open ports on the local router to allow the VPN server to send traffic through to devices on the LAN, and then to hav ethe VPN server forward the correct ports to the correct LAN IP addresses.
A best practice here is to allow only specific ports targeting specific IP addresses. (This is another of our "blast radius" mitigations in case someone gains control of the VPN server.) It's possible to set a rule that would simply allow all traffic from the VPN server to any IP on the LAN on any port, but that increases the risk quite a bit. We use specific allow rules on the wireguard interface on the router, such as the following:
These rules allow the VPN Server to send traffic to the DMR Node and FlexRadio. Note that these are not NAT rules, simply allow-traffic. (Since by default we deny any traffic coming into our LAN from the VPN server we need these. They cause us to have to duplicate work a little bit, but it makes the LAN quite a bit more secure in case the VPN server is compromised.)
Next, we need to tell the VPN server to forward (NAT) those ports to the correct LAN IPs:
So now, any unsolicited traffic coming in from the internet that hits the VPN server on port 55193 will get routed through the VPN tunnel directly to the DMR node's LAN IP of 10.49.7.19. For some protocols, such as FlexRadio's SmartLink remote control, this might be enough. The trick is that in your remote control app you need to tell it to connect to the IP address of the VPN server.
But if it auto-detects it's own IP, or if it's a protocol that returns responses to the sending address, this still won't work. The last step is where the magic happens:
Capturing all outbound packets
In order to make this work, we need the local devices to have no idea that there is any NATing going on, they need to believe that their public IP address is the address of the VPN server.
In pfSense we will do this be creating an alias, putting all of the IPs we want to capture into that alias, and then setting up a routing rule to send all traffic from that alias through the VPN tunnel.
Step 1 is to create both of the aliases we need on the on-site router:
We need two aliases here. One contains the specific local LAN device IP addresses of the devices that need to be forced to go through the VPN to reach the internet. The other alias covers the entire rfc1918 private IP address space. We'll be using this to distinguish between content destined for the internet, and content destined for another device on our LAN.
Step 2 is to create a routing rule on our firewall that detects any traffic from the list of devices we just created. In this example we're putting this rule on the "VLANS" firewall rule tab in pfSense because the alias covers devices on multiple local VLAN segments, but if all of the devices are on the same interface/VLAN the rule could go there instead:
This rule captures any packets coming from the alias containing the devices that we want to force through the VPN with a destination NOT in the private IP block alias we set up (implying that those packets are destined for a public IP address), and sets their gateway to be the Gateway for the the VPN server that we set up.
And as soon as we refresh the firewall rules, all traffic from our selected IP
addresses should be going out through the VPN server. You can test this if any of those
devices are running linux and you have command line access by using the command
which should show the public IP address of the VPN server. Success!
With this setup, someone can transmit over RF, which gets picked up and converted to digital voice over IP, routed up through a satellite moving really fast overhead in low-earth orbit, sent back down into a datacenter, sent off to another datacenter, and then sent off to the radio on the other end where it is converted back into analog (or digital!) voice being broadcast over RF.
How cool is that?
It took a little bit of working around, but using this technique we've been able to get any protocol or device needed working over Starlink. The same technique was previously used when WA7VC's site was served by a trio of DSL modems operating in a round-robin, so with some adaptation it can probably be used to solve just about any similar connectivity problem. I look forward to the eventual RF over IP over RF over IP over RF nesting just for the grand silliness of it.
Now, all this said, what's the correct solution here? Ideally hardware and software developers realize that the future is IPv6, or at least CGNAT, and start developing these solutions as built-in to their products. Offering the option to connect something like a FlexRadio to a tailscale tailnet directly would certainly be a step in the right direction. But as long as there are Ham Radio operators trying to experiment there will probably be some devices or protocols that don't work behind CGNAT or IPv6, and for them we present this solution!
If you're interested in adding VPN capabilites to your software stack, or if you'd like help getting a setup like this working and managed, feel free to reach out to us at contact@daedalusdreams, we're always happy to chat!