If you’re using Skype, you probably don’t want some antivirus update to consume all of your bandwidth and ruin your call. You’d like a policy that says “give Skype all the bandwidth it needs,” which the router then enforces.
When I used to work in networking infrastructure in the mid-nineties, I spent a lot of time on network Quality of Service (QOS). This is basically deciding which kind of traffic has priority when there isn’t enough capacity to go around—different kinds of traffic go into different queues, and some queues get a bigger share of the pipe. (In my house, for example, World of Warcraft traffic takes precedence over everything else so my Dropbox updates don’t make the raid my wife is running collapse.)
Getting everyone to play nice together
Policy and QOS doesn’t end with your router—all the devices between your computer and the machine on the other end have to respect the policies you’ve set up, from the little wifi router in your basement to the big machines at the core of the Internet. Otherwise, traffic you prioritize will just get stuck in a bottleneck somewhere. Not easy, and still not well solved today.
One of my jobs as product manager for policy-based networking was to convince thirteen divisions of 3Com, spread around Europe, America, and the Middle East, to agree on how to treat different kinds of traffic.
After trying for months to convince all these divisions to play nicely together, I realized something (which may first have been suggested by John Strassner or Andy Gottlieb, both of whom are far smarter than I am):
The magic number of queues you need is two.
This is easily explained. Consider a one-lane road. If you’re stuck behind a large, slow truck, the drive is interminable.
But change the number of lanes to two, and everyone can pass, eventually. What’s more, emergency traffic (such as an ambulance) can always get through, because it has the ability to clear everyone else out of the fast lane with its sirens.
(Here’s the nerdy big; feel free to skip it.
This mattered because, at the time, the best way to signal priority across devices was the three IPTOS precedence bits in the header of every packet, which meant eight possible values. Most existing policy models were pretty complicated, trying to make eight kinds of traffic. That caused a lot of disagreement between the various companies and divisions jockeying for dominance on the Internet.
By saying to them, “I only want one of the three bits—two values—and you can do whatever you want with the other two” I was able to get everyone to agree. Also, unusually, John Strassner and I, representing rivals Cisco and 3Com, were in violent agreement about this stuff. Some breadcrumbs are still lying around on the Internet.
I wrote a lot more about this in Managing Bandwidth, back in 1999, if you care, but it’s out of print and out of date.)
Connectivity is the first casualty of disaster
When super-typhoon Haiyan devastated the Philippines this year, one of the first casualties was connectivity. In times of emergency, we turn to our networks, and whether it’s a bombing or an ice storm, our smartphones are our lifelines.
Closer to home, in Toronto my friend Sulemaan posted a picture of the inside of his car last week; he’d had to crawl in through the back hatch to charge his phone because the windows were encrusted with ice and it was his only power source.
Generally speaking, we rely on the mobile phone network, rather than the Internet, for emergencies. But SMS is either one-to-one or, in specific cases, one-to-many. It’s not a platform for the hive mind the way Twitter, Facebook, Google+ or reddit can be.
Sulemaan wanted to know what was going on, and his smartphone was his lifeline, not for SMS, but for access to the hive mind and connectivity to his fellow citizens. Put another way, Twitter is a message bus for the human race.
Why two channels?
Most networking systems have two channels built into them. In a mobile phone, there’s one part that sends signalling information (that a call is coming in; what the caller’s phone number is; etc.) and one part that carries the call itself. In ISDN, these were called the Dialler channel and the Bearer channel.
There are two big reasons for this.
- It makes them harder to hack. Some of you may be old enough to remember a time when you put money in a payphone and it made a specific sound. Hackers took advantage of this by making small boxes that mimicked the sound, getting free calls. It doesn’t work if the control channel (with which the phone tells the network it has been paid) is separate from the carrier channel (which transmits your voice.)
- The network is more robust. If you need to tell a router it’s under attack, but the attack involves overloading its circuits, your management traffic will be lost in the flood of the attack. You need a way to make sure the management traffic gets through no matter what, and often the times you need to manage things are precisely the times when they’re congested.
Remember the Emergency Broadcast System?
From 1963 to 2008, the Emergency Broadcast System was enough. In 2008, however, the Emergency Alert System replaced it, with a stated goal of letting “the President of the United States to speak to the United States within 10 minutes.” And in early 2013, the Commercial Mobile Alert System, which is basically broadcast SMS, launched.
In parallel, the US government has a National Infrastructure Assurance Plan that’s designed to prioritize essential systems like dams and railways; it also prioritizes, for example, the delivery of gasoline to data centers in a crisis.
For decades, we’ve regulated 911 services, and emergency warning broadcasts on television. At a time when everyone is discussing the government listening to what happens online, few people are thinking about the government’s ability to publish what’s online.
Publishing isn’t enough, though. There are three patterns of communication we need to consider:
- One-to-one. First responders rely on reserved frequencies and radios to communicate in an outage, although there are limitations to the number of participants and ranges of such technologies. SMS and phone systems are a good backup, in part because the towers and networks benefit from battery backups and robust installations. This stuff is covered by the NIAP, as well as the FCCs restrictions on radio spectrum and the cooperation of manufacturers like Motorola.
- One-to-many. This is where the EBS, EAS, and CMAS apply, sending messages to the population across broadcast media. They’re good for things like tornado and flash flood warnings.
- Many-to-many. This is a relatively new form of communication. What was once the town square is now the social network. It’s the source of emergent news (much of which is false, and must be parsed; just look at the false information in the Boston Marathon bombings.)
While one-to-one and one-to-many are somewhat regulated and reinforced, the many-to-many model—also known as the Internet—isn’t. As society settles on a few big platforms (Skype, Facetime, Twitter, reddit, Facebook, Google+, and so on) they face inevitable regulation. And the only way to regulate and manage these things in a time of crisis is a parallel management network.
Some concrete first steps are pretty clear: governments, under existing legislation, could probably compel Facebook, Twitter, Google and Bing search, and ad delivery networks like Overture and Brightroll, to replace advertisements with emergency messages.
Those would get seen pretty quickly. It’s not a far stretch from there to telling every Facebook user they’ve been tagged in an emergency message, or telling every Twitter user they were mentioned in an emergency tweet.
Beyond this, there could be the reservation of messaging platforms within a geofence. Could mobile data providers be required to set up a private virtual network atop the public one, letting first responders and community leaders switch to a second data network which had more available capacity? That would certainly help ease overload.
I’m not sure what this will look like in ten years. But I suspect that it will cause a lot of regulatory headaches and implementation challenges for providers and operators, and be used as the justification for further control of public Internet resources.
We’re entering a period of more weather extremes, more asymmetric threats, and a faster rate of societal change. At the same time, we’ve developed a protocol for the hive mind. Now that the Internet, and the applications that run atop it, are the default platform for communal, many-to-many modern disaster management, we’re going to wind up with two queues.