HTTPS, short for secure HTTP, which shields your browsing from both snooping and tampering, sounds like something that every cybersecurity-conscious company would have jumped to provide for its website visitors as soon as it possibly could.
But in an early example of a ‘Stop The Insanity’ story, HTTPS needed at least two decades to take hold, for a bunch of curious and sometimes contradictory reasons.
Join Paul Ducklin for Part 1 of this peculiar but educational tale…
Since the advent of the World Wide Web in 1990, we’ve gone through a number of stages in our cybersecurity expectations while we browse.
For the first few years of the web’s existence, we were happy with (indeed, we had no choice but to accept) unencrypted connections.
Everything going back and forth between your browser and the web severs you visited was sent in what cryptographic jargon calls plaintext.
Loosely speaking, anyone else on your local network, anyone at your ISP, anyone connected anywhere along your network path, could keep a complete record of which pages you asked for, and what data came back.
From a privacy point of view, there wasn’t any.
For the most part, however, the issue of web encryption was largely ignored.
Even back then, snooping on your web traffic would let eavesdroppers keep track of your evolving research interests, and could therefore be used for basic industrial espionage, but the internet was still largely academic, and individuals’ research interests were often a matter of public record anyway.
By the mid-1990s, however, business and consumer internet access were widespread enough that online companies were keen to start selling products via the web.
This meant accepting confidential data uploaded by customers that both ends were supposed to keep out of public view, and which therefore needed some sort of cryptographic protection to keep snoops, stalkers and spies at bay.
For any sort of online commerce with home delivery, for example, this confidential data, an example of what we now know by the name PII, short for personally identifiable information, would of necessity include the purchaser’s address and payment card details.
The long-defunct browser Netscape (the precursor of Mozilla’s open-source browser Firefox, originally known as Phoenix because it rose from the ashes of Netscape) was a multi-billion dollar business at the time, so the company’s developers hastily added numerous web browser features that made site personalisation, online shopping, and online payments viable:
These days, SSL doesn’t really exist: the protocol has been revised many times since 1995 to fix cryptographic flaws and to improve reliability.
Along the way, for a mixture of technical, political and marketing reasons, SSL was officially renamed TLS, short for transport layer security.
As a result, SSL version 3 was superseded in 1999 not by SSL 3.1, but by TLS 1.0, which has itself been upgraded three times since then, to TLS 1.1 (2006), TLS 1.2 (2008), and TLS 1.3 (2018).
Confusingly, perhaps, many well-known open-source programming libraries still include the letters SSL in their name, even though by default none of them build in support for the old and insecure SSL-style versions any more.
OpenSSL, wolfSSL, LibreSSL and BoringSSL, for example, have all kept the outdated nickname SSL alive, even though there aren’t (or ought not to be) any websites left online that accept SSL-style connections.
SSL 1 was sufficiently flawed that it was never actually released, SSL 2 was officially ‘de-recommended’ back in 2011, and SSL 3 was thrown out as unsafe in 2015. TLS 1.0 and TLS 1.1 were deprecated, as the jargon puts it, in 2021, and should no longer be used.
All modern web servers now support TLS, and most browsers activate it whenever the other end will support it, so that almost all data exchanged between browsers and web servers in 2024 is encrypted.
That includes web content written in HTML, short for hypertext markup language, the text-based page description system that tells your browser what to say, with special text tags to denote headings, paragraphs, clickable links, and so on:
TLS encryption also covers data such as cookies, which remember details between two visits to the same site, including personalisation settings, authentication tokens that prove you’ve already logged in, and tracking codes that help website owners follow you round their sites so they can hit you up with custom ads, special offers, and more:
Stolen cookies can be just as dangerous as stolen passwords if the cookie data includes security tokens or authentication codes that are accepted by the server as sufficient evidence that you recently entered your full password plus any multi-factor authentication codes needed.
TLS also protects any content fetched for use inside web pages, including JavaScript code, stylesheets and images, as well as files such as documents or programs that the website offers for download.
Just as importantly, of course, TLS shrouds any data your browser uploads to the server, including the names of all the pages you choose to view, the search terms you use to find them, any passwords you type in along the way, any PII you enter into forms on the site, including payment card details, and any files you decided to upload.
Up to the point that the TLS encryption is applied to the web data, everything generated by and sent out by your browser is exactly the same as it would be over an unencrypted connection.
Similarly, once the TLS encryption layer is stripped off at the web server you’re talking to, the data that the web server deals with is just good old HTTP – the same hypertext transfer protocol that the server would see if you connected to it without any encryption.
You don’t need two completely different browsers, or two sorts of web server, or two different databases of web content.
By convention, HTTP traffic travels over network port 80 when there is no encryption added on, while HTTPS traffic consists of exactly the same content, wrapped in a protective cryptographic shroud and sent instead over network port 443.
In other words, what’s not to like about the add-on encryption that SSL introduced in 1995, and that TLS continues to deliver today?
The SSL/TLS layer that puts the S in HTTPS protects your web traffic from snoops, censors and cybercriminals.
Even if it isn’t perfect, the alternative is to exchange HTTP data entirely unencrypted, so that anyone who feels like it can snoop on you directly anyway.
So, given that SSL first arrived on the scene almost 30 years ago, and has been supported by almost all browsers and web servers ever since…
…you’re probably expecting this article to end abruptly right now, with web encryption being what researchers refer to as a solved problem for decades.
After all, we needed web encryption for important commercial reasons way back in the mid-1990s, or Netscape wouldn’t have scrambled to come up with SSL and HTTPS to keep the credit card companies happy.
Once we had SSL, surely there would be little reason not to try it out right away?
And once we started using it, surely there would be little reason not simply to use it all the time, if only to avoid the hassle of working out which were the right pages of a website to encrypt and which ones didn’t contain any data or cookies that needed shielding from attackers?
In fairness, given the inertia that often slows down progress in IT, it’s reasonable to assume that SSL wouldn’t have taken over instantly, and that HTTPS might have taken a few years replace HTTP in general use.
But a reasonable observer would almost certainly think that by 1999, say, or perhaps by 2000, HTTPS would have taken over almost completely from plain old HTTP.
Unfortunately, that’s not how the story of web encryption played out.
Progress towards an HTTPS-only web world, or even to a mostly-HTTPS one, took decades to unfold.
The benefits of web encryption were easily explained right from the start, namely that with HTTPS:
The last point above, namely that strong encryption provides a method for preserving the integrity of data, is always as important as, and sometimes even more vital than, keeping the data secret from prying eyes.
Converting your browsing to HTTPS, where it was available, was easy: just replace the starting text http://
in any URL with its secure counterpart, https://
, thus telling your browser to wrap a layer of TLS encryption around its HTTP exchanges.
In theory, converting web servers from HTTP to HTTPS was easy too, but in practice it threw up some tricky problems.
Performance was perhaps the thorniest problem when introducing server-side HTTPS.
Before cloud computing became ubiquitous, most companies ran their own web servers, either on their own premises or at a so-called co-location site, where their servers were in rented-out racks that were right next to their ISP’s points of connection to the internet. (Co-location was more expensive, but eliminated the often much slower ‘branch-line’ connection between the ISP’s site and the company’s own offices.)
Servers couldn’t be added to or removed from the available pool of computers at will, because they had to be budgeted for, ordered, delivered, installed and commissioned, which could take weeks or months.
Most businesses had to guess how much computing power they would need for web services, typically annually: too little, and customers and prospects would be unimpressed by an underperforming website; too much, and the idle cycles on any underused web servers would be a serious waste of money.
Anything that added extra computing time and bandwidth to web requests was regarded with great suspicion, especially if it affected every incoming request from every visitor.
Unfortunately, activating SSL or TLS for all web traffic did just that.
There was not only the computational overhead of encrypting and decrypting all the HTTP data going back and forth, but also the extra network overhead at the start of every connection as the visitor’s browser and the web server negotiated the cryptographic settings they would use from then on.
Encryption overhead tends to be much lower these days, because many computer processors now include dedicated machine instructions to speed up commonly-used cryptographic algorithms, notably the AES cipher, very commonly used in the data-scrambling part of TLS connections.
The ‘cryptographic dance’ carried out by browser and server at the start of a connection, during which each end agrees on a common set of cryptographic algorithms and settings, and decides if it’s willing to trust the other, involves extra data and additional requests-and-replies.
For example, here’s a simple website, visited on the left using plain old HTTP, and on the right using HTTPS.
To the user, the data that’s requested and received, and the HTML of the page that’s ultimately displayed in the browser, is identical, given that the browser processes the data after the encryption has been stripped off:
The data exchanged over the network in the HTTP session consists of a short, text-based HTTP request using the command GET /
, which means fetch me the main page
, followed immediately by a reply containing the HTML content of that page for the browser to display:
But with HTTPS involved, the browser and the server need to exchange a pair of network packets first just to establish the connection, which takes up what’s known as the round-trip time, or RTT; followed by another round trip to handle the embedded HTTP request and reply:
This extra overhead means that the adoption of SSL/TLS, especially by companies that weren’t themselves processing payments or handling credit cards, was unenthusiastically slow.
Added to the performance and bandwidth overheads of encrypting everything was the two-fold infrastructural hassle created by the use of digitally signed cryptographic certificates in TLS, required so that servers can affirm to their visitors that they really are the servers they claim to be.
(TLS servers can require clients to produce signed digital certificates too, so that each end can reliably identify the other, but in regular web browsing, where the aim is to attract as many visitors as frictionlessly as possible, this is almost never done.)
Generally speaking, there are two parts to certificate-based verification.
Firstly, you need some way of signing a ‘cryptographic statement’ in every connection to prove that the certificate really is yours, and not just someone else’s public certificate you copied when you last visited their site.
Secondly, you need what is effectively a certificate for your certificate, signed by someone else that the other person already trusts, to reassure them that you didn’t just create the certificate for yourself in someone else’s name.
After all, making your own certificate for any web domain you like, and signing it yourself, is trivial, with any number of free tools and programming libraries available to help you.
Here, we create a elliptic curve public-private key using the NIST P-256 curve prime256v1
, generate a certificate based on that public key for the website any.domain.example
, and sign the certificate ourselves with our new private key.
We’ve used the programming language Lua and a popular TLS library based on OpenSSL called luaossl
:
As the code suggests at the line cert:sign(key,'SHA256')
, vouching for your own certificate, whether at signing time or whenever you send it out to identify your website, requires that you have your own private key available.
When you’re signing something offline, as we did when creating the certificate above, it’s comparatively easy to take care of the private key – for example, you could store it on an encrypted USB drive and only ever plug it when needed.
But when you’re running a TLS-encrypted web server, you need some way of making that private key available to the server to ‘prove’ its own certificates every time a new connection comes in.
And if you have multiple web servers, perhaps at multiple co-location sites around the world, you need to be able to distribute that private key safely to all of them, to update the key when it expires, and to keep it secure on each and every internet-facing server that needs it.
In today’s cloud-based world, generating, signing, distributing and securing private keys is something that service providers can do for you automatically, so it has become a cybersecurity task than many website operators no longer have to worry about, or perhaps don’t even think about at all.
But in the late 1990s and 2000s, SSL/TLS key management was a task that many companies didn’t really have the time or skills to deal with well.
Even worse, if you were looking after your own TLS keys and certificates, was the chore of needing someone other than yourself to verify and sign your certificates before you could use them.
So-called CAs, or certificate authorities, often charged hundreds of dollars per certificate per year for verification and signing, knowing that unless you were a CA yourself (a business in its own right), you had little choice but to use someone else’s services.
Unless you used a CA that all the world’s major browsers and operating systems had already loaded as ‘trusted signers’ into their built-in lists of known CAs, visitors arriving at your site would be confronted by an unappealing and suspicious notice implying that your site itself was untrustworthy:
Acquiring and renewing certificates signed by one of the commercial CAs in the elite global club of accepted authorities could be an fussy, time-consuming, expensive and often manual process, even for medium and large businesses (the bigger your business, the more domains and certificates you were likely to be juggling).
If you left it too late, you might end up with a few days of expired certificates, leading to perfunctory and unimpressive browser warnings to your visitors, followed by a scramble to get the renewal done.
If you renewed too early, you risked spending extra money to have certificates with overlapping validity periods.
Given that today’s cloud providers typically generate TLS certificates for your new website automatically as part of your hosting charge, and given that numerous free CAs now exist through which you can auto-renew your certificates using simple scripting tools…
…it’s worth remembering that, until the mid-2010s, the renewal process for TLS certificates, and the ‘trusted club’ of CAs themselves, were regarded by many CIOs, IT managers and sysadmins with a mixture of dismay and suspicion.
The benefits of wrapping every web connection in a three-way cocoon were obvious to anyone with an interest in, or with responsibility for, cybersecurity:
Nevertheless, for every cybersecurity expert who urged the world to make faster progress towards HTTPS everywhere , you could, until quite recently, find another expert who argued strongly against the universal use of HTTPS.
The excuse was the same sort of paradoxical reason pitched on a regular basis by governments around the world, even in free and democratic countries, arguing that ‘too much’ encryption will benefit the criminals more than the rest of us.
“If everyone switches to HTTPS,” warned the naysayers, “then all the crooks will all start using it too, and we won’t be able to scan incoming and outgoing web traffic for malware, which is easy to do if everyone sticks to HTTP.”
For many years, problems – both real and imagined – with deploying HTTPS meant that it was a security feature that a web-savvy Hamlet, Prince of Denmark might have described with the words, “Though I am an online native, and to the manner born, it is a custom more honour’d in the breach than the observance.”
For most of the 2000s, many small and medium businesses simply ignored HTTPS, given that few people were refusing to use their websites for the lack of it.
HTTPS could end up being complex and costly to implement, and many companies found that their own IT security teams had dug themselves into a hole: the tools that provided security for one aspect of their network, such as scanning new files for malware, led to them suppressing security elsewhere, by not shielding their own and their customers’ web traffic from snoops and cybercriminals.
With many companies relying on everyone else not using HTTPS as part of their own cybersecurity regimen, the unspoken agreement seemed to be not to foist HTTPS on anyone else, if that could possibly be helped.
Many larger brands, such as social media sites that relied on active logins, and online shopping sites that relied on web-based purchases, went for a partial adoption of HTTPS, for example by enabling it only on login pages (where plaintext passwords could otherwise be sniffed out) and on payment pages (where credit card details and other PII was entered).
Sadly, this half-hearted approach produced a false sense of security, and it took some well-publicised cybersecurity scares to prove the dangers of this part-time use of HTTPS.
Fortunately, a spirited effort by cybersecurity activists, backed by a practical and effective non-profit project, led to a set of human-friendly tools and protocols that made it easy, and free, to activate HTTPS on almost any website.
Now read Part 2, where we into the second half of this fascinating story to find out how HTTPS became the choice for almost every website today…
Why not ask how SolCyber can help you do cybersecurity in the most human-friendly way? Don’t get stuck behind an ever-expanding convoy of security tools that leave you at the whim of policies and procedures that are dictated by the tools, even though they don’t suit your IT team, your colleagues, or your customers!
Paul Ducklin is a respected expert with more than 30 years of experience as a programmer, reverser, researcher and educator in the cybersecurity industry. Duck, as he is known, is also a globally respected writer, presenter and podcaster with an unmatched knack for explaining even the most complex technical issues in plain English. Read, learn, enjoy!
Featured image of padlocked door by Jornada Produtora via Unsplash.
By subscribing you agree to our Privacy Policy and provide consent to receive updates from our company.