Mobile MDR Has Arrived – Safeguard Your Execs from Zero-Day Threats Today.

Log in

Get Demo

Home

Blog

Surveillance versus security: web filtering in the spotlight

Breach Management Cybersecurity

Surveillance versus security: web filtering in the spotlight

Paul Ducklin

08/21/2024

Share this article:

If HTTPS provides true end-to-end encryption, how do web firewalls crack into your network traffic to do their filtering?

And if a legitimate firewall can do this when you want, what’s stopping random cybercriminals somewhere further down the line from doing it when you don’t?

To serve and protect

In a recent article, we investigated HTTPS, the technology that puts the padlock in your browser’s address bar in order to protect you from unwanted surveillance and interference while you’re online.

HTTPS, short for HTTP with security, provides what is known in the jargon as end-to-end encryption using a protocol called TLS (cybersecurity sure loves its acronyms and initialisms!), itself short for transport layer security.

Simply put, TLS takes a network connection that would normally go out in what’s known as plaintext form, raw and unscrambled, and packages it into an encrypted ‘communications tunnel’ until it reaches the server at the other end.

At this point, the encryption is stripped off and the plaintext of the network connection is delivered to the service you’re talking to.

With HTTPS, your browser still generates text-based commands such as GET /sample.html HTTP/1.1, which is plain old unencrypted HTTP, and the website at the other end sends back text-based HTTP replies such as HTTP/1.1 200 OK followed by the page content, or HTTP/1.1 404 NOT FOUND followed by an error message, so that the underlying protocol is much the same as it has been for decades.

But this raw data is only visible on the computer at each end of the link, which is why we say that the part between the network connection endpoints enjoys end-to-end encryption.

HTTPS is important because plain old HTTP, if used all the way from browser to server, sends your data across the internet in a way that is trivial to sniff out and snoop on.

The unencrypted HTTP connection below was captured from the network, after the data had left my computer but before it reached the server, using the popular open-source network analyzer Wireshark:

Surveillance versus security: web filtering in the spotlight - SolCyber

But introducing TLS for the across-the-internet sector of the journey, thus creating an HTTPS connection, means that that snoops, spies, scammers and any other inquisitive intermediaries along the way see what is mostly just shredded digital cabbage

There’s a modest cost in network performance terms, as you can see (4283 bytes of network data data with TLS versus 537 bytes without), but the web data exchanged is now shrouded from view:

Just to reiterate what we mentioned last time: the TLS-encrypted part of an HTTPS interaction secures the transportation of the data, providing not only confidentiality against surveillance but also protection against malevolent manipulation of the content along the way.

However (and this is vital to remember), TLS isn’t there to vet or validate the original content it is sending and receiving.

In other words, if a web server sends you a legitimate app, then TLS will prevent an intermediary from casually replacing it with malware before it reaches you.

But if a web server sends you malware, then TLS will not (and isn’t supposed to, given that its job is to shield what is sent so it can’t be modified) detect and warn you about that malware, or block it, or clean it up in transit.

Likewise, if you visit a fake news story or a phishing site, then HTTPS will precisely and protectively deliver that dangerous content directly into your browser, because the S in HTTPS exists to preserve data that was already generated and sent out, not to analyze it and pass judgement on it before accepting it for transmission.

What about security?

At this point, it’s worth asking the question, “How do web filters work if most of the browsing traffic passing through them is just pseudo-random scrambled data that gives little or nothing away?”

When HERE BE DRAGONS shows up as #YhBmh009~Ib and THIS IS MALWARE is intractably disguised as lF5B_h56bj=I@s1 until it reaches your computer, how can security software that isn’t running directly on your endpoint help to block unwanted content proactively?

After all, end-to-end encryption is meant to provide an impenetrable security tunnel that shields your data from inspection by anyone, without fear or favor, including cybercriminals, the government, your ISP, your VPN provider if you have one, and even your own IT department.

Indeed, TLS traffic protection isn’t just a matter of encryption, where each end agrees on a one-time encryption key for the current session.

TLS aims to reassure you not only that your traffic was encrypted and unmodified in transit, but also that it came from the site you expected in the first place. (As we me mentioned earlier, this is not the same as vouching for the safety of the actual content, merely an attestation as to its origin.)

Without this sort of verification of origin, anyone could serve up encrypted data under a banner such as “Genuinely from Wellknown Corp“, and convincingly distribute modified files as the real deal.

Even worse is that any operator anywhere along your network path could mount what’s known as a manipulator-in-the-middle attack, or MitM for short, and easily masquerade as any site that you choose to visit, like this:

Divert all your HTTPS traffic to an interception server that accepts and terminates your connection as though it were the real site.
Wait for you to reveal which site you want to visit, and keep your connection open without replying yet.
Open up a secure outbound TLS connection of its own to your chosen site, ready to fetch the actual content you wanted.
Reply to you, pretend to be the site you really wanted, and finish setting up a secure inbound TLS connection with you, ready to send back the content you want.
Wait for you to request the content you want from the genuine site, thinking you’re secure, and fetch that very content via secondary outbound connection, unknown to you.
Spy on, log, and optionally modify the data that comes back from the real site while you’re waiting for the reply.
Fraudulently relay the optionally modified data back to you as though it had arrived directly from the real site.

If web servers had no way to convince you that you really had reached the site you wanted, then encryption alone would not be enough to stop cybercriminals from setting up fake servers to impersonate them,

Explainer. Step 2 above, where you reveal which website name you want to visit in plaintext form before the encryption starts, is part of setting up a TLS connection. This text is known as SNI, short for server name indication, and is inserted because most contemporary cloud servers provide content for thousands or even millions of different customers and need to know which customer’s site will be answering the connection request. This is a bit like writing a private letter that you seal into an envelope instead of sending it openly by postcard: it won’t reach the right destination unless you put the recipient’s full address on the outside.

When you send an SNI string denoting that you want to download content from, say, example.com, the server at the other end sends what’s known as a TLS certificate that identifies itself as the example.com site.

If the SNI string and the name in the returned certificate don’t match, which could be the result of a genuine mistake, a server that has gone offline and been replaced with a placeholder, or a deliberate misdirection, your browser will helpfully warn you and refuse to go any further.

But anyone can create a certificate in any name they like, as we show here using the programming language Lua and a popular TLS library based on OpenSSL called luaossl:

All browsers, and any well-written apps, therefore require that you haven’t merely signed the certificate yourself, but that you have had it signed in turn by a so-called certificate authority, or CA, that the app or browser itself already considers trustworthy.

This chain of digital signatures often involves three stages, not just two: you generate a certificate claiming to represent example.com; the CA signs this with what’s known as an intermediate certificate; and the intermediate is signed by a trusted, top-level certificate from the same CA, or from another CA.

Operating systems and browsers typically maintain their own lists of ‘assumed-good’ top-level CAs, known as root CA certificates, and will automatically approve any website-level certificate that is vouched for by a chain of signatures that ends in a trusted root:

Simulation and interception

Even though the end-to-end encryption offered by HTTPS provides additional online safety and security for all of us, there are a few groups that actively seek to work around it.

Some have legitimate and ethical goals; others have evil intent.

IT departments, for example, often want to inspect HTTPS traffic with good intentions, such as blocking phishing scams and other known-bad web pages, detecting and stopping rogue downloads, and preventing the unexpected upload of data that isn’t supposed to leave the company.

Many governments are also keen on having at least some way of keeping tabs on their citizens’ browsing, some under the banner of investigating and preventing online crime when duly warranted; others with the authoritarian goal of spying on, censoring or otherwise controlling the lives of the populace.

And, as you can imagine, cybercrooks who can masquerade as legitimate sites that pass the HTTPS ‘encryption and certification test’ can lure their victims into a thoroughly false sense of security, and much more easily convince them to download and install malware, to enter personal information into phishing sites, or to read and believe fake news.

Just as importantly, cybercriminals who can crack open end-to-end TLS encryption tunnels without raising any alarms or popping up any warnings may be able keep track of your business operations in considerable detail.

They could be reading your emails, stealing and selling on data belonging to your staff and customers, running off with your intellectual property, misdirecting payments to or from the company, and more, without implanting any active malware on your computers.

But how can they do so, given that this is exactly what TLS is supposed to prevent?

The certificate chain diagrams above suggest three annoyingly obvious ways:

Hack, bribe or otherwise co-opt existing root CAs to sign bogus certificates. Many widely-accepted root CAs are officially operated by governments or public sector bodies, and in some countries, even private companies may have little choice but to comply with official government orders to sign certificates that wouldn’t naturally pass the ‘ownership test’. (Fortunately, compelling CAs to sign obviously bogus certificates in huge numbers is not quite as trivial as it sounds, given that the root CA lists used by the most popular browsers and operating systems are subject to scrutiny from a wide range of global sources, public and private, that are unlikely to collude.)
Steal or co-opt legitimate organization’s private keys. Stealing a company’s private keys is often no easier, and sometimes more difficult, than simply compromising its website in the first place and implanting bogus content in what’s known as a supply chain attack, thus getting the company itself to serve all the rogue data ‘for free’. Nevertheless, a stolen private key is enough for an imposter to create a cloned website on a new server, redirect unsuspecting users to it, and then present ‘proof’ that the site is genuine.
Hack, bribe or compel users to add trusted root certificates to their own computers. This requires someone, either users themselves, legitimate sysadmins, or attackers, to perform a potentially unwanted operation on every device they want to affect. However, once you install a rogue root CA, no active malware is required: any certificate signed by the unofficial root CA will subsequently be trusted by your computer or your browser.

MiTM attacks that aren’t

Most legitimate web filtering tools rely on some version of the third trick above.

A web filtering firewall, for example, will typically generate its own clearly-labelled corporate interception certificate when it is first activated, and then rely on the company’s sysadmins to install this custom root certificate on all their users’ computers, or require remote users to download and install the certificate, before the firewall will allow them to browse beyond the company network.

This is an effective way of scanning encrypted web traffic to detect obvious violations of company policy or to head off unexpected risks, and when done sensitively with each user’s knowledge and consent, is generally considered both ethical and legal.

With an interception CA available, legitimate web filters can pull off what is effectively a MitM (manipulator-in-the-middle) attack as described above, although the description “MitM” is usually reserved for unscrupulous uses of this technique, with cybersecurity vendors preferring marketing-friendly euphemisms including keybridging, decrypt-recrypt, and middleboxing.

When performing this sort of keybridging operation, a web filter can get away with Step 4 in the MitM list above (replying to you and pretending to be the site you thought you had visited directly) because it can automatically generate a fake certificate that pretends to have been issued by the site at the other end, and sign the fake server certificate using its own interception CA.

The middlebox can’t get away with submitting the real certificate that it received from the real website at the far end of the split-in-the-middle connection, because the middlebox doesn’t have the private key for the real certificate, which is why the ‘fake certificate’ ruse is needed.

But the artificial certificate from the middlebox will work fine, without errors or warnings, because your browser trusts the interception CA, and because the middlebox has the right private key to vouch for the fake certificate to your browser.

For bad instead of good

Unfortunately, traffic interception CAs aren’t always obvious once they’re installed: unlike security software, surveillance tools or malware, there are no background processes, hidden apps, or suspicious behavioral triggers that might give away their presence automatically.

Cybercriminals who can trick or coerce you or your sysadmins into installing a fraudulent interception CA can thereafter carry out exactly the same sort of MitM shenanigans, but for bad instead of for good, and theoretically trick you into thinking that their fraudulent servers are the real deal.

Similarly, a criminal who can break into or exploit a security vulnerability on a web filtering firewall that you already trust may be able to steal the private key for that middlebox’s interception certificate, which would allow them to impersonate the web filter, and therefore carry out an equally devastating attack.

Simply put, getting hold of a trusted middlebox’s private key is essentially equivalent to getting hold of the private key of every other server on the internet.

What to do?

Never accept add-on trusted CA certificates, whether they’re for verifying apps, scanning web traffic or securing email, unless you are certain that they are ethically-created interception certificates authorized by your own IT team. Avoid apps or ‘free’ online services that say they need to perform HTTPS interception under the guise of improving your searching, your internet speed or some other explanatory ruse.
Learn how to review the lists of trusted certificate authorities on your own computer, and learn what sort of prompts will pop up to warn you if you are tricked into installing a rogue certificate by mistake (ask your IT team for advice). If you see something you aren’t sure about, say something. if your company uses web filtering firewalls, familiarise yourself with how its own middlebox interception certificates identify themselves.
Learn how to review the certificate signing chain in your browser when you visit critically important servers. Even if cybercriminals trick you into installing a fraudulent CA certificate with a likely-sounding name, they can’t clone all the cryptographic identifiers that would show up in a legitimate certificate unless they have the original CA’s private key. Some browsers automatically warn you if the CA certificate used to vouch for a site you visit doesn’t come from their own trusted list, so learn how to check for that.
If you’re a sysadmin looking after web filtering on a network, be sympathetic and humane to your users. Make sure they know how to recognise the company’s official middlebox certificate, so they can tell when their traffic is being legitimately intercepted. Consider exempting some of their low-risk browsing from interception altogether, for example if they visit a well-known bank or communicate with a trusted healthcare provider for personal reasons. That way, they will have true end-to-end encryption on selected connections, and you won’t end up inadvertently collecting and logging personal data you neither need nor want.
Be open and honest with your users about exactly what sort of web traffic you intend to monitor, what might happen to any suspicious data you detect (for example, will unknown files be shared automatically with security vendors?), how detailed your logs will be, and how long you plan to keep them.

Why not ask how SolCyber can help you do cybersecurity in the most human-friendly way? Don’t get stuck behind an ever-expanding convoy of security tools that leave you at the whim of policies and procedures that are dictated by the tools, even though they don’t suit your IT team, your colleagues, or your customers!

More About Duck

Paul Ducklin is a respected expert with more than 30 years of experience as a programmer, reverser, researcher and educator in the cybersecurity industry. Duck, as he is known, is also a globally respected writer, presenter and podcaster with an unmatched knack for explaining even the most complex technical issues in plain English. Read, learn, enjoy!

Featured image by Tobias Tullius via Unsplash.

Paul Ducklin

08/21/2024

Share this article:

The world doesn’t need another traditional MSSP  or MDR or XDR.

What it requires is practicality and reason.

tour the product

Choose identity-first managed security.

We start with identity and end with transparency — protecting where attacks begin and keeping you informed, with as much visibility as you want. No black boxes, just clear, expert-driven security.

Get Started

No more paying for useless bells and whistles.

No more time wasted on endless security alerts.

No more juggling multiple technologies and contracts.

Follow us!

SOLUTIONS

Mobile Protection Foundational Coverage XDR++ MDR++ Security Monitoring Extended Coverage Cyber Insurance+ Program Why SolCyber Pricing

WHO WE HELP

From Scratch Scaling Up Total Protection Overburdened

PROMO

Refer-a-Peer
Program Slogans & Grumbles Winners

PORTAL

Log in

CONNECT

Contact About SolCyber Partnerships Careers Customer Portal

LEARN

Case Studies Blog Tales from an Armadilllo Tales From The SOC News Downloads

Subscribe

Join our newsletter to stay up to date on features and releases.

By subscribing you agree to our Privacy Policy and provide consent to receive updates from our company.

CONTACT

©

2025

|

Made with

by

Jason Pittock

Privacy Statement Cookies Settings

Surveillance versus security: web filtering in the spotlight

To serve and protect

What about security?

Simulation and interception

MiTM attacks that aren’t

For bad instead of good