No one measures the value of their website by how many visitors it keeps out, or how many sales leads it blocks, or how many promotional videos it stops prospects from watching. Web pages are supposed to invite network connections, read in untrusted data, and generate cool content in reply.
Learn how cybercrooks turn that workflow on its head to get your server to do their dirty work for them through the use of webshells…
Cybersecurity is full of intriguing initialisms, acronyms and jargon.
Unfortunately, lots of them turn up in reports, advisories and warnings under the assumption that you not only already know what they mean, but also understand all the complexities and nuances of how to deal with them.
For example, you might read that “CISA has just released an advisory about the TTPs being used in a current attack, warning that bad actors are deploying webshells to take control of your network.”
But what does this mean in plain English?
CISA, of course, is shorthand that’s a lot easier and quicker to say than Cybersecurity and Infrastructure Security Agency, part of the US Department of Homeland Security; TTPs are tools, techniques and procedures, or more simply put, ‘how the crooks did it’; bad actors are cybercriminals (not thespians with little ability); and as for webshells…
…well, that’s what we’re here to discuss, because they’re a surprisingly simple but hard-to-spot trick by which attackers can open themselves a sneaky backdoor into your network.
Loosely speaking, if cybercriminals can upload just one harmless-looking file onto your web server, or make just one tiny change to a file that’s already there, they might later be able wander into your network at will by using nothing more suspicious than a regular web browser.
The obvious cybersecurity problem with web servers is that, unlike most of your business network (and probably unlike your entire home network), they’re quite deliberately configured to listen for and accept network connections from anyone who wants to pay a visit.
No one measures the success of their website by how many prospects and customers it keeps out or judges its value by how good it is at preventing visitors from reading articles, because websites are almost always a “Come one! Come all!” proposition.
In contrast, your home network is probably hooked up to the internet via a single cable plugged into a router that shares that single connection between all the users and devices in your household.
Because all outbound traffic goes through that single point, most home routers, by default at least, automatically suppress all inbound connection requests simply because they have no idea which computer inside the network would be the right one to deal with the connection.
If it’s someone trying to send you an email or browse to a web page, for example, the router doesn’t know whether you even have a mail server or a website, let alone which computer it would be running on.
(This traffic management trick is known as NAT, short for network address translation, because all your outbound requests are rewritten as if they originated from your router, and although this just happens to to have some security benefits, it’s security by accident rather than security by design, whatever anyone tries to tell you.)
As we mentioned in a previous article [*], one solution that criminals found to this inadvertent roadblock, which turned up when home routers became common, was to switch to so-called bots, also known as zombie malware.
[*] https://solcyber.com/why-malware-descriptions-arent-what-they-used-to-be/
Bots simply turned the whole client-server model inside-out.
Instead of sitting inside your network and waiting for rogue connections to arrive from the outside to tell them what to do, bots regularly and quietly connect out by themselves to fetch the latest list of malicious instructions.
In contrast, when you decide to expose a computer inside your network to anyone and everyone from the outside, you’re creating a possible jumping-off point that cybercriminals can target without using the sort of call-home-style programming they’d need in zombie-based malware.
As a result, online services such as websites, whether they’re running in the cloud, in your own server room, or merely in the cupboard under the stairs, are usually protected in a different way to a typical laptop or phone.
Your laptop is probably shielded from most if not all inbound probes, but very liberally enabled and equipped for all sorts of outbound connections, from browsing, email and instant messaging to file transfer, gaming and video conferencing.
But your web server, despite being open to connections from outside, is unlikely to be used in as many different ways as your laptop.
On your laptop, you might be gaming, emailing, browsing, streaming, video chatting, reading newly-downloaded documents, and (for all you know) providing a hidey-hole for zombie malware, all at the same time.
That makes it easier for crooks to hide in plain sight, as it were, under cover of the wide range of programmatic and network activity that is going on all the time.
On a well-managed web server, however, you can usually predict what sort of activities will show up in your logs, so you probably back yourself, in theory at least, to spot any anomalous or unexpected software behaviour more easily than you would on a general-purpose laptop.
The bad news is that there’s a surprisingly easy way for crooks to get zombie-like powers over your website in an innocent-looking way.
They don’t need to use the inside-out-malware trick of calling home for instructions, or of installing and firing up additional networking software on your server, or of generating unusual network traffic.
The problem (though like many things in IT it is also regarded as a feature), is something called server-side scripting, which most web servers support and many websites make use of.
Server-side scripts are used to adapt or adjust your site’s appearance for each visitor: miniature programs, embedded in the web page files on your server, are triggered every time the page is viewed, and used to decide what appears in the data that gets sent back.
If you think that sounds like a recipe for trouble, you’re quite right.
For example, if a visitor requests a file called, say, bookreview.html
from a typical Linux web server, the server will usually just read in a file of that name and send its contents back to the other end, using the extension .html
as a marker to denote that the file is pure HTML, ready for unmodified use.
If the user requests bookreview.php
instead, the server won’t send back the text exactly as it appears in the file, but will first pass the file to the PHP subsystem (PHP is a powerful programming language not entirely unlike Perl or Python), which will run the file as a script that generates the content to send back, which will often be different every time.
The visitor never actually gets to see the PHP program that was stored in the file bookreview.php;
they just see the output of the server-side script instead.
In practice, most server-side scripts look like regular HTML, but with snippets of PHP code embedded inside that get stripped out and processed as scripts.
The HTML parts are sent back unmodified, while the PHP parts are replaced with the content that each embedded script produced.
An HTML file like this would be sent out as-is, where the text in angle brackets is HTML code representing instructions to the browser at the other end on how to display it:
But with slightly different text markers in it, a server-side script can include embedded PHP commands, tagged not with <…>
but instead with <?php...?>
:
The PHP mini-program echo date("l");
inside the script markers simply prints out the current day as a text string, so the HTML that actually gets sent back will have everything between the <?php...?>
markers replaced, like this:
The pure HTML tags and the regular text get reproduced verbatim, but visitors won’t know, and indeed can’t tell, that the HTML they are seeing didn’t come directly from a static file on the server.
Windows Internet Information Services, better-known as IIS, calls server-side scripts by the equally self-explanatory name Active Server Pages (or ASP.NET in their modern form), which are usually denoted with .asp
or .aspx
at the end of the filename instead of .php
.
The magic script tags are slightly different, with markers such @{…}
and <%…%>
instead, and the programming languages used for scripting are typically Visual Basic Script, JavaScript or C# rather than PHP, but the underlying principles are the same.
There are two worrying security problems here.
Firstly, if the file served up to a visitor contains a rogue script that shouldn’t be there, the HTML that actually gets sent out won’t contain any trace of the server-side script at all, so there is nothing even for well-informed visitors to your site (or for your own web testers) to notice in their browsers.
Secondly, unlike HTML, which gets processed and rendered at the other end of the network connection in the visitor’s browser, and therefore puts them at risk if something goes wrong, the server-side scripts run right inside your network, on the web server itself, with the same privileges and network access as the web server software.
You can probably guess where this is going, and you might be surprised to find out just how simple and hard to spot a rogue server-side script can be.
Here’s a minimalist example, based on two features built into PHP. (Windows server scripts offer similar power combined with similar danger, albeit that they usually use languages other than PHP.)
The first feature that attackers can abuse is that when a web request arrives with what’s known as a query string at the end (text following a question mark in the URL), PHP automatically extracts the query values into a variable called $_GET
, with each query item extracted individually.
Requesting a URL like this:
…would automatically make the following PHP variables available to the server-side script ourscript.php
:
The next feature that is handy for attackers is the dangerously powerful PHP function eval()
, short for ‘evaluate’, which takes any text you send into it, compiles it as PHP code, and runs it as a brand new PHP program.
This means that attackers can build new commands at run-time from data that came in from outside your network.
In the script example above, where we printed out the day in text form, we used the hard-coded PHP code snippet:
But if we had written this in our server-side script instead:
…and then we visited the page using a URL such as:
…then the server-side scripting engine would process the request like this:
Simply put, we’ve just come up with a really simple way of tricking the server into running any code we like, simply by visiting a URL that has ?code=our_rogue_PHP;
on the end of it.
We could get a bit trickier, of course, and give the $_GET[…]
item that triggers the server-side code execution a name that we know, but that other people aren’t likely to guess, so that it acts as a sort of ‘attack password’, like this:
This way, even if we sneak our rogue script into a file that already relies on query strings, as in the ?name=duck&blog=solcyber
example above, the page will still behave normally when visitors use the sort of URLs that are expected, but we can subvert the script whenever we like by adding the secret incantation ?A367E8F4=rogue_code
instead.
In fact, there are lots of other ways to disguise the trigger for a simple webshell, for example by using the built-in PHP variable $_COOKIE
, which extracts named cookies from the web request in the same way that $_GET
extracts items from the query string.
What we have shown here, without being overly technical, is how easy it is to create what is essentially a miniature bot or zombie file that sits innocently on a web server, waiting for secret instructions to arrive in web requests that look harmless.
We’ve also seen how easy it is to disguise these instructions, in a way that makes it hard or impossible to know what to look for in advance, or even in retrospect, if all you have are web logs and not the program code of the webshell itself.
And this sort of rogue webshell script can easily be engineered not to interfere with the usual behaviour of the site, for example by hiding its remote commands in unused query items in the URL or sending them in the HTTP headers as innocent and apparently irrelevant cookies that your server would usually just ignore.
Worse still, unlike the sort of bots or zombie malware that might infect a laptop computer, webshells don’t need to run in the background and keep calling home to see if their handlers have any malevolent instructions for them to follow.
Webshells are typically activated on demand simply by visiting the right URL on an infected site, with the right sort of secret data squirrelled away in the web request, so that cybercriminals can even use regular browsers to trigger their attacks.
Clearly, implanting rogue files onto a web server (or modifying existing files, for that matter), never ends well, even if what’s implanted is not executable code in the form of a webshell or other malware.
After all, an attacker who could unlawfully change even a single line of text in a single HTML file could cause you to start churning out fake news, making obviously incorrect claims, spewing unacceptable insults, or foisting rogue script code onto your site visitors, which would not reflect well on your organisation at all.
But webshells are much worse, because they make ideal beachheads for criminals to initiate attacks in the heart of your network; they are often trivial to modify, even by technically unsophisticated crooks, so they are tricky to search for reliably; and they typically hide in the midst of a sea of server files that are littered with similar-looking but legitimate server-side scripts of their own.
As we have seen, a basic but general-purpose webshell can even be stripped down to a few tens of text characters, small enough for a rogue insider to tap in at an inattentive developer’s unattended keyboard in a few unobtrusive moments.
Here are some tips to consider:
Your hosting company may do a good job of protecting your site from incidents triggered by other customers on the same infrastructure, because that’s part of generally protecting their own business and network.
But that’s not the same as the human touch of a dedicated managed security service provider who understands your business, helps you manage any specific risks that apply to you and your own customers, knows what to look out for on your behalf, and has the experience to jump in and act promptly if they spot it.
PS. If there are any knotty topics you’re keen to see us cover, from malware analysis and exploit explanation all the way to cryptographic correctness and secure coding, please let us know. DM us on social media, or email the writing team directly at amos@solcyber.com.
More About Duck
Paul Ducklin is a respected expert with more than 30 years of experience as a programmer, reverser, researcher and educator in the cybersecurity industry. Duck, as he is known, is also a globally respected writer, presenter and podcaster with an unmatched knack for explaining even the most complex technical issues in plain English. Read, learn, enjoy!