XSS - What are Cross-Site Scripting Attacks?
Cross-Site Scripting Attacks (XSS Attacks) are amongst the most dangerous in web development. Here's how they work and how to defend.
Understanding Cross-Site Scripting (XSS) Attacks
Cross-Site Scripting (XSS) attacks are all about running JavaScript code on another user's machine.
This achieved by "injecting" some malicious JavaScript code into content that's going to be rendered for visitors of a website. Every visitor is then going to execute that malicious code and that's where the bad things start.
But first things first: How could such malicious code be injected?
Injecting Malicious Code
In the example shown in the above video and code snippet, you see that the user is able to enter a message and image url which is then both output on the page:
<section id="user-input"><form><div class="form-control"><label for="user-message">Your Message</label><textarea id="user-message" name="user-message"></textarea></div><div class="form-control"><label for="message-image">Message Image</label><input type="text" id="message-image" name="message-image" /></div><button type="submit">Send Message</button></form></section><section id="user-messages"><ul></ul></section>
// ...function renderMessages() {let messageItems = ''for (const message of userMessages) {messageItems = `${messageItems}<li class="message-item"><div class="message-image"><img src="${message.image}" alt="${message.text}"></div><p>${message.text}</p></li>`}userMessagesList.innerHTML = messageItems}// ...
(also check out the full code example)
The messages added by the user are in the end output by using innerHTML
.
innerHTML
takes a string and interprets it as HTML that's then being rendered to the screen.
So the above code example leads to a <li>
with an image and some text inside of it being rendered.
But what if the user now uses the form to enter the following message?
<script>alert('Hacked!');// ... do more bad things// e.g. send a fetch() request to steal data</script>
This would be output as part of the message via innerHTML
and therefore, the <script>
element would indeed be rendered by the browser.
But if you use the above example, you'll notice that no alert is shown. So it looks like the injected script code didn't actually execute.
And that's indeed the case.
Modern browsers protect you against this very basic form of XSS attacks. <script>
elements "injected" via innerHTML
are not being executed by browsers!
So this won't work.
But here's an approach that will work: Abuse the fact that the <img>
src
is set to some user input.
Keep in mind that we set the image like this in the JavaScript code:
// ...messageItems = `${messageItems}<li class="message-item"><div class="message-image"><img src="${message.image}" alt="${message.text}"></div><p>${message.text}</p></li>`// ...
In the above snippet, a simple string is built by using template literal syntax.
This string is then later handed off to innerHTML
.
What if we would manipulate message.image
such that it actually changes the to-be-rendered element entirely? And not just its src
.
Here's what a user could enter in the form (for the image url) to achieve this:
This might look weird but this in the end leads to this string being set via innerHTML
:
<li class="message-item"><div class="message-image"><imgsrc="invalid-page.com/no-image!jpg"onerror="alert('Hacked!')"alt="Test"/></div><p>Test</p></li>
Do you see the problem?
The whole <img>
was manipulated!
The attacker set the image src
to an invalid URL which will fail to load! And by setting onerror
(a valid attribute of <img>
!) we can define JavaScript code that should execute when the image fails to load.
So we force the image to fail loading and we provide the "remedy" by setting onerror
to our malicious code. Pretty clever...
In this case, we'll see the "Hacked!" alert but of course we could do worse thing with our injected JavaScript code.
Hacking Yourself vs Others
Thus far, with the above example, we're only hacking ourselves though. There's no server or databased involved, all the code only executes locally.
But of course that's just the case because it's a basic example, focusing on the frontend.
In reality, the user-generated content (i.e. message + image url) would be sent to a server and stored in some database.
Other users would then fetch this content, it would be rendered on the page in their browser and boom ... we hacked them!
The injected JavaScript code could do anything, for example steal authentication tokens (also see my dedicated article + video on that topic).
This is a huge problem!
And as you can see: It's not too difficult to add this malicious code.
How Can You Protect Your App?
Here's a simple yet important rule: Always sanitize user-generated content before storing and serving it!
"Sanitizing content" means that you want to remove all malicious parts that could be inside of user-generated content.
There are libraries that help you with that - e.g. this one for JavaScript/ Node. Similar libraries exist for other programming languages - you always want to check the package description to find out if it helps you in your project.
Sanitizing does not just help with XSS but also with SQL and NoSQL injection.
You should only store cleaned (= sanitized) content in your databases. By doing that, you'll ensure that you'll only serve secure content to your users.
In addition, you can look into escaping content in your client-side JavaScript code.
That means that you also have a sanitization step on the frontend - in addition to the one on the backend.
Modern frameworks like Angular, React.js or Vue have that built-in.
Client-side escaping is just a bonus though - you should really only store secure content in your database!
A Hidden Danger
But unfortunately, there also is another source of XSS attacks - beside unsanitized user-generated content.
Third-party JavaScript libraries that are included in your frontend project code!
In modern client-side applications, we typically use a lot of third-party libraries. From frameworks like Angular to utility libraries like lodash.
And the code that's included in those libraries also runs as part of your client-side code.
What if one of those libraries was compromised? What if it contained malicious code?
You wouldn't notice - and you would be in deep sh...
A compromised library that's included in your code is a huge problem.
Protecting Against Compromised Libraries
You can reduce the danger of including a dangerous third-party library.
For one, you of course may want to stick to the bigger, more popular and well-maintained libraries. Companies like Google (which is maintaining Angular for example) probably have no interest in breaking into your app/ users.
But even such popular libraries could theoretically contain security holes.
Angular might be a bad example because they have a full team working exclusively on Angular but if you think about other popular libraries, they often have just a small team (sometimes just one person) who's working on the library. And those teams might not even be doing that fulltime.
It could happen that a malicious pull request is sneaked into the code repository and all of a sudden a trusted library gets compromised.
Of course you can check the source code of the library you're using. This can be cumbersome but it's the only way of reaching 100% certainty that nothing's wrong with the library. Though, of course, you'll need to repeat this with every update and new version of the library...
Commands like npm audit
also help you. At least they surface known vulnerabilities. Unknown vulnerabilities will of course still be a danger.
Realistically, you probably won't be able to achieve 100% protection against XSS unless you write all code on your own.
But by adhering to the above mentioned practices, you can reduce the risk to a minimum. And you absolutey should because XSS attacks can be really bad!