Synopsis

Let’s examine, understand, and learn how to prevent one of the most common attacks people use to ‘hack’ websites.

Setup

We’ll start by setting up an ultra-lightweight PHP page (server side) connected to a MySQL database, simulating a web application.

We need mysql and php. On macOS:

 $ brew install mysql && brew install php

On Linux:

Install mysql via the instructions at MySQL’s instalation docs.
Install php via the following command:

apt-get install php5-cli

Now we need to create two files, our server, and a SQL database setup script.

The server side PHP script:

 <?php 
 $link = mysqli_connect("127.0.0.1", "root", "", "main"); 
 $sql = "SELECT * FROM users;"; 
 $result = mysqli_query($link, $sql); 
 while($row = $result->fetch_assoc()) { 
     echo $row["first_name"] . ", " . $row["last_name"] . ", " . $row["email"] . "\<br\>"; 
 }
 
 mysqli_close($link); 
 ?>

Save as users.php. And the DB setup script:

 CREATE DATABASE main; 
 USE main;
 
 CREATE TABLE `users` ( 
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT, 
 `first_name` varchar(30) NOT NULL DEFAULT '', 
 `last_name` varchar(30) NOT NULL DEFAULT '', 
 `email` varchar(30) NOT NULL DEFAULT '', 
 `password` varchar(60) NOT NULL DEFAULT '', 
 PRIMARY KEY (`id`), 
 UNIQUE KEY `email` (`email`) 
 ) ENGINE=InnoDB AUTO_INCREMENT=18 DEFAULT CHARSET=utf8;
 
 INSERT INTO users (first_name,last_name,email,password) VALUES ('Jon','Doe','jon@gmail.com','pass');

To run all of this, we do the following commands, from the directory where the PHP script is saved:

 $ mysqld 
 $ mysql -h localhost -u root 
 > source UserTable.sql 
 > exit 
 $ php -S 0.0.0.0:3000

And we’re good to go. Verify by visiting http://localhost:3000/users.php in your browser. You should see our only result:

Jon, Doe, jon@gmail.com

Now we can begin.

Accepting User Input

The anatomy of our server, so far, is as follows: The database holds data (added by our SQL script), retrieved via SQL queries in a PHP script, which in turn sends the results of those queries off to clients that request data from our web server.

So far so good. This is roughly how most web applications function; often the flow of data is the same, only the technologies acting as a conduit for that data change. An injection attack is impossible at this point, as nowhere in this flow do we accept user input, so there’s no way for a user to dictate what happens on our server. The only “input” a user has is whether or not to request data at all.

In the above server, we have a table of users. Right now, the whole table is returned at once. If we add the ability to request only users with a client specified name, we might make the application more useful, but we also introduce an attack vector, via user input that we accept as a criteria for returning results from the users table.

Let’s do two things to our server:

  1. Add more users to our table
  2. Add the ability to query users by name.

Add the following at the end of our SQL script:

 INSERT INTO users (first_name,last_name,email,password) VALUES ('Jane','Doe','jane@gmail.com','pa$$'); 
 INSERT INTO users (first_name,last_name,email,password) VALUES ('Jim','Doe','jim@gmail.com','Pa$5'); 
 INSERT INTO users (first_name,last_name,email,password) VALUES ('Janice','Doe','janice@gmail.com','P@7$');

We’ll also make the server read a GET parameter when the client makes a request and use that to determine the first name of the person we should retrieve. This is accomplished by changing our SQL statement to the following:

 $sql = "SELECT * FROM users WHERE first_name='" . $_GET['first_name'] . "';" ;

Notice that we 1) Parse the requested URL for it’s GET parameters with the $_GET variable and 2) Use a specific value from the query parameters inside our SQL statement to retrieve only certain rows from our database.

To make these things take effect, we’ll execute a similar sequence of commands as before, but this time wipe out the database before re-running our SQL script.

 $ mysql -h localhost -u root 
 > DROP DATABASE main; 
 > source UserTable.sql 
 > exit 
 $ php -S 0.0.0.0:3000

Now, if we visit localhost:3000/users.php?first_name=janice, we see

Janice, Doe, janice@gmail.com

if we visit localhost:3000/users.php?first_name=bob, we see nothing.

SQL Injection

The problem here is that you could put practically anything in the GET parameter for first_name, including special characters, special SQL keywords and other values like TRUE and FALSE.

What if we request localhost:3000/users.php?first_name=jan'ice?

Fatal error: Call to a member function fetch_assoc() on a non-object in users.php on line 7

Interesting. Looks like when we make our request, PHP throws an error. This is a dead giveaway that some user input field on a website is vulnerable to an injection attack. So what’s going on here? We take whatever follows the equals sign after the first_name parameter and drop it into our SQL query. So the actual SQL query generated for the above request is as follows:

SELECT * FROM users WHERE first_name=' + FIRST NAME PARAMETER +  ';

So if first_name=jan'ice, then the final query will be

SELECT * FROM users WHERE first_name='jan'ice';

As we can see, there’s an unterminated quote sequence. Hence when we run the SQL query, we get nothing back, and turning the result of the SQL query into an associative array fails.

How do we use this to our advantage? Well, the WHERE clause in our SQL statement is what we control. What if we set it to something that’s always true? That would allow us to return all results from the database, possibly disclosing user records we’re not supposed to see.

Try again with first_name=' OR 1=1 OR '.

Why is this value special? Let’s have a look at the generated SQL query:

SELECT * FROM users WHERE first_name='' OR 1=1 OR '';

When SQL goes through the rows of our DB to decide whether the row matches our WHERE clause criteria, it’s evaluating the following, say, for Janice’s record:

Return this record? in DB in Query ----------------- -------------------- first_name=janice first_name= '' or 1=1 or '' first_name = '' FALSE 1=1 TRUE '' FALSE

Breaking down the SQL logic, we see that 1=1 will always evaluate to true, no matter what row we’re looking at. So by setting our GET parameter to ' OR 1=1 OR ' we successfully return everything in the table.

This is a contrived example given the above PHP script, but many real-world websites do the following to authenticate users:

 $sql = "SELECT * FROM users WHERE username='" . $_POST['username'] . "' AND password='" . $_POST['password'] . "';" ; 
 $result = mysqli_query($link, $sql);
 
 if (mysql_num_rows($result) > 0) { 
     // User is authenticated... 
 } else { 
     // User not found 
 }

In this case, the script looks for a row in the user table with a username equal to an unauthenticated client’s username, and then verifies that the row’s and the client’s passwords match. If we pass in our username as

' OR 1=1 OR '

then, as we established above, all rows will be returned, authenticating the user, bypassing the intended security mechanism.

Other exploits like this abuse the comment syntax in SQL queries, flow control statements like IF or WHEN, or character encodings.

Fixing the Hole

This example of a SQL injection attack is really simple to fix; and in general most SQL injection attacks are easily resolved by escaping user input, as we will see. So why do we read article headlines mentioning millions of websites being vulnerable to SQL injection attacks? Laziness, ignorance, and reliance on outdated technologies, mostly.

For defeating SQL injection attacks, there a couple of options, but essentially both come down to preventing the abuse of SQL syntax in a user formulated input. One approach is as follows, changing our SQL statement to:

 $sql = "SELECT * FROM users WHERE first_name='" . mysqli_real_escape_string($link,$_GET['first_name']) . "';" ;

This approach simply takes the user input string and replaces the special characters with special characters that SQL recognizes as specifically user input characters, instead of valid SQL symbols. Specifically, it takes the ' character to \' so that SQL isn’t tricked into thinking the user input represents multiple statements.

It should be noted that if a node backend was used in place of a PHP script, this ‘escaping’ protection strategy would be done automatically.

The even better approach, however, is to use a construct called a prepared statement, which, at a high level, programmatically separates SQL queries from the arguments passed to them. Put another way, the value that you insert into the SQL query from user input is taken as an object evaluated separately from the SQL query, removing the possibility of muddling user input data with SQL query syntax. These prepared statements require slightly more invasive code changes, but are even more robust to attack than merely escaping user input correctly.

Second Order SQL Injection Attacks

The esoteric cousin to normal SQL injection, second order injection attacks are much harder to overcome, but thankfully are hard to implement and rely on some SQL injection security being implemented, but not complete security, so the window of opportunity for such an attack is rare.

In a second order injection attack, malicious user input is saved in a database, escaped, entered as a prepared statement or otherwise. While not used outright for an exploit, if a cleverly crafted value is saved in a database and reused later in an unsafe manner, it could still be used to exploit a vulnerability in SQL query security. For instance, if one created an account on a website with the username

' OR 1=1 OR '

it might be correctly entered into the database without problem, but be called upon in another part of the website, say, concatenated into a different SQL query string allowing it to be used maliciously, as intended by the attacker.

The way to prevent second order injection attacks is simple: escape or use prepared statements everywhere you perform SQL queries. Don’t assume that because information made it into your database, it can be trusted.

More Reading & Other Resources