How to Write Secure Software for the Web

Web application security is critical. Security holes in the web applications you write can result in your site being defaced, which is merely very embarrassing. It could result in your user’s information being stolen, which is annoying andmight land you in legal hot water in your jurisdiction. Or it could result in your host being used for spamming, distributed denial of service attacks, or spreading malicious software, all of which makes the Internet a worse place for everyone else. Some simple steps, and the correct mindset, can reduce the risk of this happening because of the applications you have written.

Steps

1. Avoid SQL injections. SQL injections are possibly the most common way in which web software is compromised today. The example from Wikipedia follows. Consider the following line of pseudocode.

statement := “SELECT * FROM userinfo WHERE id = ” + a_variable + “;”

This works fine if “a_variable” can be trusted. However, if it’s a user-supplied variable, an attacker could supply this:

1;DROP TABLE users

Which would result in this query being executed:

SELECT * FROM userinfo WHERE id=1;DROP TABLE users;

Which is obviously a terrible thing to have. The solution is to sanitise any user-supplied variables, religiously. PHP has the mysql_real_escape_string function for MySQL. Other database libraries for other programming have the same thing. Use them, and you’ll be much safer. There are also object-relational mappers (like Django for Python) that save you from having to write SQL code at all, and take care of escaping strings before passing them to your database; consider using one of these.

2. Never open any filename based on a parameter which the user supplies. For example, if you were displaying a file to a user, you might be tempted to have a URL likedisplay.php?file=something.txt, and in your PHP code use this:

require(“/www/awesomeapp/text-files/” . $_GET["file"]);

This is extremely dangerous, as it will allow them to use sequences of “../” (taking them to the parent directory) to include any file on the file system, for example by going todisplay.php?file=../../../../../etc/passwd.

At best, this will allow them to gather information about your server’s configuration, which could provide information useful for a different attack. At worst, if PHP’s require or includefunctions (or an equivalent in your language) are used to include the contents of the file, it can allow code execution (by manipulating the User-Agent header then including the file/proc/self/environ).

The solution to this is to either find some other way to include the file that does not do so based on user input, or very carefully sanitise the supplied file name. Check each character for validity, and discard any forbidden ones using the principle that the characters which are not allowed are forbidden, and not by simply removing harmful ones. In other words, don’t try to replace bad characters in a string. Make a list of the ones that are explicitly allowed and scan the string for those, removing anything that is not in the list of allowed ones. That’s 1) simpler 2) defends against attack vectors you hadn’t thought of.

3. Sanitise any data that you write to a page. Your code might accept a parameter called “person” (i.e. index.php?person=Bob, and your code would contain something like this:

<h1>Hello, <?php echo $_GET["person"];?></h1>

This allows an attacker to write arbitrary JavaScript to the page, meaning that if they convinced somebody to simply follow a link to the site, they could get them to cause an action with a side-effect (for example, posting a comment, or submitting any other form). Replace any known-bad characters (quote marks, <, and &gt with their equivalent HTML entity before writing it to a page.

Many templating engines will do this for you automatically, and will only allow HTML characters in an outputted variable if they are explicitly declared safe. You may want to consider using one.

If you need to allow some limited HTML (for example, for allowing text formatting in comments), sanitise such HTML to only allow tags known to be safe and have minor side-effects (such as <em>), discard all others, and discard any unwanted HTML attributes. Once again, the principle should be only allow what is explicitly allowed, and strip everything else. (Following this principle could have prevented one notorious WordPress vulnerability, where it sanitised CSS within a “style” attribute, but left it untouched if the attribute name was in upper case.)

4. Never pass a user-supplied variable to a shell. For example, you might have written a script to send a ping to a given host. It would take a parameter called host (for example,ping.php?host=127.0.0.1), and so you might be tempted to do this:

system(“ping ” . $_GET["host"]);

This is extremely dangerous, because by supplying a “host” parameter with a pipe symbol or a semicolon, it allows them to execute any command on your system (for example, giving a host of “127.0.0.1 | wget http://www.wikihow.com/“).

The solution to this is to avoid using the system function where possible; invoke the external program some other way. pcntl_exec is a better idea for PHP. If you must usesystem, carefully sanitise any user-supplied data before passing it to the shell. (An example for our “ping” script would be to check each character to ensure that it was either a number, letter, or period, and discard any character which is anything else.)

5. Use functions like PHP’s eval very, very carefully. If the string includes a user-supplied variable, then that will allow them to execute arbitrary code. In fact, there is almostnever a reason to use this; if you find yourself using it, your code is probably terrible and you should re-think it.

6. Check the referrer on all HTTP POST requests that have side-effects. Make sure that the referring site is your site; otherwise, it could have simply been a user that clicked a button on a form on another website. (You may or may not want to ensure that the referring site is non-blank. The referrer could be blank if the attacker-controlled web page uses secure HTTP and yours does not.). Another trick when using a form is generating a unique code, and storing it in both a session variable and in a hidden field in the form. When submitting, check if the value of the hidden field are equal to the one in the session variable.

7. Never allow an HTTP GET request to have a side-effect. For example, never write to a database using parameters supplied in a GET request. Remember that a GET request could be a user of your site who was tricked into clicking a link.

8. Verify credentials at every step. Here’s an example:

1. The user requests to carry out a certain action.
2. You check to see if they have permission to carry out that action, displaying a “permission denied” page if they do not.
3. You display a “Are you sure you want to do this?” page with the action expressed in hidden form fields.
4. That page then posts to a script which carries out the action.

The trick is, if you only verify their permission to carry out those actions in step 2, but not at the script being posted to in step 4, this counts on them not being able to alter the form given to them in step 3 (which is trust greatly misplaced), or even going to step 4 directly.

9. Don’t store passwords as plain text. This is a mitigating measure in case your site is compromised. Since many users re-use passwords, it could mean that all their other online accounts get compromised. The standard counter-measure to this is to store acryptographic hash (such as using the SHA-2 algorithm) of the sum of a few bytes of random data plus the password. This will make it possible to verify whether a supplied password is correct, without storing the password itself.

(For future-proofing, it might be a good idea to store the type of algorithm used as well, in case you need to migrate to a different hash algorithm, should severe weaknesses be found in current ones. This happened to MD5 and SHA-1, and may yet happen to today’s most trusted hash algorithms.)

10. When using a database, do not store database login information directly in the root folder of your application. Instead, store them in a different folder and then either 1) deny access to the folder using your web server’s settings or .htaccess 2) store the file outside of the document root. Then include the file; your application is not affected by your web server’s permissions. This will stop people from going to http://your-awesome-site.example.com/config.ini to find out your database login information (but it won’t stop an attacker who has found a file inclusion vulnerability as in step 3, or a shell code execution exploit as in step 4).

11. Don’t depend on robots.txt to hide sensitive areas of your site. There’s sometimes good reason to do this (for example, if your entire site is sensitive and you don’t want it indexed by search engines). But remember thatone of the first things an attacker hell-bent on compromising your site will do is to look at your robots.txt to see what you are hiding. A better way of doing this is to use a meta tag on the pages you don’t want to be indexed (for example, your administrative back-end pages). For example:

<meta name=”robots” content=”noindex, nofollow” />

12. Get into the correct mindset. This article isn’t intended to be a comprehensive listing of every security problem that could affect your site, but it does give some real world examples of applying some important principles:

  • Never trust any information that can be supplied by the user. Indeed, assume that the user is an attacker, and treat any user-supplied data accordingly.
  • When sanitising data, only allow that which is safe, and strip anything else. This defends against bad input that you hadn’t thought of when you were writing the code. An example of this would be if you developed your software on and for a Unix-like system, and defended against file-inclusion vulnerabilities by removing any forward slashes from a provided file name. This would work OK, until someone deployed your code on a Windows box (which would allow a backslash as a path separator).
  • Assume that you’ll make mistakes, and apply the principle of defense in depth to mitigate the effects. (The above “only allow that which is safe” is a special case of this.)
  • Assume a potential attacker knows everything that you do. Which is to say, write your code as if a potential attacker was sitting next to you reading it. This both encourages you to use good practice every step of the way, and mitigates the effects of your application getting compromised. Don’t write bad code thinking that nobody will ever see it, or figure out how bad it is. And while security by obscurity has its place, don’t count on secrets staying secret and think like an attacker.

Warnings

  • Never assume that your application is 100% secure. New exploits could be found, which will be misused by attackers. Make sure you stay up to date with the latest best-practices in order to secure your applications to the extents of your capabilities.
Tags:

Related posts

This entry was posted on Tuesday, January 24th, 2012 at 2:48 pm and is filed under Gadgets. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.