In this article, we’ll look at why it’s so important to filter anything that’s incorporated into our applications. In particular, we’ll look at how to validate and sanitize foreign data in PHP.
Never (ever!) trust foreign input in your application. That’s one of the most important lessons to learn for anyone developing a web application.
Foreign input can be anything — from $_GET
and $_POST
form input data, some elements on the HTTP request body, or even some values on the $_SERVER
superglobal. Cookies, session values, and uploaded and downloaded document files are also considered foreign input.
Every time we process, output, include or concatenate foreign data into our code, there’s a potential vector for attackers to inject code into our application (the so-called injection attacks). Because of this, we need to make sure every piece of foreign data is properly filtered so it can be safely incorporated into the application.
When it comes to filtering, there are two main types: validation and sanitization.
Validation
Validation ensures that foreign input is what we expect it to be. For example, we might be expecting an email address, so we are expecting something with the ********@*****.***
format. For that, we can use the FILTER_VALIDATE_EMAIL
filter. Or, if we’re expecting a Boolean, we can use PHP’s FILTER_VALIDATE_BOOL
filter.
Amongst the most useful filters are FILTER_VALIDATE_BOOL
, FILTER_VALIDATE_INT
, and FILTER_VALIDATE_FLOAT
to filter for basic types and the FILTER_VALIDATE_EMAIL
and FILTER_VALIDATE_DOMAIN
to filter for emails and domain names respectively.
Another very important filter is the FILTER_VALIDATE_REGEXP
that allows us to filter against a regular expression. With this filter, we can create our custom filters by changing the regular expression we’re filtering against.
All the available filters for validation in PHP can be found here.
Sanitization
Sanitization is the process of removing illegal or unsafe characters from foreign input.
The best example of this is when we sanitize database inputs before inserting them into a raw SQL query.
Again, some of the most useful sanitization filters include the ones to sanitize for basic types like FILTER_SANITIZE_STRING
, FILTER_SANITIZE_CHARS
and FILTER_SANITIZE_INT
, but also FILTER_SANITIZE_URL
and FILTER_SANITIZE_EMAIL
to sanitize URLs and emails.
All PHP sanitization filters can be found here.
filter_var() and filter_input()
Now that we know PHP has an entire selection of filters available, we need to know how to use them.
Filter application is done via the filter_var()
and filter_input()
functions.
The filter_var()
function applies a specified filter to a variable. It will take the value to filter, the filter to apply, and an optional array of options. For example, if we’re trying to validate an email address we can use this:
<?php $email = your.email@sitepoint.com: if ( filter_var( $email, FILTER_VALIDATE_EMAIL ) ) { echo ("This email is valid");
}
If the goal was to sanitize a string, we could use this:
<?php
$string = "<h1>Hello World</h1>"; $sanitized_string = filter_var ( $string, FILTER_SANITIZE_STRING);
echo $sanitized_string;
The filter_input()
function gets a foreign input from a form input and filters it.
It works just like the filter_var()
function, but it takes a type of input (we can choose from GET
, POST
, COOKIE
, SERVER
, or ENV
), the variable to filter, and the filter. Optionally, it can also take an array of options.
Once again, if we want to check if the external input variable “email” is being sent via GET
to our application, we can use this:
<?php if ( filter_input( INPUT_GET, "email", FILTER_VALIDATE_EMAIL ) ) { echo "The email is being sent and is valid.";
}
Conclusion
And these are the basics of data filtering in PHP. Other techniques might be used to filter foreign data, like applying regex, but the techniques we’ve sen in this article are more than enough for most use cases.
Make sure you understand the difference between validation and sanitization and how to use the filter functions. With this knowledge, your PHP applications will be more reliable and secure!