instead of htmlentities()
. The latter is a waste of time and resources, because it transforms every possible character into its HTML entity. That's not what you want. You only want to transform the "dangerous" characters -- which is what htmlspecialchars()
It's also crucial to set the right character encoding. Without that, the whole function can be useless (which even Google had to learn
Check the 10 worst sins of security
. A lot of the things you're discussing here I already wrote down in this thread.
Originally Posted by Northie
versions of php prior to PHP 5.3.6 did not allow the charset attribute in the DSN, so set names had to be used. Set names can be buggy.
No, using SET NAMES
to change the character encoding for a database interface is conceptually wrong and can break all escaping functions. In other words, this will make the code vulnerable to SQL injections despite
the use of escaping functions. That's why I put a big warning into the code above:
// important! specify the character encoding in the DSN string, don't use SET NAMES
This does not
affect prepared statements (the real ones). But obviously it's still not a good idea to run around with broken security functions that don't do what they're supposed to do.
The problem of SET NAMES
is that you "silently" change the underlying character encoding without notifying the application of it. The application doesn't know that you've just made a query to change the encoding from, say, ANSI to UTF-8. It still thinks you're using ANSI, so all escaping functions will assume the wrong encoding. This can make them completely blind, because they simply no longer recognize the "dangerous" characters.
This problem applies to all
database libraries, not just PDO. You'll get the same effect with the old MySQL library or MySQLi or whatever.
Fortunately, the mainstream encodings used today (ASCII, ANSI and UTF-8) all share the same bit encoding for the "dangerous" characters. So while this mistake is made again and again, it rarely leads to actual vulnerabilities. But don't rely on this piece of luck to keep your broken code secure! As soon as you deal with more exotic encodings, your whole security will fall apart. On a side node: If you're willing to rely on all "dangerous" characters being encoded as ASCII, it's safer to use the good ol' addslashes()
rather than the new escaping functions (mysql_real_escape()
etc.). Because addslashes()
is guaranteed to recognize ASCII characters.
Long story stort: Never use SET NAMES
. If that's the only way you can change the encoding due to some outdated PHP version or library, then you cannot change the encoding in PHP at all. You have to do it in MySQL.
But since your PHP version 5.3.18 came after
5.3.6, I'm not even sure why we're having this discussion. You can simply use the DSN string.