rfc:is_literal
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
rfc:is_literal [2020/03/23 20:23] – Add link to GitHub craigfrancis | rfc:is_literal [2021/02/12 18:42] – Add notes about the implementation from Danack, GC Flags, and Future Scope craigfrancis | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Is Literal Check ====== | ====== PHP RFC: Is Literal Check ====== | ||
- | * Version: 0.1 | + | * Version: 0.2 |
* Date: 2020-03-21 | * Date: 2020-03-21 | ||
+ | * Updated: 2020-12-22 | ||
* Author: Craig Francis, craig# | * Author: Craig Francis, craig# | ||
* Status: Draft | * Status: Draft | ||
Line 10: | Line 11: | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | Add an // | + | Add an // |
- | This function would allows developers/ | + | As in, at runtime, |
- | Commands | + | This simple check can be used to warn or completely block SQL Injection, Command Line Injection, and many cases of HTML Injection |
- | This will also allow systems/ | + | ===== The Problem ===== |
- | Literals are values defined within the PHP scripts, | + | Escaping strings |
- | | + | The vast majority of programmers should never do this (mistakes will be made). |
- | $b = 'Example | + | |
- | | + | Unsafe values (often user supplied) **must** be kept separate (e.g. parameterised SQL), or be processed by something that understands the context (e.g. a HTML Templating Engine). |
- | + | ||
- | $c = 'Example | + | This is primarily for security reasons, but it can also cause data to be damaged (e.g. ASCII/UTF-8 issues). |
- | | + | |
+ | Take these mistakes, where the value has come from the user: | ||
+ | |||
+ | <code php> | ||
+ | echo "< | ||
+ | </ | ||
+ | |||
+ | Flawed, and unfortunately very common, | ||
+ | |||
+ | <code php> | ||
+ | echo "< | ||
+ | </ | ||
+ | |||
+ | Flawed because the attribute value is not quoted, e.g. //$url = '/ onerror=alert(1)'// | ||
+ | |||
+ | <code php> | ||
+ | echo "< | ||
+ | </ | ||
+ | |||
+ | Flawed because // | ||
+ | |||
+ | <code php> | ||
+ | echo '<a href="' | ||
+ | </ | ||
+ | |||
+ | Flawed because a link can include JavaScript, e.g. //$url = ' | ||
+ | |||
+ | <code html> | ||
+ | < | ||
+ | var url = "<? | ||
+ | </script> | ||
+ | </code> | ||
+ | |||
+ | Flawed because // | ||
+ | |||
+ | <code php> | ||
+ | echo '<a href="/ | ||
+ | </ | ||
+ | |||
+ | Flawed because // | ||
+ | |||
+ | <code html> | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | Flawed because the encoding is not guaranteed to be UTF-8 (or ISO-8859-1 before PHP 5.4), so the value could be corrupted. | ||
+ | |||
+ | Also flawed because some browsers (e.g. IE 11), if the charset isn't defined (header or meta tag), could guess the output as UTF-7, e.g. //$url = ' | ||
+ | |||
+ | <code php> | ||
+ | example.html: | ||
+ | <img src={{ url }} alt='' | ||
+ | |||
+ | $loader = new \Twig\Loader\FilesystemLoader(' | ||
+ | $twig = new \Twig\Environment($loader, | ||
+ | |||
+ | echo $twig-> | ||
+ | </ | ||
+ | |||
+ | Flawed because Twig is not context aware (in this case, an unquoted HTML attribute), e.g. //$url = '/ onerror=alert(1)'// | ||
+ | |||
+ | <code php> | ||
+ | $sql = ' | ||
+ | </ | ||
+ | |||
+ | Flawed because the value has not been quoted, e.g. //$id = ' | ||
+ | |||
+ | <code php> | ||
+ | $sql = ' | ||
+ | </ | ||
+ | |||
+ | Flawed if ' | ||
+ | |||
+ | <code php> | ||
+ | $sql = ' | ||
+ | </ | ||
+ | |||
+ | Flawed if 'SET NAMES latin1' | ||
+ | |||
+ | <code php> | ||
+ | $parameters = " | ||
+ | |||
+ | // $parameters = ' | ||
+ | |||
+ | mail(' | ||
+ | </ | ||
+ | |||
+ | Flawed because it's not possible to safely escape values in // | ||
+ | |||
+ | ===== Previous Solutions ===== | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | * " | ||
+ | * this amount of work isn't ideal for "just for one use case" | ||
+ | * It would have effected every SQL function, such as // | ||
+ | * Each of those functions would need a bypass for cases where unsafe SQL was intentionally being used (e.g. phpMyAdmin taking SQL from POST data) because some applications intentionally "pass raw, user submitted, SQL" (Ronald Chmara [[https://news-web.php.net/ | ||
+ | |||
+ | I also agree that "SQL injection is almost a solved problem [by using] prepared statements" | ||
+ | |||
+ | ===== Related Go Implementation ===== | ||
+ | |||
+ | As discussed by [[https:// | ||
+ | |||
+ | The Go language can do this by checking for " | ||
+ | |||
+ | Google currently avoids most issues by using [[https:// | ||
===== Related JavaScript Implementation ===== | ===== Related JavaScript Implementation ===== | ||
- | This proposal | + | This RFC is taking some ideas from TC39, where a similar idea is being discussed for JavaScript, to support the introduction of Trusted Types. |
https:// | https:// | ||
Line 36: | Line 145: | ||
They are looking at " | They are looking at " | ||
- | ===== Taint Checking | + | ===== Solution |
- | Xinchen Hui has done some amazing work with the Taint extension: | + | Literals are safe values, defined within |
- | https://github.com/ | + | <code php> |
+ | $a = ' | ||
+ | is_literal($a); | ||
- | Unfortunately this approach does not address all issues, mainly because it still allows string escaping, which is only " | + | $a = ' |
+ | is_literal($a); // true | ||
- | | + | $a = 'Example |
- | + | is_literal($a); // false | |
- | | + | |
- | + | ||
- | // DELETE FROM table WHERE id = id | + | |
- | | + | $a = 'Example |
- | + | is_literal($a); // false | |
- | // example.php? | + | |
- | + | ||
- | | + | |
- | The Taint extension also [[https://github.com/ | + | $a = sprintf(' |
+ | is_literal($a); | ||
- | ===== Previous RFC ===== | + | $c = count($ids); |
+ | $a = 'WHERE id IN (' . implode(',', | ||
+ | is_literal($a); | ||
- | Matt Tait suggested [[https://wiki.php.net/rfc/ | + | $limit = 10; |
+ | $a = 'LIMIT ' . ($limit + 1); | ||
+ | is_literal($a); | ||
+ | </code> | ||
- | It was noted that " | + | This uses a similar definition of [[https://wiki.php.net/rfc/sql_injection_protection# |
- | Where it would have effected every SQL function, such as // | + | Thanks to [[https://chat.stackoverflow.com/transcript/message/51565346# |
- | And each of those functions would need a bypass for cases where unsafe SQL was intentionally being used (e.g. phpMyAdmin taking SQL from POST data) because some applications intentionally "pass raw, user submitted, SQL" (Ronald Chmara | + | As an aside, [[https:// |
- | I also agree that "SQL injection is almost | + | Commands can be checked to ensure they are a "programmer supplied constant/ |
- | ===== Proposal ===== | + | This approach allows all systems/ |
- | Add an //is_literal()// function | + | Unlike the Taint extension, there must **not** be an equivalent |
- | This uses a similar definition as the [[https:// | + | ==== Solution: SQL Injection ==== |
- | Thanks to [[https:// | + | Database abstractions (e.g. ORMs) will be able to ensure they are provided |
- | And thanks to [[https://chat.stackoverflow.com/transcript/message/48927813#48927813|Mark R]], it might be possible to use the fact that " | + | [[https://www.doctrine-project.org/projects/doctrine-orm/en/ |
- | Unlike the Taint extension, there is no need to provide an equivalent //untaint()// function. | + | <code php> |
+ | $users = $queryBuilder | ||
+ | ->select(' | ||
+ | -> | ||
+ | -> | ||
+ | -> | ||
+ | -> | ||
- | ===== Examples ===== | + | // example.php? |
+ | </ | ||
- | ==== SQL Injection, Basic ==== | + | This mistake can be easily identified by: |
- | A simple example: | + | <code php> |
+ | public function where($predicates) | ||
+ | { | ||
+ | if (function_exists(' | ||
+ | throw new Exception(' | ||
+ | } | ||
+ | ... | ||
+ | } | ||
+ | </ | ||
- | $sql = ' | + | [[https:// |
- | + | ||
- | $result | + | |
- | Checked in the framework by: | + | <code php> |
+ | $users = R:: | ||
+ | </ | ||
- | class db { | + | [[http:// |
- | + | ||
- | public function exec($sql, $parameters = []) { | + | |
- | + | ||
- | if (!is_literal($sql)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | $statement = $this-> | + | |
- | $statement-> | + | |
- | return $statement-> | + | |
- | + | ||
- | } | + | |
- | + | ||
- | } | + | |
- | It will also work with string concatenation: | + | <code php> |
+ | $users = UserQuery:: | ||
+ | </ | ||
- | define(' | + | The //is_literal()// |
- | + | ||
- | $sql = ' | + | |
- | + | ||
- | | + | |
- | + | ||
- | $sql .= ' AND id = ' . mysqli_real_escape_string($db, $_GET[' | + | |
- | + | ||
- | is_literal($sql); | + | |
- | ==== SQL Injection, | + | ==== Solution: |
- | To ensure //ORDER BY// can be set via the user, but only use acceptable values: | + | A simple example: |
- | | + | <code php> |
- | | + | $sql = 'SELECT * FROM table WHERE id = ?'; |
- | ' | + | |
- | ' | + | |
- | ]; | + | |
- | + | ||
- | $order_id | + | |
- | + | ||
- | $sql = ' | + | |
- | ==== SQL Injection, WHERE IN ==== | + | $result |
+ | </ | ||
- | Most SQL strings can be a concatenations of literal values, but //WHERE x IN (?,?,?)// need to use a variable number of literal placeholders. | + | Checked in the framework by: |
- | So there //might// need to be a special case for // | + | <code php> |
+ | class db { | ||
- | | + | |
- | + | ||
- | // or | + | |
- | + | ||
- | | + | |
- | To be used with: | + | if (!is_literal($sql)) { |
+ | throw new Exception(' | ||
+ | } | ||
- | | + | |
+ | $statement-> | ||
+ | return $statement-> | ||
- | ==== SQL Injection, ORM Usage ==== | + | } |
- | [[https:// | + | } |
+ | </code> | ||
- | $users = $queryBuilder | + | This also works with string concatenation: |
- | -> | + | |
- | -> | + | |
- | -> | + | |
- | -> | + | |
- | -> | + | |
- | + | ||
- | // example.php? | + | |
- | Where this mistake could be identified by: | + | <code php> |
+ | define(' | ||
- | public function where($predicates) | + | $sql = 'SELECT * FROM ' . TABLE . ' WHERE id = ?'; |
- | { | + | |
- | if (!is_literal($predicates)) { | + | |
- | throw new Exception('Can only accept a literal'); | + | |
- | } | + | |
- | | + | |
- | } | + | |
- | [[https:// | + | is_literal($sql); // Returns true |
- | | + | $sql .= ' |
- | [[http://propelorm.org/Propel/reference/model-criteria.html# | + | is_literal($sql); |
+ | </code> | ||
+ | |||
+ | ==== Solution: SQL Injection, ORDER BY ==== | ||
+ | |||
+ | To ensure | ||
+ | |||
+ | <code php> | ||
+ | $order_fields = [ | ||
+ | ' | ||
+ | ' | ||
+ | ' | ||
+ | ]; | ||
+ | |||
+ | $order_id = array_search(($_GET[' | ||
+ | |||
+ | $sql = ' ORDER BY ' . $order_fields[$order_id]; | ||
+ | </code> | ||
+ | |||
+ | ==== Solution: SQL Injection, WHERE IN ==== | ||
- | $users = UserQuery:: | + | Most SQL strings can be a simple concatenations of literal values, but //WHERE x IN (?,?,?)// needs to use a variable number of literal placeholders. |
- | ==== SQL Injection, ORM Internal ==== | + | There needs to be a special case for // |
- | The // | + | <code php> |
+ | $in_sql = implode(',', | ||
- | This would avoid mistakes such as the ORDER BY issues in the Zend framework [[https:// | + | $sql = ' |
+ | </code> | ||
- | ==== CLI Injection ==== | + | ==== Solution: |
Rather than using functions such as: | Rather than using functions such as: | ||
Line 200: | Line 312: | ||
Frameworks (or PHP) could introduce something similar to // | Frameworks (or PHP) could introduce something similar to // | ||
- | Or, take a verified | + | Or, take a safe literal for the command, and use parameters for the arguments (like SQL does): |
- | | + | <code php> |
- | ' | + | $output = parameterised_exec(' |
- | ]); | + | ' |
+ | ]); | ||
+ | </ | ||
Rough implementation: | Rough implementation: | ||
- | | + | <code php> |
- | + | function parameterised_exec($cmd, | |
- | if (!is_literal($cmd)) { | + | |
- | throw new Exception(' | + | if (!is_literal($cmd)) { |
- | } | + | throw new Exception(' |
- | + | } | |
- | $offset = 0; | + | |
- | $k = 0; | + | $offset = 0; |
- | while (($pos = strpos($cmd, | + | $k = 0; |
- | if (!isset($args[$k])) { | + | while (($pos = strpos($cmd, |
- | throw new Exception(' | + | if (!isset($args[$k])) { |
- | exit(); | + | throw new Exception(' |
- | } | + | |
- | $arg = escapeshellarg($args[$k]); | + | |
- | $cmd = substr($cmd, | + | |
- | $offset = ($pos + strlen($arg)); | + | |
- | $k++; | + | |
- | } | + | |
- | if (isset($args[$k])) { | + | |
- | throw new Exception(' | + | |
exit(); | exit(); | ||
} | } | ||
- | | + | $arg = escapeshellarg($args[$k]); |
- | | + | |
- | | + | $offset = ($pos + strlen($arg)); |
+ | $k++; | ||
+ | | ||
+ | if (isset($args[$k])) { | ||
+ | throw new Exception(' | ||
+ | exit(); | ||
} | } | ||
- | ==== HTML Injection ==== | + | return exec($cmd); |
+ | |||
+ | } | ||
+ | </ | ||
+ | |||
+ | ==== Solution: | ||
Template engines should receive variables separately from the raw HTML. | Template engines should receive variables separately from the raw HTML. | ||
- | Often the engine will get the HTML from static files: | + | Often the engine will get the HTML from static files (safe): |
- | | + | <code php> |
+ | $html = file_get_contents('/ | ||
+ | </ | ||
But small snippets of HTML are often easier to define as a literal within the PHP script: | But small snippets of HTML are often easier to define as a literal within the PHP script: | ||
- | | + | <code php> |
- | < | + | $template_html = ' |
- | < | + | < |
+ | < | ||
+ | </ | ||
+ | |||
+ | Where the variables are supplied separately, in this example I'm using XPath: | ||
+ | |||
+ | <code php> | ||
+ | $values = [ | ||
+ | '// | ||
+ | NULL => ' | ||
+ | ' | ||
+ | ' | ||
+ | ], | ||
+ | '// | ||
+ | ' | ||
+ | ], | ||
+ | ]; | ||
+ | |||
+ | echo template_parse($template_html, | ||
+ | </ | ||
+ | |||
+ | The templating engine can then accept and apply the supplied variables for the relevant context. | ||
+ | |||
+ | As a simple example, this can be done with: | ||
+ | |||
+ | <code php> | ||
+ | function template_parse($html, | ||
+ | |||
+ | if (!is_literal($html)) { | ||
+ | throw new Exception(' | ||
+ | } | ||
- | Where the variables are supplied separately, in this example I'm using XPaths: | + | $dom = new DomDocument(); |
+ | $dom-> | ||
- | $values | + | $xpath = new DOMXPath($dom); |
- | '// | + | |
- | NULL => ' | + | |
- | ' | + | |
- | ' | + | |
- | ], | + | |
- | '// | + | |
- | ' | + | |
- | ], | + | |
- | ]; | + | |
- | + | ||
- | echo template_parse($template_html, | + | |
- | Being sure the HTML does not contain unsafe variables, the templating engine can accept and apply the supplied variables for the relevant context, for example: | + | foreach ($values as $query => $attributes) { |
- | function template_parse($html, | + | |
- | + | throw new Exception(' | |
- | | + | |
- | throw new Exception(' | + | |
} | } | ||
- | | + | |
- | | + | foreach ($xpath-> |
- | $dom-> | + | foreach ($attributes as $attribute => $value) { |
- | + | ||
- | $xpath = new DOMXPath($dom); | + | if (!is_literal($attribute)) { |
- | + | throw new Exception(' | |
- | foreach ($values as $query => $attributes) { | + | } |
- | + | ||
- | if (!is_literal($query)) { | + | if ($attribute) { |
- | throw new Exception(' | + | $safe = false; |
- | } | + | if ($attribute == ' |
- | + | if (preg_match('/ | |
- | | + | $safe = true; // Not " |
- | foreach ($attributes as $attribute => $value) { | + | |
- | + | ||
- | if (!is_literal($attribute)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | if ($attribute) { | + | |
- | $safe = false; | + | |
- | if ($attribute == ' | + | |
- | if (preg_match('/ | + | |
- | $safe = true; // Not " | + | |
- | } | + | |
- | } else if ($attribute == ' | + | |
- | if (in_array($value, | + | |
- | $safe = true; // Only allow specific classes? | + | |
- | } | + | |
- | } else if (preg_match('/ | + | |
- | if (preg_match('/ | + | |
- | $safe = true; | + | |
- | } | + | |
} | } | ||
- | | + | } else if ($attribute == ' |
- | | + | if (in_array($value, [' |
+ | $safe = true; // Only allow specific classes? | ||
} | } | ||
- | } else { | + | } else if (preg_match('/ |
- | $element-> | + | if (preg_match('/ |
+ | $safe = true; | ||
+ | } | ||
+ | } | ||
+ | if ($safe) | ||
+ | $element-> | ||
} | } | ||
- | | + | } else { |
+ | $element-> | ||
} | } | ||
+ | |||
} | } | ||
- | | ||
} | } | ||
- | | + | |
- | $html = ''; | + | } |
- | $body = $dom-> | + | |
- | if ($body-> | + | |
- | foreach ($body-> | + | $body = $dom-> |
- | $html .= $dom-> | + | if ($body-> |
- | } | + | foreach ($body-> |
+ | $html .= $dom-> | ||
} | } | ||
- | | ||
- | return $html; | ||
- | | ||
} | } | ||
+ | |||
+ | return $html; | ||
+ | |||
+ | } | ||
+ | </ | ||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | Not sure | + | None |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
- | PHP 8? | + | PHP 8.1? |
===== RFC Impact ===== | ===== RFC Impact ===== | ||
Line 353: | Line 479: | ||
===== Open Issues ===== | ===== Open Issues ===== | ||
- | | + | On [[https:// |
- | - Can // | + | |
+ | | ||
+ | - Can // | ||
- Should the function be named // | - Should the function be named // | ||
- | - Systems/ | + | - Systems/ |
===== Alternatives ===== | ===== Alternatives ===== | ||
- The current Taint Extension (notes above) | - The current Taint Extension (notes above) | ||
- | - Using static analysis (not runtime), for example [[https:// | + | - Using static analysis (not at runtime), for example [[https:// |
===== Unaffected PHP Functionality ===== | ===== Unaffected PHP Functionality ===== | ||
Line 369: | Line 497: | ||
===== Future Scope ===== | ===== Future Scope ===== | ||
- | Certain | + | As noted by [[https:// |
- | ===== Proposed Voting Choices ===== | + | This check could be used to throw an exception, or generate an error/ |
- | Not sure | + | PHP could also have a mode where output (e.g. //echo '< |
- | ===== Patches and Tests ===== | + | And maybe there could be a [[https:// |
- | A volunteer is needed to help with implementation. | + | ===== Proposed Voting Choices ===== |
- | ===== Implementation | + | N/A |
+ | |||
+ | ===== Patches and Tests ===== | ||
N/A | N/A | ||
- | ===== References | + | ===== Implementation |
- | - https://wiki.php.net/rfc/sql_injection_protection | + | [[https://github.com/ |
===== Rejected Features ===== | ===== Rejected Features ===== |
rfc/is_literal.txt · Last modified: 2022/02/14 00:36 by craigfrancis