rfc:is_literal
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
rfc:is_literal [2020/03/22 22:26] – Update the literal definition, and move to the end of the intro craigfrancis | rfc:is_literal [2021/03/03 14:49] – Minor tweaks craigfrancis | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Is Literal Check ====== | ====== PHP RFC: Is Literal Check ====== | ||
- | * Version: 0.1 | + | * Version: 0.3 |
* Date: 2020-03-21 | * Date: 2020-03-21 | ||
+ | * Updated: 2021-02-19 | ||
* Author: Craig Francis, craig# | * Author: Craig Francis, craig# | ||
* Status: Draft | * Status: Draft | ||
* First Published at: https:// | * First Published at: https:// | ||
+ | * GitHub Repo: https:// | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | Add an // | + | Add an // |
- | This function would allows developers/ | + | As in, at runtime, |
- | Commands | + | This simple check can be used to warn or completely block SQL Injection, Command Line Injection, and many cases of HTML Injection |
- | This will also allow systems/ | + | See the [[https://github.com/ |
- | Literals are values defined within the PHP scripts, for example: | + | In short, abstractions like Doctrine could protect against [[https:// |
- | | + | <code php> |
- | $b = 'Example | + | $users = $queryBuilder |
- | | + | -> |
- | + | | |
- | $c = ' | + | |
- | | + | |
+ | -> | ||
+ | </code> | ||
- | ===== Related JavaScript Implementation ===== | + | Or this Twig [[https:// |
- | This proposal is taking some ideas from TC39, where a similar idea is being discussed for JavaScript, to support the introduction of Trusted Types. | + | < |
- | + | echo $twig-> | |
- | https:// | + | </code> |
- | https:// | + | |
- | + | ||
- | They are looking at " | + | |
- | + | ||
- | ===== Taint Checking ===== | + | |
- | + | ||
- | Xinchen Hui has done some amazing work with the Taint extension: | + | |
- | + | ||
- | https:// | + | |
- | + | ||
- | Unfortunately this approach does not address all issues, mainly because it still allows string escaping, which is only " | + | |
- | + | ||
- | $sql = ' | + | |
- | + | ||
- | // delete.php? | + | |
- | + | ||
- | // DELETE FROM table WHERE id = id | + | |
- | + | ||
- | $html = '<img src=' . htmlentities($_GET[' | + | |
- | + | ||
- | // example.php? | + | |
- | + | ||
- | // <img src=x onerror=alert(cookie) | + | |
- | + | ||
- | The Taint extension also [[https:// | + | |
- | + | ||
- | ===== Previous RFC ===== | + | |
- | + | ||
- | Matt Tait suggested [[https:// | + | |
- | + | ||
- | It was noted that " | + | |
- | + | ||
- | Where it would have effected every SQL function, such as // | + | |
- | + | ||
- | And each of those functions would need a bypass for cases where unsafe SQL was intentionally being used (e.g. phpMyAdmin taking SQL from POST data) because some applications intentionally "pass raw, user submitted, SQL" (Ronald Chmara [[https:// | + | |
- | + | ||
- | I also agree that "SQL injection is almost a solved problem [by using] prepared statements" | + | |
===== Proposal ===== | ===== Proposal ===== | ||
- | Add an // | + | Literals are safe values, defined within the PHP script, for example: |
- | This uses a similar definition as the [[https:// | + | < |
+ | is_literal(' | ||
- | Thanks to [[https://news-web.php.net/ | + | $a = ' |
+ | is_literal($a); | ||
- | Unlike the Taint extension, there is no need to provide an equivalent | + | is_literal(4); |
+ | is_literal(0.3); // true | ||
+ | is_literal(' | ||
- | ===== Examples ===== | + | $a = ' |
+ | $b = $a . ' B ' . 3; | ||
+ | is_literal($b); | ||
- | ==== SQL Injection, Basic ==== | + | is_literal($_GET[' |
- | A simple example: | + | is_literal(rand(0, |
- | $sql = 'SELECT * FROM table WHERE id = ?'; | + | is_literal(sprintf('LIMIT %d', |
- | + | ||
- | $result = $db-> | + | |
- | Checked in the framework by: | + | $c = count($ids); |
+ | $a = 'WHERE id IN (' . implode(',', | ||
+ | is_literal($a); | ||
+ | </ | ||
- | class db { | + | Ideally string concatenation would be allowed, but [[https:// |
- | + | ||
- | public function exec($sql, $parameters = []) { | + | |
- | + | ||
- | if (!is_literal($sql)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | $statement = $this-> | + | |
- | $statement-> | + | |
- | return $statement-> | + | |
- | + | ||
- | } | + | |
- | + | ||
- | } | + | |
- | It will also work with string concatenation: | + | Thanks to [[https:// |
- | define(' | + | As an aside, [[https:// |
- | + | ||
- | $sql = ' | + | |
- | + | ||
- | is_literal($sql); | + | |
- | + | ||
- | $sql .= ' AND id = ' . mysqli_real_escape_string($db, $_GET[' | + | |
- | + | ||
- | is_literal($sql); | + | |
- | ==== SQL Injection, ORDER BY ==== | + | Unlike the Taint extension, there must **not** be an equivalent // |
- | To ensure //ORDER BY// can be set via the user, but only use acceptable values: | + | ===== Previous Work ===== |
- | $order_fields = [ | + | There is the [[https:// |
- | ' | + | |
- | ' | + | |
- | ' | + | |
- | ]; | + | |
- | + | ||
- | $order_id = array_search(($_GET[' | + | |
- | + | ||
- | $sql = ' ORDER BY ' | + | |
- | ==== SQL Injection, WHERE IN ==== | + | Google currently uses a [[https:// |
- | Most SQL strings can be a concatenations of literal values, but //WHERE x IN (?,?,?)// need to use a variable number of literal | + | And there are discussions about [[https://github.com/craigfrancis/php-is-literal-rfc/ |
- | So there //might// need to be a special case for //array_fill()// | + | As noted be [[https://news-web.php.net/php.internals/109192|Tyson Andre]], it might be possible |
- | $in_sql = implode(' | + | And there is the [[https:// |
- | + | ||
- | | + | |
- | + | ||
- | $in_sql = substr(str_repeat('?,', count($ids)), | + | |
- | To be used with: | + | * " |
+ | * this amount of work isn't ideal for "just for one use case" ([[https:// | ||
+ | * It would have effected every SQL function, such as // | ||
+ | * Each of those functions would need a bypass for cases where unsafe SQL was intentionally being used (e.g. phpMyAdmin taking SQL from POST data) because some applications intentionally "pass raw, user submitted, SQL" (Ronald Chmara [[https:// | ||
- | $sql = ' | + | I also agree that "SQL injection |
- | + | ||
- | ==== SQL Injection, ORM Usage ==== | + | |
- | + | ||
- | [[https:// | + | |
- | + | ||
- | $users = $queryBuilder | + | |
- | -> | + | |
- | -> | + | |
- | -> | + | |
- | -> | + | |
- | -> | + | |
- | + | ||
- | // example.php? | + | |
- | + | ||
- | Where this mistake could be identified by: | + | |
- | + | ||
- | public function where($predicates) | + | |
- | { | + | |
- | if (!is_literal($predicates)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | ... | + | |
- | } | + | |
- | + | ||
- | [[https://redbeanphp.com/ | + | |
- | + | ||
- | $users = R:: | + | |
- | + | ||
- | [[http://propelorm.org/Propel/ | + | |
- | + | ||
- | $users = UserQuery:: | + | |
- | + | ||
- | ==== SQL Injection, ORM Internal ==== | + | |
- | + | ||
- | The // | + | |
- | + | ||
- | This would avoid mistakes such as the ORDER BY issues in the Zend framework [[https://framework.zend.com/ | + | |
- | + | ||
- | ==== CLI Injection ==== | + | |
- | + | ||
- | Rather than using functions such as: | + | |
- | + | ||
- | * // | + | |
- | * // | + | |
- | * // | + | |
- | * // | + | |
- | + | ||
- | Frameworks (or PHP) could introduce something similar to // | + | |
- | + | ||
- | Or, take a verified literal for the command, and use parameters for the arguments (like SQL): | + | |
- | + | ||
- | $output = parameterised_exec(' | + | |
- | ' | + | |
- | ]); | + | |
- | + | ||
- | Rough implementation: | + | |
- | + | ||
- | function parameterised_exec($cmd, | + | |
- | + | ||
- | if (!is_literal($cmd)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | $offset = 0; | + | |
- | $k = 0; | + | |
- | while (($pos = strpos($cmd, | + | |
- | if (!isset($args[$k])) { | + | |
- | throw new Exception(' | + | |
- | exit(); | + | |
- | } | + | |
- | $arg = escapeshellarg($args[$k]); | + | |
- | $cmd = substr($cmd, | + | |
- | $offset = ($pos + strlen($arg)); | + | |
- | $k++; | + | |
- | } | + | |
- | if (isset($args[$k])) { | + | |
- | throw new Exception(' | + | |
- | exit(); | + | |
- | } | + | |
- | + | ||
- | return exec($cmd); | + | |
- | + | ||
- | } | + | |
- | + | ||
- | ==== HTML Injection ==== | + | |
- | + | ||
- | Template engines should receive variables separately from the raw HTML. | + | |
- | + | ||
- | Often the engine will get the HTML from static files: | + | |
- | + | ||
- | $html = file_get_contents(' | + | |
- | + | ||
- | But small snippets of HTML are often easier to define as a literal within the PHP script: | + | |
- | + | ||
- | $template_html = ' | + | |
- | < | + | |
- | < | + | |
- | + | ||
- | Where the variables are supplied separately, in this example I'm using XPaths: | + | |
- | + | ||
- | $values = [ | + | |
- | '// | + | |
- | NULL => ' | + | |
- | ' | + | |
- | ' | + | |
- | ], | + | |
- | '// | + | |
- | ' | + | |
- | ], | + | |
- | ]; | + | |
- | + | ||
- | echo template_parse($template_html, | + | |
- | + | ||
- | Being sure the HTML does not contain unsafe variables, the templating engine can accept and apply the supplied variables for the relevant context, for example: | + | |
- | + | ||
- | function template_parse($html, | + | |
- | + | ||
- | if (!is_literal($html)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | $dom = new DomDocument(); | + | |
- | $dom-> | + | |
- | + | ||
- | $xpath = new DOMXPath($dom); | + | |
- | + | ||
- | foreach ($values as $query => $attributes) { | + | |
- | + | ||
- | if (!is_literal($query)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | foreach ($xpath-> | + | |
- | foreach ($attributes as $attribute => $value) { | + | |
- | + | ||
- | if (!is_literal($attribute)) { | + | |
- | throw new Exception(' | + | |
- | } | + | |
- | + | ||
- | if ($attribute) { | + | |
- | $safe = false; | + | |
- | if ($attribute == ' | + | |
- | if (preg_match('/ | + | |
- | $safe = true; // Not " | + | |
- | } | + | |
- | } else if ($attribute == ' | + | |
- | if (in_array($value, | + | |
- | $safe = true; // Only allow specific classes? | + | |
- | } | + | |
- | } else if (preg_match('/ | + | |
- | if (preg_match('/ | + | |
- | $safe = true; | + | |
- | } | + | |
- | } | + | |
- | if ($safe) { | + | |
- | $element-> | + | |
- | } | + | |
- | } else { | + | |
- | $element-> | + | |
- | } | + | |
- | + | ||
- | } | + | |
- | } | + | |
- | + | ||
- | } | + | |
- | + | ||
- | $html = ''; | + | |
- | $body = $dom-> | + | |
- | if ($body-> | + | |
- | foreach ($body-> | + | |
- | $html .= $dom-> | + | |
- | } | + | |
- | } | + | |
- | + | ||
- | return $html; | + | |
- | + | ||
- | } | + | |
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | Not sure | + | None |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
- | PHP 8? | + | PHP 8.1? |
===== RFC Impact ===== | ===== RFC Impact ===== | ||
Line 350: | Line 116: | ||
===== Open Issues ===== | ===== Open Issues ===== | ||
+ | On [[https:// | ||
+ | |||
+ | - Should this be named something else? ([[https:// | ||
- Would this cause performance issues? | - Would this cause performance issues? | ||
- | - Can // | + | - Can // |
- | - Should the function be named // | + | - Systems/ |
- | - Systems/ | + | |
- | + | ||
- | ===== Alternatives ===== | + | |
- | + | ||
- | - The current Taint Extension (notes above) | + | |
- | - Using static analysis (not runtime), for example [[https:// | + | |
===== Unaffected PHP Functionality ===== | ===== Unaffected PHP Functionality ===== | ||
Line 366: | Line 129: | ||
===== Future Scope ===== | ===== Future Scope ===== | ||
- | Certain | + | As noted by [[https:// |
+ | |||
+ | This check could be used to throw an exception, or generate | ||
+ | |||
+ | PHP could have a mode where output (e.g. //echo '< | ||
+ | |||
+ | And, for a bit of silliness (Spaß ist verboten), there could be a // | ||
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
- | Not sure | + | N/A |
===== Patches and Tests ===== | ===== Patches and Tests ===== | ||
- | A volunteer is needed to help with implementation. | + | N/A |
===== Implementation ===== | ===== Implementation ===== | ||
- | N/A | + | [[https:// |
===== References ===== | ===== References ===== | ||
- | - https:// | + | N/A |
===== Rejected Features ===== | ===== Rejected Features ===== |
rfc/is_literal.txt · Last modified: 2022/02/14 00:36 by craigfrancis