rfc:is_literal

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
rfc:is_literal [2021/02/19 19:36] – Moving justification and examples to GitHub craigfrancisrfc:is_literal [2021/04/11 11:37] – Example updates craigfrancis
Line 15: Line 15:
 As in, at runtime, being able to check if a variable has been created by literals, defined within a PHP script, by a trusted developer. As in, at runtime, being able to check if a variable has been created by literals, defined within a PHP script, by a trusted developer.
  
-This simple check can be used to warn or completely block SQL Injection, Command Line Injection, and many cases of HTML Injection (aka XSS).+This check can be used to warn or completely block (by defaultmany, if not all, injection based vulnerabilities.
  
 See the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md|justification for why this is important]]. See the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md|justification for why this is important]].
  
-But, in short, abstractions like [[https://www.doctrine-project.org/projects/doctrine-orm/en/2.7/reference/security.html|Doctrine could protect itself against common mistakes]] like this:+In short, abstractions like Doctrine could identify [[https://www.doctrine-project.org/projects/doctrine-orm/en/2.7/reference/security.html|common mistakes]]like this [[https://www.doctrine-project.org/projects/doctrine-orm/en/2.7/reference/query-builder.html#high-level-api-methods|Query Builder]] SQL Injection vulnerability:
  
 <code php> <code php>
-$query = $em->createQuery('SELECT FROM User u WHERE u.id = ' . $_GET['id']);+$users = $queryBuilder 
 +  ->select('u') 
 +  ->from('User', 'u') 
 +  ->where('u.id = ' . $_GET['id']
 +  ->getQuery() 
 +  ->getResult(); 
 +</code> 
 + 
 +Or Twig could project against HTML Injection vulnerabilities (aka XSS), like this flawed [[https://twig.symfony.com/doc/2.x/recipes.html#loading-a-template-from-a-string|HTML Template]]: 
 + 
 +<code php> 
 +echo $twig->createTemplate('<p>Hi ' . $_GET['name'] . '</p>')->render();
 </code> </code>
  
Line 30: Line 41:
  
 <code php> <code php>
 +is_literal('Example'); // true
 +
 $a = 'Example'; $a = 'Example';
 is_literal($a); // true is_literal($a); // true
  
-$a = 'Example ' $a . ', ' . 5+is_literal(4); // true 
-is_literal($a); // true+is_literal(0.3)// true 
 +is_literal('a' . 'b'); // true, compiler can concatenate
  
-$a = 'Example $_GET['id']+$a = 'A'
-is_literal($a); // false+$b = $a . . 3
 +is_literal($b); // true, ideally (more details below)
  
-$a = 'Example . time(); +is_literal($_GET['id']); // false
-is_literal($a); // false+
  
-$a = sprintf('LIMIT %d', 3)+is_literal(rand(0, 10)); // false 
-is_literal($a); // false+ 
 +is_literal(sprintf('LIMIT %d', 3)); // false
  
 $c = count($ids); $c = count($ids);
 $a = 'WHERE id IN (' . implode(',', array_fill(0, $c, '?')) . ')'; $a = 'WHERE id IN (' . implode(',', array_fill(0, $c, '?')) . ')';
-is_literal($a); // true, the odd one that involves functions+is_literal($a); // true, the one exception that involves functions.
- +
-$limit = 10; +
-$a = 'LIMIT ' . ($limit + 1); +
-is_literal($a); // false, but might need some discussion.+
 </code> </code>
  
-This uses a similar definition of [[https://wiki.php.net/rfc/sql_injection_protection#safeconst|SafeConst]] from Matt Tait's RFCbut it doesn't need to accept Integer or FloatingPoint variables as safe (unless it makes the implementation easier), nor should this proposal effect any existing functions.+Ideally string concatenation would be allowed, but [[https://github.com/Danack/RfcLiteralString/issues/5|Danack]] suggested this might raise performance concernsand an array implode like function could be used instead (or a query builder).
  
 Thanks to [[https://chat.stackoverflow.com/transcript/message/51565346#51565346|NikiC]], it looks like we can reuse the GC_PROTECTED flag for strings. Thanks to [[https://chat.stackoverflow.com/transcript/message/51565346#51565346|NikiC]], it looks like we can reuse the GC_PROTECTED flag for strings.
Line 64: Line 75:
 ===== Previous Work ===== ===== Previous Work =====
  
-There is the [[https://github.com/laruence/taint|Taint extension]] by Xinchen Hui, but this approach explicitly allows escapingwhich doesn't address all issues.+There is the [[https://github.com/laruence/taint|Taint extension]] by Xinchen Hui, but in contrast to this, there must **not** be an equivalent //untaint()// function, or support any kind of escaping (see the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md|justification page]]).
  
-Google currently uses a [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md#go-implementation|similar approach in Go]] with the use of "compile time constants"and there are [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md#javascript-implementation|discussions with it happening in JavaScript]].+Google currently uses a [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md#go-implementation|similar approach in Go]] which uses "compile time constants", [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md#perl-implementation|Perl has a Taint Mode]] (but uses regular expressions to un-taint data), and there are discussions about [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification.md#javascript-implementation|adding it to JavaScript]] to support Trusted Types.
  
-It might be possible to use static analysis, for example [[https://psalm.dev/|psalm]] (thanks [[https://news-web.php.net/php.internals/109192|Tyson Andre]]). But I can't find any which do these checks by default, they are likely to miss things that happen at runtime, and we can't expect all programmers to use static analysis (especially those who have just stated, who need this more than developers who know the concepts and just make the odd mistake).+As noted be [[https://news-web.php.net/php.internals/109192|Tyson Andre]], it might be possible to use static analysis, for example [[https://psalm.dev/|psalm]]. But I can't find any which do these checks by default, [[https://github.com/vimeo/psalm/commit/2122e4a1756dac68a83ec3f5abfbc60331630781|can be incomplete]], they are likely to miss things (especially at runtime), and we can't expect all programmers to use static analysis (especially those who are new to programming, who need this more than developers who know the concepts and just make the odd mistake).
  
-And there is the [[https://wiki.php.net/rfc/sql_injection_protection|Automatic SQL Injection Protection]] RFC by Matt Tait, where it was noted:+And there is the [[https://wiki.php.net/rfc/sql_injection_protection|Automatic SQL Injection Protection]] RFC by Matt Tait, where this RFC uses a similar concept of the [[https://wiki.php.net/rfc/sql_injection_protection#safeconst|SafeConst]]. When Matt's RFC was being discussed, it was noted:
  
   * "unfiltered input can affect way more than only SQL" ([[https://news-web.php.net/php.internals/87355|Pierre Joye]]);   * "unfiltered input can affect way more than only SQL" ([[https://news-web.php.net/php.internals/87355|Pierre Joye]]);
Line 77: Line 88:
   * Each of those functions would need a bypass for cases where unsafe SQL was intentionally being used (e.g. phpMyAdmin taking SQL from POST data) because some applications intentionally "pass raw, user submitted, SQL" (Ronald Chmara [[https://news-web.php.net/php.internals/87406|1]]/[[https://news-web.php.net/php.internals/87446|2]]).   * Each of those functions would need a bypass for cases where unsafe SQL was intentionally being used (e.g. phpMyAdmin taking SQL from POST data) because some applications intentionally "pass raw, user submitted, SQL" (Ronald Chmara [[https://news-web.php.net/php.internals/87406|1]]/[[https://news-web.php.net/php.internals/87446|2]]).
  
-I also agree that "SQL injection is almost a solved problem [by using] prepared statements" ([[https://news-web.php.net/php.internals/87400|Scott Arciszewski]]), but we still need something to identify mistakes.+I also agree that "SQL injection is almost a solved problem [by using] prepared statements" ([[https://news-web.php.net/php.internals/87400|Scott Arciszewski]]), but we still //is_literal()// to identify mistakes.
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
Line 105: Line 116:
 On [[https://github.com/craigfrancis/php-is-literal-rfc/issues|GitHub]]: On [[https://github.com/craigfrancis/php-is-literal-rfc/issues|GitHub]]:
  
 +  - Name it something else? [[https://news-web.php.net/php.internals/109197|Jakob Givoni]] suggested //is_from_literal()//; or maybe //is_safe()//.
   - Would this cause performance issues?   - Would this cause performance issues?
   - Can //array_fill()//+//implode()// pass though the "is_literal" flag for the "WHERE IN" case?   - Can //array_fill()//+//implode()// pass though the "is_literal" flag for the "WHERE IN" case?
-  - Should the function be named something else? ([[https://news-web.php.net/php.internals/109197|Jakob Givoni]] suggested //is_from_literal//). +  - Systems/Frameworks that define certain variables (e.g. table name prefixes) without the use of a literal (e.g. ini/json/yaml files), they might need to make some changes to use this check, as originally noted by [[https://news-web.php.net/php.internals/87667|Dennis Birkholz]].
-  - Systems/Frameworks that define certain variables (e.g. table name prefixes) without the use of a literal (e.g. ini/json/yaml files), might need to make some changes to use this check, as originally noted by [[https://news-web.php.net/php.internals/87667|Dennis Birkholz]].+
  
 ===== Unaffected PHP Functionality ===== ===== Unaffected PHP Functionality =====
Line 116: Line 127:
 ===== Future Scope ===== ===== Future Scope =====
  
-As noted by [[https://chat.stackoverflow.com/transcript/message/51573226#51573226|MarkR]], the benefit will come when it can be used by PDO and similar functions (//mysqli_query//, //preg_match//, etc).+As noted by [[https://chat.stackoverflow.com/transcript/message/51573226#51573226|MarkR]], the biggest benefit will come when it can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc). But the basic idea can be used immediately by frameworks and general abstraction libraries, and they can give feedback for future work. 
 + 
 +**Phase 2** could introduce a way for programmers to specify that certain function arguments only accept safe literals, and/or specific value-objects their project trusts (this idea comes from [[https://web.dev/trusted-types/|Trusted Types]] in JavaScript).
  
-This check could be used to throw an exception, or generate an error/warning/notice, providing way for PHP to teach new programmers, and/or completely block unsafe values in SQLHTMLCLI, etc.+For examplea project could require the second argument for //pg_query()// only accept literals or their //query_builder// object (which provides a //__toString// method); and that any output (printechoreadfile, etc) must use the //html_output// object that's returned by their trusted HTML Templating system (using //ob_start()// might be useful here).
  
-PHP could also have mode where output (e.g. //echo '<html>'//) is blocked, and this can be bypassed (maybe via //ini_set//) when the HTML Templating Engine has created the correctly encoded output.+**Phase 3** could set default of 'only literalsfor all of the relevant PHP function arguments, so developers are given a warning, and later prevented (via an exception)when they provide an unsafe value to those functions (they could still specify that unsafe values are allowed, e.g. phpMyAdmin).
  
-And, for a bit of silliness, there could be a //is_figurative()// function, which MarkR seems to [[https://chat.stackoverflow.com/transcript/message/48927770#48927770|really]], [[https://chat.stackoverflow.com/transcript/message/51573091#51573091|want]] :-)+And, for a bit of silliness (Spaß ist verboten), there could be a //is_figurative()// function, which MarkR seems to [[https://chat.stackoverflow.com/transcript/message/48927770#48927770|really]], [[https://chat.stackoverflow.com/transcript/message/51573091#51573091|want]] :-)
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
rfc/is_literal.txt · Last modified: 2022/02/14 00:36 by craigfrancis