rfc:is_trusted

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:is_trusted [2021/06/21 12:00] craigfrancisrfc:is_trusted [2021/06/21 19:36] (current) craigfrancis
Line 1: Line 1:
-====== PHP RFC: Is Trusted ======+====== PHP RFC: Is_Trusted ======
  
   * Version: 0.9   * Version: 0.9
Line 5: Line 5:
   * Updated: 2021-06-06   * Updated: 2021-06-06
   * Author: Craig Francis, craig#at#craigfrancis.co.uk   * Author: Craig Francis, craig#at#craigfrancis.co.uk
-  * Contributors: Joe Watkins, Dan Ackroyd, Máté Kocsis+  * Contributors: Joe Watkins, Máté Kocsis, Dan Ackroyd
   * Status: Under Discussion   * Status: Under Discussion
-  * First Published at: https://wiki.php.net/rfc/is_trusted+  * First Published at: https://wiki.php.net/rfc/is_literal
   * GitHub Repo: https://github.com/craigfrancis/php-is-literal-rfc   * GitHub Repo: https://github.com/craigfrancis/php-is-literal-rfc
  
Line 14: Line 14:
 Add the function //is_trusted()//, so developers can check if a variable can be trusted to not contain an Injection Vulnerability. Add the function //is_trusted()//, so developers can check if a variable can be trusted to not contain an Injection Vulnerability.
  
-By trusting integers and literals (strings defined by the programmer), we can provide a lightweight, simple, and very effective way to identify common Injection Vulnerabilities.+By trusting integers and literals (strings defined by the programmer), we can provide a lightweight, simple, and very effective way to prevent Injection Vulnerabilities.
  
-It avoids the "false sense of security" that comes with the flawed "Taint Checking" approach, [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/escaping.php?ts=4|because escaping is very difficult to get right]]. +It avoids the "false sense of security" that comes with the flawed "Taint Checking" approach, [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/escaping.php?ts=4|because escaping is very difficult to get right]]. It's much safer for developers to use parameterised queries, and well-tested libraries.
- +
-Developers should not escape anything themselves (or use anything like "magic quotes"); they should simply use parameterised queries, and/or well-tested libraries.+
  
 These libraries require certain sensitive values to only come from the developer; but because it's [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/mistakes.php?ts=4|easy to incorrectly include user values]], Injection Vulnerabilities are still introduced by the thousands of developers using these libraries incorrectly. You will notice the linked examples are based on examples found in the Libraries' official documentation, they still "work", and are typically shorter/easier than doing it correctly (I've found many of them on live websites, and it's why I'm here). A simple Query Builder example being: These libraries require certain sensitive values to only come from the developer; but because it's [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/mistakes.php?ts=4|easy to incorrectly include user values]], Injection Vulnerabilities are still introduced by the thousands of developers using these libraries incorrectly. You will notice the linked examples are based on examples found in the Libraries' official documentation, they still "work", and are typically shorter/easier than doing it correctly (I've found many of them on live websites, and it's why I'm here). A simple Query Builder example being:
Line 58: Line 56:
 ==== Usage in PHP ==== ==== Usage in PHP ====
  
-Libraries would be able to use //is_trusted()// immediately, allowing them to warn developers about Injection Issues as soon as they receive any non-trusted values, for example:+Libraries would be able to use //is_trusted()// immediately, allowing them to warn developers about Injection Issues as soon as they receive any non-trusted values. Some already plan to implement this, for example:
  
 **Propel** (Mark Scherer): "given that this would help to more safely work with user input, I think this syntax would really help in Propel." **Propel** (Mark Scherer): "given that this would help to more safely work with user input, I think this syntax would really help in Propel."
Line 70: Line 68:
 Add the function //is_trusted()//, where "trusted" is defined as: Add the function //is_trusted()//, where "trusted" is defined as:
  
-  - Any integer, 
   - Strings defined by the programmer (literals),   - Strings defined by the programmer (literals),
-  - Strings defined by the engine (interned strings).+  - Strings defined by the engine (interned strings)
 +  - Any integer.
  
 No input values are interned; instead an interned string includes: No input values are interned; instead an interned string includes:
Line 112: Line 110:
 ===== Try it ===== ===== Try it =====
  
-[[https://3v4l.org/#focus=rfc.literals|Have a play with it on 3v4l.org]] - Note, the function is still named //is_literal()//.+[[https://3v4l.org/#focus=rfc.literals|Have a play with it on 3v4l.org]]
  
 [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4|How it can be used by libraries]] - Notice how this example library just raises a warning, to simply let the developer know about the issue, **without breaking anything**. And it provides an //"unsafe_value"// value-object to bypass the //is_trusted()// check, but none of the examples need to use it (can be useful as a temporary thing, but there are much safer/better solutions, which developers are/should already be using). [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4|How it can be used by libraries]] - Notice how this example library just raises a warning, to simply let the developer know about the issue, **without breaking anything**. And it provides an //"unsafe_value"// value-object to bypass the //is_trusted()// check, but none of the examples need to use it (can be useful as a temporary thing, but there are much safer/better solutions, which developers are/should already be using).
Line 120: Line 118:
 ==== Taint Checking ==== ==== Taint Checking ====
  
-**Taint checking is flawed, isn't this the same?** It is not the same.+**Taint checking is flawed, isn't this the same?**
  
-Taint Checking incorrectly assumes the output of an escaping function is "safe" for a particular context. While it sounds reasonable in theory, the operation of escaping functions, and the context for which their output is safe, is very hard to define. This leads to a feature that is both complex and unreliable.+It is not the same. Taint Checking incorrectly assumes the output of an escaping function is "safe" for a particular context. While it sounds reasonable in theory, the operation of escaping functions, and the context for which their output is safe, is very hard to define and led to a feature that is both complex and unreliable.
  
 <code php> <code php>
Line 130: Line 128:
 </code> </code>
  
-All three examples would be incorrectly considered "untainted". The first two need the values to be quoted. The third example, //htmlentities()// does not escape single quotes by default before PHP 8.1 ([[https://github.com/php/php-src/commit/50eca61f68815005f3b0f808578cc1ce3b4297f0|fixed]]), and it does not consider the issue of 'javascript:' URLs.+All three examples would be incorrectly considered "safe(untainted). The first two need the values to be quoted. The third example, //htmlentities()// does not escape single quotes by default before PHP 8.1 ([[https://github.com/php/php-src/commit/50eca61f68815005f3b0f808578cc1ce3b4297f0|fixed]]), and it does not consider the issue of 'javascript:' URLs.
  
 In comparison, //is_trusted()// doesn't have an equivalent of //untaint()//, or support escaping. Instead PHP will set the trusted flag, and as soon as the value has been manipulated or includes anything that is not trusted (e.g. user data), the trusted flag is lost. In comparison, //is_trusted()// doesn't have an equivalent of //untaint()//, or support escaping. Instead PHP will set the trusted flag, and as soon as the value has been manipulated or includes anything that is not trusted (e.g. user data), the trusted flag is lost.
Line 138: Line 136:
 ==== Education ==== ==== Education ====
  
-**Why not educate everyone?** You can't - developer training simply does not scale, and mistakes still happen.+**Why not educate everyone?** 
 + 
 +You can't - developer training simply does not scale, and mistakes still happen.
  
 We cannot expect everyone to have formal training, know everything from day 1, and consider programming a full time job. We want new programmers, with a variety of experiences, ages, and backgrounds. Everyone should be guided to do the right thing, and notified as soon as they make a mistake (we all make mistakes). We also need to acknowledge that many programmers are busy, do copy/paste code, don't necessarily understand what it does, edit it for their needs, then simply move on to their next task. We cannot expect everyone to have formal training, know everything from day 1, and consider programming a full time job. We want new programmers, with a variety of experiences, ages, and backgrounds. Everyone should be guided to do the right thing, and notified as soon as they make a mistake (we all make mistakes). We also need to acknowledge that many programmers are busy, do copy/paste code, don't necessarily understand what it does, edit it for their needs, then simply move on to their next task.
Line 144: Line 144:
 ==== Static Analysis ==== ==== Static Analysis ====
  
-**Why not use static analysis?** It will never be used by most developers.+**Why not use static analysis?** 
 + 
 +It will never be used by most developers.
  
 I still agree with [[https://news-web.php.net/php.internals/109192|Tyson Andre]], you should use Static Analysis, but it's an extra step that most programmers cannot be bothered to do, especially those who are new to programming (its usage tends to be higher among those writing well-tested libraries). I still agree with [[https://news-web.php.net/php.internals/109192|Tyson Andre]], you should use Static Analysis, but it's an extra step that most programmers cannot be bothered to do, especially those who are new to programming (its usage tends to be higher among those writing well-tested libraries).
  
-Also, these tools currently focus on other issues (type checking, basic logic flaws, code formatting, etc), rarely attempting to address injection vulnerabilities. Those that do are [[https://github.com/vimeo/psalm/commit/2122e4a1756dac68a83ec3f5abfbc60331630781|often incomplete]], need sinks specified on all library methods (unlikely to happen), and are not enabled by default. For example, Psalm, even in its strictest errorLevel (1), and running //--taint-analysis// (rarely used), will not notice the missing quote marks in this SQL, and incorrectly assume it's safe:+Also, these tools currently focus on other issues (type checking, basic logic flaws, code formatting, etc), rarely attempting to address Injection Vulnerabilities. Those that do are [[https://github.com/vimeo/psalm/commit/2122e4a1756dac68a83ec3f5abfbc60331630781|often incomplete]], need sinks specified on all library methods (unlikely to happen), and are not enabled by default. For example, Psalm, even in its strictest errorLevel (1), and running //--taint-analysis// (rarely used), will not notice the missing quote marks in this SQL, and incorrectly assume it's safe:
  
 <code php> <code php>
Line 160: Line 162:
 ==== Performance ==== ==== Performance ====
  
-**What about the performance impact?** Máté Kocsis has created a [[https://github.com/kocsismate/php-version-benchmarks/|php benchmark]] to replicate the old [[https://01.org/node/3774|Intel Tests]], and the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/kocsismate.pdf|preliminary testing on this implementation]] has found a 0.124% performance hit for the Laravel Demo app, and 0.161% for Symfony (rounds 4-6, which involved 5000 requests). These tests do not connect to a database, as the variability introduced makes it impossible to measure the difference.+**What about the performance impact?** 
 + 
 +These stats from an early version of the implementation (new tests will be completed soon). 
 + 
 +Máté Kocsis has created a [[https://github.com/kocsismate/php-version-benchmarks/|php benchmark]] to replicate the old [[https://01.org/node/3774|Intel Tests]], and the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/kocsismate.pdf|preliminary testing on this implementation]] has found a 0.124% performance hit for the Laravel Demo app, and 0.161% for Symfony (rounds 4-6, which involved 5000 requests). These tests do not connect to a database, as the variability introduced makes it impossible to measure that low of a difference.
  
-There is a more severe 3.719% when running this [[https://github.com/kocsismate/php-version-benchmarks/blob/main/app/zend/concat.php#L25|concat test]], but this is not representative of a typical PHP script (it's not normal to concatenate 4 strings, 5 million times, with no other actions).+The full stress-test is 3.719% when running this [[https://github.com/kocsismate/php-version-benchmarks/blob/main/app/zend/concat.php#L25|concat test]], but this is not representative of a typical PHP script (it's not normal to concatenate 4 strings, 5 million times, with no other actions).
  
 Joe Watkins has also noted that further optimisations are possible (the implementation has focused on making it work). Joe Watkins has also noted that further optimisations are possible (the implementation has focused on making it work).
Line 168: Line 174:
 ==== String Concatenation ==== ==== String Concatenation ====
  
-**Is string concatenation supported?** Yes.+**Is string concatenation supported?**
  
-The trusted flag is preserved when two trusted values are concatenated; this makes it easier to use //is_trusted()//, especially by developers that use concatenation for their SQL/HTML/CLI/etc.+Yes. The trusted flag is preserved when two trusted values are concatenated; this makes it easier to use //is_trusted()//, especially by developers that use concatenation for their SQL/HTML/CLI/etc.
  
-Previously we tried a version that only supported concatenation at compile-time (not run-time), to see if it would reduce the performance impact even further, and doing so might help developers with debugging. The idea was to require everyone to use special //trusted_concat()// and //trusted_implode()// functions, which would raise exceptions to highlight where mistakes were made. These two functions can still be implemented by developers themselves (see [[#support_functions|Support Functions]] below), as they can be useful; but requiring everyone to use them would have required big changes to existing projects, and exceptions are not a graceful way of handling mistakes.+Previously we tried a version that only supported concatenation at compile-time (not run-time), to see if it would reduce the performance impact even further. The idea was to require everyone to use special //trusted_concat()// and //trusted_implode()// functions, which would raise exceptions to highlight where mistakes were made. These two functions can still be implemented by developers themselves (see [[#support_functions|Support Functions]] below), as they can be useful; but requiring everyone to use them would have required big changes to existing projects, and exceptions are not a graceful way of handling mistakes.
  
 Performance wise, my [[https://github.com/craigfrancis/php-is-literal-rfc/tree/main/tests|simplistic testing]] found there was still [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/local.pdf|a small impact without run-time concat]]: Performance wise, my [[https://github.com/craigfrancis/php-is-literal-rfc/tree/main/tests|simplistic testing]] found there was still [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/local.pdf|a small impact without run-time concat]]:
Line 188: Line 194:
 ==== String Splitting ==== ==== String Splitting ====
  
-**Why don't you support string splitting?** In short, we can't find any use cases (security features should try to keep the implementation as simple as possible).+**Why don't you support string splitting?**
  
-Also, the security considerations are different. Concatenation joins known/fixed units together, whereas if you're starting with a trusted string, and the program allows the evil user to split the string (e.g. setting the length in substr), then they get considerable control over the result (it creates an untrusted modification).+In short, we can't find any real use cases (security features should try to keep the implementation as simple as possible). 
 + 
 +Also, the security considerations are different. Concatenation joins known/fixed units together, whereas if you're starting with a trusted string, and the program allows the Evil-User to split the string (e.g. setting the length in substr), then they get considerable control over the result (it creates an untrusted modification).
  
 These are unlikely to be written by a programmer, but consider these: These are unlikely to be written by a programmer, but consider these:
Line 212: Line 220:
 If the values are explicitly cast to integers, there is no need to make a change. If the values are explicitly cast to integers, there is no need to make a change.
  
-That said, ideally you would still follow the advice from [[https://stackoverflow.com/a/23641033/538216|Levi Morrison]], [[https://www.php.net/manual/en/pdostatement.execute.php#example-1012|PDO Execute]], and [[https://www.drupal.org/docs/7/security/writing-secure-code/database-access#s-multiple-arguments|Drupal Multiple Arguments]]:+That said, ideally you would still follow the advice from [[https://stackoverflow.com/a/23641033/538216|Levi Morrison]], [[https://www.php.net/manual/en/pdostatement.execute.php#example-1012|PDO Execute]], and [[https://www.drupal.org/docs/7/security/writing-secure-code/database-access#s-multiple-arguments|Drupal Multiple Arguments]], and implement as such:
  
 <code php> <code php>
Line 229: Line 237:
 ==== Non-Parameterised Values ==== ==== Non-Parameterised Values ====
  
-**How can this work with Table and Field names in SQL, which cannot use parameters?** They are often in variables written as literal strings anyway (so no changes needed); and if they are dependent on user input, in most cases you can (and should) use literals:+**How can this work with Table and Field names in SQL, which cannot use parameters?** 
 + 
 +They are often in variables written as literal strings anyway (so no changes needed); and if they are dependent on user input, in most cases you can (and should) use literals:
  
 <code php> <code php>
Line 249: Line 259:
 **How does this work in cases where you can't use trusted values?** **How does this work in cases where you can't use trusted values?**
  
-For example [[https://news-web.php.net/php.internals/87667|Dennis Birkholz]] noted that some Systems/Frameworks currently define some variables (e.g. table name prefixes) without the use of a literal (e.g. ini/json/yaml).+For example [[https://news-web.php.net/php.internals/87667|Dennis Birkholz]] noted that some Systems/Frameworks currently define some variables (e.g. table name prefixes) without the use of a literal (e.g. ini/json/yaml). And Larry Garfield noted that in Drupal's ORM "the table name itself is user-defined" (not in the PHP script).
  
-And Larry Garfield noted that in Drupal's ORM "the table name itself is user-defined" (not in the PHP script). +While most systems can use trusted values entirely, these special non-trusted values should still be handled separately (and carefully). This approach allows the library to ensure the majority of the input (SQL) is trusted, and then it can consistently check/escape those special values (e.g. does it match a valid table/field name, which can be included safely).
- +
-While most systems can use trusted values entirely, these special non-trusted values should still be handled separately (and carefully). This approach allows the library to ensure the majority of the input (SQL) is trusted, andthen it can consistently check/escape those special values (e.g. does it match a valid table/field name, which can be included safely).+
  
 [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L194|How this can be done with aliases]], or the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L229|example Query Builder]]. [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L194|How this can be done with aliases]], or the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L229|example Query Builder]].
Line 259: Line 267:
 ==== Usage by Libraries ==== ==== Usage by Libraries ====
  
-**Could libraries use is_trusted() internally?** Yes, they could.+**Could libraries use is_trusted() internally?** 
 + 
 +Yes, they could.
  
 It would be fantastic if they did use additional //is_trusted()// checks after receiving the values from developers (it ensures the library hasn't introduced a vulnerability either); but this isn't a priority, simply because libraries are rarely the source of Injection Vulnerabilities. It would be fantastic if they did use additional //is_trusted()// checks after receiving the values from developers (it ensures the library hasn't introduced a vulnerability either); but this isn't a priority, simply because libraries are rarely the source of Injection Vulnerabilities.
Line 269: Line 279:
 ==== Integer Values ==== ==== Integer Values ====
  
-**Can you support Integer values?** Yes, support for integers is now included.+**Can you support Integer values?** 
 + 
 +Yes, support for integers is now included.
  
 It was noted by Matthew Brown (and others) that a lot of existing code and tutorials uses integers directly, and they do not cause a security issue. It was noted by Matthew Brown (and others) that a lot of existing code and tutorials uses integers directly, and they do not cause a security issue.
  
-We did consider only trusting integers from developersunfortunately [[https://news-web.php.net/php.internals/114964|it would require a big change to add a trusted flag on integers]].+We tried to flag integers defined in the source codein the same way we are doing with strings. Unfortunately [[https://news-web.php.net/php.internals/114964|it would require a big change to add a trusted flag on integers]]. Changing how integers work internally would have made a big performance impact, and potentially affected every part of PHP (including extensions).
  
-That said, while it's not as ideologically pure, we can still trust all integers (in regards to Injection Vulnerabilities), no matter where they came from.+That said, while it's not as philosophically pure, we can still trust all integers in regards to Injection Vulnerabilities, no matter where they came from.
  
-We have not found any issues with allowing integers in SQL, HTML, CLI; and other contexts as well (like preg, mail additional_params, XPath query, and even eval).+We have not found any issues with allowing integers in SQL, HTML, CLI; and other contexts as well (e.g. preg, mail additional_params, XPath query, and even eval).
  
-We could not find any character encoding issues (The closest we could find was EBCDIC, an old IBM character encoding, which encodes the 0-9 characters differently; which anyone using it would need to re-encode either way, also [[https://www.php.net/manual/en/migration80.other-changes.php#migration80.other-changes.ebcdic|EBCDIC is not supported by PHP]]).+We could not find any character encoding issues (The closest we could find was EBCDIC, an old IBM character encoding, which encodes the 0-9 characters differently; which anyone using it would need to re-encode either way, and [[https://www.php.net/manual/en/migration80.other-changes.php#migration80.other-changes.ebcdic|EBCDIC is not supported by PHP]]).
  
 We also checked what would happen if a 64bit PHP server sent a large number to a 32bit database, but that's not an issue either, because the number is being encoded as characters in a string, so that's also fine. We also checked what would happen if a 64bit PHP server sent a large number to a 32bit database, but that's not an issue either, because the number is being encoded as characters in a string, so that's also fine.
Line 285: Line 297:
 ==== Other Values ==== ==== Other Values ====
  
-**Why don't you support Boolean/Float values?** It's a very low value feature, and we cannot be sure of the security implications.+**Why don't you support Boolean/Float values?** 
 + 
 +It's a very low value feature, and we cannot be sure of the security implications.
  
 For example, the value you put in often is not always the same as what you get out: For example, the value you put in often is not always the same as what you get out:
Line 306: Line 320:
 First, there is no perfect name. First, there is no perfect name.
  
-We did start with //is_literal()// as a placeholder name (at a time we only trusted literals). This name wasn't perfect, but it would have allowed developers to search and get an idea of what a literal was. When [[#integer_values|integer values]] were deemed necessary to help adoption, the name became more of a problem. We also need to keep to a single word name (to support a dedicated type in the future). This is where //is_trusted()// and //is_known()// was proposed. While we preferred //is_known()//, it turned out to be too vague, and a [[https://strawpoll.com/bd2qed2xs/r|vote on the name]] gave use a 18 to 3 result in favour of //is_trusted()//.+We did start with //is_literal()// as a placeholder name (at a time we only trusted literals). This name wasn't perfect, but it would have allowed developers to search and get an idea of what a literal was. When [[#integer_values|integer values]] were deemed necessary to help adoption, the name became more of a problem. We also need to keep to a single word name (to support a dedicated type in the future). This is where //is_trusted()// and //is_known()// was proposed. We had a [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/name/2021-07-20.png|vote on the name]], which gave us a 18 to 3 result in favour of //is_trusted()//.
  
-Alternative suggestions have included //is_from_literal()// from [[https://news-web.php.net/php.internals/109197|Jakob Givoni]], //is_hard_coded()// from [[https://news-web.php.net/php.internals/114910|Dik Takken]], and references to alternative implementations like "compile time constants" and "code string".+==== Support Functions ====
  
-We cannot call it //is_safe()//, because we cannot say anything is safe: +**What about other support functions?**
- +
-<code php> +
-$cli = 'rm -rf ?'; +
-$sql = 'DELETE FROM my_table WHERE my_date >= ?'; +
-eval('$name = "' . $_GET['name'] . '";'); // INSECURE +
-</code> +
- +
-While the first two examples cannot include Injection Vulnerabilities, the parameters could be set to "/" or "0000-00-00" (providing a nice vanishing magic trick); and the last one, well, they have much bigger issues to worry about (it's clearly irresponsible, and intentionally dangerous). +
- +
-==== Support Functions ====+
  
-**What about other support functions?** We did consider //trusted_concat()// and //trusted_implode()// functions (see [[#string_concatenation|String Concatenation]] above), but these can be userland functions:+We did consider //trusted_concat()// and //trusted_implode()// functions (see [[#string_concatenation|String Concatenation]] above), but these can be userland functions:
  
 <code php> <code php>
Line 362: Line 366:
 ==== Other Functions ==== ==== Other Functions ====
  
-**Why not support other string functions?** Like [[#string_splitting|String Splitting]], we can't find any use cases, and don't want to make this complicated. For example //strtoupper()// might be reasonable, but we will need to consider how it would be used (good and bad), and check for any oddities (e.g. output varying based on the current locale). Also, functions like //str_shuffle()// create unpredictable results.+**Why not support other string functions?** 
 + 
 +Like [[#string_splitting|String Splitting]], we can't find any real use cases, and don't want to make this complicated. For example //strtoupper()// might be reasonable, but we will need to consider how it would be used (good and bad), and check for any oddities (e.g. output varying based on the current locale). Also, functions like //str_shuffle()// create unpredictable results
 + 
 +==== Limitations ==== 
 + 
 +**Does this mean the value is completely safe?** 
 + 
 +While these values can be trusted to not contain an Injection Vulnerability, they cannot be completely safe from every kind of issue, For example: 
 + 
 +<code php> 
 +$cli = 'rm -rf ?'; // RISKY 
 +$sql = 'DELETE FROM my_table WHERE my_date >= ?'; // RISKY 
 +</code> 
 + 
 +The parameters could be set to "/" or "0000-00-00", which can result in deleting a lot more data than expected. 
 + 
 +There's no single RFC that can completely solve all developer errors, but this takes one of the biggest ones off the table.
  
 ==== Faking it ==== ==== Faking it ====
Line 370: Line 391:
 This implementation does not provide a way for a developer to mark anything they want as trusted. This is on purpose. We do not want to recreate the biggest flaw of Taint Checking. It would be very easy for a naive developer to mark all escaped values as trusted ([[#taint_checking|wrong]]). This implementation does not provide a way for a developer to mark anything they want as trusted. This is on purpose. We do not want to recreate the biggest flaw of Taint Checking. It would be very easy for a naive developer to mark all escaped values as trusted ([[#taint_checking|wrong]]).
  
-That said, we do not pretend there aren't ways around this (e.g. using var_export), but doing so is clearly the developer doing something wrong. We want to provide safety rails, there is nothing stopping the developer from jumping over them.+That said, we do not pretend there aren't ways around this (e.g. using [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/is-trusted-bypass.php|var_export]]), but doing so is clearly the developer doing something wrong. We want to provide safety rails, but there is nothing stopping the developer from jumping over them if that's their choice.
  
 ==== Extensions ==== ==== Extensions ====
  
-**Extensions create and manipulate strings, won't this break the flag on strings?** Strings have multiple flags already, and are off by defaultthis is the correct behaviour when extensions create their own strings (should not be considered trusted). If an extension is found to be changing the trusted flag incorrectly (unlikely), that's the same as any new flag being introduced, and will need to be fixed in the same way.+**Extensions create and manipulate strings, won't this break the flag on strings?** 
 + 
 +Strings have multiple flags already that are off by default this is the correct behaviour when extensions create their own strings (should not be flagged as trusted). If an extension is found to be already using the flag we're using for is_trusted (unlikely), that's the same as any new flag being introduced into PHP, and will need to be updated in the same way.
  
 ==== Reflection API ==== ==== Reflection API ====
  
-**Why don't you use the reflection API?** This allows you to "introspect classes, interfaces, functions, methods and extensions"; it's not currently set up for object methods to inspect the code calling it. Even if that was to be added (unlikely), it could only check if the trusted value was defined there, it couldn't handle variables (tracking back to their source), nor could it provide any future scope for a dedicated type, nor could native functions work with this (see "Future Scope").+**Why don't you use the Reflection API?** 
 + 
 +This allows you to "introspect classes, interfaces, functions, methods and extensions"; it's not currently set up for object methods to inspect the code calling it. Even if that was to be added (unlikely), it could only check if the trusted value was defined there, it couldn't handle variables (tracking back to their source), nor could it provide any future scope for a dedicated type, nor could native functions work with this (see "Future Scope").
  
 ==== Interned Strings ==== ==== Interned Strings ====
  
-**The output from //chr()// appears as trusted?** This was noticed by Claude Pache, and on a technical level is due to the [[https://news-web.php.net/php.internals/114877|use of Interned Strings]], an optimisation used by //RETURN_CHAR// that re-uses single character values. It's effectively the same as calling //sprintf('%c', $i)//, which is also not an issue, as the developer is choosing to do this.+**Why does the output from //chr()// appear as trusted?** 
 + 
 +This was noticed by Claude Pache, and on a technical level is due to the [[https://news-web.php.net/php.internals/114877|use of Interned Strings]], an optimisation used by //RETURN_CHAR// that re-uses single character values. It's effectively the same as calling //sprintf('%c', $i)//, which is also not an issue, as the developer is choosing to do this.
  
 ===== Previous Work ===== ===== Previous Work =====
Line 431: Line 458:
 ===== Open Issues ===== ===== Open Issues =====
  
-  - Supporting Integers/Interned values. +None
-  - The name. +
- +
-===== Unaffected PHP Functionality ===== +
- +
-None known+
  
 ===== Future Scope ===== ===== Future Scope =====
Line 444: Line 466:
 2) As noted by MarkR, the biggest benefit will come when this flag can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc). 2) As noted by MarkR, the biggest benefit will come when this flag can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc).
  
-First we need libraries to start using //is_trusted()// to check their inputs. The library can then do their thing, and apply the appropriate escaping, which can result in a value that no longer has the trusted flag set, but should still be trusted.+However, first we need libraries to start using //is_trusted()// to check their inputs. The library can then do their thing, and apply the appropriate escaping, which can result in a value that no longer has the trusted flag set, but should still be trusted.
  
-With a future RFC, we **could** introduce checks for the native functions. For example, if we use the [[https://web.dev/trusted-types/|Trusted Types]] concept from JavaScript (which protects [[https://www.youtube.com/watch?v=po6GumtHRmU&t=92s|60+ Injection Sinks]], like innerHTML), the libraries create a stringable object as their output. These objects can be added to a list of trusted objects for the relevant native functions. The native functions can then **warn** developers when they do not receive a value with the trusted flag, or one of the trusted objects. These warnings would not **break anything**, they just make developers aware of the mistakes they have made, and we will always need a way of switching them off entirely (e.g. phpMyAdmin).+With a future RFC, we could potentially introduce checks for the native functions. For example, if we use the [[https://web.dev/trusted-types/|Trusted Types]] concept from JavaScript (which protects [[https://www.youtube.com/watch?v=po6GumtHRmU&t=92s|60+ Injection Sinks]], like innerHTML), the libraries create a stringable object as their output. These objects can be added to a list of trusted objects for the relevant native functions. The native functions could then **warn** developers when they do not receive a value with the trusted flag, or one of the trusted objects. These warnings would **not break anything**, they just make developers aware of the mistakes they have made, and we will always need a way of switching them off entirely (e.g. phpMyAdmin).
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
rfc/is_trusted.1624276803.txt.gz · Last modified: 2021/06/21 12:00 by craigfrancis