rfc:is_trusted

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:is_trusted [2021/06/20 20:52] craigfrancisrfc:is_trusted [2021/06/21 19:36] (current) craigfrancis
Line 1: Line 1:
-====== PHP RFC: Is Trusted ======+====== PHP RFC: Is_Trusted ======
  
   * Version: 0.9   * Version: 0.9
Line 5: Line 5:
   * Updated: 2021-06-06   * Updated: 2021-06-06
   * Author: Craig Francis, craig#at#craigfrancis.co.uk   * Author: Craig Francis, craig#at#craigfrancis.co.uk
-  * Contributors: Joe Watkins, Dan Ackroyd, Máté Kocsis+  * Contributors: Joe Watkins, Máté Kocsis, Dan Ackroyd
   * Status: Under Discussion   * Status: Under Discussion
-  * First Published at: https://wiki.php.net/rfc/is_trusted+  * First Published at: https://wiki.php.net/rfc/is_literal
   * GitHub Repo: https://github.com/craigfrancis/php-is-literal-rfc   * GitHub Repo: https://github.com/craigfrancis/php-is-literal-rfc
  
Line 14: Line 14:
 Add the function //is_trusted()//, so developers can check if a variable can be trusted to not contain an Injection Vulnerability. Add the function //is_trusted()//, so developers can check if a variable can be trusted to not contain an Injection Vulnerability.
  
-By trusting integers and literals (strings defined by the programmer), we can provide a lightweight, simple, and very effective way to identify common Injection Vulnerabilities.+By trusting integers and literals (strings defined by the programmer), we can provide a lightweight, simple, and very effective way to prevent Injection Vulnerabilities.
  
-It avoids the "false sense of security" that comes with the flawed "Taint Checking" approach, [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/escaping.php?ts=4|because escaping is very difficult to get right]]. +It avoids the "false sense of security" that comes with the flawed "Taint Checking" approach, [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/escaping.php?ts=4|because escaping is very difficult to get right]]. It's much safer for developers to use parameterised queries, and well-tested libraries.
- +
-Developers should not escape anything themselves (or use anything like "magic quotes"); they should simply use parameterised queries, and/or well-tested libraries.+
  
 These libraries require certain sensitive values to only come from the developer; but because it's [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/mistakes.php?ts=4|easy to incorrectly include user values]], Injection Vulnerabilities are still introduced by the thousands of developers using these libraries incorrectly. You will notice the linked examples are based on examples found in the Libraries' official documentation, they still "work", and are typically shorter/easier than doing it correctly (I've found many of them on live websites, and it's why I'm here). A simple Query Builder example being: These libraries require certain sensitive values to only come from the developer; but because it's [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/mistakes.php?ts=4|easy to incorrectly include user values]], Injection Vulnerabilities are still introduced by the thousands of developers using these libraries incorrectly. You will notice the linked examples are based on examples found in the Libraries' official documentation, they still "work", and are typically shorter/easier than doing it correctly (I've found many of them on live websites, and it's why I'm here). A simple Query Builder example being:
Line 28: Line 26:
 </code> </code>
  
-The "Future Scope" section explains why a dedicated type should come later, and how native functions will be able to use the //trusted// flag as well.+The "Future Scope" section explains why a dedicated type should come later, and how native functions could use the //trusted// flag as well.
  
 ===== Background ===== ===== Background =====
Line 58: Line 56:
 ==== Usage in PHP ==== ==== Usage in PHP ====
  
-Libraries would be able to use //is_trusted()// immediately, allowing them to warn developers about Injection Issues as soon as they receive any non-trusted values, for example:+Libraries would be able to use //is_trusted()// immediately, allowing them to warn developers about Injection Issues as soon as they receive any non-trusted values. Some already plan to implement this, for example:
  
 **Propel** (Mark Scherer): "given that this would help to more safely work with user input, I think this syntax would really help in Propel." **Propel** (Mark Scherer): "given that this would help to more safely work with user input, I think this syntax would really help in Propel."
Line 68: Line 66:
 ===== Proposal ===== ===== Proposal =====
  
-Add the function //is_trusted()//, to check variables have only been created by integers and literals (strings defined by the programmer).+Add the function //is_trusted()//, where "trusted" is defined as: 
 + 
 +  - Strings defined by the programmer (literals), 
 +  - Strings defined by the engine (interned strings), 
 +  - Any integer. 
 + 
 +No input values are interned; instead an interned string includes: 
 + 
 +  - Strings defined by the programmer
 +  - Strings defined in the source code of php, 
 +  - Strings defined by the engine (either at compile or runtime), with known values. 
 + 
 +Any function or instruction that is aware of trusted variables shall produce a trusted string if all input would pass //is_trusted()//. This includes //sprintf()//, //str_repeat()//, //str_pad()//, //implode()//, //join()//, //array_pad()//, and //array_fill()//.
  
 <code php> <code php>
Line 97: Line 107:
 example(strtoupper($a)); // Exception thrown. example(strtoupper($a)); // Exception thrown.
 </code> </code>
- 
-Because so much existing code uses concatenation, and because it does not modify what the programmer has written, concatenated trusted values will keep the trusted flag. This includes the use of //str_repeat()//, //str_pad()//, //implode()//, //join()//, //array_pad()//, and //array_fill()//. 
- 
-And //sprintf()// will set the trusted flag, if provided with trusted values. 
  
 ===== Try it ===== ===== Try it =====
  
-[[https://3v4l.org/#focus=rfc.literals|Have a play with it on 3v4l.org]] - Note, the function is still named //is_literal()//.+[[https://3v4l.org/#focus=rfc.literals|Have a play with it on 3v4l.org]]
  
 [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4|How it can be used by libraries]] - Notice how this example library just raises a warning, to simply let the developer know about the issue, **without breaking anything**. And it provides an //"unsafe_value"// value-object to bypass the //is_trusted()// check, but none of the examples need to use it (can be useful as a temporary thing, but there are much safer/better solutions, which developers are/should already be using). [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4|How it can be used by libraries]] - Notice how this example library just raises a warning, to simply let the developer know about the issue, **without breaking anything**. And it provides an //"unsafe_value"// value-object to bypass the //is_trusted()// check, but none of the examples need to use it (can be useful as a temporary thing, but there are much safer/better solutions, which developers are/should already be using).
Line 112: Line 118:
 ==== Taint Checking ==== ==== Taint Checking ====
  
-**Taint checking is flawed, isn't this the same?** It is not the same.+**Taint checking is flawed, isn't this the same?**
  
-Taint Checking incorrectly assumes the output of an escaping function is "safe" for a particular context. While it sounds reasonable in theory, the operation of escaping functions, and the context for which their output is safe, is very hard to define. This leads to a feature that is both complex and unreliable.+It is not the same. Taint Checking incorrectly assumes the output of an escaping function is "safe" for a particular context. While it sounds reasonable in theory, the operation of escaping functions, and the context for which their output is safe, is very hard to define and led to a feature that is both complex and unreliable.
  
 <code php> <code php>
Line 122: Line 128:
 </code> </code>
  
-All three examples would be incorrectly considered "untainted". The first two need the values to be quoted. The third example, //htmlentities()// does not escape single quotes by default before PHP 8.1 ([[https://github.com/php/php-src/commit/50eca61f68815005f3b0f808578cc1ce3b4297f0|fixed]]), and it does not consider the issue of 'javascript:' URLs.+All three examples would be incorrectly considered "safe(untainted). The first two need the values to be quoted. The third example, //htmlentities()// does not escape single quotes by default before PHP 8.1 ([[https://github.com/php/php-src/commit/50eca61f68815005f3b0f808578cc1ce3b4297f0|fixed]]), and it does not consider the issue of 'javascript:' URLs.
  
 In comparison, //is_trusted()// doesn't have an equivalent of //untaint()//, or support escaping. Instead PHP will set the trusted flag, and as soon as the value has been manipulated or includes anything that is not trusted (e.g. user data), the trusted flag is lost. In comparison, //is_trusted()// doesn't have an equivalent of //untaint()//, or support escaping. Instead PHP will set the trusted flag, and as soon as the value has been manipulated or includes anything that is not trusted (e.g. user data), the trusted flag is lost.
  
-This allows libraries to use //is_trusted()// to identify when they are provided a sensitive value that should not include user input. Then it's up to the library to handle the escaping (if it's even needed). The "Future Scope" section notes how native functions will be able to use the trusted flag as well.+This allows libraries to use //is_trusted()// to check the sensitive values they receive from the developer. Then it's up to the library to handle the escaping (if it's even needed). The "Future Scope" section notes how native functions will be able to use the trusted flag as well.
  
 ==== Education ==== ==== Education ====
  
-**Why not educate everyone?** You can't - developer training simply does not scale, and mistakes still happen.+**Why not educate everyone?** 
 + 
 +You can't - developer training simply does not scale, and mistakes still happen.
  
 We cannot expect everyone to have formal training, know everything from day 1, and consider programming a full time job. We want new programmers, with a variety of experiences, ages, and backgrounds. Everyone should be guided to do the right thing, and notified as soon as they make a mistake (we all make mistakes). We also need to acknowledge that many programmers are busy, do copy/paste code, don't necessarily understand what it does, edit it for their needs, then simply move on to their next task. We cannot expect everyone to have formal training, know everything from day 1, and consider programming a full time job. We want new programmers, with a variety of experiences, ages, and backgrounds. Everyone should be guided to do the right thing, and notified as soon as they make a mistake (we all make mistakes). We also need to acknowledge that many programmers are busy, do copy/paste code, don't necessarily understand what it does, edit it for their needs, then simply move on to their next task.
Line 136: Line 144:
 ==== Static Analysis ==== ==== Static Analysis ====
  
-**Why not use static analysis?** It will never be used by most developers.+**Why not use static analysis?** 
 + 
 +It will never be used by most developers.
  
 I still agree with [[https://news-web.php.net/php.internals/109192|Tyson Andre]], you should use Static Analysis, but it's an extra step that most programmers cannot be bothered to do, especially those who are new to programming (its usage tends to be higher among those writing well-tested libraries). I still agree with [[https://news-web.php.net/php.internals/109192|Tyson Andre]], you should use Static Analysis, but it's an extra step that most programmers cannot be bothered to do, especially those who are new to programming (its usage tends to be higher among those writing well-tested libraries).
  
-Also, these tools currently focus on other issues (type checking, basic logic flaws, code formatting, etc), rarely attempting to address injection vulnerabilities. Those that do are [[https://github.com/vimeo/psalm/commit/2122e4a1756dac68a83ec3f5abfbc60331630781|often incomplete]], need sinks specified on all library methods (unlikely to happen), and are not enabled by default. For example, Psalm, even in its strictest errorLevel (1), and running //--taint-analysis// (I bet you don't use this), will not notice the missing quote marks in this SQL, and incorrectly assume it's safe:+Also, these tools currently focus on other issues (type checking, basic logic flaws, code formatting, etc), rarely attempting to address Injection Vulnerabilities. Those that do are [[https://github.com/vimeo/psalm/commit/2122e4a1756dac68a83ec3f5abfbc60331630781|often incomplete]], need sinks specified on all library methods (unlikely to happen), and are not enabled by default. For example, Psalm, even in its strictest errorLevel (1), and running //--taint-analysis// (rarely used), will not notice the missing quote marks in this SQL, and incorrectly assume it's safe:
  
 <code php> <code php>
Line 152: Line 162:
 ==== Performance ==== ==== Performance ====
  
-**What about the performance impact?** Máté Kocsis has created a [[https://github.com/kocsismate/php-version-benchmarks/|php benchmark]] to replicate the old [[https://01.org/node/3774|Intel Tests]], and the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/kocsismate.pdf|preliminary testing on this implementation]] has found a 0.124% performance hit for the Laravel Demo app, and 0.161% for Symfony (rounds 4-6, which involved 5000 requests). These tests do not connect to a database, as the variability introduced makes it impossible to measure the difference.+**What about the performance impact?** 
 + 
 +These stats from an early version of the implementation (new tests will be completed soon). 
 + 
 +Máté Kocsis has created a [[https://github.com/kocsismate/php-version-benchmarks/|php benchmark]] to replicate the old [[https://01.org/node/3774|Intel Tests]], and the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/kocsismate.pdf|preliminary testing on this implementation]] has found a 0.124% performance hit for the Laravel Demo app, and 0.161% for Symfony (rounds 4-6, which involved 5000 requests). These tests do not connect to a database, as the variability introduced makes it impossible to measure that low of a difference.
  
-There is a more severe 3.719% when running this [[https://github.com/kocsismate/php-version-benchmarks/blob/main/app/zend/concat.php#L25|concat test]], but this is not representative of a typical PHP script (it's not normal to concatenate 4 strings, 5 million times, with no other actions).+The full stress-test is 3.719% when running this [[https://github.com/kocsismate/php-version-benchmarks/blob/main/app/zend/concat.php#L25|concat test]], but this is not representative of a typical PHP script (it's not normal to concatenate 4 strings, 5 million times, with no other actions).
  
 Joe Watkins has also noted that further optimisations are possible (the implementation has focused on making it work). Joe Watkins has also noted that further optimisations are possible (the implementation has focused on making it work).
Line 164: Line 178:
 Yes. The trusted flag is preserved when two trusted values are concatenated; this makes it easier to use //is_trusted()//, especially by developers that use concatenation for their SQL/HTML/CLI/etc. Yes. The trusted flag is preserved when two trusted values are concatenated; this makes it easier to use //is_trusted()//, especially by developers that use concatenation for their SQL/HTML/CLI/etc.
  
-Previously we tried a version that only supported concatenation at compile-time (not run-time), to see if it would reduce the performance impact even further, and doing so might help developers with debugging. The idea was to require everyone to use special //trusted_concat()// and //trusted_implode()// functions, which would raise exceptions to highlight where mistakes were made. These two functions can still be implemented by developers themselves (see [[#support_functions|Support Functions]] below), as they can be useful; but requiring everyone to use them would have required big changes to existing projects, and exceptions are not a graceful way of handling mistakes.+Previously we tried a version that only supported concatenation at compile-time (not run-time), to see if it would reduce the performance impact even further. The idea was to require everyone to use special //trusted_concat()// and //trusted_implode()// functions, which would raise exceptions to highlight where mistakes were made. These two functions can still be implemented by developers themselves (see [[#support_functions|Support Functions]] below), as they can be useful; but requiring everyone to use them would have required big changes to existing projects, and exceptions are not a graceful way of handling mistakes.
  
 Performance wise, my [[https://github.com/craigfrancis/php-is-literal-rfc/tree/main/tests|simplistic testing]] found there was still [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/local.pdf|a small impact without run-time concat]]: Performance wise, my [[https://github.com/craigfrancis/php-is-literal-rfc/tree/main/tests|simplistic testing]] found there was still [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/tests/results/with-concat/local.pdf|a small impact without run-time concat]]:
Line 180: Line 194:
 ==== String Splitting ==== ==== String Splitting ====
  
-**Why don't you support string splitting?** In short, we can't find any use cases (security features should try to keep the implementation as simple as possible).+**Why don't you support string splitting?**
  
-Also, the security considerations are different. Concatenation joins known/fixed units together, whereas if you're starting with a trusted string, and the program allows the evil user to split the string (e.g. setting the length in substr), then they get considerable control over the result (it creates an untrusted modification).+In short, we can't find any real use cases (security features should try to keep the implementation as simple as possible). 
 + 
 +Also, the security considerations are different. Concatenation joins known/fixed units together, whereas if you're starting with a trusted string, and the program allows the Evil-User to split the string (e.g. setting the length in substr), then they get considerable control over the result (it creates an untrusted modification).
  
 These are unlikely to be written by a programmer, but consider these: These are unlikely to be written by a programmer, but consider these:
Line 200: Line 216:
 ==== WHERE IN ==== ==== WHERE IN ====
  
-**What about an undefined number of parameters, e.g. //WHERE id IN (?, ?, ?)//?** You should already be following the advice from [[https://stackoverflow.com/a/23641033/538216|Levi Morrison]], [[https://www.php.net/manual/en/pdostatement.execute.php#example-1012|PDO Execute]], and [[https://www.drupal.org/docs/7/security/writing-secure-code/database-access#s-multiple-arguments|Drupal Multiple Arguments]]:+**What about an undefined number of parameters, e.g. //WHERE id IN (?, ?, ?)//?** 
 + 
 +If the values are explicitly cast to integers, there is no need to make a change. 
 + 
 +That said, ideally you would still follow the advice from [[https://stackoverflow.com/a/23641033/538216|Levi Morrison]], [[https://www.php.net/manual/en/pdostatement.execute.php#example-1012|PDO Execute]], and [[https://www.drupal.org/docs/7/security/writing-secure-code/database-access#s-multiple-arguments|Drupal Multiple Arguments]], and implement as such:
  
 <code php> <code php>
Line 214: Line 234:
 } }
 </code> </code>
- 
-This pushes everyone to use parameters properly; rather than using implode() on user values, and including them directly in the SQL (which is easy to get wrong). That said, if the //$ids// are integers (a common practice) these will still be trusted without the developer needing to make any changes. 
  
 ==== Non-Parameterised Values ==== ==== Non-Parameterised Values ====
  
-**How can this work with Table and Field names in SQL, which cannot use parameters?** They are often in variables written as literal strings anyway (so no changes needed); and if they are dependent on user input, in most cases you can (and should) use literals:+**How can this work with Table and Field names in SQL, which cannot use parameters?** 
 + 
 +They are often in variables written as literal strings anyway (so no changes needed); and if they are dependent on user input, in most cases you can (and should) use literals:
  
 <code php> <code php>
Line 239: Line 259:
 **How does this work in cases where you can't use trusted values?** **How does this work in cases where you can't use trusted values?**
  
-For example [[https://news-web.php.net/php.internals/87667|Dennis Birkholz]] noted that some Systems/Frameworks currently define some variables (e.g. table name prefixes) without the use of a literal (e.g. ini/json/yaml).+For example [[https://news-web.php.net/php.internals/87667|Dennis Birkholz]] noted that some Systems/Frameworks currently define some variables (e.g. table name prefixes) without the use of a literal (e.g. ini/json/yaml). And Larry Garfield noted that in Drupal's ORM "the table name itself is user-defined" (not in the PHP script).
  
-And Larry Garfield noted that in Drupal's ORM "the table name itself is user-defined" (not in the PHP script). +While most systems can use trusted values entirely, these special non-trusted values should still be handled separately (and carefully). This approach allows the library to ensure the majority of the input (SQL) is trusted, and then it can consistently check/escape those special values (e.g. does it match a valid table/field name, which can be included safely).
- +
-While most systems can use trusted values entirely, these special non-trusted values should still be handled separately (and carefully). This approach allows the library to ensure the majority of the input (SQL) is trusted, then it can consistently check/escape those special values (e.g. does it match a valid table/field name, which can be included safely).+
  
 [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L194|How this can be done with aliases]], or the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L229|example Query Builder]]. [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L194|How this can be done with aliases]], or the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L229|example Query Builder]].
Line 249: Line 267:
 ==== Usage by Libraries ==== ==== Usage by Libraries ====
  
-**Could libraries use is_trusted() internally?** Yes, they could.+**Could libraries use is_trusted() internally?**
  
-It would be fantastic if they did use additional //is_trusted()// checks after receiving the values from developers (it ensures they haven't introduced a vulnerability either); but this isn't a priority, simply because libraries are rarely the source of Injection Vulnerabilities.+Yes, they could.
  
-That said, consider the Drupalgeddon vulnerability; where //$db->expandArguments()// allowed unsafe/non-trusted values to be used as placeholders with //IN (:arg_0, :arg_1)//. By using something like the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L229|example Query Builder]], //is_trusted()// would have been used to check the raw SQL, and the field/parameter names (which are not trusted in this case) get checked and appended separately.+It would be fantastic if they did use additional //is_trusted()// checks after receiving the values from developers (it ensures the library hasn't introduced a vulnerability either); but this isn't a priority, simply because libraries are rarely the source of Injection Vulnerabilities. 
 + 
 +That said, consider the Drupalgeddon vulnerability; where //$db->expandArguments()// allowed unsafe/non-trusted values to be used as placeholders with //IN (:arg_0, :arg_1)//. By using something like the [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/example.php?ts=4#L229|example Query Builder]], //is_trusted()// could have been used to check the raw SQL, and then the non-trusted field/parameter names would be checked and applied separately.
  
 Zend also had a couple of issues with ORDER BY, where it didn't check the inputs either ([[https://framework.zend.com/security/advisory/ZF2014-04|1]]/[[https://framework.zend.com/security/advisory/ZF2016-03|2]]). Zend also had a couple of issues with ORDER BY, where it didn't check the inputs either ([[https://framework.zend.com/security/advisory/ZF2014-04|1]]/[[https://framework.zend.com/security/advisory/ZF2016-03|2]]).
Line 259: Line 279:
 ==== Integer Values ==== ==== Integer Values ====
  
-**Can you support Integer values?** Yes, support for integers is now included.+**Can you support Integer values?**
  
-It was noted by Matthew Brown (and others) that a lot of existing code and tutorials uses integers directlyand doing so does not cause a security issue.+Yessupport for integers is now included.
  
-While we would like to only trust integers from developersit's [[https://news-web.php.net/php.internals/114964|not possible to add trusted flag on integers]] (well, not without some big changes).+It was noted by Matthew Brown (and others) that a lot of existing code and tutorials uses integers directlyand they do not cause security issue.
  
-That said, we can still trust all integers (in regards to Injection Vulnerabilities)no matter where they came fromWe have not found any issues with allowing integers in SQLHTML, CLI; and other contexts as well (like preg, mail additional_params, XPath query, and even eval).+We tried to flag integers defined in the source codein the same way we are doing with stringsUnfortunately [[https://news-web.php.net/php.internals/114964|it would require a big change to add a trusted flag on integers]]. Changing how integers work internally would have made a big performance impact, and potentially affected every part of PHP (including extensions).
  
-We could not find any character encoding issues; the closest we could find was EBCDICan old IBM character encoding, which encoded the 0-9 characters differently, but anyone using it would be dealing with that re-encoding issue separately, and [[https://www.php.net/manual/en/migration80.other-changes.php#migration80.other-changes.ebcdic|EBCDIC is not supported by PHP]].+That said, while it'not as philosophically pure, we can still trust all integers in regards to Injection Vulnerabilitiesno matter where they came from.
  
-We also checked what would happen if a 64bit PHP server sent a large number to a 32bit database, but that'not an issue eitherbecause the number is being encoded as characters in a stringso that's also fine.+We have not found any issues with allowing integers in SQLHTML, CLI; and other contexts as well (e.g. preg, mail additional_params, XPath queryand even eval).
  
-While it would be ideologically pure to only accept integers from the developer, allowing all integers helps adoption, and we cannot find single way that integers can cause an Injection Vulnerability.+We could not find any character encoding issues (The closest we could find was EBCDIC, an old IBM character encoding, which encodes the 0-9 characters differently; which anyone using it would need to re-encode either way, and [[https://www.php.net/manual/en/migration80.other-changes.php#migration80.other-changes.ebcdic|EBCDIC is not supported by PHP]]). 
 + 
 +We also checked what would happen if 64bit PHP server sent a large number to a 32bit database, but that's not an issue either, because the number is being encoded as characters in a string, so that's also fine.
  
 ==== Other Values ==== ==== Other Values ====
  
-**Why don't you support Boolean/Float values?** It's a very low value feature, and we cannot be sure of the security implications.+**Why don't you support Boolean/Float values?** 
 + 
 +It's a very low value feature, and we cannot be sure of the security implications.
  
 For example, the value you put in often is not always the same as what you get out: For example, the value you put in often is not always the same as what you get out:
Line 296: Line 320:
 First, there is no perfect name. First, there is no perfect name.
  
-We did start with //is_literal()// as a placeholder name (at a time we only trusted literals). This name wasn't perfect, but it would have allowed developers to search and get an idea of what a literal was. When [[#integer_values|integer values]] were deemed necessary to help adoption, the name became more of a problem. We also need to keep to a single word name (to support a dedicated type in the future). This is where //is_trusted()// and //is_known()// was proposed. While we preferred //is_known()//, it turned out to be too vague, and a [[https://strawpoll.com/bd2qed2xs/r|vote on the name]] gave use a 18 to 3 result in favour of //is_trusted()//.+We did start with //is_literal()// as a placeholder name (at a time we only trusted literals). This name wasn't perfect, but it would have allowed developers to search and get an idea of what a literal was. When [[#integer_values|integer values]] were deemed necessary to help adoption, the name became more of a problem. We also need to keep to a single word name (to support a dedicated type in the future). This is where //is_trusted()// and //is_known()// was proposed. We had a [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/name/2021-07-20.png|vote on the name]], which gave us a 18 to 3 result in favour of //is_trusted()//.
  
-Alternative suggestions have included //is_from_literal()// from [[https://news-web.php.net/php.internals/109197|Jakob Givoni]], //is_hard_coded()// from [[https://news-web.php.net/php.internals/114910|Dik Takken]], and references to alternative implementations like "compile time constants" and "code string".+==== Support Functions ====
  
-We cannot call it //is_safe()//, because we cannot say anything is safe: +**What about other support functions?**
- +
-<code php> +
-$cli = 'rm -rf ?'; +
-$sql = 'DELETE FROM my_table WHERE my_date >= ?'; +
-eval('$name = "' . $_GET['name'] . '";'); // INSECURE +
-</code> +
- +
-While the first two examples cannot include Injection Vulnerabilities, the parameters could be set to "/" or "0000-00-00" (providing a nice vanishing magic trick); and the last one, well, they have much bigger issues to worry about (it's clearly irresponsible, and intentionally dangerous). +
- +
-==== Support Functions ====+
  
-**What about other support functions?** We did consider //trusted_concat()// and //trusted_implode()// functions (see [[#string_concatenation|String Concatenation]] above), but these can be userland functions:+We did consider //trusted_concat()// and //trusted_implode()// functions (see [[#string_concatenation|String Concatenation]] above), but these can be userland functions:
  
 <code php> <code php>
Line 352: Line 366:
 ==== Other Functions ==== ==== Other Functions ====
  
-**Why not support other string functions?** Like [[#string_splitting|String Splitting]], we can't find any use cases, and don't want to make this complicated. For example //strtoupper()// might be reasonable, but we will need to consider how it would be used (good and bad), and check for any oddities (e.g. output varying based on the current locale). Also, functions like //str_shuffle()// create unpredictable results.+**Why not support other string functions?** 
 + 
 +Like [[#string_splitting|String Splitting]], we can't find any real use cases, and don't want to make this complicated. For example //strtoupper()// might be reasonable, but we will need to consider how it would be used (good and bad), and check for any oddities (e.g. output varying based on the current locale). Also, functions like //str_shuffle()// create unpredictable results
 + 
 +==== Limitations ==== 
 + 
 +**Does this mean the value is completely safe?** 
 + 
 +While these values can be trusted to not contain an Injection Vulnerability, they cannot be completely safe from every kind of issue, For example: 
 + 
 +<code php> 
 +$cli = 'rm -rf ?'; // RISKY 
 +$sql = 'DELETE FROM my_table WHERE my_date >= ?'; // RISKY 
 +</code> 
 + 
 +The parameters could be set to "/" or "0000-00-00", which can result in deleting a lot more data than expected. 
 + 
 +There's no single RFC that can completely solve all developer errors, but this takes one of the biggest ones off the table.
  
 ==== Faking it ==== ==== Faking it ====
Line 360: Line 391:
 This implementation does not provide a way for a developer to mark anything they want as trusted. This is on purpose. We do not want to recreate the biggest flaw of Taint Checking. It would be very easy for a naive developer to mark all escaped values as trusted ([[#taint_checking|wrong]]). This implementation does not provide a way for a developer to mark anything they want as trusted. This is on purpose. We do not want to recreate the biggest flaw of Taint Checking. It would be very easy for a naive developer to mark all escaped values as trusted ([[#taint_checking|wrong]]).
  
-That said, we do not pretend there aren't ways around this (e.g. using var_export), but doing so is clearly the developer doing something wrong. We want to provide safety rails, there is nothing stopping the developer from jumping over them.+That said, we do not pretend there aren't ways around this (e.g. using [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/justification/is-trusted-bypass.php|var_export]]), but doing so is clearly the developer doing something wrong. We want to provide safety rails, but there is nothing stopping the developer from jumping over them if that's their choice.
  
 ==== Extensions ==== ==== Extensions ====
  
-**Extensions create and manipulate strings, won't this break the flag on strings?** Strings have multiple flags already, and are off by defaultthis is the correct behaviour when extensions create their own strings (should not be considered trusted). If an extension is found to be changing the trusted flag incorrectly (unlikely), that's the same as any new flag being introduced, and will need to be fixed in the same way.+**Extensions create and manipulate strings, won't this break the flag on strings?** 
 + 
 +Strings have multiple flags already that are off by default this is the correct behaviour when extensions create their own strings (should not be flagged as trusted). If an extension is found to be already using the flag we're using for is_trusted (unlikely), that's the same as any new flag being introduced into PHP, and will need to be updated in the same way.
  
 ==== Reflection API ==== ==== Reflection API ====
  
-**Why don't you use the reflection API?** It currently allows you to "introspect classes, interfaces, functions, methods and extensions"; it's not currently set up for object methods to inspect the code calling it. Even if that was to be added (unlikely), it could only check if the trusted value was defined there, it couldn't handle variables (tracking back to their source), nor could it provide any future scope for a dedicated type, or for native functions to check (see "Future Scope").+**Why don't you use the Reflection API?** 
 + 
 +This allows you to "introspect classes, interfaces, functions, methods and extensions"; it's not currently set up for object methods to inspect the code calling it. Even if that was to be added (unlikely), it could only check if the trusted value was defined there, it couldn't handle variables (tracking back to their source), nor could it provide any future scope for a dedicated type, nor could native functions work with this (see "Future Scope").
  
 ==== Interned Strings ==== ==== Interned Strings ====
  
-**The output from //chr()// can sometimes appear as trusted?** This was noticed by Claude Pache, and is due to the [[https://news-web.php.net/php.internals/114877|use of Interned Strings]], an optimisation used by //RETURN_CHAR// that can re-use single character values. While we could change this, doing so will have a small performance impact, which isn't really worth it (it doesn't really cause an issue).+**Why does the output from //chr()// appear as trusted?** 
 + 
 +This was noticed by Claude Pache, and on a technical level is due to the [[https://news-web.php.net/php.internals/114877|use of Interned Strings]], an optimisation used by //RETURN_CHAR// that re-uses single character values. It's effectively the same as calling //sprintf('%c', $i)//, which is also not an issue, as the developer is choosing to do this.
  
 ===== Previous Work ===== ===== Previous Work =====
Line 421: Line 458:
 ===== Open Issues ===== ===== Open Issues =====
  
-  - Supporting Integers/Interned values. +None
-  - The name. +
- +
-===== Unaffected PHP Functionality ===== +
- +
-None known+
  
 ===== Future Scope ===== ===== Future Scope =====
Line 434: Line 466:
 2) As noted by MarkR, the biggest benefit will come when this flag can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc). 2) As noted by MarkR, the biggest benefit will come when this flag can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc).
  
-First we need libraries to start using //is_trusted()// to check their inputs. The library then do their thing, and apply the appropriate escaping, which can result in a value that no longer has the trusted flag set, but should still be trusted.+However, first we need libraries to start using //is_trusted()// to check their inputs. The library can then do their thing, and apply the appropriate escaping, which can result in a value that no longer has the trusted flag set, but should still be trusted.
  
-With a future RFC, we can introduce checks for the native functions. By using the [[https://web.dev/trusted-types/|Trusted Types]] concept from JavaScript (which protects [[https://www.youtube.com/watch?v=po6GumtHRmU&t=92s|60+ Injection Sinks]], like innerHTML), the libraries create a stringable object as their output. These objects can then be added to a list of trusted objects for the relevant native functions. The native functions can then **warn** developers when they do not receive a value with the trusted flag, or one of these trusted objects. These warnings would not **break anything**, they just make developers aware of the mistakes they have made, and will always need a way of switching them off entirely (e.g. phpMyAdmin).+With a future RFC, we could potentially introduce checks for the native functions. For example, if we use the [[https://web.dev/trusted-types/|Trusted Types]] concept from JavaScript (which protects [[https://www.youtube.com/watch?v=po6GumtHRmU&t=92s|60+ Injection Sinks]], like innerHTML), the libraries create a stringable object as their output. These objects can be added to a list of trusted objects for the relevant native functions. The native functions could then **warn** developers when they do not receive a value with the trusted flag, or one of the trusted objects. These warnings would **not break anything**, they just make developers aware of the mistakes they have made, and we will always need a way of switching them off entirely (e.g. phpMyAdmin).
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
rfc/is_trusted.1624222357.txt.gz · Last modified: 2021/06/20 20:52 by craigfrancis