rfc:literal_string

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
rfc:literal_string [2023/04/01 09:05] – Text tweaks, and note how JS Templates create XSS issues craigfrancisrfc:literal_string [2023/04/20 12:18] (current) – Notes on integer/string-concat in Python/Go | Note XHP for templating | Note use of eval | Note future scope for LiteralInteger craigfrancis
Line 143: Line 143:
 } }
 </code> </code>
 +
 +[[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/others/python/integers.py|Python]] and [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/others/go/integers.go|Go]] support string concatenation as well.
  
 (On a technical note, we did test an implementation that didn't support concatenation, primarily to see if this would help reduce the performance impact even further. However, the PHP engine can sometimes still concatenate values automatically at compile-time (so concatenation appears to work in some contexts), and it didn't make much (if any) difference in regards to performance, because //concat_function()// in "zend_operators.c" uses //zend_string_extend()// (which needs to remove the //LiteralString// flag) and "zend_vm_def.h" does the same; by supporting a quick concat with an empty string (x2), which would need its flag removed as well). (On a technical note, we did test an implementation that didn't support concatenation, primarily to see if this would help reduce the performance impact even further. However, the PHP engine can sometimes still concatenate values automatically at compile-time (so concatenation appears to work in some contexts), and it didn't make much (if any) difference in regards to performance, because //concat_function()// in "zend_operators.c" uses //zend_string_extend()// (which needs to remove the //LiteralString// flag) and "zend_vm_def.h" does the same; by supporting a quick concat with an empty string (x2), which would need its flag removed as well).
Line 303: Line 305:
  
 Due to this limitation, we did consider an approach to trust all integers, where Scott Arciszewski suggested the name //is_noble()//. While this is not as philosophically pure, we continued to explore this possibility because we could not find any way an Injection Vulnerability could be introduced with integers in SQL, HTML, CLI; and other contexts as well (e.g. preg, mail additional_params, XPath query, and even eval). We could not find any character encoding issues either (The closest we could find was EBCDIC, an old IBM character encoding, which encodes the 0-9 characters differently; which anyone using it would need to re-encode either way, and [[https://www.php.net/manual/en/migration80.other-changes.php#migration80.other-changes.ebcdic|EBCDIC is not supported by PHP]]). And we could not find any issue with a 64bit PHP server sending a large number to a 32bit database, because the number is being encoded as characters in a string (so that's also fine). However, the feedback received was that while safe from Injection Vulnerabilities, it becomes a more complex concept, one that might cause programmers to assume it is also safe from programmer/logic errors. Ultimately the preference was the simpler approach, that did not allow any integers (which is reinforced with the name LiteralString). Due to this limitation, we did consider an approach to trust all integers, where Scott Arciszewski suggested the name //is_noble()//. While this is not as philosophically pure, we continued to explore this possibility because we could not find any way an Injection Vulnerability could be introduced with integers in SQL, HTML, CLI; and other contexts as well (e.g. preg, mail additional_params, XPath query, and even eval). We could not find any character encoding issues either (The closest we could find was EBCDIC, an old IBM character encoding, which encodes the 0-9 characters differently; which anyone using it would need to re-encode either way, and [[https://www.php.net/manual/en/migration80.other-changes.php#migration80.other-changes.ebcdic|EBCDIC is not supported by PHP]]). And we could not find any issue with a 64bit PHP server sending a large number to a 32bit database, because the number is being encoded as characters in a string (so that's also fine). However, the feedback received was that while safe from Injection Vulnerabilities, it becomes a more complex concept, one that might cause programmers to assume it is also safe from programmer/logic errors. Ultimately the preference was the simpler approach, that did not allow any integers (which is reinforced with the name LiteralString).
 +
 +[[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/others/python/integers.py|Python]] and [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/others/go/integers.go|Go]] do not support integers either.
  
 ==== FAQ: Other Values ==== ==== FAQ: Other Values ====
Line 455: Line 459:
  
 <code javascript> <code javascript>
-p.innerHTML = `Hi ${name}`;+p.innerHTML = `Hi ${name}`; // INSECURE
 </code> </code>
  
Line 465: Line 469:
  
 [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/alternatives/tagged-templates.php|Example]] / [[https://github.com/craigfrancis/php-is-literal-rfc/commit/1dc5f4fb425009d03a640036a1022f88c4a0533d?diff=unified|Diff]] [[https://github.com/craigfrancis/php-is-literal-rfc/blob/main/alternatives/tagged-templates.php|Example]] / [[https://github.com/craigfrancis/php-is-literal-rfc/commit/1dc5f4fb425009d03a640036a1022f88c4a0533d?diff=unified|Diff]]
 +
 +[[https://docs.hhvm.com/hack/XHP/introduction|XHP]] in Hack / HHVM is similar, where it introduces an XML-like syntax that can be used for HTML templating.
  
 ==== Macros ==== ==== Macros ====
Line 581: Line 587:
 ===== Open Issues ===== ===== Open Issues =====
  
-None+Additional testing of the final implementation; including extensions like [[https://www.swoole.com/|Swoole]] or [[https://openswoole.com/|OpenSwoole]]. 
 + 
 +Should //eval()// be unable to create a LiteralString, or is too similar to: 
 + 
 +<code php> 
 +$id = ($_GET['id'] ?? NULL); 
 +$file = tempnam(sys_get_temp_dir(), 'literal-string'); 
 +file_put_contents($file, '<'.'?php return '.var_export(strval($id),true).';'); 
 +$id = require($file); 
 +unlink($file); 
 +</code>
  
 ===== Future Scope ===== ===== Future Scope =====
Line 587: Line 603:
 1) We might re-look at //sprintf()// being able to return a LiteralString. 1) We might re-look at //sprintf()// being able to return a LiteralString.
  
-2) As noted by MarkR, the biggest benefit will come when this flag can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc).+2) We might re-look at //LiteralInteger//. While this is unlikely, as it would change the zval structure, it might be possible if there is enough demand. It would also need a discussion on what happens with other operations, e.g. integer addition. 
 + 
 +3) As noted by MarkR, the biggest benefit will come when this flag can be used by PDO and similar functions (//mysqli_query//, //preg_match//, //exec//, etc).
  
 However, first we need libraries to start checking the relevant inputs are a LiteralString. The library can then do their thing, and apply the appropriate escaping, which can result in a value that no longer has the LiteralString flag set, but is perfectly safe for the native functions. However, first we need libraries to start checking the relevant inputs are a LiteralString. The library can then do their thing, and apply the appropriate escaping, which can result in a value that no longer has the LiteralString flag set, but is perfectly safe for the native functions.
rfc/literal_string.1680339913.txt.gz · Last modified: 2023/04/01 09:05 by craigfrancis