rfc:chaining_comparison

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:chaining_comparison [2016/12/15 17:15] bp1222rfc:chaining_comparison [2021/03/27 14:58] (current) – Move to inactive ilutov
Line 4: Line 4:
   * Author: David Walker (dave@mudsite.com)   * Author: David Walker (dave@mudsite.com)
   * Author: Richard Fussenegger (php@fleshgrinder.com)   * Author: Richard Fussenegger (php@fleshgrinder.com)
-  * Status: Draft+  * Status: Inactive
   * First Published at: http://wiki.php.net/rfc/chaining_comparison   * First Published at: http://wiki.php.net/rfc/chaining_comparison
  
 ===== Introduction ===== ===== Introduction =====
-This RFC proposes a syntax change to allow the chaining together of comparison and equality operations ''[==, !=, !==, ===, <, <=, >, >=]'' to allow arbitrary comparisons.  The initial request that spawned this RFC[1] was initially only for interval checking.  Discussion on the thread expanded the scope of the request to go from strictly interval checking to allowing more arbitrary number of comparisons.  It evolved from there to expand to a majority of the comparison operations.+This RFC proposes a syntax change to allow arbitrary chaining together of comparison and equality operations ''[==, !=, !==, ===, <, <=, >, >=]'' The initial request that spawned this RFC[1] was initially only for interval checking.  Discussion on the thread expanded the scope of the request to go from strictly interval checking to allowing more arbitrary number of comparisons.  It evolved from there to expand to a majority of the comparison operations.  The primary benefit to this proposal would be to make for more readable code when doing numerous comparisons between a single variable.
  
 <file php> <file php>
Line 22: Line 22:
  
 /* /*
- * To allow this to be functionally the same+ * To be functionally equivalent to this syntax
  */  */
 if (0 < $a < 100) { if (0 < $a < 100) {
Line 31: Line 31:
 ===== Proposal ===== ===== Proposal =====
 Proposals herein will contain a dump of relevant AST (php-ast) nodes and OPCodes (vld) to better visualize the compilation, and execution. Proposals herein will contain a dump of relevant AST (php-ast) nodes and OPCodes (vld) to better visualize the compilation, and execution.
 +
 ==== Comparison Chaining ==== ==== Comparison Chaining ====
-The proposal creates a new AST operation type ''ZEND_AST_COMPARE_OP'' which will be compiled in a left-precedence required (see: [[https://wiki.php.net/rfc/chaining_comparison#should_we_allow_user-defined_right_recursion|open issues]]) manor.  In doing this compilation we ensure shortcutting of righter operations if the left sides have evaluated to false.  To accomplish this we introduce a new means of emitting an operation, by noting where a ''JMPZ_EX'' may need to exist (see implementations for ''zend_emit_op_at'').  This will shift operations that may have been emitted by compiling the right side of this AST compare to allow jumping over them if the left side of the operation is evaluated to false.  I believe this means is necessary because we can't just shortcut if the left operation is false, ''false < $a++'' should still evaluate the right part of the expression.  We should only inject the JMPZ_EX ops, IF, the left child is a chained ''ZEND_AST_COMPARE_OP'' The proposal also changes the associativity of the equality, and comparison, operations to being left associative.+The proposal creates a new AST operation type ''ZEND_AST_COMPARE_OP'' which will be compiled in a left-recursive manor.  
  
 <file php> <file php>
Line 66: Line 67:
     2        INIT_FCALL                                               'var_dump'     2        INIT_FCALL                                               'var_dump'
     3        IS_SMALLER                                       ~4      !0, 5     3        IS_SMALLER                                       ~4      !0, 5
-    4      > JMPZ_EX                                          ~     ~4, ->7+    4      > JMPZ_EX                                          ~     ~4, ->7
     5    >   POST_INC                                         ~5      !1     5    >   POST_INC                                         ~5      !1
-    6    >   IS_SMALLER                                       ~     ~4, ~5 +    6    >   IS_SMALLER                                       ~     ~4, ~5 
-    7    >   SEND_VAL                                                 ~6+    7    >   SEND_VAL                                                 ~4
     8        DO_ICALL                                                      8        DO_ICALL                                                 
  */  */
Line 111: Line 112:
  *  *
           INIT_FCALL                                               'var_dump'           INIT_FCALL                                               'var_dump'
-          IS_EQUAL                                         ~     !0, 1 +          IS_EQUAL                                         ~     !0, 1 
-        > JMPZ_EX                                          ~     ~4, ->6 +        > JMPZ_EX                                          ~     ~2, ->6 
-      >   IS_IDENTICAL                                     ~     ~4, <true> +      >   IS_IDENTICAL                                     ~     ~2, <true> 
-      >   SEND_VAL                                                 ~5+      >   SEND_VAL                                                 ~2
  */  */
 </file> </file>
 +
 +==== False Short Circuiting ====
 +In doing this compilation we can ensure short cutting of righter operations if the left sides have evaluated to false.  To accomplish this we introduce a new means of emitting an operation, by noting where a ''JMPZ_EX'' may need to exist (see implementations for ''zend_emit_op_at'').  This will shift operations that may have been emitted by compiling the right side of this AST compare to allow jumping over them if the left side of the operation is evaluated to false.  I believe this means is necessary because we can't just shortcut if the left operation is false, ''false < $a++'' should still evaluate the right part of the expression.  We should only inject the JMPZ_EX ops, IF, the left child is a chained ''ZEND_AST_COMPARE_OP'' The proposal also changes the associativity of the equality, and comparison, operations to being left associative.
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
-No BC Breaking changes expected (see: Open Issues)+BC Breaking changes expected depending on open-issue answers
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
Line 126: Line 130:
 ===== RFC Impact ===== ===== RFC Impact =====
 ==== To Opcache ==== ==== To Opcache ====
-Yes, we're adding new JMPZ_EX codes when chaining to ensure false values correctly jump over any pre/post inc/dev ops from eval.+I'm unsure; we're adding new op-codes and/or order of opcodes, but are not introducing any new codes
  
 ===== Open Issues ===== ===== Open Issues =====
Line 133: Line 137:
 ''1 < 2 == 3 < 4'' ''1 < 2 == 3 < 4''
  
-Why is this even a question, much less a challenging one?  Well, a seemingly majority of languages ''[C, C++, Java, Ruby, Perl]'' all would tell you that the expression would evaluate to true.  However some, ''[WollframAlpha, Python]'' would evaluate that expression to false.  Some, like ''[Numbers, LibreOffice]'' will raise a syntax error, or give awkward answers.  The question we have is which way should PHP go with the evaluation of this expression?  Clearly we can ascertain that the true-evaluating languages have the precedence of the less-than operator more imporatant than that of the equality, so they check if true == true.  Whereas the false-evaluating languages treat comparisons and equality with the same precedence.  As such they compare 1 less than 2, 2 is-equal 3.  The latter group are apparently more strictly typed and won't compare bools to numbers, but even there we can see the precedence is equal, as it's comparing the result of the first expression into the next ''(1 < 2) == 3''+Why is this even a question, much less a challenging one?  Well, a seemingly majority of languages ''[C[2], C++[3], Java[4], Ruby[5], Perl[6]]'' all would tell you that the expression would evaluate to true.  However some, like ''Python[7]''would evaluate that expression to false.  Some, like ''[Numbers, LibreOffice]'' will raise a syntax error, or give awkward answers.  The question we have is which way should PHP go with the evaluation of this expression?  Clearly we can ascertain that the true-evaluating languages have the precedence of the less-than operator more imporatant than that of the equality, so they check if true == true.  Whereas the false-evaluating languages treat comparisons and equality with the same precedence.  As such they compare 1 less than 2, 2 is-equal 3.  The latter group are apparently more strictly typed and won't compare bools to numbers, but even there we can see the precedence is equal, as it's comparing the result of the first expression into the next ''(1 < 2) == 3''
  
-It is important to point out that the example syntax is currently valid in PHP 7.1.  PHP 7.1 currently has a C-like precedence where ''[<, <=, >, >=]'' are a higher precedence than ''[==, !=, ===, !==]'' Below are expressions and their return values in PHP 7.1, and with the two potential methods of evaluating that expression.+It is important to point out that the example syntax is currently valid in PHP 7.1.  PHP 7.1 currently has a C-like precedence where ''[<, <=, >, >=]'' are a higher precedence than ''[==, !=, ===, !==]''[8].  Below are expressions and their return values in PHP 7.1, and with the two potential methods of evaluating that expression.
  
 <file php> <file php>
Line 160: Line 164:
 </file> </file>
  
-====Should we allow user-defined right recursion?==== +==== Right Recursion ==== 
-Both proposed implementations currently, for non-equality, operations require a left-recursive chain.  In doing this, the right node of the left comparison, if evaluated to true, is returned up the tree for comparison.+Another syntax difference that could be BC problematic is with right-recursion of the chained expression.  Currently PHP will evaluate right recursive single expression comparisons.  The proposed feature would raise a compile time error doing this.  The question is should itor should we permit right-recursive chaining? The test case we can look at:
  
-''1 < 2 < 3''+<file php> 
 +<?php 
 +var_dump(1 < (2 < 3)); 
 +var_dump(1 < 2 == 3); 
 +var_dump(1 < 2 == 3 == 4); 
 +var_dump(1 < 2 == (3 == 4)); 
 +</file>
  
-What I mean by this, for this example, the compiler would have the first compiled AST with a left side of another comparison-op, and the right side of 3 So it'd recurse and evaluate the left child, being < 2.  If the node can evaluate, and evaluate to true rather than the return result being ''true'' it would be the result of the right node, in this case 2.  So when the parent node compiles the left-node, rather than the bool true being there, it's that childs right node of 2 And would then compare 2 < 3.  Being a non-child node it'll here set the result to true.+We will go over how PHP 7.1 currently would evaluate each, and then how a right-recursive chain would pan out. 
 +<file php> 
 +<?php 
 +var_dump(1 < (2 < 3)); 
 +/* 
 + * 1 < (2 < 3) := 1 < true := false 
 + */
  
-The question is should we allow users to define right recursion in the manor of+var_dump(1 < 2 == 3); 
 +/* 
 + * (1 < 2) == 3 := true == 3 := true 
 + */
  
-''1 < (3)''+var_dump(1 < 2 == == 4)
 +/* 
 + * Parse Error, unexpected == 
 + */
  
-This would then instruct the compiler to have the 'top' node have a left side of 1, and a right side of a comparison-op.  Should we be evaluating this as 1 < true, or, allow right-side defined recursion and return the left node for comparison with a 1 < (result of expr) := 2?+var_dump(1 < 2 == (3 == 4)); 
 +/* 
 + * (1 < 2) == (3 == 4) := true == false := false 
 + */ 
 +</file>
  
-This is a question when it comes to personal preferenceand the short circuiting of expressions For example:+The current proposal (implemented) evaluation method.  You'll notice that we do permit right-recursion for equality operations.  This is due to the fact that equality operations will evaluate against booleanor boolean-converted values.  Since you don't really care what the left-node of the right-recursive side is, you only care if the right side evaluates to true or not. 
 +<file php> 
 +<?php 
 +var_dump(1 < (2 < 3)); 
 +/* 
 + * Parse ErrorNo right recursion 
 + */
  
-''1 < 1 < $a++''+var_dump(1 < 2 == 3); 
 +/* 
 + * (1 < 2) == 3 := true == 3 := true 
 + */
  
-With the above expression, the ''$a++'' would never run, so after the line ''$a'' would not be altered.  However, we could allow right recursion with+var_dump(1 < 2 == 3 == 4); 
 +/* 
 + * ((1 < 2) == 3) == 4 := (true == 3) == 4 := true == 4 := true 
 + */
  
-''1 < (1 < $a++)''+var_dump(1 < 2 == (3 == 4)); 
 +/* 
 + (1 < 2== (3 == 4) := true == false := false 
 + */ 
 +</file>
  
-This writing would ensure ''$a++'' is evaluated in the chain of less than expressions.  However the above could easily be written with greater than expressions to prevent right-recursion+If however we permitted right recursive comparison operations we would evaluate as such: 
 +<file php> 
 +<?php 
 +var_dump(1 < (2 < 3)); 
 +/* 
 + * 1 < (2 < 3) := 1 && (2 < 3) && (1 < 2) := true && true && true := true 
 + */
  
-''$a++ > 1''+var_dump(< 2 == 3); 
 +/* 
 + * (< 2) == 3 := true == 3 := true 
 + */
  
-The potentially weirder part of allowing right-recursion could be a syntax like this:+var_dump(1 < 2 == 3 == 4); 
 +/* 
 + * ((1 < 2) == 3) == 4 := (true == 3) == 4 := true == 4 := true 
 + */
  
-''1 < ($a++ < (3 < 2))''+var_dump(1 < 2 == (3 == 4)); 
 +/* 
 + * (1 < 2) == (3 == 4:= true == false := false 
 + */ 
 +</file>
  
-Since, the right-most node (3 < 2would evaluate to falseshould this in turn jump over the post-inc of $a++, since we know the expression in it's entirety will be evaluated to false.  Currently both implementations I've worked on don'do anything with right-recursionshowever we could employ the injection of an op before the compilation of the left node as well, in the event the right node ends up evaluating to false.+If the first example in this last one looks a little odd, it's because it is.  We do design for short-cutting of a long expression when a fault is found to prevent further execution much like you have in ''if()'' statements.  Howeverwe do process in a left-to-right manor.  So the first thing would require us to ensure the left most side evaluates to true, and if it wasn't ''1'' but rather ''$a++'', we'd want to ensure to get that left nodes potential opcodes to execute before comparing the right hand side.  Since we are chaining, we'd want to evaluate the right, then return the left node of it to be evaluated against the top's left node.  This, odd syntax is why didn'implement a right-recursive chaining of comparison operations. 
 + 
 +Although allow right-recursion of equality operations does itself introduce some slightly odd syntax like: 
 +<file php> 
 +<?php 
 +/* 
 + * Right chained comparison syntax 
 + */ 
 +var_dump(1 < (2 == 2))// bool(false) 
 + 
 +/* 
 + * Is Functionally identical to PHP 7.1's allowed syntax 
 + */ 
 +var_dump(1 < (2 <= 2)); // bool(false) 
 +</file> 
 +Since we don't chain together the right/left node of an equality operatorthis is functionally identical to PHP 7.1's allowed syntax.  We could, for equality operations denote if they were in-fact a right node-continuation of a chain, thus would allow them to evaluate to either the left node, or false.
  
  
 +As we can see right-recursive comparison operations do have numerous caveats and oddities.  For these reasons we didn't implement it, and generally are on the side of forbidding right-recursive comparison operations.
  
 ===== Unaffected PHP Functionality ===== ===== Unaffected PHP Functionality =====
Line 200: Line 274:
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
-Implementation #1: comparisons evaluated before equality: https://github.com/php/php-src/compare/master...bp1222:multi-compare +Working Implementation: comparisons evaluated before equality: https://github.com/php/php-src/compare/master...bp1222:multi-compare
-Implementation #2: comparisons and equality evaluated together: https://github.com/php/php-src/compare/master...bp1222:multi-compare-equal-prec+
  
 Will need eyes of those more familiar with AST/VM to review. Will need eyes of those more familiar with AST/VM to review.
- 
-For changes affecting the core language, you should also provide a patch for the language specification. 
  
 ===== Implementation ===== ===== Implementation =====
Line 211: Line 282:
  
 ===== References ===== ===== References =====
-[1] - Initial idea on Internals: http://marc.info/?l=php-internals&m=147846422102802&w=2+  * [1] - [[http://marc.info/?l=php-internals&m=147846422102802&w=2|Initial idea on Internals]] 
 +  * [2] - [[https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html#Operator-Precedence|Precedence in C]] 
 +  * [3] - [[https://msdn.microsoft.com/en-us/library/126fe14k.aspx|Precedence in C++]] 
 +  * [4] - [[http://introcs.cs.princeton.edu/java/11precedence/|Precedence in Java]] 
 +  * [5] - [[https://ruby-doc.org/core-2.2.0/doc/syntax/precedence_rdoc.html|Precedence in Ruby]] 
 +  * [6] - [[http://perldoc.perl.org/perlop.html#Operator-Precedence-and-Associativity|Precedence in Perl]] 
 +  * [7] - [[https://docs.python.org/2/reference/expressions.html#operator-precedence|Precedence in Python]] 
 +  * [8] - [[http://php.net/manual/fa/language.operators.precedence.php|Precedence in PHP]] 
  
 ===== Rejected Features ===== ===== Rejected Features =====
 Keep this updated with features that were discussed on the mail lists. Keep this updated with features that were discussed on the mail lists.
rfc/chaining_comparison.1481822132.txt.gz · Last modified: 2017/09/22 13:28 (external edit)