rfc:chaining_comparison

This is an old revision of the document!


PHP RFC: Chaining Comparison

Introduction

The point of this RFC is to allow the chaining together of comparison and equality operations [==, !=, !==, ===, <, <=, >, >=] to allow arbitrary comparisons. The initial request that spawned this RFC was for interval checking.

Today such comparisons must be written as such:

<?php
$a = 10;
 
if (0 < $a && $a < 100) {
    echo "Value is between 0 and 100\n";
}

The proposal of this RFC is to allow new syntax as follows:

<?php
$a = 10;
 
if (0 < $a < 100) {
    echo "Value is between 0 and 100\n";
}

Proposal

Comparison Chaining

The proposal creates a new AST operation type ZEND_AST_COMPARE_OP which will be compiled in a left-precedence required manor. In doing this compilation we introduce a new means of emitting an operation, by noting where a JMPZ_EX may need to exist, depending if we are continuing the compare chain. This will shift operations that may have been emitted by compiling the right side of this AST compare to allow jumping over them if the left side of the operation is evaluated to false.

The proposal also changes the precedence of the equality, and comparison, operations to being left recursive. This is required, since if the left node of this operation is true, and it itself is a comparison operation, it should return the right node (for less than) to be used in the next comparison op. Example:

<?php
$a = 1;
$b = 10;
 
var_dump($a < 5 < $b++); // bool(true)
 
/*
 * AST Dump
 *
 * 2:  AST_CALL
 *      expr: AST_NAME
 *          flags: NAME_NOT_FQ (1)
 *        name: "var_dump"
 *      args: AST_ARG_LIST
 *         0: AST_COMPARE_OP
 *             flags: COMPARE_IS_SMALLER (19)
 *             left: AST_COMPARE_OP
 *                 flags: COMPARE_IS_SMALLER (19)
 *                 left: AST_VAR
 *                     name: "a"
 *                 right: 5
 *             right: AST_POST_INC
 *                 var: AST_VAR
 *                     name: "b"
 */
 
/*
 * OPCodes
 *
 *    2        INIT_FCALL                                               'var_dump'
 *    3        IS_SMALLER                                       ~4      !0, 5
 *    4      > JMPZ_EX                                          ~6      ~4, ->7
 *    5    >   POST_INC                                         ~5      !1
 *    6    >   IS_SMALLER                                       ~6      ~4, ~5
 *    7    >   SEND_VAL                                                 ~6
 *    8        DO_ICALL                                                 
 */

So we can see what this feature will do internally. Speaking directly at the OPCodes we see how our JMPZ_EX code injection works. Since the first evaluated IS_SMALLER op is the left side-recursive of the expression, we determine this expressions result. If the result evaluates to true (and if you look at the code it checks to see if there's an extended_value flag) we continue to the POST_INC otherwise we skip to the sending of the value which would be false.

Equality Chaining

The proposal is also extended to allow chaining of equality operators. However, equality operators are at a higher (or not as significant) precedence as the comparison operations. This enforces equality operators operate on either booleans, or values that will be compared to a boolean value. Example:

<?php
$a = 1;
$b = 10;
 
var_dump($a == 1 === true); // bool(true)
 
/*
 * AST Dump
 *
 *   2: AST_CALL
 *       expr: AST_NAME
 *           flags: NAME_NOT_FQ (1)
 *           name: "var_dump"
 *       args: AST_ARG_LIST
 *           0: AST_COMPARE_OP
 *               flags: COMPARE_IS_IDENTICAL (15)
 *               left: AST_COMPARE_OP
 *                   flags: COMPARE_IS_EQUAL (17)
 *                   left: AST_VAR
 *                       name: "a"
 *                   right: 1
 *               right: AST_CONST
 *                   name: AST_NAME
 *                       flags: NAME_NOT_FQ (1)
 *                       name: "true"
 */
 
/*
 * OPCodes
 *
 *   2        INIT_FCALL                                               'var_dump'
 *   3        IS_EQUAL                                         ~4      !0, 1
 *   4      > JMPZ_EX                                          ~5      ~4, ->6
 *   5    >   IS_IDENTICAL                                     ~5      ~4, <true>
 *   6    >   SEND_VAL                                                 ~5
 */

Combined Uses (Open Discussion Topic)

One of the concerns raised in the implementation of this feature would be how to handle expressions such as 1 < 2 == 3 < 4. This is valid syntax in PHP 7.1 and before that essentially checks if true == true. The current implementation respects the current syntax, by evaluating the comparison operators before equality operators. Combining the two chaining methods we can write a new expression like this example:

<?php
$a = 1;
$b = 4;
$c = 10;
 
var_dump($a < 2 == 3 < $b < 5 == 20 > $c); // bool(true)
 
/*
 * AST Dump
 *
 *  3: AST_CALL
 *       expr: AST_NAME
 *           flags: NAME_NOT_FQ (1)
 *           name: "var_dump"
 *       args: AST_ARG_LIST
 *           0: AST_COMPARE_OP
 *               flags: COMPARE_IS_EQUAL (17)
 *               left: AST_COMPARE_OP
 *                   flags: COMPARE_IS_EQUAL (17)
 *                   left: AST_COMPARE_OP
 *                       flags: COMPARE_IS_SMALLER (19)
 *                       left: AST_VAR
 *                           name: "a"
 *                       right: 2
 *                   right: AST_COMPARE_OP
 *                       flags: COMPARE_IS_SMALLER (19)
 *                       left: AST_COMPARE_OP
 *                           flags: COMPARE_IS_SMALLER (19)
 *                           left: 3
 *                           right: AST_VAR
 *                               name: "b"
 *                       right: 5
 *               right: AST_COMPARE_OP
 *                   flags: COMPARE_IS_SMALLER (19)
 *                   left: 20
 *                   right: AST_VAR
 *                       name: "c"
 */
 
/*
 * OPCodes
 *
 *   3        INIT_FCALL                                               'var_dump'
 *   4        IS_SMALLER                                       ~6      !0, 2
 *   5      > JMPZ_EX                                          ~9      ~6, ->10
 *   6    >   IS_SMALLER                                       ~7      3, !1
 *   7      > JMPZ_EX                                          ~8      ~7, ->9
 *   8    >   IS_SMALLER                                       ~8      ~7, 5
 *   9    >   IS_EQUAL                                         ~9      ~6, ~8
 *  10    > > JMPZ_EX                                          ~11     ~9, ->13
 *  11    >   IS_SMALLER                                       ~10     !2, 20
 *  12        IS_EQUAL                                         ~11     ~9, ~10
 *  13    >   SEND_VAL                                                 ~11
 *  14        DO_ICALL                                                 
 */

Backward Incompatible Changes

No BC Breaking changes expected (see: Open Issues)

Proposed PHP Version(s)

Next PHP (currently 7.2)

RFC Impact

To Opcache

Yes, we're adding new JMPZ_EX codes when chaining to ensure false values correctly jump over any pre/post inc/dev ops from eval.

Open Issues

Should equality and comparison expressions be treated as same precedence?

This is a tough question. It's essentially asking if an expression like 1 < 2 == 3 < 4 should be evaluated as (1 < 2) == (3 < 4) OR (1 < 2) && (2 == 3) && (3 < 4).

Why is this even a question? A majority of languages would evaluate the expression with the former interpretation. However, Python, and WolframAlpha, appear to do the latter. The latter does enforce more of a “chaining” of these comparisons. During discussion, this question should be addressed, as the underlying code to support either means is very minor.

If we do decide to proceed with the Python/Wolfram interpretation of a very chained left-to-right this would introduce a BC break as PHP 7.1 and prior do support the syntax expr1 < expr2 == expr3 < expr4 to be interpreted as (expr1 < expr2) == (expr3 < expr4) where boolean == boolean. If implemented as a strict left-to-right comparison, the new interpretation would be (expr1 < expr2) && (expr2 == expr3) && (expr3 < expr4) where boolean && boolean && boolean. You can be explicit with parenthesis using strict-chaining to get the same result.

<?php
 
/*
 * PHP <= 7.1
 */
var_dump(1 < 2 == 3 < 4); // bool(true)
var_dump(1 < 2 == 3 < 4 == 5 < 6) // Syntax Error
 
/*
 * Proposed Chaining, comparators evaluated first; equality second [See: Implementation #1]
 */
var_dump(1 < 2 == 3 < 4); // bool(true)
var_dump(1 < 2 == 3 < 4 == 5 < 6) // bool(true)
 
/*
 * Proposed Strict Chaining [See: Implementation #2]
 */
var_dump(1 < 2 == 3 < 4); // bool(false)
var_dump(1 < 2 == 3 < 4 == 5 < 6) // bool(false)
var_dump((1 < 2) == (3 < 4) == (5 < 6)) // bool(true)

Unaffected PHP Functionality

Does not alter the operation of the comparison Spaceship [<=>] operator.

Future Scope

Proposed Voting Choices

Requires 2/3 vote

Patches and Tests

Implementation #1: comparisons evaluated before equality: https://github.com/php/php-src/compare/master...bp1222:multi-compare

Implementation #2: comparisons and equality evaluated together: https://github.com/php/php-src/compare/master...bp1222:multi-compare-equal-prec

Will need eyes of those more familiar with AST/VM to review.

For changes affecting the core language, you should also provide a patch for the language specification.

Implementation

References

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/chaining_comparison.1481585737.txt.gz · Last modified: 2017/09/22 13:28 (external edit)