rfc:trailing_whitespace_numerics

PHP RFC: Permit trailing whitespace in numeric strings

Introduction

PHP currently ignores whitespace at the start of a numeric string: " 123" and "123" are considered equivalent. However, it considers whitespace at the end of a numeric string to be “non-well-formed”: +"123 " produces an E_NOTICE-level error, and \is_numeric("123 ") returns false.

This can be unhelpful. One reason for this is because trailing whitespace occurs in similar situations to leading whitespace. For example, a user might copy and paste a number into a form field. The likelihood of unintentionally pasting trailing whitespace, in this case, is similar to pasting leading whitespace, and both are equally meaningless. PHP unhelpfully only complains about the latter: whether you want to reject unneeded whitespace, or ignore it, PHP's behaviour only does half the job.

Additionally, there are some scenarios specific to trailing whitespace. For instance, when reading a number out of multi-line text, a string may contain line-ending characters. Currently, PHP would complain when this number is used.

Moreover, accepting leading whitespace yet rejecting trailing whitespace is inconsistent and surprising.

Proposal

This RFC proposes to change PHP's behaviour, such that trailing whitespace is accepted in a numeric string, much like leading whitespace. This would make PHP more consistent, less surprising, and save time by avoiding the need to trim trailing whitespace from numeric strings.

For the PHP interpreter, this would be accomplished by modifying the is_numeric_string C function (and its variants) in the Zend Engine. This would therefore affect PHP features which make use of this function, including:

  • Arithmetic operators will no longer produce an E_NOTICE-level error when used with a numeric string with trailing whitespace
  • The int and float type declarations will, in weak typing mode, no longer produce an E_NOTICE-level error when passed a numeric string with trailing whitespace
  • Type checks for built-in/extension (“internal”) PHP functions will, in weak typing mode, no longer produce an E_NOTICE-level error when passed a numeric string with trailing whitespace
  • The comparison operators will now consider numeric strings with trailing whitespace to be numeric, therefore meaning that, for example, "123 " == 123 produces true, much like " 123" == 123 does at present
  • The \is_numeric function will now return true for numeric strings with trailing whitespace
  • The ++ and -- operators will now convert numeric strings with trailing whitespace to integers or floats, as appropriate, rather than applying the alphanumeric increment rules

The PHP language specification's definition of str-numeric would be modified by the addition of str-whitespaceopt after str-number.

Backward Incompatible Changes

\is_numeric() now returns true rather than false for numeric strings with trailing whitespace. The author does not expect this is likely to cause significant backwards-compatibility issues, because only trailing whitespace and not not leading whitespace being invalid is uncommon. Additionally, the new behaviour may be the one intended.

Proposed PHP Version(s)

This is proposed for the next PHP 7.x. At the time of writing, that would be PHP 7.2.

RFC Impact

To Existing Extensions

Any extension using is_numeric_string, its variants, and other functions which themselves use it, on will be affected.

To Opcache

In the patch, all tests pass with Opcache enabled. I am not aware of any issues arising here.

Unaffected PHP Functionality

This does not affect the filter extension, which handles numeric strings itself in a different fashion.

Future Scope

None conceivable.

Proposed Voting Choices

This is a language change, and requires a 2/3 majority. The vote is a two-choice Yes/No vote on whether to accept the RFC and apply its changes to the next applicable version of PHP.

Patches and Tests

A pull request for a complete PHP interpreter patch, including a test file, can be found here: https://github.com/php/php-src/pull/2317

FIXME: There is not yet a language specification patch.

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged to
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)
rfc/trailing_whitespace_numerics.txt · Last modified: 2017/09/22 13:28 (external edit)