rfc:base_convert_improvements

This is an old revision of the document!


PHP RFC: base_convert changes

Introduction

The base_convert family of functions(base_convert, binhex, hexbin etc) are very accepting with their input arguments, you can pass any string to to them and they give a best effort of converting it.

For example base_convert(“hello world”, 16, 10); will return 237 with no warnings. What this does internally is base_convert(“ed”, 16, 10);

Also negative numbers simply do not work, eg base_convert(“-ff”, 16, 10); will return 255. (similar to above the “-” gets silently ignored).

Experienced developers get caught out by this for example https://gist.github.com/iansltx/4820b02ab276c3306314daaa41573445#file-getlines-php-L9

In this case literal binary data was the input and the result was 0 (which is expected but not clear)

Other functions effected by this are:

  • decbin() - Decimal to binary
  • bindec() - Binary to decimal
  • decoct() - Decimal to octal
  • octdec() - Octal to decimal
  • dechex() - Decimal to hexadecimal
  • hexdec() - Hexadecimal to decimal

Other programming languages behave in a a similar way to this proposal * Javascript - https://jsfiddle.net/c16b4usp/ * Python - https://repl.it/repls/OliveBlushingModem * Go - https://play.golang.org/p/aLWg15c00Fy

Javascript does work with larger numbers correctly, Python and Go work in a way which would be similar to PHP if this change was made.

Proposal

I propose two changes to the base_convert family of functions.

Error on ignored characters

When passed arguments that base_convert will ignore, also give a warning to the user informing them their input was incorrect. There is an exception in place for the second char, if base is 16 'x' is allowed eg “0xff”, base 2 allows b and base 8 allows o. this fits in with common formats for these numbers

This can be raised to a full exception for a later PHP version (eg PHP 8)

Allow negative arguments

Negative numbers should be allowed to be passed to the base_convert family of functions. for example base_convert('-ff', 16', 10); should return -255. Currently users need to make an exception in this case

Backward Incompatible Changes

Error on ignored characters

No BC breaks for a warning, in the case of an exception there will be a BC break - Would suggest PHP 8 for this.

Allow negative arguments

base_convert currently in the backend has a zend_ulong as its representation. We would need a zend_long. This will change the behavior of numbers in the range 9223372036854775807 - 18446744073709551615 but will allow negative numbers.

This currently breaks a fair amount of unit tests that I will need to update if this change is accepted. The unit tests are around extreme values.

A secondary BC change would be people who have worked around this in userland code for example

https://stackoverflow.com/a/7051224/1281385

This change will break these work arounds

Would suggest PHP8 for this change

Proposed PHP Version(s)

PHP 7.4 for warnings PHP 8.0 for negative arguments and exceptions

RFC Impact

To SAPIs

None

To Existing Extensions

None

To Opcache

None

New Constants

None

php.ini Defaults

Open Issues

Make sure there are no open issues when the vote starts!

Unaffected PHP Functionality

Only the base_convert family of functions will be changed

Future Scope

Allow base_convert to allow any length input and not be restricted to a 64 bit int type

Proposed Voting Choices

2 votes

Error on ignored characters

This vote will be to raise a E_DEPRECATED warning for PHP 7.4. Raising to be an InvalidArgumentException in PHP 8

Allow negative arguments

This vote will allow negative arguments in PHP 8

Patches and Tests

Vote

Started 19th June 2019. Ends 3rd July 2019

Raise deprecated error in 7.4 and raise to exception in PHP 8 for ignored characters
Real name Yes No
Count: 0 0
Allow negative numbers to be converted in PHP 8
Real name Yes No
ashnazg (ashnazg)  
carusogabriel (carusogabriel)  
daverandom (daverandom)  
derick (derick)  
emir (emir)  
galvao (galvao)  
girgias (girgias)  
jasny (jasny)  
kalle (kalle)  
kguest (kguest)  
mariano (mariano)  
mbeccati (mbeccati)  
mike (mike)  
petk (petk)  
pollita (pollita)  
ramsey (ramsey)  
salathe (salathe)  
sergey (sergey)  
stas (stas)  
Count: 2 17

References

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/base_convert_improvements.1560928604.txt.gz · Last modified: 2019/06/19 07:16 by exussum