PHP RFC: base_convert changes
- Version: 0.1
- Date: 2019-05-15
- Author: Scott Dutton,php@exussum.co.uk
- Status: Implemented (in PHP 7.4)
- First Published at: https://wiki.php.net/rfc/base_convert_improvements
Introduction
The base_convert family of functions(base_convert, binhex, hexbin etc) are very accepting with their input arguments, you can pass any string to to them and they give a best effort of converting it.
For example base_convert(“hello world”, 16, 10); will return 237 with no warnings. What this does internally is base_convert(“ed”, 16, 10);
Also negative numbers simply do not work, eg base_convert(“-ff”, 16, 10); will return 255. (similar to above the “-” gets silently ignored).
Experienced developers get caught out by this for example https://gist.github.com/iansltx/4820b02ab276c3306314daaa41573445#file-getlines-php-L9
In this case literal binary data was the input and the result was 0 (which is expected but not clear)
Other functions affected by this are:
- decbin() - Decimal to binary
- bindec() - Binary to decimal
- decoct() - Decimal to octal
- octdec() - Octal to decimal
- dechex() - Decimal to hexadecimal
- hexdec() - Hexadecimal to decimal
Other programming languages behave in a a similar way to this proposal * Javascript - https://jsfiddle.net/c16b4usp/ * Python - https://repl.it/repls/OliveBlushingModem * Go - https://play.golang.org/p/aLWg15c00Fy
Javascript does work with larger numbers correctly, Python and Go work in a way which would be similar to PHP if this change was made.
Proposal
I propose two changes to the base_convert family of functions.
Error on ignored characters
When passed arguments that base_convert will ignore, also give a warning to the user informing them their input was incorrect. There is an exception in place for the second char, if base is 16 'x' is allowed eg “0xff”, base 2 allows b and base 8 allows o. this fits in with common formats for these numbers
This can be raised to a full exception for a later PHP version (eg PHP 8)
Allow negative arguments
Negative numbers should be allowed to be passed to the base_convert family of functions. for example base_convert('-ff', 16', 10); should return -255. Currently users need to make an exception in this case
Backward Incompatible Changes
Error on ignored characters
No BC breaks for a warning, in the case of an exception there will be a BC break - Would suggest PHP 8 for this.
Allow negative arguments
base_convert currently in the backend has a zend_ulong as its representation. We would need a zend_long. This will change the behavior of numbers in the range 9223372036854775807 - 18446744073709551615 but will allow negative numbers.
This currently breaks a fair amount of unit tests that I will need to update if this change is accepted. The unit tests are around extreme values.
A secondary BC change would be people who have worked around this in userland code for example
https://stackoverflow.com/a/7051224/1281385
This change will break these work arounds
Would suggest PHP8 for this change
Proposed PHP Version(s)
PHP 7.4 for warnings PHP 8.0 for negative arguments and exceptions
RFC Impact
To SAPIs
None
To Existing Extensions
None
To Opcache
None
New Constants
None
php.ini Defaults
Open Issues
Make sure there are no open issues when the vote starts!
Unaffected PHP Functionality
Only the base_convert family of functions will be changed
Future Scope
Allow base_convert to allow any length input and not be restricted to a 64 bit int type
Proposed Voting Choices
2 votes
Error on ignored characters
This vote will be to raise a E_DEPRECATED warning for PHP 7.4. Raising to be an InvalidArgumentException in PHP 8
Allow negative arguments
This vote will allow negative arguments in PHP 8
Patches and Tests
Vote
Started 19th June 2019. Ends 3rd July 2019
References
Rejected Features
Keep this updated with features that were discussed on the mail lists.