rfc:safe_cast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
rfc:safe_cast [2014/11/12 20:15]
ajf Crated? Created.
rfc:safe_cast [2014/11/29 21:20]
ajf end(ed)
Line 1: Line 1:
 ====== PHP RFC: Safe Casting Functions ====== ====== PHP RFC: Safe Casting Functions ======
-  * Version: 0.1.4 +  * Version: 0.1.8 
-  * Date: 2014-10-20, Last Updated 2014-11-12+  * Date: 2014-10-20, Last Updated 2014-11-14
   * Author: Andrea Faulds, ajf@ajf.me   * Author: Andrea Faulds, ajf@ajf.me
-  * Status: Under Discussion+  * Status: Declined
   * First Published at: http://wiki.php.net/rfc/safe_cast   * First Published at: http://wiki.php.net/rfc/safe_cast
  
Line 9: Line 9:
  
 Currently, PHP only provides one means of type conversion: explicit casts. These casts never fail or emit errors, making them dangerous to use, as when passed garbage input, they will simply return garbage instead of indicating that something went wrong. This makes it difficult to write robust applications which handle user data. They also prevent any suggestion of strict type hinting for scalar types, because if that were to be added, users would simply use dangerous explicit casts to get around errors and the result would be code that is buggier than it would have been without type hinting at all. Currently, PHP only provides one means of type conversion: explicit casts. These casts never fail or emit errors, making them dangerous to use, as when passed garbage input, they will simply return garbage instead of indicating that something went wrong. This makes it difficult to write robust applications which handle user data. They also prevent any suggestion of strict type hinting for scalar types, because if that were to be added, users would simply use dangerous explicit casts to get around errors and the result would be code that is buggier than it would have been without type hinting at all.
 +
 +For int and float conversion specifically, ''ext/filter'' provides ''FILTER_VALIDATE_INT'' and ''FILTER_VALIDATE_FLOAT''. ''filter_var($foo, FILTER_VALIDATE_INT)'' and ''filter_var($foo, FILTER_VALIDATE_FLOAT)''. However, these are rather unwieldy, encouraging people to use the shorter explicit casts, and suffer from a performance and safety standpoint by their converting values to strings before validating (allowing, for example, booleans, or objects with ''__toString''). Furthermore, their use requires explicit error handling by checking for a FALSE return value. If the programmer forgets to check it, they are no safer than explicit casts.
  
 ===== Proposal ===== ===== Proposal =====
  
-A set of three "safe casting" functions for converting to types is is added to ''ext/standard''''to_int()'', ''to_float()'' and ''to_string()''. These functions validate their input to ensure data is not lost with the cast (thus the cast can be considered safe), instead of casting blindly. If the input fails to validate, they return NULL. (Returning ''NULL'' is controversialhoweverso while this is my preferred option and is the one implementedit will be put to a vote.)+Two families of "safe casting" functions are added to ''ext/standard''''try_''* and ''to_''*for ''int'', ''float'' and ''string''. These functions validate their input to ensure data is not lost with the cast (thus the cast can be considered safe), instead of casting blindly. If the input fails to validate, the ''to_''* functions throw a ''CastException'', while the ''try_''* functions return NULL. If validation succeeds, the converted result is returned. 
 + 
 +''to_int()'' and ''try_int()'' accept only intsnon-NaN integral floats within the range of an integer (''PHP_INT_MIN'' to ''PHP_INT_MAX''), and strings containing decimal integer sequences within the range of an integer. Leading and trailing whitespace is not permittednor are leading zeros.
  
-''to_int()'' accepts only ints, non-NaN integral floats within the range of an integer (''PHP_INT_MIN'' to ''PHP_INT_MAX''), and strings containing decimal integer sequences within the range of an integer. Leading and trailing whitespace is not permitted, nor are leading zeros or a positive sign.+''to_float()'' and ''try_float()'' accept only ints, floats, and strings representing floats. Leading and trailing whitespace is not permitted, nor are leading zeros.
  
-''to_float()'' accepts only intsfloats, and strings representing floats. Leading and trailing whitespace is not permitted, nor are leading zeros or a positive sign.+''to_string()'' and ''try_string()'' accept only stringsobjects which cast to stringsints and floats.
  
-''to_string()'' accepts only strings, objects which cast to stringsints and floats.+The new class ''CastException'' is added to ''ext/standard''which extends SPL's ''RuntimeException''.
  
 ==== Rationale ==== ==== Rationale ====
Line 24: Line 28:
 The concept was developed in my thoughts, and in discussions with Anthony Ferrara (both in-person at PHPNW14 and online) and others in [[http://chat.stackoverflow.com/rooms/11/php|StackOverflow's PHP chatroom]]. Here, I list some of my or our rationale for particular decisions. The concept was developed in my thoughts, and in discussions with Anthony Ferrara (both in-person at PHPNW14 and online) and others in [[http://chat.stackoverflow.com/rooms/11/php|StackOverflow's PHP chatroom]]. Here, I list some of my or our rationale for particular decisions.
  
-  * The functions keep to the principle of zero data-loss, i.e. where ''X'' is the type of ''$A'' and ''Y'' is the type being converted to, ''to_Y($A!== NULL'' [[https://en.wikipedia.org/wiki/If_and_only_if|IFF]] ''(X)(Y)$A === $A''. However, there are some limited exceptions: +  * The functions don't accept anything which wouldn't be converted properly by normal unsafe explicit casts (i.e. ''(int)'' and ''intval()''), such that their accepted inputs should be a strict subset of inputs converted by unsafe casts. This is why hexadecimal and exponents are not permitted for ''to_int()'' and ''try_int()''. 
-      Floats are allowed to lose accuracy from their decimal representations. This is because zero data-loss is impossible for floats, as PHP does not output them in full precision. Furthermore, some loss of accuracy is expected when dealing with floats. +  * Whitespace is not allowed because it is expected that most input will lack it, and it can be easily stripped before conversion using ''trim()'' 
-      Objects with ''__toString'' are accepted for ''to_string'', despite a cast back from a string to the original type not necessarily being allowed +  Generally, no data is allowed to be lost, with the exception of floats, which are allowed to lose accuracy from their decimal representations. This is because zero data-loss is impossible (or at least highly impractical) for floats, as PHP does not output them in full precision, and many decimal values don't have exact binary representations, and vice-versa. Furthermore, some loss of accuracy is expected when dealing with floats. 
-      * ''to_int'' and ''to_float'' only accept integersfloats and strings +  Leading zeros are not accepted for ''to_int()'' and ''try_int()'', because strings containing them are not consistently interpreted (sometimes they are considered octalsometimes decimal)nor is user intent consistent. 
-      * ''to_string'' only accepts integersfloatsstrings and objects +  * There are two sets of functions, one that returns NULL on failure and the other that throws an exception for the following reasons
-  * An error return value was chosen instead of an exception: +      * In some casesinvalid input is an exceptional case, so an exception is desirable, but in other cases, it is not exceptional, so an exception shouldn't be usedHaving two functions allows both these scenarios to be covered. 
-      * To make chaining easierand because checking for NULL is easier than checking if an exception was thrown (''if ($value === NULL) { ... } else { ... }'' vs ''try { ... } catch (Exception $e{ $errored = true; ... } if (!$errored... }'') +          * It is often repeated that exceptions shouldn't be used for flow control. 
-      * Exceptions are currently not used in core functions +      * It is PHP tradition to allow both object-oriented (exceptionsand procedural (error return valueapproaches to problems
-      * The cast failing is not an exceptional case, as it is a validation function, so it should use a return value, not an exception +      * This placates both people who would like an error return value, and people who would like exceptions. 
-      * Catching exceptions is slower than checking error values+      * Return values are better for chaining in some circumstances. 
 +      * Checking exceptions is generally slower than checking return values, especially in HHVM).
  
 ==== Examples Table ==== ==== Examples Table ====
Line 39: Line 44:
 A sample table of whether values pass or fail generated by [[https://gist.github.com/TazeTSchnitzel/19c91f800e47d53cc28c|this script]], on a 64-bit machine: A sample table of whether values pass or fail generated by [[https://gist.github.com/TazeTSchnitzel/19c91f800e47d53cc28c|this script]], on a 64-bit machine:
  
-^ value                                   to_int                                  to_float                                ^ to_string                               ^ +^ value                                   try_int                                 try_float                               ^ try_string                              
 ^ string(6) "foobar"                      | fail                                    | fail                                    | pass                                    |  ^ string(6) "foobar"                      | fail                                    | fail                                    | pass                                    | 
 ^ string(0) ""                            | fail                                    | fail                                    | pass                                    |  ^ string(0) ""                            | fail                                    | fail                                    | pass                                    | 
Line 47: Line 52:
 ^ string(2) "10"                          | pass                                    | pass                                    | pass                                    |  ^ string(2) "10"                          | pass                                    | pass                                    | pass                                    | 
 ^ string(3) "010"                         | fail                                    | fail                                    | pass                                    |  ^ string(3) "010"                         | fail                                    | fail                                    | pass                                    | 
-^ string(3) "+10"                         fail                                    fail                                    | pass                                    | +^ string(3) "+10"                         pass                                    pass                                    | pass                                    | 
 ^ string(3) "-10"                         | pass                                    | pass                                    | pass                                    |  ^ string(3) "-10"                         | pass                                    | pass                                    | pass                                    | 
 ^ int(10)                                 | pass                                    | pass                                    | pass                                    |  ^ int(10)                                 | pass                                    | pass                                    | pass                                    | 
Line 89: Line 94:
 ===== Open Issues ===== ===== Open Issues =====
  
-While I'd prefer to return NULL on error, it would also be possible to return FALSE. As this seems to be relatively controversial, it will be put to a vote.+None.
  
 ===== Unaffected PHP Functionality ===== ===== Unaffected PHP Functionality =====
Line 97: Line 102:
 ===== Future Scope ===== ===== Future Scope =====
  
-This might be extended to other types. However, support for the other scalar types has deliberately not been included. For booleans, there is no clear single format to accept, and it is very simple to do so manually. NULL is a type with only one possible value, so there is no point in casting. Resources are special and don't really count as scalars.+This might be extended to other types. However, support for the other scalar types has deliberately not been included. For booleans, there is no clear single format to accept, nor a consistent interpretation of particular values depending on the format used. Furthermore, strict boolean conversion is very simple to do so manually. NULL is a type with only one possible value, so there is no point in casting. Resources are special and don't really count as scalars.
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
Line 103: Line 108:
 As this is not a language change and only introduces new functions, only a 50%+1 majority will be required. The vote will be a straight Yes/No vote on accepting the RFC and merging the patch into master. As this is not a language change and only introduces new functions, only a 50%+1 majority will be required. The vote will be a straight Yes/No vote on accepting the RFC and merging the patch into master.
  
-Because the behaviour on the error case is controversial, a second two-way vote will be held at the same timewith the options being return ''NULL'' and return ''FALSE''.+==== Vote ==== 
 + 
 +Voting opened 2014-11-19 and ended 2014-11-29. 
 + 
 +<doodle title="Should the Safe Casting Functions RFC be accepted, and the patch merged into master?" auth="ajf" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
Line 129: Line 141:
 ===== Changelog ===== ===== Changelog =====
  
 +  * v0.1.8 - ext/filter note in Introduction
 +  * v0.1.7 - Allow positive signs
 +  * v0.1.6 - Dropped zero round trip data loss principle, added octal and whitespace rationale
 +  * v0.1.5 - Renamed ''to_'' functions to ''try_'', added ''to_'' functions which throw exception
   * v0.1.4 - Reject leading '+' and '0' for int/float, ''to_Y($A) !== NULL IFF (X)(Y)$A === $A'' principle in rationale   * v0.1.4 - Reject leading '+' and '0' for int/float, ''to_Y($A) !== NULL IFF (X)(Y)$A === $A'' principle in rationale
   * v0.1.3 - Return NULL, don't include exceptions in vote   * v0.1.3 - Return NULL, don't include exceptions in vote
rfc/safe_cast.txt · Last modified: 2017/09/22 13:28 (external edit)