rfc:numeric_literal_separator

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
rfc:numeric_literal_separator [2019/04/30 23:01] – created theodorejbrfc:numeric_literal_separator [2019/08/19 19:58] (current) – Separators in C# aren't just a proposal theodorejb
Line 1: Line 1:
 ====== PHP RFC: Numeric Literal Separator ====== ====== PHP RFC: Numeric Literal Separator ======
-  * Date: 2019-04-30 +  * Date: 2019-05-15 
-  * Author: Theodore Brown <theodorejb@outlook.com>+  * Author: Theodore Brown <theodorejb@outlook.com>, Bishop Bettini <bishop@php.net>
   * Based on [[https://wiki.php.net/rfc/number_format_separator|previous RFC]] by: Thomas Punt <tpunt@php.net>   * Based on [[https://wiki.php.net/rfc/number_format_separator|previous RFC]] by: Thomas Punt <tpunt@php.net>
-  * Proposed PHP version: PHP 7.4 +  * StatusImplemented (in PHP 7.4) 
-  * Discussion: https://externals.io/message/105450 +  * Discussion: https://externals.io/message/105714 
-  * StatusDraft +  * Target versionPHP 7.4 
 +  * Implementation: https://github.com/php/php-src/pull/4165
  
 ===== Introduction ===== ===== Introduction =====
  
 The human eye is not optimized for quickly parsing long sequences of The human eye is not optimized for quickly parsing long sequences of
-digits. Thus, a lack of visual separators increases the time it takes +digits. Thus, a lack of visual separators makes it take longer to 
-to read and debug code, and can lead to unintended mistakes.+read and debug code, and can lead to unintended mistakes.
  
 <code php> <code php>
Line 24: Line 24:
  
 <code php> <code php>
-$discount = 12300; // Is this 12,300? Or 123, because it's in cents?+$discount = 13500; // Is this 13,500? Or 135, because it's in cents?
 </code> </code>
  
Line 33: Line 33:
  
 <code php> <code php>
-$threshold = 1_000_000_000; // a billion! +$threshold = 1_000_000_000;  // a billion! 
-$testAmt = ‪107_925_284.88;  // scale is hundreds of millions +$testValue = ‪107_925_284.88; // scale is hundreds of millions 
-$discount = 123_00        // $123, stored as cents+$discount = 135_00         // $135, stored as cents
 </code> </code>
  
Line 45: Line 45:
 299_792_458;   // decimal 299_792_458;   // decimal
 0xCAFE_F00D;   // hexadecimal 0xCAFE_F00D;   // hexadecimal
-0b0010_1101;   // binary +0b0101_1111;   // binary 
-026_73_43    // octal+0137_041     // octal
 </code> </code>
  
Line 58: Line 58:
 _100; // already a valid constant name _100; // already a valid constant name
  
-// these all produce "Parse error: syntax error"+// these all produce "Parse error: syntax error":
 100_;       // trailing 100_;       // trailing
 1__1;       // next to underscore 1__1;       // next to underscore
Line 67: Line 67:
 </code> </code>
  
-===== Use cases =====+===== Unaffected PHP Functionality =====
  
-Business logic thresholds, scientific constants, and unit test values +Adding an underscore between digits in a numeric literal will not 
-are common situations where large numeric literals are necessary.+change its value. The underscores are stripped out during the lexing 
 +stage, so the runtime is not affected.
  
-==== Use cases to avoid ====+<code php> 
 +var_dump(1_000_000); // int(1000000) 
 +</code>
  
-It may be tempting to use integers for storing data such as phone, +This RFC does not change the behavior of string to number 
-credit card, and social security numbers since these values appear +conversion. Numeric separators are intended to improve code 
-numeric. Howeverthis is almost always a bad idea, since these +readabilitynot alter how input is processed.
-values often have prefixes and leading digits that are significant.+
  
-A good rule of thumb is that if it doesn'make sense to use +===== Backward Incompatible Changes ===== 
-mathematical operators on a value (e.gadding it, multiplying it+ 
-dividing itetc.), then an integer probably isn'the best way to +None. 
-store it.+ 
 +===== Discussion ===== 
 + 
 +==== Use cases ==== 
 + 
 +Digit separators make possible the cognitive process of 
 +[[https://en.wikipedia.org/wiki/Subitizing|subitizing]]. That is
 +accurately and confidently "telling at a glance" the number of digits, 
 +rather than having to count themThis measurably lessens the time 
 +to correctly read numbers longer than four digits. 
 + 
 +Large numeric literals are commonly used for business logic 
 +constants, unit test values, and performing data conversions. 
 +For example: 
 + 
 +Composer's retry delay when removing a file:
  
 <code php> <code php>
-// never do this: +usleep(350000); // without separator 
-$phoneNumber = 345_6789; + 
-$creditCard = 378_2822_4631_0005; +usleep(350_000)// with separator
-$socialSecurity = 111_11_1111;+
 </code> </code>
  
-===== Backward Incompatible Changes =====+Conversion of an Active Directory timestamp (the number of 
 +100-nanosecond intervals since January 1, 1601) to a Unix timestamp:
  
-None.+<code php> 
 +$time = (int) ($adTime / 10000000 - 11644473600); // without separator
  
-===== Unaffected PHP Functionality =====+$time (int) ($adTime / 10_000_000 - 11_644_473_600); // with separator 
 +</code>
  
-Underscores in numeric literals will be stripped out during the +Working with scientific constants:
-lexing stage, so the runtime will not be affected.+
  
 <code php> <code php>
-var_dump(1_000_000); // int(1000000)+const ASTRONOMICAL_UNIT = 149597870700; // without separator 
 + 
 +const ASTRONOMICAL_UNIT = 149_597_870_700; // with separator
 </code> </code>
  
-This RFC does not change the behavior of string to number +Separating bytes in a binary or hex literal: 
-conversionNumeric separators are intended to improve code + 
-readabilitynot alter how input is processed.+<code php> 
 +0b01010100011010000110010101101111; // without separator 
 + 
 +0b01010100_01101000_01100101_01101111; // with separator 
 + 
 +0x42726F776E; // without separator 
 + 
 +0x42_72_6F_77_6E; // with separator 
 +</code> 
 + 
 +==== Use cases to avoid ==== 
 + 
 +It may be tempting to use integers for storing data such as phone, 
 +credit card, and social security numbers since these values appear 
 +numericHowever, this is almost always a bad idea, since such 
 +numbers often have prefixes and leading digits that are significant. 
 + 
 +A good rule of thumb is that if it doesn't make sense to use 
 +mathematical operators on a value (e.g. adding itmultiplying it, 
 +etc.), then an integer probably isn't the best way to store it. 
 + 
 +<code php> 
 +// don't do this: 
 +$phoneNumber = 345_6789; 
 +$creditCard = 231_6547_9081_2543; 
 +$socialSecurity = 111_11_1111; 
 +</code>
  
-===== Will it be harder to search for numbers? =====+==== Will it be harder to search for numbers? ====
  
 A concern that has been raised is whether numeric literal separators A concern that has been raised is whether numeric literal separators
Line 118: Line 164:
 practice, this isn't problematic as long as a codebase is consistent. practice, this isn't problematic as long as a codebase is consistent.
  
-===== Prior art =====+Furthermore, separators can sometimes make it easier to find numbers. 
 +To use an earlier example, 13_500 and 135_00 could be differentiated 
 +in a find/replace. Another example would be separated bytes in a hex 
 +literal, which allows searching for a value like "_6F_" to find only 
 +the numbers containing that specific byte.
  
-Numeric literal separators are widely supported in other programming languages.+==== Should it be the role of an IDE to group digits? ====
  
-  * Ada: single, between digits [[http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4|1]] +It has been suggested that numeric literal separators aren't needed 
-  * C# (proposal for 7.0): multiplebetween digits [[https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.0/digit-separators.md|2]] +for better readabilitysince IDEs could be updated to automatically 
-  * C++: single, between digits (single quote used as separator) [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3781.html|3]] +display large numbers in groups of three digits.
-  * Java: multiple, between digits [[https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html|4]] +
-  * JavaScript and TypeScript: single, between digits [[http://2ality.com/2018/02/numeric-separators.html|5]] +
-  * Julia: single, between digits [[https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/|6]] +
-  * Perl: single, between digits [[https://perldoc.perl.org/perldata.html#Scalar-value-constructors|7]] +
-  * Python: single, between digits [[https://www.python.org/dev/peps/pep-0515/|8]] +
-  * Ruby: single, between digits [[http://ruby-doc.org/core-2.6.3/doc/syntax/literals_rdoc.html#label-Numbers|9]] +
-  * Rust: multiple, anywhere [[https://doc.rust-lang.org/reference/tokens.html#number-literals|10]] +
-  * Swift: multiple, between digits [[https://docs.swift.org/swift-book/ReferenceManual/LexicalStructure.html#ID415|11]]+
  
-===== Proposed Voting Choices =====+However, it isn't always desirable to group numbers the same way. 
 +For example, a programmer may write ''10050000'' differently 
 +depending on whether or not it represents a financial quantity stored 
 +as cents:
  
-Add numeric literal separators in PHP 7.4? Yes/No.+<code php> 
 +$total = 100_500_00; /represents $100,500.00 stored as cents
  
-===== Patches and Tests =====+$total 10_050_000; // represents $10,050,000 
 +</code>
  
-Pending...+Binary and hex literals may also be grouped by a varying number of 
 +digits to reflect how they are used (e.gbits may be separated into 
 +nibbles, bytes, or words). An IDE cannot do this automatically 
 +without knowing the programmer's intent for each numeric literal.
  
-===== Why resurrect this proposal? =====+==== Why resurrect this proposal? ====
  
 The [[https://wiki.php.net/rfc/number_format_separator|previous RFC]] The [[https://wiki.php.net/rfc/number_format_separator|previous RFC]]
Line 157: Line 207:
 JavaScript, and TypeScript), and a stronger case can be made for the JavaScript, and TypeScript), and a stronger case can be made for the
 feature than was made before. feature than was made before.
 +
 +==== Should I vote for this feature? ====
 +
 +Andrea Faulds summarized the considerations [[https://externals.io/email/90673/source|as follows]]:
 +
 +<blockquote>
 +This feature offers some benefit in some cases. It doesn't introduce
 +much new complexity. There's no new syntax or tokens, it just modifies
 +the form of the existing number tokens. It fits in well [with] what's
 +already there, consistently applying to all number literals. It follows
 +established convention in other languages. Its appearance at least hints
 +that values with these separators are not constants or identifiers, but
 +numbers, reducing potential for confusion. It limits its own application
 +to prevent abuse (no leading, trailing, or repeated separators). And
 +it's relatively intuitive.
 +</blockquote>
 +
 +==== Comparison to other languages ====
 +
 +Numeric literal separators are widely supported in other programming languages.
 +
 +  * Ada: single, between digits [[http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4|1]]
 +  * C#: multiple, between digits [[https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.0/digit-separators.md|2]]
 +  * C++: single, between digits (single quote used as separator) [[http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3781.html|3]]
 +  * Java: multiple, between digits [[https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html|4]]
 +  * JavaScript and TypeScript: single, between digits [[https://github.com/tc39/proposal-numeric-separator|5]]
 +  * Julia: single, between digits [[https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/|6]]
 +  * Kotlin: multiple, between digits [[https://github.com/Kotlin/KEEP/blob/master/proposals/underscores-in-numeric-literals.md|7]]
 +  * Perl: single, between digits [[https://perldoc.perl.org/perldata.html#Scalar-value-constructors|8]]
 +  * Python: single, between digits [[https://www.python.org/dev/peps/pep-0515/|9]]
 +  * Ruby: single, between digits [[http://ruby-doc.org/core-2.6.3/doc/syntax/literals_rdoc.html#label-Numbers|10]]
 +  * Rust: multiple, anywhere [[https://doc.rust-lang.org/reference/tokens.html#number-literals|11]]
 +  * Swift: multiple, between digits [[https://docs.swift.org/swift-book/ReferenceManual/LexicalStructure.html#ID415|12]]
 +
 +===== Vote =====
 +
 +Voting started 2019-05-30 and ended 2019-06-13.
 +
 +<doodle title="Support numeric literal separator in PHP 7.4?" auth="theodorejb" voteType="single" closed="true">
 +   * Yes
 +   * No
 +</doodle>
  
 ===== References ===== ===== References =====
 +
 +Request to revive RFC: https://externals.io/message/105450
  
 Discussion from previous RFC: https://externals.io/message/89925, https://externals.io/message/90626, https://marc.info/?l=php-internals&m=145320709922246&w=2. Discussion from previous RFC: https://externals.io/message/89925, https://externals.io/message/90626, https://marc.info/?l=php-internals&m=145320709922246&w=2.
  
-Previous implementation: https://github.com/php/php-src/pull/1699 and https://phpinternals.net/articles/implementing_a_digit_separator.+Blog post about original implementation: https://phpinternals.net/articles/implementing_a_digit_separator.
rfc/numeric_literal_separator.1556665290.txt.gz · Last modified: 2019/04/30 23:01 by theodorejb