rfc:uuid

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:uuid [2017/05/25 19:05] – Updated class links to something more persistent fleshgrinderrfc:uuid [2017/09/22 13:28] (current) – external edit 127.0.0.1
Line 3: Line 3:
   * Date: 2017-05-25   * Date: 2017-05-25
   * Author: Richard Fussenegger, php@fleshgrinder.com   * Author: Richard Fussenegger, php@fleshgrinder.com
-  * Status: Under Discussion+  * Status: Declined
   * First Published at: http://wiki.php.net/rfc/uuid   * First Published at: http://wiki.php.net/rfc/uuid
  
Line 9: Line 9:
 Universally Unique Identifiers (UUIDs, also known as Globally Unique Identifiers [GUIDs]) are 128 bit integers that guarantee uniqueness across space and time. PHP currently provides the ''[[https://php.net/function.uniqid|uniqid]]'' function only, however, there are many flaws to it; as is apparent from the many warnings on the manual page. UUIDs are the natural answer to that problem. UUIDs are also gaining more attraction due to emerging technologies like streaming platforms (e.g. Kafka), or event sourcing applications, since uniqueness per record is of paramount importance. Depending on a central (locking) authority increases complexity and decreases throughput of such systems. Universally Unique Identifiers (UUIDs, also known as Globally Unique Identifiers [GUIDs]) are 128 bit integers that guarantee uniqueness across space and time. PHP currently provides the ''[[https://php.net/function.uniqid|uniqid]]'' function only, however, there are many flaws to it; as is apparent from the many warnings on the manual page. UUIDs are the natural answer to that problem. UUIDs are also gaining more attraction due to emerging technologies like streaming platforms (e.g. Kafka), or event sourcing applications, since uniqueness per record is of paramount importance. Depending on a central (locking) authority increases complexity and decreases throughput of such systems.
  
-UUIDs are defined and standardized in [[https://tools.ietf.org/html/rfc4122|RFC 4122]], but where effectively used long before in many systems. The algorithms that are involved are well understood and battle tested through ubiquitous software, like Microsoft’s Windows operating system, since almost 30 years. UUIDs are mainly used to assign identifiers to entities without requiring a central authority. They are thus particularly useful in distributed systems. They also allow very high allocation rates; up to 10 million per second per machine, if necessary. Please refer to [[https://en.wikipedia.org/wiki/Universally_unique_identifier|the Wikipedia article]] for more details about UUIDs, their flaws, as well as collision probabilities.+UUIDs are defined and standardized in [[https://tools.ietf.org/html/rfc4122|RFC 4122]], but were effectively used long before in many systems. The algorithms that are involved are well understood and battle tested through ubiquitous software, like Microsoft’s Windows operating system, since almost 30 years. UUIDs are mainly used to assign identifiers to entities without requiring a central authority. They are thus particularly useful in distributed systems. They also allow very high allocation rates; up to 10 million per second per machine, if necessary. Please refer to [[https://en.wikipedia.org/wiki/Universally_unique_identifier|the Wikipedia article]] for more details about UUIDs, their flaws, as well as collision probabilities.
  
 Most high-level programming languages provide support for UUIDs out-of-the-box. The following is a list of widely used languages and other software that provides support for UUIDs out-of-the-box: Most high-level programming languages provide support for UUIDs out-of-the-box. The following is a list of widely used languages and other software that provides support for UUIDs out-of-the-box:
Line 40: Line 40:
  
 ==== Why C? ==== ==== Why C? ====
-There is no reason why this should be implemented in C. One could argue that it is faster, which it probably is, but this is a weak argument. This RFC would propose the inclusion of UUIDs implemented in PHP if shipping of PHP code as part of the standard module of PHP would be possible.+There is no reason why this should be implemented in C. One could argue that it is faster, which it probably is, but this is a weak argument. This RFC would propose the inclusion of UUIDs implemented in PHP if shipping of PHP code as part of the standard module of PHP would be possible. However, there is a C API included that allows other PHP modules to utilize UUIDs.
  
 === Why not PECL UUID? === === Why not PECL UUID? ===
Line 46: Line 46:
  
 ==== Why a Class? ==== ==== Why a Class? ====
-UUIDs are basically random data. There is no way for an application to distinguish between a string of 16 bytes (since strings in PHP are random bytes too) and a UUID. This problem can be minimized through the implementation of UUIDs as a class. The code that constraints a type to a UUID has the guarantee that the string is of exactly 16 bytes. The developer that constraints a type to a UUID has the guarantee that the other developer passing a value at least had to have a look at the UUID class. Of course, there is nothing that prevent the other developer from creating UUIDs that are highly predictable, as it is impossible to ensure that.+UUIDs are basically random data. There is no way for an application to distinguish between a string of 16 bytes (since strings in PHP are random bytes too) and a UUID. This problem can be minimized through the implementation of UUIDs as a class. The code that constraints a type to a UUID has the guarantee that the string is of exactly 16 bytes. The developer that constraints a type to a UUID has the guarantee that the other developer passing a value at least had to have a look at the UUID class. Of course, there is nothing that prevents the other developer from creating UUIDs that are highly predictable, as it is impossible to ensure that she does not do that.
  
 ==== Implementation ==== ==== Implementation ====
Line 139: Line 139:
 Leading whitespace (spaces '' '' and tabs ''\t'') and opening braces (''{'') are ignored, so are trailing whitespace (spaces '' '' and tabs ''\t'') and closing braces (''}''). Hyphens (''-''), regardless of position, are always ignored. The method follows the [[https://en.wikipedia.org/wiki/Robustness_principle|robustness principle]] and is not meant for validation. The hexadecimal digits ''a'' through ''f'' are case insensitively parsed. Leading whitespace (spaces '' '' and tabs ''\t'') and opening braces (''{'') are ignored, so are trailing whitespace (spaces '' '' and tabs ''\t'') and closing braces (''}''). Hyphens (''-''), regardless of position, are always ignored. The method follows the [[https://en.wikipedia.org/wiki/Robustness_principle|robustness principle]] and is not meant for validation. The hexadecimal digits ''a'' through ''f'' are case insensitively parsed.
  
-A ''UUIDParseException'' is thrown is parsing of the input string fails.+A ''UUIDParseException'' is thrown if parsing of the input string fails.
  
 The named ''NamespaceDNS'', ''NamespaceOID'', ''NamespaceURL'', ''NamespaceX500'', and ''Nil'' constructors provide shortcuts for the predefined special UUIDs from RFC 4122. The named ''NamespaceDNS'', ''NamespaceOID'', ''NamespaceURL'', ''NamespaceX500'', and ''Nil'' constructors provide shortcuts for the predefined special UUIDs from RFC 4122.
Line 169: Line 169:
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
-Both ''UUID'' and ''UUIDParsException'' are now globally defined classes, which might collide with user defined classes of the same name in the global namespace. However, the risk of the introduction of them is considered to be very low, since the global namespace should not be used by PHP users.+Both ''UUID'' and ''UUIDParseException'' are now globally defined classes, which might collide with user defined classes of the same name in the global namespace. However, the risk of the introduction of them is considered to be very low, since the global namespace should not be used by PHP users.
  
 ===== Proposed PHP Version(s) ===== ===== Proposed PHP Version(s) =====
-Next PHP 7.x+7.3
  
 ===== RFC Impact ===== ===== RFC Impact =====
Line 191: Line 191:
  
 ===== Open Issues ===== ===== Open Issues =====
-==== Argument Parsing ==== +None
-The provided implementation makes use of the latest features that were added to PHP 7 and PHP 7.1, namely return type constraints. It is customary for internal PHP routines to verify that the number of parameters that were passed match exactly the signature of the routine, and to emit a warning if that count mismatches followed by returning null. The problem with this approach in combination with return type constraints is that a ''TypeError'' directly follows the emitted warning, since null is not part of that constraint. +
- +
-Making all return type constraints nullable (as would be possible since PHP 7.1) is a very bad idea, since it would break the promise of the routine, and every caller is required to account for null. Even if they pass the correct amount of arguments, since it becomes unclear just from looking at the signature in which situations null is actually returned. This completely defeats the purpose of a return type constraint. +
- +
-Another possible approach would be to simply ignore the parameters that were passed, not validating them at all. The problem with this approach is that it is different to how all the other internal PHP routines work. Nikita Popov said that doing this would require a global decision for all of PHP’s internal code. +
- +
-The last possible approach, and the one that is currently implemented, is it to validate the arguments and log a warning, but still return the desired correct value. This is probably the most sensible thing to do, as it ensures that the implementation does not diverge too far away from existing internal PHP routines, it notifies users of the incorrect usage via a warning, and it upholds the promise of returning e.g. a ''UUID'' instance under all circumstances. +
- +
-==== Namespace ==== +
-The discussion about namespace in PHP core is on ongoing dispute. This highly self-contained component could easily be provided from an internal namespace and thus lay the grounds for other things to come. This would require additional discussion, but possible namespaces in alphabetical order are (''UUID'' is the class): +
- +
-  * ''PHP\UUID'' +
-  * ''PHP\Core\UUID'' +
-  * ''PHP\Lang\UUID'' +
-  * ''PHP\Standard\UUID'' +
-  * ''PHP\STD\UUID'' +
-  * ''PHP\UUIDs\UUID'' +
- +
-Other variations, based on the premise that the reserved ''PHP'' namespace should be used only for things that are directly tied to the language itself (e.g. AST): +
- +
-  * ''Core\UUID'' +
-  * ''Lang\UUID'' +
-  * ''Standard\UUID'' +
-  * ''STD\UUID'' +
-  * ''UUIDs\UUID'' +
- +
-==== Final Class ==== +
-The ''UUID'' class is currently final, as it provides the best forward compatibility. However, this might not be the best choice for creating custom domain identifiers that are reusable across PHP libraries, e.g. a ''UserID'' that is provided by the ''FOSUserBundle'' and a ''UUID'' type constraint in Doctrine that can accept that one. However, it is possible to solve this kind of problem by simply adding a ''UUIDConvertible'' interface with a ''toUUID'' method to a library, and ensure type safe conversion through that while keeping the ''UUID'' class final. That being said, there are no real reasons other than forward compatibility to why the class should be final. +
- +
-==== Doxygen Documentation ==== +
-The provided implementation uses [[http://www.doxygen.org/|Doxygen]] documentation for the C APIs. There were concerns about this, because PHP is currently undocumented and no standard way of providing documentation is in use so far. [[http://news.php.net/php.internals/99140|A discussion on php-internals]] was started to address this issue separately.+
  
 ===== Unaffected PHP Functionality ===== ===== Unaffected PHP Functionality =====
Line 228: Line 197:
  
 ===== Future Scope ===== ===== Future Scope =====
-  * Deprecate and then remove ''[[https://php.net/function.uniqid|uniqid]]'' and recommend the usage of UUIDs instead. +  * [[https://wiki.php.net/rfc/deprecate-uniqid|Deprecate and then remove uniqid and recommend the usage of UUIDs instead.]] 
-  * Addition of a ''toURI'' method, if we have a proper URL value object.+  * Addition of a ''toURN'' method, if we have a proper URL value object.
  
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
-Simple 50%+1 majority vote.+Simple 50%+1 majority vote that ends on September 20, 2017. 
 + 
 +<doodle title="Add UUID value object to PHP standard module?" auth="fleshgrinder" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
Line 245: Line 219:
  
 ===== References ===== ===== References =====
 +  * [[https://tools.ietf.org/html/rfc4122|RFC 4122]]
 +  * [[https://en.wikipedia.org/wiki/Universally_unique_identifier|Wikipedia]]
   * [[http://news.php.net/php.internals/99136|php-internals discussion]]   * [[http://news.php.net/php.internals/99136|php-internals discussion]]
   * [[https://www.reddit.com/r/PHP/comments/6cyqtd/rfc_uuid/|Reddit discussion]]   * [[https://www.reddit.com/r/PHP/comments/6cyqtd/rfc_uuid/|Reddit discussion]]
-  * [[https://tools.ietf.org/html/rfc4122|RFC 4122]] +  * [[https://twitter.com/AmbassadorAwsum/status/868097123627171842|Twitter discussion]] 
-  * [[https://en.wikipedia.org/wiki/Universally_unique_identifier|Wikipedia]]+  * [[https://wiki.php.net/rfc/class-naming|Class Naming RFC]]
  
 ===== Rejected Features ===== ===== Rejected Features =====
-None so far.+  * Doxygen Documentation ([[https://wiki.php.net/rfc/doxygen|corresponding declined RFC]]) 
 +  * Namespaces ([[https://wiki.php.net/rfc/namespaces-in-core|corresponding withdrawn RFC]])
rfc/uuid.1495739141.txt.gz · Last modified: 2017/09/22 13:28 (external edit)