rfc:xmlreader_writer_streams

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:xmlreader_writer_streams [2024/05/17 22:52] – Static again nielsdosrfc:xmlreader_writer_streams [2024/05/27 19:42] (current) nielsdos
Line 1: Line 1:
 ====== PHP RFC: Add stream open functions to XML{Reader,Writer} ====== ====== PHP RFC: Add stream open functions to XML{Reader,Writer} ======
-  * Version: 0.10.0+  * Version: 0.11.0
   * Date: 2024-04-21   * Date: 2024-04-21
   * Author: Niels Dossche <nielsdos@php.net>   * Author: Niels Dossche <nielsdos@php.net>
Line 13: Line 13:
  
 ===== Proposal ===== ===== Proposal =====
 +
 +==== Main Proposal ====
  
 I propose to add 2 new functions, one to <php>XMLReader</php> and one to <php>XMLWriter</php>, to create an instance from a stream. Here is how they would look like: I propose to add 2 new functions, one to <php>XMLReader</php> and one to <php>XMLWriter</php>, to create an instance from a stream. Here is how they would look like:
Line 33: Line 35:
 The <php>$documentUri</php> parameter is used mostly for when libxml outputs error messages, such that you can put an origin name in there. The <php>$documentUri</php> parameter is used mostly for when libxml outputs error messages, such that you can put an origin name in there.
  
-The signature for <php>XMLWriter::toStream()</php> should be self-explanatory. It is also modeled like the other open functions, but they are considerably simpler. You'll also notice the lack of an encoding argument, and that's because this is already handled by the <php>XMLWriter::startDocument()</php> function.+The signature for <php>XMLWriter::toStream()</php> should be self-explanatory. It is also modeled like the other open functions, but it is considerably simpler. You'll also notice the lack of an encoding argument, and that's because this is already handled by the <php>XMLWriter::startDocument()</php> function.
  
 While implementing this, I found some strange behaviour regarding the <php>?string $encoding</php> parameter of the existing functions <php>XMLReader::open()</php> and <php>XMLReader::XML()</php>. The first oddity is that they emit a warning instead of throwing a ValueError when the encoding contains NULL bytes. This is inconsistent with how other functions handle it. I propose to promote this warning to a ValueError instead. The second oddity is that invalid encoding names are ignored entirely. This means that it won't emit a warning or anything, but just silently not set the encoding. This can hide bugs. I propose to also throw a ValueError in this case stating "Argument #X ($encoding) must be a valid character encoding". While implementing this, I found some strange behaviour regarding the <php>?string $encoding</php> parameter of the existing functions <php>XMLReader::open()</php> and <php>XMLReader::XML()</php>. The first oddity is that they emit a warning instead of throwing a ValueError when the encoding contains NULL bytes. This is inconsistent with how other functions handle it. I propose to promote this warning to a ValueError instead. The second oddity is that invalid encoding names are ignored entirely. This means that it won't emit a warning or anything, but just silently not set the encoding. This can hide bugs. I propose to also throw a ValueError in this case stating "Argument #X ($encoding) must be a valid character encoding".
  
 An earlier version of this RFC proposed adding <php>openStream()</php> methods to both classes, but the naming was not ideal and the behaviour of being an instance method was not liked. Therefore, this was changed to static-only methods that return an instance of the respective class. An earlier version of this RFC proposed adding <php>openStream()</php> methods to both classes, but the naming was not ideal and the behaviour of being an instance method was not liked. Therefore, this was changed to static-only methods that return an instance of the respective class.
 +
 +==== Consistency ====
 +
 +We're adding new static named constructors to the <php>XMLWriter</php> and <php>XMLReader</php> classes. However, <php>XMLWriter</php> doesn't have static constructors yet and <php>XMLReader</php> has this hybrid static methods/instance methods we talked about earlier. Those existing methods also can't be used in subclasses because they return <php>XMLWriter</php> or <php>XMLReader</php> instead of <php>static</php>.
 +
 +The idea is to add the following static named constructors for consistency with the newly proposed methods, with the same arguments as their existing counterpart:
 +  - <php>XMLReader::fromUrl(string $url, ?string $encoding = null, int $flags = 0): static</php> as a new version of <php>XMLReader::open(...)</php>
 +  - <php>XMLReader::fromString(string $source, ?string $encoding = null, int $flags = 0): static</php> as a new version of <php>XMLReader::XML(...)</php>
 +  - <php>XMLWriter::toMemory(): static</php> as a new version of <php>XMLWriter::openMemory(...)</php>
 +  - <php>XMLWriter::toUrl(string $url): static</php> as a new version of <php>XMLWriter::openUri(...)</php>
 +
 +Note I used Url here instead of Uri because that's the more accurate term: it actually locates the resource instead of just identifying it.
 +
 +This does not aim to deprecate any existing methods. We will merely update the documentation to point users towards the new constructors.
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
Line 43: Line 59:
 There are three minor BC breaks. There are three minor BC breaks.
  
-The first one is the fact that we're adding the <php>fromStream()</php> or <php>toStream</php> method. If a user extends the <php>XMLReader</php> or <php>XMLWriter</php> class, and their class implements a method with the same name but an incompatible signature, a compile error will occur. I analyzed the top 2500 Composer packages, and none used the toStream or fromStream names in subclasses of the XML classes.. This means that the top 2500 packages don't suffer a BC break because of this. That doesn't mean there will be none, but it gives a good indication.+The first one is the fact that we're adding new methods. If a user extends the <php>XMLReader</php> or <php>XMLWriter</php> class, and their class implements a method with the same name but an incompatible signature, a compile error will occur. I analyzed the top 2500 Composer packages, and none used any of the proposed function names in subclasses of the XML classes. This means that the top 2500 packages don't suffer a BC break because of this. That doesn't mean there will be none, but it gives a good indication.
  
 The second BC break is caused by throwing a <php>ValueError</php> on invalid encodings instead of silently ignoring invalid encodings. If we don't signal the invalid encoding in any way to the user, this can subtly hide bugs. For example, this could hide typos or silently pass invalid user input to the respective functions. Forcing developers to handle this error explicitly will result in more robust code in the end. The second BC break is caused by throwing a <php>ValueError</php> on invalid encodings instead of silently ignoring invalid encodings. If we don't signal the invalid encoding in any way to the user, this can subtly hide bugs. For example, this could hide typos or silently pass invalid user input to the respective functions. Forcing developers to handle this error explicitly will result in more robust code in the end.
  
-The third BC break is the promotion of the NULL-byte warning to a <php>ValueError</php>. This makes the <php>XMLReader</php> and <php>XMLWriter</php> class more consistent with other extensions that throw instead of issuing a warning. The migration for developers should be quite simple: instead of silencing the warning and/or checking the return value of the function, they should use a try-catch construct to handle the error.+The third BC break is the promotion of the NUL-byte warning to a <php>ValueError</php>. This makes the <php>XMLReader</php> and <php>XMLWriter</php> class more consistent with other extensions that throw instead of issuing a warning. The migration for developers should be quite simple: instead of silencing the warning and/or checking the return value of the function, they should use a try-catch construct to handle the error.
  
 ===== Example usages ===== ===== Example usages =====
Line 114: Line 130:
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
  
-One primary vote requiring 2/3rd majority to accept the RFC as a whole.+Two primary votes each requiring 2/3rd majority: one for the main proposal and one for the consistency proposal.
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
Line 138: Line 154:
 ===== Changelog ===== ===== Changelog =====
  
 +  - 0.11.0: Incorporate feedback about static methods
 +  - 0.10.1: Language fixes
   - 0.10.0: Static again   - 0.10.0: Static again
   - 0.9.2: Add example usages of the new APIs.   - 0.9.2: Add example usages of the new APIs.
   - 0.9.1: Made XMLReader::openStream() non-static instead such that it works with overridden classes.   - 0.9.1: Made XMLReader::openStream() non-static instead such that it works with overridden classes.
   - 0.9: Initial version under discussion   - 0.9: Initial version under discussion
rfc/xmlreader_writer_streams.1715986370.txt.gz · Last modified: 2024/05/17 22:52 by nielsdos