This is an old revision of the document!
PHP RFC: Implement Current DOM Living Standard API
- Version: 0.2
- Date: 2019-04-06
- Author: Benjamin Eberlei (, Thomas Weinert
- Status: Under Discussion
- First Published at:
Working with XML (HTML) documents is a necessary task for many web applications and the dom extension implements a standardized API that was previously specified by a w3 group into 3 DOM Levels. Since then the standard has evolved and is now a Living Standard similar to HTML 5 and continously evolving and maintained by the Web Hypertext Application Technology Working Group (WHATWG).
Because the new API provides much improved traversal and manipulation APIs than the old API we propose to add the new methods to the existing ext/dom API.
Specifically we think this is a better solution to providing them in userland, because
- ext/dom + DOMDocument represents the DOM Standard, so we should continue to support the evolving versions.
- the added methods are a huge value add to users and fix a lot of more complicated approaches that were previously required by users. The search or re-implementation costs are high for users.
Follow the DOM Living Standard with ext/dom
This RFC proposes to adapt the current DOM standard changes to the PHP langauge by introducing new interfaces and public properties that simplify traversal and manipulation of DOM elements.
<?php interface DOMParentNode { /** Returns the previous node in the same hierachy that is a DOMElement or NULL if there is none */ public readonly DOMNode? $previousElementNode; /** Returns the next node in the same hierachy that is a DOMElement or NULL if there is none */ public readonly DOMNode? $nextElementNode; /** counts all child nodes that are DOMElements */ public readonly int $childElementCount; /** appends one or many nodes to the list of children behind the last child node */ public function append(...DOMNode|string|null $nodes) : void; /** prepends one or many nodes to the list of children before the first child node */ public function prepend(...DOMNode|string|null $nodes) : void; } class DOMDocument implements DOMParentNode {} class DOMElement implements DOMParentNode {} class DOMDocumentFragment implements DOMParentNode {} interface DOMChildNode { public readonly DOMNode? $previousElementSibling; public readonly DOMNode? $nextElementSibling; /** acts as a simpler version of $element->parentNode->removeChild($element); */ public function remove() : void; /** add passed node(s) before the current node */ public function before(...DOMNode|string|null $nodes) : void; /** add passed node(s) after the current node */ public function after(...DOMNode|string|null $nodes) : void; /** replace current node with new node(s), a combination of remove() + append() */ public function replaceWith(...DOMNode|string|null $nodes) : void; } class DOMElement implements DOMChildNode {} class DOMCharacterData implements DOMChildNode {} class DOMDocumentType implements DOMChildNode {}
Implementation choices
We deviate from the DOM Living Standard in some details, because it is written for Browser/Javascript implementations and the concepts cannot all be transferred 1:1 to PHP ext/dom.
The living standard implements DOMParentNode and DOMChildNode as “traits” or mixins and doesn't provide interfaces for them (as Javascript has no interfaces). This might make more sense with the primary language target (JavaScript), but for PHP it makes more sense to have the functionality available through an interface, so that code can test for $node instanceof DOMParentNode for example.
The living standard contains an intermediate trait (interface) DOMNonDocumentTypeChildNode that defines the previousElementSibling and nextElementSibling properties. This is introduced in the living standard to provide backwards compatibility with browser/web implementations, which are not our concern. In addition PHP interfaces cannot declare properties, so it wouldn't make sense to add this empty interface. For this reason this class was not introduced, but the properties are instead declared on DOMChildNode directly.
The querySelector and querySelectorAll methods defined on the DOMParentNode interface are omitted, because of their underlying complexity (using a CSS query selector parser) we recommend to leave implementations of comparable functionality to userland libraries such as PhpCss or Symfony CSS Selector.
Backward Incompatible Changes
Code using registerNodeClass to overwrite DOM classes can be affected IF they already implement the new functionality in a way that doesn't satisfy the behavior or signature of this proposed code changes.
Proposed PHP Version(s)
PHP 8.0
RFC Impact
No effect on SAPIs.
To Existing Extensions
The dom extensions API is changed in a mostly backwards compatible way (only adding new properties/methods). Breaking is code using registerNodeClass that adds child classes that also implement the new methods, but use a different signature.
The new functionality can all be implemented entirely using the already available libxml2 datastructures, so no changes to the libxml2 dependency is nceessary.
To Opcache
No effect on Opcache.
Patches and Tests
This pull request is still work in progress.
- DOM Living Standard Document