====== PHP RFC: ext/uri follow-up ====== * Version: 0.1 * Date: 2025-10-17 * Author: Máté Kocsis, kocsismate@php.net * Status: Draft * Target version: next minor version (PHP 8.6) * Implementation: https://github.com/kocsismate/php-src/pull/9 ===== Introduction ===== This RFC proposes various follow-up improvements to the [[rfc:url_parsing_api|URL Parsing API RFC]], extending the ''Uri\Rfc3986\Uri'' and ''Uri\WhatWg\Url'' classes with additional capabilities that came up during the discussion of the original RFC. These capabilities were deemed not to be essential from the get-go, therefore they were postponed in order not to increase scope even further. ===== Proposal ===== The following new functionality is introduced in this proposal: - [[#uri_building|URI Building]] - [[#query_parameter_manipulation|Query Parameter Manipulation]] - [[#accessing_path_segments_as_an_array|Accessing Path Segments as an Array]] - [[#host_type_detection|Host Type Detection]] - [[#uri_type_detection|URI Type Detection]] - [[#percent-encoding_and_decoding_support|Percent-Encoding and Decoding Support]] Each feature proposed is voted separately and requires a 2/3 majority. ==== URI Building ==== Currently, only **already existing (and validated)** URIs can be manipulated via [[https://wiki.php.net/rfc/url_parsing_api#component_modification|wither methods]]. These calls always create a new instance so that immutability of URIs is preserved. Even though this behavior has plenty of advantages, there's at least one disadvantage with this: instance creation has a performance overhead which is not necessary in some cases. This is especially problematic if a lot of URI components have to be modified in the same time, because a lot of objects are "wasted" through intermediate instantiations. $uri1 = Uri\Rfc3986\Uri::parse("http://example.com"); $uri2 = $uri1 ->withScheme("https") ->withHost("example.net") ->withPath("/foo/bar"); // This creates 3 objects altogether! Besides its suboptimal performance, another drawback of the current wither-based solution is that URI creation from the scratch is currently not possible: one always has to have a valid URI first. The empty string is a valid RFC 3986 URI, that's why it may seem a good candidate for an initial URI for URI building, but unfortunately, it's not valid for WHATWG URL. And anyway, the success of some transformations depend on the current state (which is a form of temporal coupling): $uri1 = Uri\Rfc3986\Uri::parse(""); $uri2 = $uri1 ->withScheme("https") ->withUserInfo("user:pass") // throws Uri\InvalidUriException: Cannot set a userinfo without having a host ->withHost("example.com"); $uri2 = $uri1 ->withScheme("https") ->withHost("example.com") ->withUserInfo("user:pass") // No exception is thrown In order to provide a more ergonomic and efficient solution for URI building, a fluent API is introduced that implements the [[https://refactoring.guru/design-patterns/builder|Builder pattern]]. $uriBuilder = new Uri\Rfc3986\UriBuilder(); $uriBuilder ->setScheme("https") ->setUserInfo("user:pass") ->setHost("example.com") ->setPort(8080) ->setPath("/foo/bar") ->setQuery("a=1&b=2"]) ->setQueryParams(["a" => 1, "b" => 2]) // Has the same effect as the setQuery() call above ->setFragment("section1") $uri = $uriBuilder->build(); // Validation and instance creation is only done at this point echo $uri->toRawString(); // https://user:pass@example.com:8080/foo/bar?a=1&b=2#section1 The same works for WHATWG URL: $urlBuilder = new Uri\WhatWg\UrlBuilder(); $urlBuilder ->setScheme("https") ->setUserInfo("user:pass") ->setHost("example.com") ->setPort(8080) ->setPath("/foo/bar") ->setQuery("a=1&b=2"]) ->setQueryParams(["a" => 1, "b" => 2]) // Has the same effect as the setQuery() call above ->setFragment("section1") $url = $urlBuilder->build(); // Validation and instance creation is only done at this point echo $url->toAsciiString; // https://user:pass@example.com:8080/foo/bar?a=1&b=2#section1 The complete class signatures to be added are the following: namespace Uri\Rfc3986 { final class UriBuilder { public function __construct() {} public function setScheme(?string $scheme): static {} public function setUsername(?string $username): static {} public function setPassword(?string $password): static {} public function setUserInfo(?string $userInfo): static {} public function setHost(?string $host): static {} public function setPath(string $path): static {} public function setQuery(?string $query): static {} public function setQueryParams(mixed $queryParams): static {} public function setFragment(?string $fragment): static {} public function build(?\Uri\Rfc3986\Uri $baseUrl = null): \Uri\Rfc3986\Uri {} } } namespace Uri\WhatWg { final class UrlBuilder { public function __construct() {} public function setScheme(?string $scheme): static {} public function setUsername(?string $username): static {} public function setPassword(?string $password): static {} public function setUserInfo(?string $userInfo): static {} public function setHost(?string $host): static {} public function setPath(string $path): static {} public function setQuery(?string $query): static {} public function setQueryParams(mixed $queryParams): static {} public function setFragment(?string $fragment): static {} /** @param array $errors */ public function build(?\Uri\WhatWg\Url $baseUrl = null, &$errors = null): \Uri\WhatWg\Url {} } } === Design considerations === == Builder pattern vs static factory method == Why is a complex Builder pattern based approach is proposed instead of a much simpler [[https://refactoring.guru/design-patterns/factory-method|Factory Method]] based one? The factory method could be as simple as the following: namespace Uri\Rfc3986 { final readonly class Uri { ... public static function fromComponents( ?string $scheme = null, ?string $host = null, string $path = "", ?string $userInfo = null, ?string $queryString = null, ?string $fragment = null ) {} ... } } namespace Uri\WhatWg { final readonly class Url { ... public static function fromComponents( string $scheme, ?string $host = "", string $path = "", ?string $username = null, ?string $password = null, ?string $queryString = null, ?string $fragment = null ) {} ... } } The current RFC proposes the Builder pattern based approach because of its flexibility: it makes it possible to add more convenience methods in the future. Actually, the ''setQueryParams()'' method that expects an array of query params instead of the query string representation is already one. * Yes * No * Abstain ==== Query Parameter Manipulation ==== Query parameter manipulation is an integral part of URI handling. WHATWG URL even dedicates a separate section for the [[https://url.spec.whatwg.org/#interface-urlsearchparams|URLSearchParams]] class that implements advanced query parameter handling. Unfortunately, RFC 3986 doesn't have any such capability, so ultimately, both proposed classes closely follow the design of the WHATWG URL specification. Therefore, the following classes and methods are proposed for addition: namespace Uri\Rfc3986 { final class UriQueryParams implements Countable, InteratorAggregate { public static function parse(string $queryString): \Uri\Rfc3986\UriQueryParams {} public static function fromArray(array $queryParams): \Uri\Rfc3986\UriQueryParams {} private function __construct() {} public function append(string $name, mixed $value): void {} public function delete(string $name): void {} public function deleteWithValue(string $name, mixed $value): bool {} public function has(string $name): bool {} public function hasWithValue(string $name, mixed $value): bool {} public function getFirst(string $name): mixed {} public function getLast(string $name): mixed {} public function getAll(?string $name = null): array {} public function getCount(): int {} public function set(string $name, mixed $value): void {} public function sort(): void {} public function toString(): string {} public function __serialize(): array {} public function __unserialize(array $data): void {} public function __debugInfo(): array {} } final readonly class Uri { ... public function getRawQueryParams(): ?\Uri\Rfc3986\UriQueryParams {} public function getQueryParams(): ?\Uri\Rfc3986\UriQueryParams {} #[\NoDiscard(message: "as Uri\Rfc3986\Uri::withQueryParams() does not modify the object itself")] public function withQueryParams(?\Uri\Rfc3986\UriQueryParams $queryParams): static {} ... } } namespace Uri\WhatWg { final class UrlQueryParams implements Countable, IteratorAggregate { public static function parse(string $queryString): \Uri\WhatWg\UrlQueryParams {} public static function fromArray(array $queryParams): \Uri\WhatWg\UrlQueryParams {} private function __construct() {} public function append(string $name, mixed $value): void {} public function delete(string $name): void {} public function deleteWithValue(string $name, mixed $value): void {} public function has(string $name): bool {} public function hasWithValue(string $name, string $value): bool {} public function getFirst(string $name): mixed {} public function getLast(string $name): mixed {} public function getAll(?string $name = null): array {} public function getCount(): int {} public function set(string $name, mixed $value): void {} public function sort(): void {} public function toString(): string {} public function __serialize(): array {} public function __unserialize(array $data): void {} public function __debugInfo(): array {} } final readonly class Url { ... public function getQueryParams(): ?\Uri\WhatWg\UrlQueryParams {} #[\NoDiscard(message: "as Uri\WhatWg\Url::withQueryParams() does not modify the object itself")] public function withQueryParams(?\Uri\WhatWg\UrlQueryParams $queryParams): static {} ... } } === Construction === Both ''UriQueryParams'' and ''UrlQueryParams'' support two factory methods for instantiation: * **''parse()'' method**: It parses a query string into a list of query parameters. * **''fromArray()'' method**: It takes an array of query parameters and directly composes the query parameter list object based on it. It may be counter-intuitive, but a multi-dimension array is expected (''%%[["key1" => "value1"], ["key2" => "value2"]]%%'') instead of a single array of key-value pairs (''["key1" => "value1", "key2" => "value2"]''). This is needed to support repeated query parameter names. The constructor of both classes is private that even throws upon invocation in order to enforce the usage of the above mentioned factory methods. Some examples for instantiation: $params = Uri\Rfc3986\UriQueryParams::parse("abc=foo&abc=bar"); // Successful instantiation $params = Uri\Rfc3986\UriQueryParams::fromArray( [ ["abc" => "foo"], ["abc" => "bar"], ] ); // Successful instantiation - same result as above $params = new Uri\Rfc3986\UriQueryParams(); // Thrown an exception $params = Uri\WhatWg\UrlQueryParams::parse("abc=foo&abc=bar"); // Successful instantiation $params = Uri\WhatWg\UrlQueryParams::fromArray( [ ["abc" => "foo"], ["abc" => "bar"], ] ); // Successful instantiation - same result as above $params = new Uri\WhatWg\UrlQueryParams(); // Thrown an exception It is also possible to create a ''UriQueryParams'' or ''UrlQueryParams'' instance from an ''Uri\Rfc3986\Uri'' or an ''Uri\WhatWg\Url'' object, respectively: $uri = new Uri\Rfc3986\Uri("https://example.com/?foo=bar"); $params = $uri->getRawQueryParams(); // Creates a Uri\Rfc3986\UriQueryParams instance $params = $uri->getQueryParams(); // Creates a Uri\Rfc3986\UriQueryParams instance $url = new Uri\WhatWg\Url("https://example.com/?foo=bar"); $params = $url->getQueryParams(); // Creates a Uri\Rfc3986\UriQueryParams instance The difference between ''Uri\Rfc3986\Uri::getRawQueryParams()'' and ''Uri\Rfc3986\Uri::getQueryParams()'' is that the former one passes the "raw" (non-normalized) query string as an input when instantiating ''Uri\Rfc3986\Uri\UriQueryParams''. The ''Uri\Rfc3986\Uri::getRawQueryParams()'', ''Uri\Rfc3986\Uri::getQueryParams()'', ''Uri\WhatWg\Url::getQueryParams()'' methods return ''null'' if the query string is missing (e.g. https://example.com/), and an empty query parameter list is returned if the query string is empty (e.g. https://example.com/?). $uri = new Uri\Rfc3986\Uri("https://example.com/"); echo $uri->getRawQueryParams(); // null echo $uri->getQueryParams(); // null $uri = new Uri\Rfc3986\Uri("https://example.com/?"); echo $uri->getRawQueryParams(); // A new Uri\Rfc3986\Uri\UriQueryParams containing zero items echo $uri->getQueryParams(); // A new Uri\Rfc3986\Uri\UriQueryParams containing zero items The same example with ''Uri\WhatWg\UrlQueryParams'': $url = new Uri\WhatWg\Url("https://example.com/"); echo $url->getQueryParams(); // null $url = new Uri\WhatWg\Url("https://example.com/?"); echo $url->getQueryParams(); // A new Uri\WhatWg\Url\UrlQueryParams containing zero items It's important to note that neither of ''UriQueryParams'' and ''UrlQueryParams'' validate the query parameters appropriately during construction. This behavior is by design, because the idea of WHATWG URL's ''URLSearchParams'' class is that it's tolerant for reading, and ''UriQueryParams'' and ''UrlQueryParams'' follow the same principle. Validation happens anyway when the serialized query parameters are attempted to be written to a URI (via ''Uri\Rfc3986\Uri::withQueryParams()'' and ''Uri\WhatWg\Url::withQueryParams()''). $uri = new Uri\Rfc3986\Uri("https://example.com/"); $params = new Uri\Rfc3986\UriQueryParams("#foo=bar"); // Parses an invalid parameter name "#foo" $uri = $uri->withQueryParams($params); // Throws Uri\InvalidUri exception The same example with ''Uri\WhatWg\UrlQueryParams'' works a bit differently though due to the [[https://wiki.php.net/rfc/url_parsing_api#percent-encoding_decoding|automatic percent-encoding]] behavior of WHATWG URL: $url = new Uri\WhatWg\Url("https://example.com/"); $params = new Uri\WhatWg\UrlQueryParams("#foo=bar"); // Parses an invalid parameter name "#baz" $url = $url->withQueryParams($params); // Success: the query is automatically percent-encoded to "%23foo=bar" The factory methods cannot fail in practice: they only have memory-related failure cases which are handled by the PHP engine as a fatal error. According to the WHATWG URL algorithm, the leading "?" character is removed during parsing. It's not the case for RFC 3986 - the leading "?" becomes part of the first query parameter name. $params = Uri\Rfc3986\UriQueryParams::parse("?abc=foo"); // $params internally contains the ["?abc" => "foo"] key-value pair $params = Uri\WhatWg\UrlQueryParams::parse("?abc=foo"); // $params internally contains the ["abc" => "foo"] key-value pair Another difference between the two classes is how they parse percent-encoded characters. While ''UriQueryParams'' don't transform any of the input, ''UrlQueryParams'' percent-decodes it automatically as per the WHATWG URL specification: $params = Uri\Rfc3986\UriQueryParams::parse("foo%5B%5D=b%61r"); // Percent-encoded form of "foo[]=bar" // $params internally contains the ["foo%5B%5D" => "b%61r"] key-value pair $params = Uri\WhatWg\UrlQueryParams::parse("foo%5B%5D=b%61r"); // Percent-encoded form of "foo[]=bar" // $params internally contains the ["foo[]" => "bar"] key-value pair === Parameter Retrieval === To find out if a parameter exists, the ''has()'' and ''hasWithValue()'' methods can be used: $params = new Uri\Rfc3986\UriQueryParams("foo=bar&baz=qux&baz=baz"); echo $params->has("baz"); // true echo $params->has("non-existent"); // false echo $params->hasWithValue("foo", "bar"); // true echo $params->hasWithValue("foo", "baz"); // false The ''has()'' method returns ''true'' if there is at least one parameter in the parameter list with the given name, ''false'' otherwise. On the other hand, ''hasWithValue()'' returns ''true'' if the given name and value both matches at least one parameter, otherwise it returns ''false''. The number of query parameters can be retrieved by calling the ''getCount()'' method: $params = new Uri\Rfc3986\UriQueryParams("foo=bar&baz=qux&baz=baz"); echo $params->getCount(); // 3 There are also a number of methods that can return a query parameter or an array of query parameters: * ''getFirst()'': Retrieves the first parameter with the given name. This actually implements the [[https://url.spec.whatwg.org/#dom-urlsearchparams-get|get() method]] in the WHATWG URL specification. * ''getLast()'': Retrieves the last parameter with the given name. It's a custom addition to the WHATWG URL specification. * ''getAll()'': Retrieves either all parameters if the ''$name'' parameter is ''null'', or all parameters with the given name if the ''$name'' parameter is a ''string''. $params = new Uri\Rfc3986\UriQueryParams("foo=bar&foo=baz&qux=quux"); echo $params->getFirst("foo"); // bar echo $params->getFirst("non-existent"); // null echo $params->getLast("foo"); // baz echo $params->getLast("non-existent"); // null echo $params->getAll("foo"); // [["foo", "bar"], ["foo", "baz"]] echo $params->getAll("non-existent"); // [] echo $params->getAll(null); // [["foo", "bar"], ["foo", "baz"], ["qux", "quux"]] echo $params->getAll(); // [["foo", "bar"], ["foo", "baz"], ["qux", "quux"]] All these methods return the natively stores values without applying any transformations. That is, percent-encoding or decoding neither happens in the input or in the output. $params = new Uri\Rfc3986\UriQueryParams("foo%5B%5D=b%61e"); echo $params->getFirst("foo%5B%5D"); // b%61e echo $params->getFirst("foo[]"); // null echo $params->getLast("foo%5B%5D"); // b%61e echo $params->getLast("foo[]"); // null echo $params->getAll("foo%5B%5D"); // [["foo%5B%5D", "ab%63"]] echo $params->getAll("foo[]"); // [] === Percent-Encoding and Decoding === ''UriQueryParams'' and ''UrlQueryParams'' have their distinct way of percent-encoding and decoding which is mostly similar to the behavior of RFC 3986 URIs and WHATWG URLs, but it doesn't quite work the same way. This section will discuss the specific details. ''UriQueryParams'' builds upon the [[https://wiki.php.net/rfc/url_parsing_api#parser_library_choice|uriparser library]] just like RFC 3986 URIs do. Uriparser has its custom query parameter list implementation that follows [[https://datatracker.ietf.org/doc/html/rfc1866#section-8.2.1|RFC 1866]] in the absence of any clarification in RFC 3986 about how this component should be processed. According to RFC 1866, space characters are replaced by the plus character (''+'') during percent-encoding, and the rest of the reserved characters are percent-encoded as normally. Percent-decoding inverts these operations. This behavior clearly deviates from the percent-encoding rules of the query component of RFC 3986 which allows much more characters to be present without percent-encoding (a few examples: ":", "@", "?", "/"), not to mention the difference in how the space character is handled. On the other hand, ''UrlQueryParams'' relies on the ''URLSearchParams'' class specified by WHATG URL, that yet again builds upon the ''application/x-www-form-urlencoded'' media type for historic reasons, albeit slightly differently than how RFC 1866 specifies it. As usually, WHATWG URL defines a [[https://url.spec.whatwg.org/#application-x-www-form-urlencoded-percent-encode-set|dedicated percent-encoding set]]:
The application/x-www-form-urlencoded percent-encode set contains all code points, except the ASCII alphanumeric, U+002A (*), U+002D (-), U+002E (.), and U+005F (_).
Also, a [[https://url.spec.whatwg.org/#urlencoded-serializing|dedicated algorithm]] for "serialization" is defined (in this context, serialization means recomposition - converting the list to a string): the space code point is percent-encoded as the plus code point (''+''), and the rest of the code points in the percent-encoding set are encoded how WHATWG URL normally does so. This behavior deviates from the percent-encoding rules of the query component of WHATWG URL, as the [[https://url.spec.whatwg.org/#query-percent-encode-set|query percent-encode set]] contains much less characters, and the space code point is handled differently again. It's also important to compare how the percent encoding rules of ''UriQueryParams'' and ''UrlQueryParams'' differ: they handle the asterisk (''*'') and the tilde (''~'') symbols differently: ''UriQueryParams'' percent-encodes the first one, but ''UrlQueryParams'' doesn't, however ''UriQueryParams'' doesn't percent-encode the latter one, but ''UrlQueryParams'' does so. === Recomposition === In order to be consistent with the design of ''Uri\Rfc3986\Uri'' and the ''Uri\WhatWg\Url'' classes, neither ''UriQueryParams'', nor ''UrlQueryParams'' have a ''%%__toString()%%'' magic method. Instead, they contain a custom ''toString()'' method that recomposes the query string from the parameters. $params = new Uri\Rfc3986\UriQueryParams("foo=bar&foo=baz"); echo $params->toString(); // foo=bar&foo=baz $params = new Uri\WhatWg\UrlQueryParams("foo=bar&foo=baz"); echo $params->toString(); // foo=bar&foo=baz Both ''Uri\Rfc3986\UriQueryParams::toString()'' and ''Uri\WhatWg\UrlQueryParams::toString()'' automatically percent-encodes the output according to the rules outlined in the [[https://wiki.php.net/rfc/uri_followup#percent-encoding_and_decoding|previous section]]. $params = new Uri\Rfc3986\UriQueryParams([["foo[]" => "bar baz"]]); echo $params->toString(); // foo%5B%5D=bar+baz $params = new Uri\WhatWg\UrlQueryParams([["foo[]" => "bar baz"]]); echo $params->toString(); // foo%5B%5D=bar+baz Unlike ''Uri\Rfc3986\Uri'', the ''Uri\Rfc3986\UriQueryParams'' class doesn't have a ''toRawString()'' method because it could be misleading what it exactly does: ''toRawString()'' couldn't really provide a "raw" representation of the query string, since automatic percent-encoding must happen any way to make the produced query string valid. If normalization of the recomposed query string is needed, ''Uri\Rfc3986\Uri'' is there for the rescue: $params = new Uri\Rfc3986\UriQueryParams("foo=b%61r"); // Percent-encoded form of "foo=bar" $uri = new Uri\Rfc3986\Uri("https://example.com"); $uri = $uri->withQueryParams($params); echo $uri->getQuery(); // foo=bar The above example demonstrates that query parameter normalization - which involves percent-decoding of the unnecessarily percent-encoded "a" - can still be achieved no matter that ''UriQueryParams'' does not have a dedicated ''toString()'' variant. Please keep in mind that the [[https://wiki.php.net/rfc/uri_followup#percent-encoding_and_decoding_support|last section of the proposal]] will introduce another possibility to achieve the same result. === Relation to the query component === After learning about the details of the percent-encoding and decoding behavior of ''UriQueryParams'' and ''UrlQueryParams'', it should be clarified how the new classes can interoperate with the existing ''Uri\Rfc3986\Uri'' and ''Uri\WhatWg\Url''? The short answer is they won't have 100% compatibility. But let's see an example where things can go wrong: $uri = new Uri\Rfc3986\Uri("https://example.com?foo=a b"); $params = $uri->getQueryParams(); $uri = $uri->withQueryParams($params); echo $uri->getQuery(); // foo=a+b The above example illustrates how the different percent-encoding mechanism of ''Uri\Rfc3986\Uri'' and ''Uri\Rfc3986\UriQueryParams'' affect the results: the original "foo=a b" query component is percent-encoded to "foo=a+b" during the ''$uri->withQueryParams($params)'' call. That's why the workflow is not roundtripable. ''Uri\WhatWg\UrlQueryParams'' and ''Uri\WhatWg\Url'' have the very problem, and it's even encoded in the WHATWG URL specification itself. === Modification === The ''append()'' method can be used to append a parameter to the end of the list. As normally, the same query parameter can be added multiple times: $params = new Uri\Rfc3986\UriQueryParams("foo=bar"); $params->append("baz", "qux"); $params->append("baz", "qaz"); // Appends "baz" twice echo $params->toString(); // foo=bar&baz=qux&baz=qaz Updating a parameter is possible via the ''set()'' method: $params = new Uri\Rfc3986\UriQueryParams("foo=bar&foo=baz"); $params->set("foo", "baz"); // Overwrites the first item "foo", and removes the second one $params->set("qux", "qaz"); // Appends a new item "qux" echo $params->toString(); // foo=bar&baz=qux&baz=qaz Actually, the ''set()'' method has a hybrid behavior: if a parameter is not present in the list, then it adds it just like ''append()'' does. Otherwise, it overwrites the first item, and removes the rest of the occurrences. Removing parameters is possible via either the ''delete()'' or the ''deleteWithValue()'' method: the former one removes all occurrences of the given parameter name, while the latter one removes all occurrences of a parameter if the given name and value both matches it, as demonstrated below: $params = new Uri\Rfc3986\UriQueryParams("foo=bar&foo=baz&foo=qux"); $params->deleteWithValue("foo", "baz"); // Deletes the "foo=baz" parameter $params->delete("foo"); // Deletes the rest of the occurrences: "foo=bar" and "foo=qux" $params->delete("non-existent"); // The parameter is not present: nothing happens The last method that can modify the list is ''sort()'', which sorts the parameters alphabetically: $params = new Uri\Rfc3986\UriQueryParams("foo=bar&baz=qux&baz=baz"); $params->sort(); echo $params->toString(); // baz=baz&baz=qux&foo=bar None of these methods do any percent-encoding or decoding. This wasn't a question for RFC 3986 though, but WHATWG URL usually does some kind of automatic post-processing. $params = new Uri\WhatWg\UrlQueryParams(""); $params->append("foo%5B%5D", "ab%63"); // Percent-encoded form of "foo[]=abc" $params->set("bar%5B%5D", "de%66"); // Percent-encoded form of "bar[]=def" echo $params->toString(); // foo%5B%5D=ab%63&bar%5B%5D=de%66 As it can be seen, the percent-encoded octets received from the input remained the same in the output. === Type support === What's also important to clarify is how non-string values are mapped? PHP's [[http_build_query()|https://www.php.net/manual/en/function.http-build-query.php]] and functions can map basically any type to query params, however, the exact behavior is not specified by either RFC 3986 or WHATWG URL: RFC 3986 completely omits any information how query parameters should be build, while WHATWG URL's ''URLSearchParams'' only accepts and returns string data. The position of this RFC is that it's important to follow the road that ''http_build_query()'' has already paved because of better developer experience and better interoperability with the existing ecosystem. That's why the following type mapping behavior is proposed **when a query parameter is added/updated**: * **bool:** becomes string "0" (in case of ''false'') or string "1" (in case of ''true'') * **int:** becomes a numeric string (123 -> "123") * **float:** becomes a decimal string (3.14 -> "3.14") * **resource:** invalid mapping, throws a ''TypeError'' * **array:** TBD * **object:** TBD The above conversion rules work for both ''UriQueryParams'' and ''UrlQueryParams''. However, ''Uri\Rfc3986\UriQueryParams'' can additionally properly handle ''null'' values: a ''null'' input is mapped to a query component so that only the parameter name is present - the "=" and the parameter value is omitted. On the other hand, ''Uri\WhatWg\UrlQueryParams'' converts ''null'' values to an empty string. Alternatively, it could omit parameters with ''null'' values completely, the same way as ''http_build_query()'' does. $params = new Uri\Rfc3986\UriQueryParams(""); $params->append("param_null", null); $params->append("param_bool", true); $params->append("param_int", 123); $params->append("param_float", 3.14); var_dump($params->getFirst("param_null")); // NULL var_dump($params->getFirst("param_bool")); // string(1) "1" var_dump($params->getFirst("param_int")); // string(3) "123" var_dump($params->getFirst("param_float")); // string(4) "3.14" echo $params->toString(); // param_null¶m_bool=1¶m_int=123¶m_float=3.14 Note how ''UrlQueryParams'' works differently with regards to ''null'' values: $params = new Uri\WhatWg\UrlQueryParams(""); $params->append("param_null", null); $params->append("param_bool", true); $params->append("param_int", 123); $params->append("param_float", 3.14); var_dump($params->getFirst("param_null")); // string(0) "" var_dump($params->getFirst("param_bool")); // string(1) "1" var_dump($params->getFirst("param_int")); // string(3) "123" var_dump($params->getFirst("param_float")); // string(4) "3.14" echo $params->toString(); // param_null=¶m_bool=1¶m_int=123¶m_float=3.14 Exact array and object casting rules are still to be decided. === Implemented Interfaces === The ''UriQueryParams'' and ''UrlQueryParams'' classes could implement the ''IteratorAggregate'' interface in theory. However, it's not possible to do so due to query components that share the same name, e.g.: ''param=foo¶m=bar¶m=baz''. In this case, the same key (''param'') would be repeated 3 times - and it's actually not possible to support with iterators. === Cloning === Cloning of ''UriQueryParams'' and ''UrlQueryParams'' is supported. $params1 = new Uri\Rfc3986\UriQueryParams("foo=bar&foo=baz"); $params2 = clone $params1; echo $params1->toString(); // foo=bar&foo=baz echo $params2->toString(); // foo=bar&foo=baz ''UrlQueryParams'' works the same way: $params1 = new Uri\WhatWg\UrlQueryParams("foo=bar&foo=baz"); $params2 = clone $params1; echo $params1->toString(); // foo=bar&foo=baz echo $params2->toString(); // foo=bar&foo=baz === Serialization === Both classes are serializable and deserializable. The only implementation gotcha is that the serialized format is slightly unexpected: instead of recomposing the query params into a query string, the individual key-value pairs are serialized as an array. This is necessary because both ''toString'' implementations automatically percent-encode the input, so using these algorithms would skew the original data, not to mention the fact that ''Uri\WhatWg\UrlQueryParams::parse()'' performs automatic percent-decoding too. === Debugging === Both classes contain a ''%%__debugInfo()%%'' method that returns all items in the query parameter list in order to make debugging easier. $params = new Uri\Rfc3986\UriQueryParams("foo=bar&foo=baz&foo=qux"); var_dump($params); /* object(Uri\Rfc3986\UriQueryParams)#1 (1) { ["params"]=> array(3) { [0]=> array(1) { ["foo"]=> string(3) "bar" } [1]=> array(1) { ["foo"]=> string(3) "baz" } [2]=> array(1) { ["foo"]=> string(3) "qux" } } } */ $params = new Uri\WhatWg\UrlQueryParams("foo=bar&foo=baz&foo=qux"); var_dump($params); /* object(Uri\WhatWg\UrlQueryParams)#1 (1) { ["params"]=> array(3) { [0]=> array(1) { ["foo"]=> string(3) "bar" } [1]=> array(1) { ["foo"]=> string(3) "baz" } [2]=> array(1) { ["foo"]=> string(3) "qux" } } } */ === Vote === * Yes * No * Abstain ==== Accessing Path Segments as an Array ==== Sometimes, accessing path segments rather than the whole path as string is needed. When this is the case, splitting the path to segments manually after retrieval is both inconvenient and disadvantageous performance-wise, especially considering the fact that ''Uri\Rfc3986\Uri'' internally stores the path as a list of segments. In order to better support the related use-cases, the following methods are proposed to be added: namespace Uri\Rfc3986 { final readonly class Uri { ... public function getRawPathSegments(): ?array {} public function getPathSegments(): ?array {} #[\NoDiscard(message: "as Uri\Rfc3986\Uri::withPathSegments() does not modify the object itself")] public function withPathSegments(array $segments): static {} ... } } namespace Uri\WhatWg { final readonly class Url { ... public function getPathSegments(): array {} #[\NoDiscard(message: "as Uri\WhatWg\Url::withPathSegments() does not modify the object itself")] public function withPathSegments(array $segments): static {} ... } } This way, it is possible to write the following code: $uri = new Uri\WhatWg\Uri("https://example.com/foo/bar/baz"); $segments = $uri->getPathSegments(); // ["foo", "bar", "baz"] $uri = $uri->withPathSegments(["a", "b"]); echo $uri->getPath(); // /a/b The same for WHATWG URL: $url = new Uri\WhatWg\Url("https://example.com/foo/bar/baz"); $segments = $url->getPathSegments(); // ["foo", "bar", "baz"] $url = $url->withPathSegments(["a", "b"]); echo $url->getPath(); // /a/b The getter methods return ''null'' if the path is empty (https://example.com), an empty array when the path consists of a single slash (https://example.com/), and a non-empty array otherwise. ''Uri\Rfc3986\Uri::withPathSegments()'' and ''Uri\WhatWg\Url::withPathSegments()'' internally concatenate the input segments separated by a ''/'' character, and then trigger ''Uri\Rfc3986\Uri::withPath()'' and ''Uri\WhatWg\Url::withPath()'', respectively. * Yes * No * Abstain ==== Host Type Detection ==== Both the RFC 3986 and WHATWG URL specifications distinguish different types of the host component because each of them have different parsing and formatting rules. Probably the most notable example is the IPv6 host type that requires the IPv6 address to be written between a ''['' and '']'' pair. In order to support returning information about the host type, the following enums and methods are proposed to be added: namespace Uri\Rfc3986 { enum UriHostType { case IPv4; case IPv6; case IPvFuture; case RegisteredName; } final readonly class Uri { ... public function getHostType(): ?\Uri\Rfc3986\UriHostType {} ... } } namespace Uri\WhatWg { enum UrlHostType { case IPv4; case IPv6; case Domain; case Opaque; case Empty; } final readonly class Url { ... public function getHostType(): ?\Uri\WhatWg\UrlHostType {} ... } } The new ''getHostType()'' methods return the type of the host component for both specifications: $uri = new Uri("https://192.168.0.1/"); echo $uri->getHostType(); // UriHostType::IPv4 $uri = new Uri("https://[2001:db8::1]/"); echo $uri->getHostType(); // UriHostType::IPv6 $uri = new Uri("https://[v1.1.2.3]/"); echo $uri->getHostType(); // UriHostType::IPvFuture $uri = new Uri("https://example.com/"); echo $uri->getHostType(); // UriHostType::RegisteredName The same for WHATWG URL: $url = new Uri\WhatWg\Url("https://192.168.0.1/"); echo $url->getHostType(); // UrlHostType::IPv4 $url = new Uri\WhatWg\Url("https://[2001:db8::1]/"); echo $uri->getHostType(); // UrlHostType::IPv6 $url = new Uri\WhatWg\Url("https://example.com/"); echo $url->getHostType(); // UrlHostType::Domain $url = new Uri\WhatWg\Url("scheme://example.com/"); echo $url->getHostType(); // UrlHostType::Opaque $url = new Uri\WhatWg\Url("mailto://john.doe@example.com"); echo $url->getHostType(); // UrlHostType::Empty * Yes * No * Abstain ==== URI Type Detection ==== RFC 3986 distinguishes different URI "types" based on what they begin with. * **Relative-reference:** Starts with a path, and the scheme is therefore omitted. Relative-references can be further grouped into the following types: * **Absolute-path reference:** Starts with a single slash ("**/**"), e.g.: "/foo" * **Relative-path reference:** Starts without a slash ("**/**"), e.g.: "foo" * **Network-path reference:** Starts with a double slash ("**%%//%%**") followed by an authority, e.g.: ''%%//host/foo%%'' * **URI:** Starts with the scheme component, and then continues with either the authority, or the path. In order to better support granular RFC 3986 URI type detection, the following enums and methods are proposed to be added: namespace Uri\Rfc3986 { enum UriType { case AbsolutePathReference; case RelativePathReference; case NetworkPathReference; case Uri; } final readonly class Uri { ... public function getUriType(): Uri\Rfc3986\UriType {} ... } } This way, it becomes easier to detect the URI type: $uri = new Uri\Rfc3986\Uri("https://example.com"); var_dump($uri->getUriType()); // Uri\Rfc3986\UriType::Uri $uri = new Uri\Rfc3986\Uri("/foo"); var_dump($uri->getUriType()); // Uri\Rfc3986\UriType::AbsolutePathReference $uri = new Uri\Rfc3986\Uri("foo"); var_dump($uri->getUriType()); // Uri\Rfc3986\UriType::RelativePathReference $uri = new Uri\Rfc3986\Uri("//host.com/foo"); var_dump($uri->getUriType()); // Uri\Rfc3986\UriType::NetworkPathReference The WHATWG URL specification defines some special schemes (''http'', ''https'', ''ftp'', ''file'', ''ws'', ''wss''), which have distinct parsing and serialization rules. In order to make checks for special URLs easier to perform, a new ''Uri\WhatWg\Url::isSpecial()'' method is added: namespace Uri\WhatWg { final readonly class Url { ... public function isSpecial(): bool {} ... } } This enables low-level control for applications that need to mirror WHATWG behaviors in parsing or normalization. $url = new Uri\WhatWg\Url("https://example.com"); var_dump($url->isSpecial()); // true $url = new Uri\WhatWg\Url("custom:example"); var_dump($url->isSpecial()); // false * Yes * No * Abstain ==== Percent-Encoding and Decoding Support ==== Contrarily to the common belief that's probably further affirmed by the ''urlencode()'' and ''urldecode()'' functions, percent-encoding and decoding are [[https://wiki.php.net/rfc/url_parsing_api#percent-encoding_decoding|both a context-sensitive process]]. Context-sensitivity means that different characters need to be percent-encoded/percent-encoded depending on which URI component is being processed.
It should also be mentioned that in fact, ''urlencode()'' and ''urldecode()'' should rather be used for the ''application/x-www-form-urlencoded'' media type, and ''rawurlencode()'' and ''rawurldecode()'' more closely implements RFC 3986.
For example, the path component dedicates special meaning for the ''/'' character. Therefore, this character doesn't necessarily have to be percent-encoded in the path component. There are some cases though when it makes sense to percent-encode them, as highlighted by the [[https://wiki.php.net/rfc/url_parsing_api#advanced_examples|first example]] within the "Advanced examples" section of the original URI RFC. Unfortunately, ''rawurlencode()'' doesn't take the component into account, and replaces the "/" with "%2F" unconditionally. echo rawurlencode("/foo/bar/baz"); // %2Ffoo%2Fbar%2Fbaz In order to correctly handle percent-encoding and decoding based on the rules of RFC 3986 and WHATWG URL, the following methods and enums are proposed to be added: namespace Uri\Rfc3986 { enum UriPercentEncodingMode { case UserInfo; case Host; case RelativeReferencePath; case RelativeReferenceFirstPathSegment; case Path; case PathSegment; case Query; case FormQuery; case Fragment; case AllReservedCharacters; case All; } final readonly class Uri { ... public static function percentEncode(string $input, \Uri\Rfc3986\UriPercentEncodingMode $mode): string {} public static function percentDecode(string $input, \Uri\Rfc3986\UriPercentEncodingMode $mode): string {} ... } } namespace Uri\WhatWg { enum UrlPercentEncodingMode { case UserInfo; case Host; case OpaqueHost; case Path; case PathSegment; case OpaquePath; case OpaquePathSegment; case Query; case SpecialQuery; case FormQuery; case Fragment; } final readonly class Url { ... public static function percentEncode(string $input, \Uri\WhatWg\UrlPercentEncodingMode $mode): string {} public static function percentDecode(string $input, \Uri\WhatWg\UrlPercentEncodingMode $mode): string {} ... } } The ''percentEncode()'' and ''percentDecode()'' methods both require an input string and a ''PercentEncodingMode'' enum to be passed. The enums make the context of the encoding/decoding processes fully explicit and clear. The following modes are supported: * **Uri\Rfc3986\UriPercentEncodingMode** * **UserInfo:** Besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**:**". Any other characters are percent-encoded. * **Host:** If the input string is a valid IPv4, an IPv6 or an IPvFuture address, no percent-encoding is performed, since these host types do not support the process. Otherwise (for registered names), [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]] are allowed to be present. Any other characters are percent-encoded. * **AbsolutePathReferenceFirstSegment:** The first segment of absolute-path references cannot start with "**%%//%%**" characters (e.g. ''%%//foo%%''), otherwise the path [[https://datatracker.ietf.org/doc/html/rfc3986#section-4.2|would be confusable]] with a network-path reference. Therefore, besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**:**", "**@**". Any other characters are percent-encoded. * **RelativePathReferenceFirstSegment:** The first segment of relative-path references cannot contain a "**:**" character (e.g. ''this:that''), otherwise the path [[https://datatracker.ietf.org/doc/html/rfc3986#section-4.2|would be confusable]] with a scheme name. Therefore, besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**@**". Any other characters are percent-encoded. * **RelativeReferencePath:** * **Path:** Besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**/**", "**:**", "**@**". Any other characters are percent-encoded. * **PathSegment:** Besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**:**", "**@**". Any other characters are percent-encoded. * **Query:** Besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**:**", "**@**", "**/**", and "**?**". Any other characters are percent-encoded. * FormQuery: It is mostly the same as ''Uri\Rfc3986\UriPercentEncodingMode::Query'', but it behaves according to the ''application/x-www-form-urlencode'' media type rather than RFC 3986. The only difference between the two is that " " is encoded as "**+**". * Fragment: Besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]], [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], as well as [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|sub-delimiters]], it also allows the following characters to be present: "**:**", "**@**", "**/**", and "**?**". Any other characters are percent-encoded. * AllReservedCharacters: All [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.2|reserved characters]] are percent-encoded. The rest of the characters are left as-is. * AllButUnreservedCharacters: Besides [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.3|unreserved characters]] and [[https://datatracker.ietf.org/doc/html/rfc3986#section-2.1|percent-encoded octets]], all other characters are percent-encoded. For the complete ABNF syntax of each component, consult [[https://datatracker.ietf.org/doc/html/rfc3986#appendix-A|Appendix A]] of RFC 3986. * **Uri\WhatWg\UrlPercentEncodingMode** * **UserInfo:** Besides the code points percent-encoded by ''Uri\WhatWg\UrlPercentEncodingMode::Path'', the following code points are percent-encoded: U+002F (**/**), U+003A (**:**), U+003B (**;**), U+003D (**=**), U+0040 (**@**), U+005B (**[**) to U+005D (**]**), inclusive, and U+007C (**|**). * **OpaqueHost:** [[https://infra.spec.whatwg.org/#c0-control|Control characters]], and all [[https://url.spec.whatwg.org/#c0-control-percent-encode-set|code points greater than ~]] are percent-encoded. * **Path:** Besides the code points percent-encoded by ''Uri\WhatWg\UrlPercentEncodingMode::Query'', the following code points are percent-encoded: U+003F (**?**), U+005E (**^**), U+0060 (**`**), U+007B (**{**), and U+007D (**}**). * **PathSegment:** Besides the code points percent-encoded by ''Uri\WhatWg\UrlPercentEncodingMode::Query'', the following code points are percent-encoded: U+003F (**?**), U+005E (**^**), U+0060 (**`**), U+007B (**{**), U+007D (**}**), and U+002F (**/**). * **OpaquePathSegment:** * **Query:** Besides [[https://infra.spec.whatwg.org/#c0-control|Control characters]], and all [[https://url.spec.whatwg.org/#c0-control-percent-encode-set|code points greater than ~]], the following code points are percent-encoded: U+0020 SPACE, U+0022 (**"**), U+0023 (**#**), U+003C (**<**), and U+003E (**>**). * **SpecialQuery:** Besides the code points percent-encoded by ''Uri\WhatWg\UrlPercentEncodingMode::Query'', the following code points are percent-encoded: U+0027 (**'**) * **FormQuery:** Besides the code points percent-encoded by ''Uri\WhatWg\UrlPercentEncodingMode::UserInfo'', the following code points are percent-encoded: U+0024 (**$**) to U+0026 (**&**), inclusive, U+002B (**+**), U+002C (**,**), U+0021 (**!**), U+0027 (**'**) to U+0029 RIGHT PARENTHESIS, inclusive, and U+007E (**~**). * **Fragment:** Besides [[https://infra.spec.whatwg.org/#c0-control|Control characters]], and all [[https://url.spec.whatwg.org/#c0-control-percent-encode-set|code points greater than ~]], the following code points are percent-encoded: U+0020 SPACE, U+0022 (**"**), U+003C (**<**), U+003E (**>**), and U+0060 (**`**). Since neither RFC 3986, nor WHATWG URL support percent-encoded characters inside the scheme component, none of the enums contain a ''Scheme'' case. WHATWG URL automatically percent-decodes the host when [[https://wiki.php.net/rfc/uri_followup#determining_if_the_whatwg_url_is_special|it's special]], so ''Uri\WhatWg\UrlPercentEncodingMode'' doesn't contain a ''Host'' case. The ''percentDecode()'' methods perform the inverted operation of ''percentEncode()'': it decodes every character that is percent-encoded, but which are otherwise allowed by the current percent-encoding mode. $uri = new Uri\Rfc3986\Uri("https://example.com#_%40%2F"); // The fragment is the percent-encoded form of "_@/" echo Uri\Rfc3986\Uri::percentDecode( $uri->getFragment(), Uri\Rfc3986\UriPercentEncodingMode::Fragment ); // _%40/ The "/" character is allowed in the fragment, so it's needlessly percent-encoded in the URI - that's why it can be percent-decoded by ''percentDecode()''. On the other hand, "@" is not supported in the context of the fragment, so it's kept in the percent-encoded octet form. RFC 3986 has a sentence that apparently contradicts with the behavior of ''Uri\Rfc3986\Uri::percentDecode()'': > Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI. According to this rule, reserved characters - even if they are allowed in the context of a component - should not be percent-decoded during normalization. Even though the ''Uri\Rfc3986\Uri'' getters respect this rule, the ''percentDecode()'' method intentionally disregards it so that it can serve in use-cases where those getters cannot. Let's see an example: $uri = new Uri\Rfc3986\Uri("https://example.com/?q=%3A%29"); // The query is the percent-encoded form of ":)" echo $uri->getQuery(); // %3A%29 echo Uri\Rfc3986\Uri::percentDecode( $uri->getQuery(), Uri\Rfc3986\UriPercentEncodingMode::Query ); // :) As it can be seen above, the ''getQuery()'' getter only normalizes the "%20" percent-encoded octet, and it leaves the two reserved characters - ":" and ")" - as-is, even though ")" is allowed in the context of the query (so it shouldn't be percent-encoded at all). By using ''percentDecode()'' one can make the input consumable directly, and scheme-specific or producer-specific algorithms should continue to the getters should they need to do any kind of custom processing. By using the proposed percent-encoding and decoding capabilities, many use-cases will become possible to implement in a specification-compliant way which was difficult to achieve before. For example, path segments can be properly percent-encoded when they contain the ''/'' character: $uri = new Uri\Rfc3986\Uri("https://example.com"); $uri = $uri->withPathSegments( [ "foo", Uri\Rfc3986\Uri::percentEncode("bar/baz", Uri\Rfc3986\UriPercentEncodingMode::PathSegment) ] ); $uri->toRawString(); // https://example.com/foo/bar%2Fbaz * Yes * No * Abstain ===== Backward Incompatible Changes ===== All the proposed changes are completely backward compatible because the affected classes are all [[https://wiki.php.net/rfc/url_parsing_api#why_should_the_uri_rfc3986_uri_and_the_uri_whatwg_url_classes_be_final|final]]. ===== Proposed PHP Version(s) ===== Next minor version (PHP 8.6 most likely) ===== RFC Impact ===== ==== To the Ecosystem ==== What effect will the RFC have on IDEs, Language Servers (LSPs), Static Analyzers, Auto-Formatters, Linters and commonly used userland PHP libraries? ==== To Existing Extensions ==== Existing extensions can continue to use the existing URI API without any changes. Some of the features are exposed as ''PHPAPI'' functions through public headers. ==== To SAPIs ==== None. ===== Open Issues ===== None. ===== Future Scope ===== None. ===== Patches and Tests ===== https://github.com/kocsismate/php-src/pull/9 ===== Implementation ===== After the RFC is implemented, this section should contain: - the version(s) it was merged into - a link to the git commit(s) - a link to the PHP manual entry for the feature ===== References ===== * [[https://wiki.php.net/rfc/url_parsing_api|Add RFC 3986 and WHATWG URL compliant API]] ===== Rejected Features ===== None. ===== Changelog =====