PHP RFC: Create RFC Compliant URL Parser
- Version: 0.3
- Date: 2016-10-04
- Author: David Walker (dave@mudsite.com)
- Proposed version: PHP 7.2+
- Status: Inactive
- First Published at: http://wiki.php.net/rfc/replace_parse_url
Introduction
This RFC came about for an attempt to resolve Bug #72811. In the attempt, discussion shifted from trying to patch the current implementation of parse_url()
to more generally replacing the current one. The discussion then shifted to the inability to remove parse_url()
due to BC issues. Ideas formed on creating an immutable class that will take a URL and parse it, exposing the pieces by getters.
The current implementation of parse_url()
makes a bunch of exceptions to RFC 3986. I do not know if these are conscious exceptions, or, if parse_url()
was never based off of the RFC. After raising this RFC, I was alerted that the RFC, is complimented by WHATWG spec on URLs. The aim of WHATWG is to combine RFC 3986 and RFC 3987. However, WHATWG is a “Living Standard” which makes it subject to change, however frequent. Although it does some good combining the two RFC's, the complexities to have a single PHP parser that would require constant maintaining to adhere to the evolving standard is not exactly practical.
So, this RFC proposes creating a new parser that adheres to the two RFC's. In doing so, if PHP is compiled with mbstring support, would be able to properly support multibyte characters in a URL.
Proposal
<?php class URL { public function __construct(string $url, string|URL $base); /** * $input - The string to be parsed * $base - (optional) If $url is relative, this is what it is relative to * $encoding_override - (optional) we assume $url is a UTF-8 encoded string, you may change it here * $url - (optional) A URL object that should be modified by the parsing of $input. The return value will be this variable as well * $state_override - (optional) begin parting the $input from a specific state. */ static public function parse(string $input[, URL $base[, int $encoding_override[, URL $url[, int $state_override]]]]) : URL; public function getScheme() : ?string; public function getUsername() : ?string; public function getPassword() : ?string; public function getHostname() : ?string; public function getPort() : ?int; public function getPath() : ?string; public function getQuery() : ?string; public function getFragment() : ?string; public function getAll() : array; }
Backward Incompatible Changes
None
RFC Impact
To Existing Extensions
standard
Open Issues
- Deprecate
parse_url()
? Try and push people into using the new URLParser class. - Should
parse_url()
have a sunset date of PHP8, or PHP9?
Proposed Voting Choices
Requires 2/3
Implementation
After the project is implemented, this section should contain
- the version(s) it was merged to
- a link to the git commit(s)
- a link to the PHP manual entry for the feature