rfc:xml_option_parse_huge
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:xml_option_parse_huge [2023/09/21 20:47] – nielsdos | rfc:xml_option_parse_huge [2023/10/22 15:53] (current) – implemented nielsdos | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== PHP RFC: PHP_XML_OPTION_PARSE_HUGE | + | ====== PHP RFC: XML_OPTION_PARSE_HUGE |
- | * Version: 0.9 | + | * Version: 0.9.1 |
* Date: 2023-09-21 | * Date: 2023-09-21 | ||
* Author: Niels Dossche, nielsdos@php.net | * Author: Niels Dossche, nielsdos@php.net | ||
- | * Status: | + | * Status: |
+ | * Implementation: | ||
* First Published at: https:// | * First Published at: https:// | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | ext/xml allows the user to parse XML in an event-driven way (SAX). The user can register callbacks to be called when certain nodes are encountered while parsing. In a sense, this is a streaming parsing model. | + | ext/xml allows the user to parse XML in an event-driven way (SAX). The user can register callbacks to be called when certain nodes are encountered while parsing. In a sense, this is a streaming parsing model: the user's callbacks are invoked while parsing is still happening. |
This RFC attempts to address a feature request on the old bugtracker: https:// | This RFC attempts to address a feature request on the old bugtracker: https:// | ||
Line 15: | Line 16: | ||
First, it's important to note that the ext/xml extension can work with two different XML parsers: either libexpat or libxml2, with libxml2 being the more commonly used of the two. | First, it's important to note that the ext/xml extension can work with two different XML parsers: either libexpat or libxml2, with libxml2 being the more commonly used of the two. | ||
- | Now, let's get to the heart of the issue: | + | Now, let's get to the actual |
- | Starting with libxml2 version 2.7.6, parsing large input data is no longer allowed by default; it must be explicitly enabled. This change was made to prevent potential denial-of-service attacks. However, this modification unintentionally disrupted a legitimate use-case involving the < | + | Starting with libxml2 version 2.7.0 (https:// |
There is a workaround for < | There is a workaround for < | ||
- | This proposal aims to solve this issue by introducing a new option. | + | This proposal aims to solve this issue by introducing a new parser |
+ | |||
+ | (*) The definition of large is defined in [[https:// | ||
===== Proposal ===== | ===== Proposal ===== | ||
- | It's possible to set parser options via <php>xml_set_parser_option</ | + | It's possible to set parser options via <php>xml_parser_set_option</ |
Internally, this option will pass XML_PARSE_HUGE to libxml2, allowing large documents to be parsed without resulting in a parse error. | Internally, this option will pass XML_PARSE_HUGE to libxml2, allowing large documents to be parsed without resulting in a parse error. | ||
If libexpat is used, this option will do nothing as libexpat does not block loading large documents anyway. | If libexpat is used, this option will do nothing as libexpat does not block loading large documents anyway. | ||
+ | |||
+ | It's worth noting that for extensions like SimpleXML and DOM extensions, you can run into the same problem. However, there you //do// have the option < | ||
+ | |||
+ | ==== Example Usage ==== | ||
+ | |||
+ | <PHP> | ||
+ | function startElement($parser, | ||
+ | // Do something interesting | ||
+ | } | ||
+ | function endElement($parser, | ||
+ | // Do something interesting | ||
+ | } | ||
+ | $parser = xml_parser_create(); | ||
+ | xml_parser_set_option($parser, | ||
+ | xml_set_element_handler($parser, | ||
+ | // Add more handlers | ||
+ | $success = xml_parse($parser, | ||
+ | </ | ||
+ | |||
+ | If you try to change the huge parsing option while parsing is busy, e.g. in one of the callback handlers, and < | ||
+ | Example: | ||
+ | |||
+ | <PHP> | ||
+ | <?php | ||
+ | function startElement($parser, | ||
+ | xml_parser_set_option($parser, | ||
+ | } | ||
+ | function endElement($parser, | ||
+ | // Do something interesting | ||
+ | } | ||
+ | $parser = xml_parser_create(); | ||
+ | xml_parser_set_option($parser, | ||
+ | xml_set_element_handler($parser, | ||
+ | // Add more handlers | ||
+ | $success = xml_parse($parser, | ||
+ | </ | ||
+ | |||
+ | Results in: | ||
+ | Fatal error: Uncaught Error: Cannot change option XML_OPTION_PARSE_HUGE while parsing in example.php: | ||
+ | |||
===== Backward Incompatible Changes ===== | ===== Backward Incompatible Changes ===== | ||
- | No BC breaks unless the user defined a global constant XML_OPTION_PARSE_HUGE themselves. | + | No BC breaks unless the user defined a global constant |
===== Proposed PHP Version(s) ===== | ===== Proposed PHP Version(s) ===== | ||
Line 72: | Line 115: | ||
===== Proposed Voting Choices ===== | ===== Proposed Voting Choices ===== | ||
- | One primary vote (requires 2/3 majority): add PHP_XML_OPTION_PARSE_HUGE? | + | One primary vote (requires 2/3 majority): add XML_OPTION_PARSE_HUGE parsing option? |
+ | |||
+ | <doodle title=" | ||
+ | * Yes | ||
+ | * No | ||
+ | </ | ||
===== Patches and Tests ===== | ===== Patches and Tests ===== | ||
Line 79: | Line 127: | ||
===== Implementation ===== | ===== Implementation ===== | ||
- | After the project is implemented, | + | |
- | - the version(s) it was merged | + | Merged |
- | | + | |
- | - a link to the PHP manual entry for the feature | + | ===== Changelog ===== |
- | - a link to the language specification section (if any) | + | |
+ | * 0.9.1: Fixed libxml2 version, clarified limit, added code sample, linked | ||
+ | * 0.9.0: First version under discussion | ||
===== References ===== | ===== References ===== |
rfc/xml_option_parse_huge.1695329261.txt.gz · Last modified: 2023/09/21 20:47 by nielsdos