rfc:pecl_http
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
rfc:pecl_http [2015/01/28 11:13] – add link to merged tree; expand to three-way vote mike | rfc:pecl_http [2015/02/27 21:02] – closed the other vote, too mike | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: Add pecl_http to core ====== | ====== PHP RFC: Add pecl_http to core ====== | ||
- | * Version: | + | * Version: |
- | * Date: 2015-01-28 | + | * Date: 2014-08-19 |
+ | * Last-Modified: 2015-02-20 | ||
* Author: Michael Wallner, < | * Author: Michael Wallner, < | ||
- | * Status: | + | * Status: |
* First Published at: http:// | * First Published at: http:// | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | A discussion whether it is feasible to add [[http:// | ||
- | ===== Proposal | + | ==== About ==== |
- | Providing the functionality of pecl_http with the core distribution. | + | === What is pecl_http? |
+ | pecl_http is an extension that aims to provide a convenient and powerful set of functionality | ||
- | See [[https:// | + | It eases handling of HTTP urls, headers |
- | The PHP7 port can be found here: \\ | + | It provides powerful request functionality with support for parallel requests and an event loop library. |
- | https:// | + | |
- | A fully merged (http and dependencies) tree can be found here: \\ | + | === Why pecl_http? === |
- | https:// | + | pecl_http was first created more than ten years ago. Back in that time I was a PEAR guy and maintained a set of HTTP related packages. I ever wondered so much why PHP does not support HTTP in a more sophisticated manner and thought I’ve got to change that. |
- | ===== Proposed PHP Version(s) ===== | + | Back then there wasn’t much more than a few PEAR libraries like HTTP_Request, HTTP_Download and HTTP_Cache. What followed was a journey through PHP4 and 5, OOP and procedural programming compatible APIs with good reception from the users, but bad critics by architects and evangelists. v2 tried to settle on a more concise and modern set of OOP APIs. I still get threats from users because of this. Development was slow and it took about three years until 2.0 stable was released in late 2013. |
- | PHP7, resp. git:master | + | |
- | ===== RFC Impact ===== | + | Ever so often there were requests to bundle pecl_http with the source distribution but until now I never felt like I want to do that nor that the code was ready for such a move. This time has come, though. The only real concern I still have is the burden it creates on all core developers when moving such a big amount of code from the hands of one into the hands of a few. |
- | Feature corner stones: | + | === Code base size === |
- | * Modular client (currently | + | I have to admit, that the amount of code that comes with pecl_http can be called big, but that is mostly |
- | * Message parser \\ http:// | + | |
- | * Header params parser \\ http:// | + | |
- | * Server side environment request and response entities \\ http:// | + | |
- | * Negotiation \\ http:// | + | |
- | * Encodings: chunked, deflate, gzip \\ http:// | + | |
- | * Swiss army knife for URLs \\ http:// | + | |
- | A total of 28 classes and 2 functions will be added. | + | ==== Miscellaneous ==== |
+ | === Usage numbers === | ||
+ | Frankly, I don’t know. PECL stats show about 50k source package downloads per month in average, whatever that has to say. It’s placed in the top 10 where good old APC still has the lead. | ||
- | ==== To SAPIs ==== | + | === Test coverage |
+ | The current test suite provides a code coverage of about 90% and is subject to improvement. \\ Coverage resport of v2: http:// | ||
- | WEB: | + | There’s one test that currently fails for me due to porting the extension to ZE3, because of reference mismatch and a leak in zend_assign_to_variable_ref() as cause. |
- | * pecl_http adds processing | + | |
- | ==== To Opcache ==== | + | === Licensing |
+ | All of the affected code was licensed under 2-clause BSD and will be re-licensed to PHP-3.01 license. | ||
- | None. | + | === FIG and PSR-7 === |
+ | There are no plans to follow or adhere to the Framework Interoperability Group’s “PHP Standard Recommendation” #7. | ||
- | ==== New Constants | + | ===== The Guts ===== |
+ | A fully merged tree can be inspected here (based on v2.2, so slightly out of date): \\ https:// | ||
- | No new global constants. | + | An up-to-date (based on v2.3) pecl_http tree for PHP7 can be found here: \\ https:// |
- | ==== php.ini Defaults | + | ==== Documentation |
+ | The current docs are available here: \\ http:// | ||
- | * http.etag.mode => crc32b | + | The documentation presumably has to be converted to docbook to be included in php.net, though, at php.net/http currently reside the docs for pecl_http v1. As far as I know, the docs team did not come to a conclusion how to handle that situation. |
- | ===== Open Issues ===== | + | I had hoped that in the meantime the php-docs approach/ |
- | The PHP manual still hosts the docs for pecl_http-v1 and it's not been decided how to handle | + | Markdown sources of the documentation |
+ | |||
+ | ==== Dependencies ==== | ||
+ | === libz AKA zlib === | ||
+ | * Type: of dep: hard build dep | ||
+ | * Minimum version: 1.2.0.4 | ||
+ | * Provided functionality: | ||
+ | * Current state: essential | ||
- | ===== Unaffected PHP Functionality | + | === libidn === |
+ | * Type of dep: soft build dep | ||
+ | * Provided functionality: | ||
+ | * Current state: feature completive, might look into ICU as alternative | ||
+ | |||
+ | === libcurl === | ||
+ | * Type of dep: soft build dep | ||
+ | * Minimum version: 7.18.2 | ||
+ | * Provided functionality: | ||
+ | * Current state: feature completive; might look into additional alternatives, | ||
+ | |||
+ | === libevent(2) === | ||
+ | * Type of dep: soft build dep | ||
+ | * Provided functionality: | ||
+ | * Current state: nice to have - must have for more parallel requests than select() can reasonably handle; might look into additional alternatives like libuv (libev already has libevent compatibility) | ||
+ | |||
+ | === pecl/propro === | ||
+ | * Type of dep: hard build dep | ||
+ | * Provided functionality: | ||
+ | * Current state: feature completive, suggested to be merged to ext/ | ||
+ | |||
+ | == Property Proxy API == | ||
+ | |||
+ | ZE2 doxygen reference can be found here: \\ http:// | ||
+ | |||
+ | There’s been similar functionality in ZE2, but it was dysfunctional to my findings back then and has obviously been removed in ZE3. | ||
+ | |||
+ | Internals thread from 2010: \\ http:// | ||
+ | |||
+ | When a property is requested byref (BP_VAR_RW) from an object that stores state in a member of the objects C struct, and that object implementation uses its own property handlers, it can use a property proxy to enable that kind of access to that state. | ||
+ | |||
+ | This is accomplished by returning an instance of the property proxy object instead of the property directly. The proxy has its set and get handlers overridden and does deferred and cascaded fetch/push of the original member of the container (object) to update the state. | ||
+ | |||
+ | Every extension maintaining state outside of real object properties (e.g. dom) can make use of this functionality and does not have to emit the error “Properties of XYClass can not be accessed by ref or array key/ | ||
+ | |||
+ | Actual implementation for http\Message properties: \\ https:// | ||
+ | |||
+ | <code c> | ||
+ | static zval *php_http_message_object_read_prop(zval *object, zval *member, int type, void **cache_slot, | ||
+ | { | ||
+ | zval *return_value; | ||
+ | zend_string *member_name = zval_get_string(member); | ||
+ | php_http_message_object_prophandler_t *handler = php_http_message_object_get_prophandler(member_name); | ||
+ | |||
+ | if (!handler || type == BP_VAR_R || type == BP_VAR_IS) { | ||
+ | return_value = zend_get_std_object_handlers()-> | ||
+ | |||
+ | if (handler) { | ||
+ | php_http_message_object_t *obj = PHP_HTTP_OBJ(NULL, | ||
+ | |||
+ | PHP_HTTP_MESSAGE_OBJECT_INIT(obj); | ||
+ | handler-> | ||
+ | |||
+ | zval_ptr_dtor(return_value); | ||
+ | ZVAL_COPY_VALUE(return_value, | ||
+ | } | ||
+ | } else { | ||
+ | return_value = php_property_proxy_zval(object, | ||
+ | } | ||
+ | |||
+ | zend_string_release(member_name); | ||
+ | |||
+ | return return_value; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | === pecl/raphf === | ||
+ | * Type of dep: hard build dep | ||
+ | * Provided functionality: | ||
+ | * Current state: feature completive, suggested to be merged to main, please discuss | ||
+ | |||
+ | == Resource And Persistent Handle Factory API == | ||
+ | |||
+ | ZE2 doxygen reference can be found here: \\ http:// | ||
+ | |||
+ | I once said that raphf provides a similar set of functionality like zend_list, but unfortunately this is a bit misleading, so let’s look at the differences first: | ||
+ | |||
+ | zend_list manages refcounted zend_resources with opaque handles and custom destructors for persistent and non-persistent handles. Persistent handles are returned to the caller as is. | ||
+ | |||
+ | raphf does not need refcount support, because we’re working with objects instead of resources, and so should you probably, too. Instead a kind of copy constructor is optionally supported for object cloning. | ||
+ | |||
+ | Here’s the actual implementation of curl_easy and curl_multi ctor/ | ||
+ | |||
+ | <code c> | ||
+ | typedef struct php_http_curle_storage { | ||
+ | char *url; | ||
+ | char *cookiestore; | ||
+ | CURLcode errorcode; | ||
+ | char errorbuffer[0x100]; | ||
+ | } php_http_curle_storage_t; | ||
+ | |||
+ | static inline php_http_curle_storage_t *php_http_curle_get_storage(CURL *ch) { | ||
+ | php_http_curle_storage_t *st = NULL; | ||
+ | |||
+ | curl_easy_getinfo(ch, | ||
+ | |||
+ | if (!st) { | ||
+ | st = pecalloc(1, sizeof(*st), | ||
+ | curl_easy_setopt(ch, | ||
+ | curl_easy_setopt(ch, | ||
+ | } | ||
+ | |||
+ | return st; | ||
+ | } | ||
+ | |||
+ | static void *php_http_curle_ctor(void *opaque, void *init_arg) | ||
+ | { | ||
+ | void *ch; | ||
+ | |||
+ | if ((ch = curl_easy_init())) { | ||
+ | php_http_curle_get_storage(ch); | ||
+ | return ch; | ||
+ | } | ||
+ | return NULL; | ||
+ | } | ||
+ | |||
+ | static void *php_http_curle_copy(void *opaque, void *handle) | ||
+ | { | ||
+ | void *ch; | ||
+ | |||
+ | if ((ch = curl_easy_duphandle(handle))) { | ||
+ | curl_easy_reset(ch); | ||
+ | php_http_curle_get_storage(ch); | ||
+ | return ch; | ||
+ | } | ||
+ | return NULL; | ||
+ | } | ||
+ | |||
+ | static void php_http_curle_dtor(void *opaque, void *handle) | ||
+ | { | ||
+ | php_http_curle_storage_t *st = php_http_curle_get_storage(handle); | ||
+ | |||
+ | curl_easy_cleanup(handle); | ||
+ | |||
+ | if (st) { | ||
+ | if (st-> | ||
+ | pefree(st-> | ||
+ | } | ||
+ | if (st-> | ||
+ | pefree(st-> | ||
+ | } | ||
+ | pefree(st, | ||
+ | } | ||
+ | } | ||
+ | |||
+ | static php_resource_factory_ops_t php_http_curle_resource_factory_ops = { | ||
+ | php_http_curle_ctor, | ||
+ | php_http_curle_copy, | ||
+ | php_http_curle_dtor | ||
+ | }; | ||
+ | |||
+ | static void *php_http_curlm_ctor(void *opaque, void *init_arg) | ||
+ | { | ||
+ | return curl_multi_init(); | ||
+ | } | ||
+ | |||
+ | static void php_http_curlm_dtor(void *opaque, void *handle) | ||
+ | { | ||
+ | curl_multi_cleanup(handle); | ||
+ | } | ||
+ | |||
+ | static php_resource_factory_ops_t php_http_curlm_resource_factory_ops = { | ||
+ | php_http_curlm_ctor, | ||
+ | NULL, | ||
+ | php_http_curlm_dtor | ||
+ | }; | ||
+ | </ | ||
+ | |||
+ | A resource factory can be created from this ops and be used directly, or can be transparently wrapped by the persistent handle ops to support process/ | ||
+ | |||
+ | Actual implementation of creating the resource/ | ||
+ | |||
+ | <code c> | ||
+ | static php_resource_factory_t *create_rf(php_http_url_t *url) | ||
+ | { | ||
+ | php_persistent_handle_factory_t *pf; | ||
+ | php_resource_factory_t *rf = NULL; | ||
+ | zend_string *id; | ||
+ | char *id_str = NULL; | ||
+ | size_t id_len; | ||
+ | |||
+ | if (!url || (!url-> | ||
+ | php_error_docref(NULL, | ||
+ | return NULL; | ||
+ | } | ||
+ | |||
+ | id_len = spprintf(& | ||
+ | id = php_http_cs2zs(id_str, | ||
+ | |||
+ | pf = php_persistent_handle_concede(NULL, | ||
+ | zend_string_release(id); | ||
+ | |||
+ | if (pf) { | ||
+ | rf = php_resource_factory_init(NULL, | ||
+ | } else { | ||
+ | rf = php_resource_factory_init(NULL, | ||
+ | } | ||
+ | |||
+ | zend_string_release(id); | ||
+ | |||
+ | return rf; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | For php_persistent_handle_concede() to succeed, a provider has to be registered at MINIT: | ||
+ | <code c> | ||
+ | if (SUCCESS != php_persistent_handle_provide(PHP_HTTP_G-> | ||
+ | return FAILURE; | ||
+ | } | ||
+ | if (SUCCESS != php_persistent_handle_provide(PHP_HTTP_G-> | ||
+ | return FAILURE; | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | php_persistent_handle_concede() would also take pointers to a wakeup and a retire function for persistent handles, so that e.g. network sockets or database handles can be prepared for the idle time or be checked that they are still valid when requested. | ||
+ | |||
+ | This is the last notable difference from zend_list and is not needed by curl, but here’s an example of wakeup and retire functions located in pecl/pq, a PostgreSQL Client: \\ https:// | ||
+ | |||
+ | <code c> | ||
+ | static void php_pqconn_wakeup(php_persistent_handle_factory_t *f, void **handle TSRMLS_DC) | ||
+ | { | ||
+ | PGresult *res = PQexec(*handle, | ||
+ | PHP_PQclear(res); | ||
+ | |||
+ | if (CONNECTION_OK != PQstatus(*handle)) { | ||
+ | PQreset(*handle); | ||
+ | } | ||
+ | } | ||
+ | |||
+ | static void php_pqconn_retire(php_persistent_handle_factory_t *f, void **handle TSRMLS_DC) | ||
+ | { | ||
+ | php_pqconn_event_data_t *evdata = PQinstanceData(*handle, | ||
+ | PGcancel *cancel; | ||
+ | PGresult *res; | ||
+ | |||
+ | /* go away */ | ||
+ | PQsetInstanceData(*handle, | ||
+ | |||
+ | /* ignore notices */ | ||
+ | PQsetNoticeReceiver(*handle, | ||
+ | |||
+ | /* cancel async queries */ | ||
+ | if (PQisBusy(*handle) && (cancel = PQgetCancel(*handle))) { | ||
+ | char err[256] = {0}; | ||
+ | |||
+ | PQcancel(cancel, | ||
+ | PQfreeCancel(cancel); | ||
+ | } | ||
+ | /* clean up async results */ | ||
+ | while ((res = PQgetResult(*handle))) { | ||
+ | PHP_PQclear(res); | ||
+ | } | ||
+ | |||
+ | /* clean up transaction & session */ | ||
+ | switch (PQtransactionStatus(*handle)) { | ||
+ | case PQTRANS_IDLE: | ||
+ | res = PQexec(*handle, | ||
+ | break; | ||
+ | default: | ||
+ | res = PQexec(*handle, | ||
+ | break; | ||
+ | } | ||
+ | |||
+ | if (res) { | ||
+ | PHP_PQclear(res); | ||
+ | } | ||
+ | |||
+ | if (evdata) { | ||
+ | /* clean up notify listeners */ | ||
+ | zend_hash_apply_with_arguments(& | ||
+ | |||
+ | /* release instance data */ | ||
+ | efree(evdata); | ||
+ | } | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | Any extension providing (networked) services should be able to take advantage of raphf unless it needs to expose resources to userland. | ||
+ | |||
+ | == raphf INI setting == | ||
+ | There’s a global INI setting (SYSTEM), persistent_handle_limit (defaults to -1, unlimited) of debatable usefulness. | ||
+ | |||
+ | ==== Features ==== | ||
+ | === C API === | ||
+ | Most of the features are directly accessible through pecl_http' | ||
+ | |||
+ | === Globals === | ||
+ | Nothing in the global namespace, except the namespace '' | ||
+ | |||
+ | == Client == | ||
+ | * Docs: http:// | ||
+ | * Current status: essential | ||
+ | * Related functionality in core: HTTP stream wrapper, ext/curl | ||
+ | |||
+ | The HTTP stream wrapper is of limited functionality, | ||
+ | |||
+ | Better support for more complicated applications like different authentication schemes, proxy types, encodings, SSL/TLS layers and what not is a desirable out of the box functionality. | ||
+ | |||
+ | All of that would actually be available by ext/curl but the existing libcurl binding is in an subpar maintenance state and suffers from its own quirks. Also, to my great surprise, there are only about five people enjoying the libcurl API, or what is available from it in PHP. | ||
+ | |||
+ | Currently only libcurl is implemented as a provider for http\Client, | ||
+ | |||
+ | http\Client supports sending parallel requests, optionally driven by an event loop library like libev{, | ||
+ | |||
+ | == Encoding == | ||
+ | * Docs: http:// | ||
+ | * Current status: essential | ||
+ | * Related functionality in core: ext/zlib | ||
+ | |||
+ | Actually ext/zlib supports all the same three encodings since I fixed it a few years ago, so there could be the occasion for few shared code lines. What it definitely lacks, though, are incremental encoders/ | ||
+ | |||
+ | AFAIK there no accessible implementation of chunked encoding in core. | ||
+ | |||
+ | == Env == | ||
+ | * Docs: http:// | ||
+ | * Current status: feature completive | ||
+ | * Related functionality in core: superglobals, | ||
+ | |||
+ | http\Env provides negotiation of content type, character set and language which is a feature set often asked for, nothing comparable exists in core. | ||
+ | |||
+ | Most of what the environmental/ | ||
+ | |||
+ | http\Env\Request provides a central access point for all of that data and makes it safe to change/mock it without actually changing the original environment. | ||
+ | |||
+ | http\Env\Response provides features for sending responses beyond header(), ob_start() and readfile() with support for ranges/ | ||
+ | |||
+ | == Message == | ||
+ | * Docs: http:// | ||
+ | * Current status: essential | ||
+ | * Related functionality in core: rfc1867.c | ||
+ | |||
+ | Message parser and tools. http\Message is the base class of all request and response classes. Note that a " | ||
+ | |||
+ | Also message bodies with support for building and (basic) parsing of multipart bodies, utilizing a (temporary) stream for memory efficiency. | ||
+ | |||
+ | Splitting a multipart body creates a chain of http\Message objects. | ||
+ | |||
+ | == Header == | ||
+ | * Docs: http:// | ||
+ | * Current status: essential | ||
+ | * Related functionionalty in core: non-existent | ||
+ | |||
+ | Header parser and tools. | ||
+ | |||
+ | I'm really not sure what case I should make about a header and message parser implementation in an HTTP package. | ||
+ | |||
+ | == Cookie == | ||
+ | * Docs: http:// | ||
+ | * Current status: feature completive, maybe the odd cousin | ||
+ | * Related functionality in core: non-existent | ||
+ | |||
+ | Cookie and Set-Cookie headers come in a special format and are ubiquitous, thus they deserve a discrete parser. | ||
+ | |||
+ | One could argue that there is related functionality in core, namely php_default_treat_data(), | ||
+ | |||
+ | == Params == | ||
+ | * Docs: http:// | ||
+ | * Current status: essential | ||
+ | * Related functionality in core: non-existent | ||
+ | |||
+ | Header params parser; think of a content-type or an accept header. Negotiation, | ||
+ | |||
+ | == QueryString == | ||
+ | * Docs: http:// | ||
+ | * Current status: feature completive | ||
+ | * Related functionality in core: parse_str() (php_default_treat_data()) | ||
+ | |||
+ | Query string parser and tools. Actually builds on http\Params. | ||
+ | |||
+ | parse_str() suffers from its legacy/ | ||
+ | |||
+ | == Url == | ||
+ | * Docs: http:// | ||
+ | * Current status: essential | ||
+ | * Related functionality in core: parse_url() | ||
+ | |||
+ | URL parser and tools with UTF-8, locale multibyte and IDNA support (need to check if, and how much it diverges from IRIs). See RFC3987 and RFC3988. | ||
+ | |||
+ | I'm not sure what recommendation parse_url() follows, if any. | ||
+ | |||
+ | ==== Unaffected PHP Functionality ==== | ||
The http: stream wrapper is unaffected by pecl_http. | The http: stream wrapper is unaffected by pecl_http. | ||
- | ===== Proposed Voting Choices | + | ===== Vote ===== |
+ | |||
+ | Three way " | ||
+ | |||
+ | <doodle title=" | ||
+ | * Yes, enabled by default | ||
+ | * Yes, disabled by default | ||
+ | * No | ||
+ | </ | ||
+ | |||
+ | \\ | ||
+ | \\ | ||
+ | Additional simple vote on the namespace prefix (" | ||
+ | |||
+ | <doodle title=" | ||
+ | * http | ||
+ | * php\http | ||
+ | </ | ||
+ | |||
+ | ===== Discussed and changed items ===== | ||
+ | |||
+ | == Parsing multipart/ | ||
+ | |||
+ | This functionality was removed from the proposal. | ||
+ | |||
+ | == Parsing a/json into $_POST == | ||
+ | |||
+ | This functionality was removed from the proposal, which removed the ext/json dependency. | ||
+ | |||
+ | == Translating charsets of http\QueryString == | ||
+ | |||
+ | This functionality was removed from the proposal, which removed the ext/iconv dependency. | ||
+ | |||
+ | == Extended hashing methods for ETags of dynamic content == | ||
+ | |||
+ | This functionality was removed from the proposal, which removed the ext/hash dependency. | ||
+ | |||
+ | == Splitting up into smaller RFCs == | ||
+ | |||
+ | It was requested to split this RFC up into more smaller ones, but mainly only, as observed by me, to *not* bring an HTTP client implementation into the default distribution. These requests were not considered further by me, because I think the client gives substantial value to the overall package. | ||
+ | |||
+ | == Upgrade path for existing pecl_http users == | ||
+ | |||
+ | A pecl_http integraded into the default distribution would be considered v3. Upcoming v2 releases could take measures to prepare any transition to the PHP7 API. | ||
+ | |||
+ | == Namespace choice, or the case of the case == | ||
+ | |||
+ | I consider this issue non-important, | ||
- | A three-way yes (enabled by default) / yes (disabled by default) / no vote with a 50%+1 majority for both yes choices combined needed for acceptance. | + | There will be an extra vote on whether to prefix the '' |
===== Changelog ===== | ===== Changelog ===== | ||
Line 76: | Line 509: | ||
* Added link to fully merged tree | * Added link to fully merged tree | ||
* Expand voting options | * Expand voting options | ||
+ | * 2.0 | ||
+ | * Complete rewrite | ||
+ | * 2.1 | ||
+ | * Expanded feature section | ||
+ | * 2.2 | ||
+ | * Removed optional dependencies on all three extensions (json, iconv, hash), and the one INI entry related to it | ||
+ | * 2.3 | ||
+ | * Removed http\Env RINIT section | ||
+ | * Changed namespace from '' | ||
+ | * Fixed some wordings and list formattings | ||
+ | * 2.4 | ||
+ | * Added " |
rfc/pecl_http.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1