RFC: First-Class Object and Array Literals
- Version: 1.0.3
- Date: 2011-06-04 (Updated: 2011-06-06)
- Author: Sean Coates sean@php.net
- Status: Inactive
Introduction
In modern web applications, JSON has become ubiquitous. Whether through serialization, such as when passing data to the end user's browser and when consuming JSON-powered web services, or as a query format, such as when interfacing with systems like CouchDB, MongoDB and ElasticSearch, many current (and future) web deployments (will) need to generate and consume JSON in some way.
JSON (in the not-necessarily-strict sense) is also the de facto way to communicate between a web back-end and mobile applications.
This proposal calls for first-class JSON-like object and array literals (a.k.a. primitives, constructs) in PHP.
This RFC supersedes the existing Short Syntax for Arrays RFC.
Goal
The goal of this proposal is to simplify interaction with third-party services and applications that speak JSON (or a JSON-like dialect), by allowing first-class (inline) array and object literals.
Syntax
The proposed syntax is an inline, non-strict implementation of the syntax
defined in the JSON Spec. Using
a strict syntax would not conform with the PHP Way, but E_NOTICE could be
raised in the same way that array(one => 1)
curently emits a notice.
Furthermore, JSON-proper is a serialization format, but this proposal concerns a declaratative, inline, interpreted object literal format.
In JSON-proper, keys must be quoted with “
and not '
. This same behaviour should not be expected in PHP. Similarly, PHP arrays allow a trailing comma after the last element, but JSON does not allow this. The existing PHP behaviour should be allowed in these respects.
Point of discussion (see below): the key-value separator will be both PHP's
conventional =>, and the foreign-language convention: :
.
Syntax Examples
<?php // new syntax for simple arrays: $a = [1,2,'three']; // equivalent to current: $a = array(1,2,'three'); // associative arrays: // (examples are equivalent; see discussion) $a = ['one' => 1, 'two' => 2, 'three' => 'three']; $a = ['one': 1, 'two': 2, 'three': 'three']; // equivalent to current: $a = array('one' => 1, 'two' => 2, 'three' => 'three'); // anonymous object: // (examples are equivalent; see discussion) $a = {'one': 1, 'two': 2, 'three': 3}; $a = {'one' => 1, 'two' => 2, 'three' => 3}; // equivalent to: $a = new \StdClass; $a->one = 1; $a->two = 2; $a->three = 'three'; // or: $a = (object)array('one' => 1, 'two' => 2, 'three' => 'three'); // PHP conventions (dynamic keys/values) $val = 'apple'; $record = {"favourite_fruit": $val}; // true expression: $record->favourite_fruit == "apple"; $key = "colour"; $record = {$key: "red"}; echo $record->colour; // outputs "red" $colour = "green"; $vehicle = "truck"; $record = {'notes': "Drives a {$colour} $vehicle."}; echo $record->notes; // outputs "Drives a green truck." // inline functions: $creditCard = '5105105105105100'; $doc = {"credit_card_reminder": substr($creditCard, -4)}; echo $doc->credit_card_reminder; // outputs "5100" // 'invalid' keys: $obj = {'key with spaces': 'still works'}; echo $obj->{'key with spaces'}; // outputs 'still works' $doc = {'$set': {"has_logged_in": 'yes'}}; echo $doc->{'$set'}->has_logged_in; // outputs "yes" ?>
Interaction with third-party services that speak JS-literals
<?php // Here's how an ElasticSearch query currently looks in PHP: $esQuery = new \StdClass; $esQuery->query = new \StdClass; $esQuery->query->term = new \StdClass; $esQuery->query->term->name = 'beer'; $esQuery->size = 1; // OR $esQuery = (object)array( "query" => (object)array( "term" => (object)array( "name" => "beer" ) ), "size" => 1 ); // …and here's how it could look with the proposed syntax: $esQuery = { "query" : { "term" : { "name": "beer" } }, "size" : 1 }; /* …and here's how I'd use curl to ensure that the query I'm issuing does in fact work with ElasticSearch: $ curl http://localhost:9200/gimmebar/assets/_search?pretty=1 -d'{ "query" : { "term" : { "name": "beer" } }, "size" : 1 }' */
Even considering the (object)array(
syntax, it's much easier to work with an external query (as shown with curl), if we have a (nearly) JSON-compatible syntax in PHP.
Note that this could have been written with the PHP definition of $esQuery in the proposed yet non-JSON compatible syntax (single quotes, for example), but it was written with double quotes because it is easier to pass off to curl.
Realistically, “beer” would be in a variable (maybe {“term”: {“name”: $term}}
), but replacing just the variables is certainly much easier than translating the new \StdClass
syntax.
The argument for right-hand-side assignments being allowed in the proposed syntax (such as in {'time': time()}
) is still valid because it is expected that this syntax will be used both for interacting with third party services (as with ElasticSearch above), but also generally for object and array creation without a care about third parties.
Benefits
The main benefit of this syntax directly relates to the stated goal: simplified interoperability with third-party systems, such as browsers, databases, and web services.
This goal is accomplished through the following benefits:
- A more terse syntax that is still understood and easily readable. This is especially the case for object literals.
- The ability to build code that uses existing examples for systems such as from the documentation of ElasticSearch without first translating the example from JSON to PHP's
array()
syntax. - Improved ability to debug directly against third party APIs (such as by pasting compatible object/array literal syntax from PHP into curl, which helps identify the source of the bug: your code or your data).
- Improved communication with third party vendors (read: authors, for small open source projects), who often understand JSON and thus other literals, but not necessarily PHP.
Backward Compatibility Breaks
None known at the time of writing.
The special characters proposed in this RFC ({
, }
, [
, ]
,
:
, and ,
) /are/ currently in use, but never in the context of the
proposed syntax.
Existing Implementations
- JavaScript (with which nearly all modern web apps must interface in some form) supports a very similar syntax. JavaScript (largely) supports a non-strict JSON implementation. The “JS” in JSON stands for JavaScript.
- Python supports a very similar syntax for defining lists and sets (which are somewhat analogous to PHP's arrays (non-associative)), and dictionaries, which are similar to PHP's associative arrays and anonymous (
StdClass
) objects. - Ruby 1.9.1 added syntax improvements for its hashes to simplify the syntax and improve readability much like this RFC proposes.
Differences from Other Implementations
- PHP's associative arrays are more malleable than other implementations.
- This proposal calls for a syntax that is much less strict than proper, on-spec JSON.
Why not just use ''json_decode($literalString)''?
Converting literal notation to PHP strings and then into objects/arrays is problematic for several reasons:
- Inline escaping is necessary for special characters; there is a significant amount of cognitive overhead required in keeping a mental stack of escape depth.
- Inline injection of variables or functions requires confusing and verbose string concatenation.
- Tools (editors, static analysis tools) see strings as strings; literals are more explicit.
create_function()
was replaced by first-class closures/lambdas for similar reasons.- In-line SQL has similar problems, and requires workarounds like placeholders.
- Performance concerns: encoding/re-encoding, memory usage.
Patch
- A partial (arrays only, colons only) patch is available.
Discussions
Further Discussion Required
- Strictness of unquoted keys.
- Support => in addition to
:
as a key:value separator. - possibility of simply not supporting the
\u###
syntax for Unicode characters in literal strings (just like the rest of PHP). - Should mixed-format (numeric and associative arrays) be allowed? (e.g.
[1,'two':2, 3]
)