rfc:free-json-parser

PHP RFC: Switch from json extension to jsonc

Introduction

Since PHP 5.2.0, JSON format support is provided by ext/json.

During a code license review, Debian project have discovered a issue with the code used, reported as Bug #63520.

Effectively, code from json.org is not free, as its License includes a restriction to freedom 0 (run the program for any purpose) : “The Software shall be used for Good, not Evil”. The discussion about this state is out of the scope of this RFC.

The encoder code is free since PHP 5.4.10 (Bug #63588)

Proposal

The jsonc extension, currently available from PECL site is designed to be a dropin alternative (same user API and internal ABI for other extensions).

  • Same encoder as in PHP 5.5
  • Parser provided by the json-c library (License MIT). Build can use the system library (–with-libjson) or the bundled copy (currently version 0.11 + some patches waiting for upstream review)

While the main purpose of this RFC is to fix the Licensing issue, it also introduce some new features.

As the new parser is an incremental one, the new JsonIncrementalParser class expose this feature

$parser = new JsonIncrementalParser();
$fic = fopen("somefile.json", "r");
do {
    $buf = fgets($fic);
    $ret = $parser->parse($buf);
} while ($buf && ($ret==JsonIncrementalParser::JSON_PARSER_CONTINUE));
$result = $parser->get();

Or, allow to parse a file without having to load it into memory:

$parser = new JsonIncrementalParser();
$ret = $parser->parseFile("somefile.json");
$result = $parser->get();

The json-c parser provides 2 strictness mode:

  • strict mode, used by default (for compatibility with previous implementation)
  • standard mode, available using the JSON_PARSER_NOTSTRICT option

Having a not-strict parser could be usefull for reading configuration from manually edited file, such as:

/*
Foo configuration file
*/
{
    "temp": "/tmp", // directory
    "debug": true,  // boolean
}

Backward Incompatible Changes

Partial implementation of big integers parsing (and of JSON_BIGINT_AS_STRING option). This will only work in a 32bits build for value fitting in a 64bits integer (not managed by PHP, so returned as string or float). Notice: no natural encoder will generate such data.

As the new parser implement different error codes, parser error is always returned by json_last_error() as JSON_ERROR_SYNTAX (JSON_ERROR_STATE_MISMATCH and JSON_ERROR_CTRL_CHAR are kept for compatibility but never used).

json_last_error_msg() returns error string from the json-c library.

Proposed PHP Version(s)

PHP 5.6

Notice pecl/jsonc extension is already adopted by various Linux distributions which cannot provide non-free code.

  • Debian since PHP 5.5 / Jessie
  • Fedora since PHP 5.5 / Fedora 19
  • Mageia since PHP 5.5
  • Ubuntu since PHP 5.5 / Saucy

SAPIs Impacted

All

Impact to Existing Extensions

  • replace json extension by jsonc (of course, renamed to json)

New Constants

JSON_PARSER_NOTSTRICT which allow to reduce parser strictness

  • comment are allowed
  • trailing char after data are ignored
  • trailing coma in list are ignored
  • etc

JSON_C_BUNDLED boolean, true if bundled json-c library is used, false if system one

JSON_C_VERSION version of the json-c library

Open Issues

None

Unaffected PHP Functionality

No change in PHP engine. No change for other extension.

Future Scope

Speed improvment.

As the original author (omar) seems no more involed, I could maintain this extension in the future.

Implement some RFE such as Bug #65082 new option for replacing ill-formed byte sequences with substitute char

Proposed Voting Choices

  • Yes (switch from json to jsonc in php 5.6)
  • No (keep using non-free stuff in php)

Patches and Tests

Current sources: https://github.com/remicollet/pecl-json-c

Notice : the test suite from the original json extension is kept. New features have new tests.

Implementation

References

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/free-json-parser.txt · Last modified: 2013/08/29 07:50 by remi