ideas:language-feature-usage-survey

Language Feature Usage Survey

Lead: Bishop Bettini, Working Group Members: TBD

The engine development team (“internals”, internals@lists.php.net) continually evolves the PHP language, doing so with a healthy respect for preserving backwards compatibility (BC). When the need - or option - for compatibility breaks arise, internals asks the question: “is the risk of breaking this behavior for existing code worth the gain of doing so?”

It's a tough - and subjective - question to answer. Given the sheer volume of PHP installations, it's impossible to speculate how many sites would be affected by a BC break. So without robust information otherwise, the internals team rushes into the debate using a variety of appeals:

  • The appeal to statistics. The reasoning goes like this: “The feature has been around a long time, and there are millions of sites, so it's likely that a high percentage of those sites would be affected.”
  • The appeal to personal experience. This reads like: “I have never used this feature. I've asked all my colleagues: they've never used this feature. I've been doing this for decades, and I've seen it all, so this feature must not be used at all.”
  • The appeal to Packagist. “I scanned the N most popular open-source packages and counted how many times the feature's used. It was used X times, and 80% of those were in tests, so it's not used much in production code.”

Those involved in the debate pick a statistic that resonates, then argue using that statistic as if it was a fact. They're not facts, unfortunately. They're cherry-picked statistics, and no way to make a crucial decision.

Idea

Let's ask users to send us information about how they're using PHP. Let's build an official static analyzer, which writes its results into a published interchange format, and offers to securely send analysis results to an official repository. Let's work with existing static analyzers and code editors to integrate this into their products. Let's work with continuous integration companies to get the tool as part of their pipeline offerings. Let's promote it on Reddit, Packagist, Github, StackOverflow, and any and everywhere else a developer visits.

Benefits

To PHP users:

  • Representation. When they participate, they know internals will consider their specific usage in evolving design.
  • Early warning. If internals made a breaking change, be it big or small, the analyzer will flag it.
  • Quality guidance. If internals has deprecated a feature, or has detected a usage that may be vulnerable, the analyzer will flag it.

To PHP internals:

  • Data, data, data. With real-world metrics at our finger-tips, we can make informed decisions.
  • Encourage change. If a feature has little real-world usage, we have the option to proactively remove it.
  • Discourage disruption. If a feature has substantial real-world usage, RFC authors can save their time and work on something that can be changed.

To existing static analysis tools:

  • Offload work. Fewer rules to maintain, as they can be handled upstream.
  • An open interchange format.

To code editors and continuous integration service vendors:

  • Distinguish themselves from the competition.
  • Help their customers produce better code, meaning more opportunity to gather new customers.

Requirements

The tool must be:

  • Capable of running correctly on PHP 5.0+
  • Quick to run / Efficient in time (measured in RPS, rules per second)
  • Small footprint / Efficient in space (measured in bytes)
  • Available as a standalone program, from an out-of-band download
  • Available to download, in-band, from the php cli executable
  • Analysis results:
    • Sending results to official PHP repository is _optional_
    • Transmission is secure only, and obviously so
    • Anonymous by default
    • Resistant to spamming
    • Timestamped

Progress

Idea definition only.

Status

Idea only.

ideas/language-feature-usage-survey.txt · Last modified: 2019/10/11 01:50 by bishop