Table of Contents

PHP RFC: include cleanup

Introduction

The PHP code base has grown over more than 2 decades, which leaves some room for code cleanups. While attempting to fix several crash bugs in the JIT, I found that PHP's C headers (*.h) and the existing #include directives are inconsistent, incomplete and bloated; many things just worked by chance, not by design, because there were a few headers which just included everything.

I wanted to help clean up this mess that had accumulated over two decades. After four PRs doing exactly that were widely welcomed and accepted, there was resistance from two individuals who demanded my changes to be reverted. This resistance surprised me, because I previously thought those code cleanups were common sense, and just nobody had so far taken the time to do it.

Proposal

I propose an addition to PHP's coding standard describing how C headers and #includes should be managed:

  1. Include only the headers (`*.h`) that are really needed.
  2. Use struct/union forward declarations to eliminate `#include` directives if possible.
  3. If some headers are needed only in some build configurations, enclose them in the same `#if` that also guards the use of its definitions.
  4. Each source file (`*.c`) should include its own header file first to ensure that the header's `#includes` are complete.
  5. Header inclusions are ordered this way: its own header first, then PHP headers, then third-party library headers, then system headers (e.g. libc, POSIX).

(GitHub PR with the same contents: https://github.com/php/php-src/pull/10338)

Then, the reverted cleanups should be reconsidered:

I have already posted a new draft PR which contains all these, plus some more: https://github.com/php/php-src/pull/10345

Advantages of this proposal:

Disadvantages of this proposal:

Criticism issued on this proposal:

Backward Incompatible Changes

This proposal has no runtime effect. The PHP language is not affected, and the PHP executable will not change.

At the C source code level, reducing #includes may break the build of third-party extensions which forgot to include a header that was previously implied by PHP, e.g. “errno.h” is included for no reason, and extensions may rely on that.

If staying compatible with defective extensions is deemed important, these includes may be re-added to a header such as “php_compat.h”, possibly with a way to opt-out. That way, broken extensions still build, but PHP itself still benefits from a smaller and more correct set of #includes.

Proposed PHP Version(s)

“master” branch (which may eventually become 8.3 or 9.0?).

RFC Impact

To SAPIs

No runtime impact.

The build of out-of-tree SAPIs may break for the same reasons as described above.

To Existing Extensions

No runtime impact.

The build of out-of-tree extensiosn may break for the same reasons as described above.

To Opcache

No impact.

Being an in-tree extension, opcache benefits from those code cleanups. My PR contains cleanup commits for opcache.

New Constants

Not applicable.

php.ini Defaults

Not applicable.

Open Issues

Unaffected PHP Functionality

The PHP language is completely unaffected.

Future Scope

Not applicable. This is an incremental improvement to PHP's source code.

Proposed Voting Choices

Vote

Voting started on 2023-02-01 (two weeks after the RFC was posted to php-internals) and will end on 2023-02-15 at 15:00 UTC (another two weeks).

Primary vote:

Should #include directives be cleaned up?
Real name Yes No
alcaeus (alcaeus)  
alec (alec)  
beberlei (beberlei)  
bukka (bukka)  
bwoebi (bwoebi)  
derick (derick)  
dmitry (dmitry)  
gasolwu (gasolwu)  
girgias (girgias)  
kalle (kalle)  
kguest (kguest)  
levim (levim)  
nicolasgrekas (nicolasgrekas)  
ocramius (ocramius)  
petk (petk)  
pollita (pollita)  
stas (stas)  
svpernova09 (svpernova09)  
theodorejb (theodorejb)  
timwolla (timwolla)  
twosee (twosee)  
Final result: 11 10
This poll has been closed.

Secondary (clarifications on how to clean up):

Is it allowed to document an #include line with a code comment?
Real name Yes No
bukka (bukka)  
bwoebi (bwoebi)  
derick (derick)  
dharman (dharman)  
kguest (kguest)  
levim (levim)  
mcmic (mcmic)  
ocramius (ocramius)  
pollita (pollita)  
svpernova09 (svpernova09)  
Final result: 1 9
This poll has been closed.
Is it allowed to forward-declare structs/unions/typedefs?
Real name Yes No
bwoebi (bwoebi)  
derick (derick)  
dharman (dharman)  
kalle (kalle)  
kguest (kguest)  
levim (levim)  
pollita (pollita)  
svpernova09 (svpernova09)  
Final result: 1 7
This poll has been closed.
Is it allowed to split a large header to reduce dependencies?
Real name Yes No
alcaeus (alcaeus)  
alec (alec)  
bwoebi (bwoebi)  
derick (derick)  
dmitry (dmitry)  
gasolwu (gasolwu)  
girgias (girgias)  
kalle (kalle)  
kguest (kguest)  
levim (levim)  
mcmic (mcmic)  
nicolasgrekas (nicolasgrekas)  
ocramius (ocramius)  
pollita (pollita)  
svpernova09 (svpernova09)  
theodorejb (theodorejb)  
timwolla (timwolla)  
twosee (twosee)  
Final result: 12 6
This poll has been closed.

Patches and Tests

Implementation

https://github.com/php/php-src/pull/10338 tracks the specification. https://github.com/php/php-src/pull/10345 tracks my proposed implementation.

https://github.com/php/php-src/pull/10410 is a PR splitted from https://github.com/php/php-src/pull/10345 with some minimal cleanup, as suggested on php-internals by Jakub Zelenka; it contains compatibility tweaks so (bad) third-party extensions do not break, such as always including errno.h from php.h

https://github.com/php/php-src/pull/10404 is a PR that adds a few third-party extensions to the CI; more extensions could easily be added. This is slighly out of this RFC's scope, but may help those who worry about extension breakages.

If voters decide that #include directives should not have code comments, then existing comments should be removed to reduce “clutter”; see https://github.com/php/php-src/pull/10472

References