P++ idea: FAQ
- Date: 2019-08-09
- Author: Zeev Suraski, zeev@php.net
This is a clarifying FAQ for the idea presented on internals@. It attempts to address many points that were raised repeatedly in the discussion that ensued.
A list of concerns about this idea has been compiled by Arnold Daniels. Some of them are addressed here.
Note: P++ is a temporary code name and is subject to change.
What is this all about?
Trying to shorten the lengthy email into a couple of points:
- There are two big, substantial schools of thought in the PHP world. The first likes PHP roughly the way it is - dynamic, with strong BC bias and emphasis on simplicity; The other, prefers a stricter language, with reduced baggage and more advanced/complex features.
- There's no 'right' or 'wrong' here. Both schools of thought are valid, and have a very substantial following. However, it's challenging to create a language that caters to both of these crowds at the same time - which is a constant source of contention on internals@.
- The proposal is to create a new dialect of PHP (code named P++) that will live alongside PHP, but won't be bound by the historical philosophy behind the language. In other words, this new dialect could be inherently more strict, it could be more daring with BC and remove elements that are considered 'baggage' (such as short tags), and adding more complex features - especially ones that are a good fit for strictly typed languages - without having to introduce the same complexity to the PHP dialect.
- This is not a fork. The code base will be identical, the developers working on that codebase will be the same. The vast majority of the code would be identical. Only the specific points of difference between the two dialects will have different implementations. It is somewhat similar to what was done with strict_types in PHP 7 - only on a larger scale.
Do we really need to do all that just because some folks can't give up on short tags?
This is not related to short tags, and the short tags deprecation RFC was not the main motivator for this idea. The goal of this proposal is a lot more ambitious - it's to provide a clear vision for PHP - and to hopefully finally put to rest the tension between the two schools of thought on internals@ - by providing both of them with what they want.
Why fork PHP?
This is not a fork. The codebase will be identical, it would be versioned together and developed by the same people. The binaries will be identical - if you'd install PHP, you'd be installing P++ and vice versa. The same binary will be running your PHP, P++ or combined PHP/P++ apps.
While it's not yet clear how one would 'mark' a file as a P++ file, it would probably be some sort of a special header at the top of the file, such as:
<?p++?> <?php 'Hello, world!'; ?>
In addition, we may find ways to mark entire namespaces as P++, so that frameworks don't have to explicitly mark every individual file as P++.
This means doubling our dev efforts, while internals@ is already low on contributors. How will we deal with that?
Thankfully, it doesn't mean that at all. The vast majority of code will be shared between the PHP mode and the P++ mode - both in source and at runtime.
Data structures, key subsystems, extensions, web server interfaces, OPcache - and mostly everything else - will be be the exact same code running regardless of whether the file being executed is a PHP one or a P++ one. The only additional development overhead will be the specific areas of difference between PHP and P++.
It's true that it means we'd have to maintain two versions of certain pieces of code, and that we'll have some if() statements in various places - as P++ is likely to have additional checks compared to PHP. However, these are elements that would have to be introduced anyway if we're ever to move towards a stricter version of PHP. Moreover, since even folks in the strict crowd don't suggest that we move towards a stricter future without providing a migration path - effectively, the efforts involved with this approach and virtually any other approach are similar.
Why not just make a perpetual PHP 7.4 LTS and be done with it, as we move to a stricter PHP 8/9?
There are many issues with this approach, but these are probably the most important ones:
- For the dynamic crowd - more strictness is not equivalent to progress, and as such - they don't want to see future versions of PHP forcing them in that direction. They still want to get other types of new features (non-strictness related), better performance, bug fixes, new extensions and such. Making PHP stricter with newer versions means that for many users - upgrading would mean going backwards as far as their development preferences are concerned.
- Equally important - it's remarkably difficult - arguably impractical - from a development effort point of view. Unlike this proposal - which aims to continue supporting both dialects in the same codebase - having a version that would no longer be actively developed, but would still have to be maintained for security and critical bugfixes over the course of over a decade - requires resources that we simply don't have (it is arguably a kind of a fork).
Will I need to choose between PHP and P++?
Yes and no. As mentioned above, when you install one - you'd also have the other - so as far as apps go - you'd be able to run both dialects on a single server, even within the same app. However, practically speaking, projects and individuals are likely to typically pick and standardize on one or the other - similarly to how things went down with strict_types.
Will I be able to mix and match PHP and P++ in the same app?
Yes. While we need to work out the exact mechanics, the designation of whether code is PHP or P++ will be at the file level - not at the request level. A single execution (request) may load many different files, and these files could be from both dialects. Code from PHP files will behave with the PHP semantics - while code from P++ files will behave with P++ semantics. Here too - this would be similar to strict_types.
While this may sound awkward at first - there could be very practical use cases for this. For instance - a P++ only framework that is being used by a PHP application - or vice versa. For those of you familiar with C and C++ - this is somewhat similar.
Does it mean PHP will no longer evolve? Will all new features go into P++?
No, it just means it'll evolve differently. Strictness and type related features are likely to go just to P++, and only be available in P++ files. Bias for BC will remain in PHP (which won't mean it would never be broken - just that there'd have to be good return-on-investment cases for each such case).
However - unrelated features - such as performance improvements in the engine (e.g. JIT), developments in extensions, or new async-related features - will be available for both PHP and P++.
What are the benefits of this approach?
There are numerous benefits to this approach. First, it gives both camps on internals@ - and beyond - a good solution to their aspirations. Those who prefer the dynamic nature of PHP get to keep it, while those who prefer a more strictly typed language - get to obtain it without being bound by any limitations of PHP. The alternative to that is a zero sum game - where the win of one group is the loss of the other, and vice versa.
Beyond being a good technological solution - that enables us to support our entire audience in the least amount of effort - this could also bring an end to key source of contention on internals@ in recent years.
Finally - although most of the readers of this document are likely to be technology people - it should be noted that launching P++ that would start with a clean slate - could have substantial positioning/branding advantages. Companies, development managers and individual developers who have ruled out PHP - are more likely to take note of a P++ launch, than of a launch of PHP 8.0 or PHP 9.0.
Aren't we risking fragmenting the userbase?
To a degree, we are. But this isn't a flaw of this idea - but a representation of reality as it already exists on the ground.
As mentioned above, there's a huge crowd out there that likes the dynamic nature of PHP, and looks warily at the attempts to make it more and more type-oriented.
At the same time - there's another huge crowd out there that looks at PHP and thinks to themselves “why is it evolving so slowly towards finally getting rid of this dynamic nonsense?”
There's no right or wrong here. Both points of view are valid. When we look at potential solutions to bridge between these two contradictory points of views, there aren't too many of them available:
- Stick with dynamic PHP. This will not be accepted by the proponents of a stricter language.
- Evolve towards a stricter PHP. This will not be accepted by the proponents of a more dynamic language.
- Fork the codebase. This is a net loss option for everyone involved, regardless of how it's done. There's no technological advantage to doing that, and we don't have enough contributors to do that even if we wanted to (which we don't).
- Come up with some creative solution to cater to both audiences. This is what this proposal attempts to do. It does that while keeping the project itself unified, and while ensuring perpetual interoperability between the two dialects - so that while there'll be some level of fragmentation, it will be the minimal one possible that still addresses everyone's primary needs.
How does this differ from Nikita's Editions idea?
There are many similarities between these two ideas, but also a couple of substantial differences. Note that this is based on a limited understanding of the Editions approach, so parts may be lacking, inaccurate or incorrect.
- In this proposal - there is a stated goal to keep the current, dynamically typed PHP - as a long term, fully supported, equal-among-equals dialect. The Editions approach views the current behaviors as 'legacy'. That means it may be discouraged, and then, at some point, deprecated and removed.
- The rollout strategy is quite different. The P++ proposal aims to first focus on the compatibility-breaking elements - things like strict ops, changes to type conversion logic, array index handling, requiring variable declarations, etc. - and aim to deliver them at the first installment of P++. That's with the goal of allowing new projects/frameworks to start fresh, without knowing that they'd likely have to undergo a major rewrite a year or two down the line when more compatibility-breaking changes are introduced. The Editions proposal appears to have no such goal - but instead, aims to gradually add/change elements in PHP.
- Related to rollout - the Editions approach doesn't allow for just two dialects - but any number of dialects. We could have a PHP2020 dialect, along with a PHP2022 dialect and a PHP2027 dialect. If we keep them all - this may actually increase our maintenance complexity.
- This proposal also mentions different BC-break strategies for PHP vs P++ (conservative vs. aggressive), while Editions will likely not touch that topic at all.
- The Editions proposal does not have quite the same positioning/marketing aspect that this proposal does.
It's important to note that the two ideas aren't necessarily mutually exclusive. We could introduce P++ and evolve it with editions, especially if it proves too difficult to get all the compat-breaking changes into the first installment of P++.
What are the challenges?
There's no shortage of challenges before we can run our first P++ app.
- We need to get buy-in. That means that folks from both schools of thought need to give up on a dream of having PHP be entirely dynamic or entirely statically typed while disregarding those who think differently from them. This appears to be a very substantial challenge.
- In order to be successful, the first version of P++ should deal with all, or at least most of compatibility-breaking changes from PHP - so that developers who make the (probably fairly painful) switch won't have to reaudit/radically refactor their code once more in the future. Some have voiced concern that they may be too optimistic to do in one installment with the limited developer-power we have. We'd have to evaluate that once we have a better idea of what that list is. Note that it does not mean we need to implement any and all ideas we may have for P++ at this first version - just that we should prioritize elements that would trigger substantial end-user code rewrites - and try to handle them before our first release.
- Of course, the most challenging of all - we need to find a reasonable name for this new dialect.
This is Hack all over again, isn't it? Why would it fair any better?
While conceptually the motivations for both P++ and Hack are similar - there are at least two critical differences between the two - each of which is likely sufficiently big to change the expected outcome.
- Hack was/is developed by a single company, and not as an open process by volunteers. Even if the vendor that's behind it is gigantic - companies and individuals were often reluctant to standardize on a such a platform.
- Perhaps more importantly - Hack (and HHVM) did not have PHP's gigantic distribution vehicle.
- For Hack, it was an uphill battle for users to even give it a try:
- They had to learn about its existence, and be sufficiently interested to learn more about it.
- Assuming they were interested enough to give it a try - they had to go through the trouble of setting it up - using entirely different methods from the ones they were used to from the PHP days (different layout, different configuration, different everything).
- With P++ - this is a radically different story from the ground up:
- Every user of PHP (starting with 8.0, or whenever we make it available) - will have it available on their servers. You will not have to install anything, or set anything up - it will simply be there.
- This in turn means that virtually anybody running a Linux distro, a recent version of WAMP, a recent version of MAMP - millions of servers and development workstations will have access to P++ without having to do anything proactively.
- In terms of awareness - since P++ will be a big part of the “What's new in PHP 8” - it will enjoy free marketing like Hack could only dream of - similar to the PHP 7 performance splash (few in the PHP world are unaware of it).
- Of course, it doesn't mean that everyone will want to start using it - but the barrier to entry with P++ is many orders of magnitude lower than what Hack to face.
What are the general concerns?
Arnold Daniels compiled a list of concerns about this proposal.
Some of them are addressed here:
Converting PHP code to P++ code is not trivial
That may be true - but it ultimately depends on what we decide to put into P++. This proposal assumes that the contents of what we'd want to do would be similar, regardless of whether we deliver it using declare()s, Editions or a unified P++ dialect. The premise of this proposal is that there's a large group of people in the PHP space that want to change PHP to be substantially different from the way it is today - making it increasingly more strong- and statically-typed. It also assumes that this isn't bad thing - as long as don't treat it as a zero sum game with those who want to keep PHP more dynamic and loosely-typed as it presently is.
PHP tooling will not support P++
It's important to understand that technologically - it'll actually be slightly simpler for vendors to support P++ vs. having to support granular declare()s or an unlimited amount of editions. There's no reason to assume that it'll be treated any differently than if similar features/changes are introduced and delivered using a different mechanism.
It's not possible to do a cleanup without breaking PHP compatibility
That is true - but that is actually a good reason to consider introducing this new dialect, and not vice versa. Many proponents of strict also want to make bigger leaps in terms of breaking downwards compatibility. Today - there's no other option except for a zero sum game with folks who may not be so fond of breaking BC (especially if it's in order to make PHP stricter). There have been numerous instances of that in recent times, and it seems many more are in store for the future.
Regarding the specific examples brought up by Andi:
- Removing array() will have no impact on compatibility of P++/PHP - it's just syntactic salt for the more modern [] syntax.
- Removing the global namespace for functions (if we do it) will only affect P++ code (i.e., access to it would be removed) - it will still be there in PHP code.
It's important to stress that neither of these ideas were discussed to date, and may or may not be proposed for future inclusion in P++.
The popularity of Python doesn't have to do with typing
This document - and the proposal in general - does not claim to suggest that strong/static typing is a good or a bad idea. It purposely doesn't take sides on which side is “right”. What it does do is acknowledge that there are two substantially opposed schools of thought for PHP users - and provides a proposal on how the project can evolve to address both of them in an efficient and productive manner. That said - clearly, there are a lot of people who think a strongly-typed PHP would be a better choice, so having that option may indeed increase its popularity.
Is there really a need for a different dialect?
One of the axioms that many in the 'strict' camp appear to believe in, is a more strongly-typed and a more statically-typed language means progress - and that the main question is how we can deliver on it. Can we do it in PHP 8 - while keeping the dynamic crowd on a legacy 7.4 version? Should we do perhaps do it more gradually by releasing changes every few years, until we get to where we want to be? For that crowd - it needs to be clear that for people who prefer dynamic, loosely-typed languages - strong & static typing aren't progress - and it doesn't matter if it happens overnight or over the course of a decade.
At the same time - many others pro-strict folks are more pragmatic, and want to simply add optional strictness - along the lines of strict_types. This, arguably - can be called progress - it's not regressive for anybody, and it does provide progress to folks who prefer a more strongly-typed / statically typed language. This will likely be the direction we go for - which means that we'll already have different dialects available anyway. It's really a matter of whether we'd have 2^N dialects (granular declare()s), N dialects (Editions), or 2 (PHP/P++).