Ideas for the Google Summer of Code 2008
Here you'll find a couple of ideas for Google Summer of Code projects. This list is not exhaustive and you may propose any “crazy” idea you may have.
Before you submit your proposal, you are encouraged to contact the possible mentors for the project you are applying. If the project hasn't any mentor assigned or if you are submiting an off-list project, please contact one of our mailing lists to discuss the proposal before submiting it.
If your project is to be written in PHP, please make sure you read the PEAR Coding Standards when applying.
If you are applying for a project in the PHP code itself (in C), you may find useful the PHP hackers guide, which also includes our C coding standards (TBD).
Your proposal should include the following:
- Name and e-mail
- Detailed description of what you intent to do including, if possible, a list of quantifiable results
- Project Schedule: How long will the project take? When can you begin working?
- Availability: How many hours per week can you spend working on this? What other obligations do you have this summer?
- Bio: Who are you? What makes you the best person to work on this project?
PECL, Website Improvements, Windows Build support improvements
Possible mentor: Pierre A. Joye, Elizabeth Marie Smith, Helgi Þormar Þorbjörnsson
This project is really two separate but related ideas.
The first portion involves working on reorganizing the current PECL website in order to be first of all usable, solid, easy to maintain (using the pearweb structure and releasing strategies). This requires nothing more than strong PHP skills, optionally some PEAR knowledge, and a desire to help make lives easier for PECL developers, and for people using PECL packages. In addition to the reorganization improvements, there are additional features that would be nice to have on the site.
The second related issue is binary releases for PECL extensions. Although currently PHP only provides Windows binaries, and only CVS snapshots, expanding this would help improve the distribution of PECL extensions. Also, supporting on demand or automatic builds using a given set of CVS tags or branches (the snapshot already exists). The DLLs (and perhaps even binaries for other architectures) can then be distributed via the release packages, and then installed via the pecl channel.
See the PECL
Automatic Code Checker
Possible mentor: Nuno Lopes
The PHP API has a couple of varargs functions that are error prone and may easily cause segfaults in PHP, especially on less used platforms. The list of such functions include zend_parse_parameters*(), zend_error() and a few others. Our current check script is made in PHP and is regex based. It is available at http://cvs.php.net/viewvc.cgi/php-src/scripts/dev/check_parameters.php?view=markup. This script is difficult to maintain and generates way too many false-positives. The work would involve creating a LLVM clang analysis tool to perform some (data-flow) static analysis and output error messages for the problems found. A sample output of the script mentioned is available at: http://gcov.php.net/viewer.php?version=PHP_HEAD&func=params.
Zend bytecode to LLVM bitcode converter
Possible mentor: Nuno Lopes
Make a tool to convert Zend bytecode into LLVM bitcode. In a first phase this could replace the Zend dispatcher/executor and in a 2nd phase it could perform some simple operations inline (while still relying on the Zend engine for non-trivial opcodes). Benchmarking plus optimization opportunities exploration are a plus.
Possible mentor: Nuno Lopes, ...
PHP has some algorithms that aren't asymptotically optimal. E.g. some string handling functions are O(nk), although they could be implemented in O(n). Analysis of the current situation and implementation of better algorithms (along with many tests) are the job.
[NOTE: we have had MANY applications for this idea, it would be smart to submit another idea as well]
Implement Unicode into PHP 6
Possible mentor: David Coallier
The main PHP 6 feature is native Unicode support. Implementing this includes updating every function in php-src, a task that is roughly 60% complete.
See Also: Unicode coverage data, README.UNICODE-UPGRADES and the PHP 6 TODO
Replace auto* with CMake
Possible mentor: (it would rock to have other phpize/auto* master to do the convertion) and Pierre A. Joye can help for the CMake part. Elizabeth Smith can help on the Windows/MSVC stuff. Kitware will also participate
Currently there are two configure scripts that need to be maintained, one for Windows and one for everything else, each being written in different languages. The Unix build system uses autoconf, however there are several bugs within older versions and m4 confuses new and old developers alike. This proposal is to convert all of our autoconf scripts to CMake.
Cmake is a cross platform make system that would generate native makefiles for developers and has a much simpler syntax to that of m4. Kitware, the company behind cmake is ready to help us to migrate and to improve cmake to fit our needs. They are also ready to port cmake to any unsupported platforms required by PHP. A detailed post will be made on internals (added by Pierre, 03/12).
CMake's dashboard (test reporting server) will also benefits from this move. We may use it too in our tests suite. (see cmake's site www.cdash.org) for a detailed info about dashboard).
See the CMake Wiki for more information about CMake.
Build Infrastructure and macro bindings for writing extension in D
Possible mentor: Alan Knowles (can help a bit on the D stuff) - probably needs someone who knows the build tools as well.
D (see Digital Mars D Website) is a compiled language like C/C++, with considerably clearer syntax and assuming it is feasible could remove alot of the hurdles that currently exist for writing PHP extensions (memory management, rather cryptic macro knowledge etc.) By utilizing clean Class based API wrappers.
This could make it possible to translate a set of PHP classes into a compiled extension with very small effort.
The implementation of this would probably consist of
- use of GDC (the gcc backend for compiling D)
- changes to build tools to make building of D code *.d part of the build process
- Converting standard Macros into c functions so they can be made available as extern(C) to D
- Converting standard Structs/Unions that properties access is usefull (otherwise just void*) would be good enough for the bindings.
This probably needs some discussion on internals
Make Ilia's Optimizer Production Ready
Possible mentor: Derick, Ilia.
APC features some optimizer interface, but there is actually very little optimizations done. Ilia has been working on an optimizer which needs to be tested, and cleaned up, and analyzed before it can be part of APC. There is quite a bit of work to do here, including porting to PHP 5.3 (and HEAD). There are also other possible optimizations to be done. After working on this, you'll have a very good understanding of PHP's internals.
Rewrite the run-tests.php script
Possible mentor: Sebastian, Travis.
Note from Travis Swicegood: Why rewrite run-tests.php when it's already been done with the PHPT project. See http://phpt.info for the project's Trac. It still has room for improvement. Parallel test processing is on the list for two major releases out (see http://phpt.info/ticket/12 for stubbed out info on how to do that in PHP), but if someone wants to implement it before then I'd be happy to help. Multiple output support is already built-in, but new Reporters could definitely be added as could micro-benchmark support.
The run-tests.php script that is used to run PHP's suite of PHPT tests has grown over the years and needs to be refactored. Two areas of improvement include leveraging multi-core systems to run tests in parallel and support for micro-benchmarks in addition to tests.
The implementation of this would probably consist of
- Refactor the existing run-tests.php using PHP 5 language features such as SPL's filesystem iterators
- Leverage multi-core systems to run tests in parallel, for instance by first crawling for .phpt files, creating a Makefile, and then using “make -jN” to distribute the workload among cores
- Add support for micro-benchmarks
- Add support for different logfile formats such as Ant/JUnit's XML format and the Test Anything Protocol (TAP)
PhD: The PHP based Docbook renderer
Possible mentor: Hannes Magnusson.
PhD is the application that is used to render both the online manual on www.php.net and the downloadable manual pages. This summer we hope to get a student to implement:
- CHM Output format
- PDF Output format
- Unix man-page Output format
- Themes for other php.net manuals (i.e. PEAR & PHP-GTK).
- Support for dbhtml processing instructions
Possible mentor: Sebastian.
Just like in the Google Summer of Code 2007, proposals for PHPUnit (which participates under the umbrella of the PHP Project) projects can be submitted. As before, proposals of this kind are considered with a lower priority. Ideas can be found here.
Possible mentors: Anant Narayanan, Elizabeth Marie Smith
PHP-GTK will participate under the umbrella of the PHP project. If you need help or more details on any of the project ideas below, please contact any of the above mentors; use the php-gtk-dev mailing list; or the #php-gtk IRC Channel on irc.freenode.net. Asking for help on #php.pecl or the php-internals mailing list will get you redirected to these resources anyway. Strong C skills is a mandatory requirement; knowledge of the Zend API and PHP extension writing a plus, but not necessary.
This project involves writing bindings to the GNOME libraries for PHP-GTK. This would allow users to write GNOME applications entirely in PHP. GNOME language bindings have fixed requirements, see these pages for details. At the end of the project, atleast the basic GNOME libs and GConf must be wrapped.
PHP-GTK needs an object-oriented binding over the Cairo library for completing code coverage for GTK+ versions above 2.8. The deliverable for this project would be a complete Object Oriented binding over the Cairo library implemented as a PHP-GTK extension.
Complete Code Coverage
PHP-GTK does not implement all functions for the latest versions of GTK+. This project primarily involves writing overrides for functions whose implementation is not automatically created by the generator. At the end of the project, all functions defined upto version 2.12 of GTK+ will be expected to work.
Pave the way for One-Bugtracker-Rules-Them-All
Possible mentors: Helgi Þormar Þorbjörnsson
At the moment we have 2 bug systems, the shared one which pear and pecl use and the php one (also known as bugsweb). Ideally the php.net project would have a single bug system which would allow for more liberal moving bugs around (think pecl <--> php) and sharing of features and no duplication of work.
This project would involve fleshing out the pear bugtracker to make it suitable to serve all these sites without loosing its current integration into pear/pecl as well as add new cool (albeit useful) features and make sure we're not missing anything the bugsweb tracker added after the pearweb tracker got forked away from it couple of years back.
Couple of good features can be found at http://news.php.net/php.internals/36065 and http://pear.php.net/package/pearweb/bugs
Note: This will mean all the bug systems will be sharing a single database or else the whole idea will fail so a big part of the merger will be to find a suitable way to get the bugs all into one database and handle the redirecting of old bug ids to the new ones, Jani already did some work on that (http://cvs.php.net/viewvc.cgi/pear/Bugtracker/) and can be used as base.
Anonymous functions and closures and other missing object oriented features
Possible mentor: Marcus Boerger
In the past we often discussed anonymous functions and closures but could never come to an agreement on how they should work in detail. For closures the main issue probably is how to pass variables from outer scope to inner scope. So far we got two nice proposals and patches, check the mail archives:
Before anyone takes on this idea, I suggest you contact the internals mail list or directly correspond with the two authors to get more information about the current state.
Other features missing from PHP's object orientation include delegates, return type hinting and property type hinting - as well as discrete property accessors, as opposed to __get/__set family of magic handlers.
SimpleTest : web browser and web testing in PHP
Possible mentor: Marcus Baker, Travis Swicegood, Perrick Penet.
XDebug Profiling Web Frontend
Possible mentor: David Coallier, Derick Rethans
Written by Derick Rethans, XDebug is the most powerful tool to help every developers and debuggers in the debugging and profiling of their algorithms, applications, classes, methods, anything related to PHP.
XDebug can generate profiler output files that are easy to load with KCacheGrind and WinCacheGrind. However, many people do not have that installed, and all the OSX people have no tool to run the profiled file (Unless they install KDE on X and then install KCacheGrind).
A very elegant and simple way to resolve this “cross platform” issue is to create a Web Interface that would behave just like KCacheGrind but would not require any system installation (else than the PHP/JS/HTML/etc code itself)
This project of course involves the creation of graphs, mathematical calculations, ajax, and of course PHP algorithms in order to display the data in a very accurate way.
You can get start by getting XDebug and trying it out. (See the profiler section)
There's a summary of the GSoC mid-term evaluation survey results on http://schlueters.de/~johannes/GSoC-Survey/summary.ods