rfc:glob_streamwrapper_support

This is an old revision of the document!


PHP RFC: StreamWrapper Support for glob()

Introduction

glob() is a function in PHP to list and filter files and/or folders using a globbing pattern. The current implementation of glob() is a thin wrapper for POSIX glob(3). While all the Filesystem Functions in PHP supports StreamWrappers, glob() does not.

Workarounds can be a struggle. Some projects may rely on regular expressions which are different from globbing. Some may attempt the use of fnmatch() which does not support all the features of glob() like brace expansions.

Example of a workaround using opendir and regex:

  function glob_streamwrapper($glob, $flags=0) {

  // Unixify paths
    $glob = str_replace('\\', '/', $glob);

  // Set basedir and remains
    $basedir = '';
    $remains = $glob;
    for ($i=0; $i<strlen($glob); $i++) {
      if (in_array($glob[$i], ['*', '[', ']', '{', '}'])) break;
      if ($glob[$i] == '/') {
        @list($basedir, $remains) = str_split($glob, $i+1);
      }
    }

  // Halt if basedir does not exist
    if ($basedir && !is_dir($basedir)) {
      return [];
    }

  // If there are no pattern remains, return base directory if valid
    if (!$remains) {
      if (is_dir($basedir)) {
        return [$basedir];
      } else {
        return [];
      }
    }

  // Extract pattern for current directory
    if (($pos = strpos($remains, '/')) !== false) {
      list($pattern, $remains) = [substr($remains, 0, $pos+1), substr($remains, $pos+1)];
    } else {
      list($pattern, $remains) = [$remains, ''];
    }

  // fnmatch() doesn't support GLOB_BRACE. Let's create a regex pattern instead.
    $regex = strtr($pattern, [
      '[!' => '[^',
      '\\' => '\\\\',
      '.'  => '\\.',
      '('  => '\\(',
      ')'  => '\\)',
      '|'  => '\\|',
      '+'  => '\\+',
      '^'  => '\\^',
      '$'  => '\\$',
      '*'  => '[^/]*',
      '?'  => '.',
    ]);

    if ($flags & GLOB_BRACE) {

      $regex = preg_replace_callback('#\{[^\}]+\}#', function($matches) {
        return strtr($matches[0], ['{' => '(', '}' => ')', ',' => '|']);
      }, $regex);

    } else {
      $regex = strtr($regex, ['{' => '\\{', '}' => '\\}']);
    }

    $regex = '#^'.$regex.'$#';

    $folders = [];
    $files = [];

  // Open directory
    $dh = opendir($basedir ? $basedir : './');

  // Step through each file in directory
    while ($file = readdir($dh)) {
      if (in_array($file, ['.', '..'])) continue;

    // Prepend path
      $file = $basedir . $file;
      $filetype = filetype($file);

      if ($filetype == 'dir') {

      // Collect a matching folder
        if (preg_match($regex, basename($file)) || preg_match($regex, basename($file).'/')) {
          if ($remains) {
            $folders = array_merge($folders, file_search($file .'/'. $remains, $flags));
          } else {
            $folders[] = $file .'/';
          }
        }

      } else if ($filetype == 'file') {

      // Skip if not a directory during GLOB_ONLYDIR
        if ($flags & GLOB_ONLYDIR) continue;

      // Collect a matching file
        if (preg_match($regex, basename($file))) {
          $files[] = $file;
        }
      }
    }

  // Merge folders and files into one and same result
    $results = array_merge($folders, $files);

    return $results;
  }

Proposal

Consistently implement StreamWrapper support for glob(). Example:

glob('vfs://*.ext')

Backward Incompatible Changes

No backwards incompatibility

Proposed PHP Version(s)

Next PHP 8.x

RFC Impact

The glob opendir implementation would be replaced with a wrapper supporting streams. It is possible to produce a fallback condition leaving local file system operations as is. But testing so far have not shown any performance anomalies or incompatibilities. So the intention is to also use the new wrapper for local filesystem operations.

open_basedir check will be removed from result filtering in favour of the already new implemented one in wrapper's opendir.

Unaffected PHP Functionality

glob(), GlobalIterator and glob:\\ for local filesystem will return the same results.

Future Scope

There has been some ideas about the future of PHP glob() and the maintained win32 implementation. Whether it could be replaced by a unified standalone implementation, rather than a POSIX layer for the local file system and a separate win32 implementation.

Proposed Voting Choices

Should we implement StreamWrapper support for PHP glob()?

Patches and Tests

Patches and tests are being produced in the Github Feature Request #9224. Final patch will be produced by Github user @KapitanOczywisty.

Implementation

After the project is implemented, this section will contain

  1. the version(s) it was merged into
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)

References

Github Feature Request #9224

rfc/glob_streamwrapper_support.1663174114.txt.gz · Last modified: 2022/09/14 16:48 by timint