3 ways to recursively list all files in a directory
  1. 1. opendir & Dir
  2. 2. glob
    1. 2.1. Glob wrapper and DirectoryIterator class
  3. 3. scandir

During my already long experience with PHP finding all files in a directory was one of the few problems which I have solved each time differently. This short note is meant to sumarize all I have done and bring up my latest version.

opendir & Dir

This is probably most traditional way – using opendir functions and theirs slightly more recent (PHP 4+) OOP variant – the Dir class. Let's see how this is done:

PHP
function ListIn($dir$prefix '') {
  
$dir rtrim($dir'\\/');
  
$result = array();

    
$h opendir($dir);
    while ((
$f readdir($h)) !== false) {
      if (
$f !== '.' and $f !== '..') {
        if (
is_dir("$dir/$f")) {
          
$result array_merge($resultListIn("$dir/$f""$prefix$f/"));
        } else {
          
$result[] = $prefix.$f;
        }
      }
    }
    
closedir($h);

  return 
$result;
}

Better coding practice would need a try..catch block like this one:

PHP
$h opendir($dir);
try {
  ...
} catch (
Exception $e) { $ex $e; }
closedir($h);
if (isset(
$ex)) { throw $ex; }

…but thanks to PHP's nature (short-living scripts) and its absense of try..finally construction (which has an interesting discussion in PHP's bugtracker) exception handling (as well as using) isn't too popular among web engines I know.

This approach has some implications:

But it has at least one advantage as well:

glob

When oen day I have discovered this nice function – glob – I thought that I finally found an ultimate solution to my recursive path scans problems. I was using it intensively until one day when I found an (undocumented) feature preventing glob from matching files/folders starting with period (.) unless explicitly specified.

But before that let's look at the function that can be used to scan a path for files/folders:

PHP
function ListIn($dir$prefix '') {
  
$dir rtrim($dir'\\/');
  
$result = array();

    foreach (
glob("$dir/*"GLOB_MARK) as &$f) {
      if (
substr($f, -1) === '/') {
        
$result array_merge($resultListIn($f$prefix.basename($f).'/';
      } else {
        
$result[] = $prefix.basename($f);
      }
    }

  return 
$result;
}

Just when I thought that the above function was a perfect solution I discovered that it had a bug. People dealing with *nix systems know that it's a kind of unspoken truth to start system/hidden directory/file names with a period (examples: .svn, .cshrc, .login and so on). Some programs use this to hide such files from the output – for example, ls util by default lists the contents of a directory excluding items starting with a period unless -a switch is passed.

And it seems this behaviour has touched PHP too. Because for glob (even on Windows systems) * doesn't match against .htaccess. Sad story it is – and I couldn't find a place in its documentation when it mentions this behaviour.

So in other words this is how things are:

shell$ ls -a
./              ../             .dot            file            file.ext
$ php -r 'foreach (glob("*") as $f) echo $f, "\n";'
file
file.ext
$ php -r 'foreach (glob(".*") as $f) echo $f, "\n";'
.
..
.dot

So we need to fix the above function:

PHP
function ListIn($dir$prefix '') {
  
$dir rtrim($dir'\\/');
  
$result = array();

    
$files array_mergeglob("$dir/*"GLOB_MARK), glob("$dir/.*"GLOB_MARK) );
    foreach (
$files as &$f) {
      
// the following code is exactly as before.
      
...

The implications are:

Advantages exist too:

Glob wrapper and DirectoryIterator class

I don't consider tham separrate methods for walking a directory because they're wrappers for glob() and Dir and only the way results are accessed differs. Also note that Glob was introduced in PHP 5.3.

A short example of Glob usage:

PHP
$dir = new DirectoryIterator('glob://*.txt');
foreach (
$dir as $file) {
  
readfile($file);
}

And of the dir function:

PHP
$dir dir('.');
while ((
$file $dir->read()) !== false) {
  echo 
$file;
}
$dir->close();

As you can see the first function method be convenient if only it wasn't present in PHP 5.3 and later and if it wasn't affected by the first-dot issue described above.
The second method feels the same way as normal opendir does – it even has a callto close.

scandir

This is my latest discovery (not to say recent). PHP's scandir() function reads an entire directory (without subdirectories) into an array, including '.' and '..' enntries. It's like calling opendir, readdir and closedir but with just one function call.
It also accepts stream context which isuseful in some cases.

It's fairly simple to use:

PHP
function ListIn($dir$prefix '') {
  
$dir rtrim($dir'\\/');
  
$result = array();

    foreach (
scandir($dir) as $f) {
      if (
$f !== '.' and $f !== '..') {
        if (
is_dir("$dir/$f")) {
          
$result array_merge($resultListIn("$dir/$f""$prefix$f/"));
        } else {
          
$result[] = $prefix.$f;
        }
      }
    }

  return 
$result;
}

You might notice that it's exactly the same as using opendir – just no calls to readdir and others.

I consider this method better and use it over all previously described ways to scan a directory in most cases. It's not as convenient as glob is but it has no troubles finding file names which first character is period.

Its pros and cons are the same as opendir's except these: