During my already long experience with PHP finding all files in a directory was one of the few problems which I have solved each time differently. This short note is meant to sumarize all I have done and bring up my latest version.
This is probably most traditional way – using opendir functions and theirs slightly more recent (PHP 4+) OOP variant – the Dir class. Let's see how this is done:
function ListIn($dir, $prefix = '') {
$dir = rtrim($dir, '\\/');
$result = array();
$h = opendir($dir);
while (($f = readdir($h)) !== false) {
if ($f !== '.' and $f !== '..') {
if (is_dir("$dir/$f")) {
$result = array_merge($result, ListIn("$dir/$f", "$prefix$f/"));
} else {
$result[] = $prefix.$f;
}
}
}
closedir($h);
return $result;
}
Better coding practice would need a try..catch block like this one:
$h = opendir($dir);
try {
...
} catch (Exception $e) { $ex = $e; }
closedir($h);
if (isset($ex)) { throw $ex; }
…but thanks to PHP's nature (short-living scripts) and its absense of try..finally construction (which has an interesting discussion in PHP's bugtracker) exception handling (as well as using) isn't too popular among web engines I know.
This approach has some implications:
But it has at least one advantage as well:
When oen day I have discovered this nice function – glob – I thought that I finally found an ultimate solution to my recursive path scans problems. I was using it intensively until one day when I found an (undocumented) feature preventing glob from matching files/folders starting with period (.) unless explicitly specified.
But before that let's look at the function that can be used to scan a path for files/folders:
function ListIn($dir, $prefix = '') {
$dir = rtrim($dir, '\\/');
$result = array();
foreach (glob("$dir/*", GLOB_MARK) as &$f) {
if (substr($f, -1) === '/') {
$result = array_merge($result, ListIn($f, $prefix.basename($f).'/';
} else {
$result[] = $prefix.basename($f);
}
}
return $result;
}
Just when I thought that the above function was a perfect solution I discovered that it had a bug. People dealing with *nix systems know that it's a kind of unspoken truth to start system/hidden directory/file names with a period (examples: .svn, .cshrc, .login and so on). Some programs use this to hide such files from the output – for example, ls util by default lists the contents of a directory excluding items starting with a period unless -a switch is passed.
And it seems this behaviour has touched PHP too. Because for glob (even on Windows systems) * doesn't match against .htaccess. Sad story it is – and I couldn't find a place in its documentation when it mentions this behaviour.
So in other words this is how things are:
shell$ ls -a ./ ../ .dot file file.ext $ php -r 'foreach (glob("*") as $f) echo $f, "\n";' file file.ext $ php -r 'foreach (glob(".*") as $f) echo $f, "\n";' . .. .dot
So we need to fix the above function:
function ListIn($dir, $prefix = '') {
$dir = rtrim($dir, '\\/');
$result = array();
$files = array_merge( glob("$dir/*", GLOB_MARK), glob("$dir/.*", GLOB_MARK) );
foreach ($files as &$f) {
// the following code is exactly as before.
...
PHPglob('*/*')
matches files in all directories that are inside current path (but not current path itself). However, unlike Ruby's glob it only matches one tree level so you can't match this and all nested directories in one call.I don't consider tham separrate methods for walking a directory because they're wrappers for glob() and Dir and only the way results are accessed differs. Also note that Glob was introduced in PHP 5.3.
A short example of Glob usage:
$dir = new DirectoryIterator('glob://*.txt');
foreach ($dir as $file) {
readfile($file);
}
And of the dir function:
$dir = dir('.');
while (($file = $dir->read()) !== false) {
echo $file;
}
$dir->close();
As you can see the first function method be convenient if only it wasn't present in PHP 5.3
and later and if it wasn't affected by the first-dot issue described above.
The second method feels the same way as normal opendir does – it even has a callto close.
This is my latest discovery (not to say recent). PHP's scandir() function reads
an entire directory (without subdirectories) into an array, including '.' and '..' enntries.
It's like calling opendir, readdir and closedir but with just one
function call.
It also accepts stream context which isuseful in some cases.
function ListIn($dir, $prefix = '') {
$dir = rtrim($dir, '\\/');
$result = array();
foreach (scandir($dir) as $f) {
if ($f !== '.' and $f !== '..') {
if (is_dir("$dir/$f")) {
$result = array_merge($result, ListIn("$dir/$f", "$prefix$f/"));
} else {
$result[] = $prefix.$f;
}
}
}
return $result;
}
You might notice that it's exactly the same as using opendir – just no calls to readdir and others.
I consider this method better and use it over all previously described ways to scan a directory in most cases. It's not as convenient as glob is but it has no troubles finding file names which first character is period.
Its pros and cons are the same as opendir's except these:
PHPtrue
as scandir's second argument to sort in descending order: PHPscandir('.', true)
.