Getting a Recursive FTP File List in .Net
Even though it’s 2009, there are still some dark areas of the internet that haven’t been upgraded to modern standards. FTP is one of them.
FTP is closer to HTTP than you think – results of FTP commands are sent back as plain text. There is no field delimiter, no standard field order, and not even a standard of what data gets returned. FTP was written with the idea that the user is on a text console, and would be reading the messages from the server directly – clients shouldn’t parse results. Still, FTP is in wide use and available on everything from servers to cell phone microchips because it does a good job at moving files.
The problem: You need to get a list of all files on a server using FTP.
The issue: FTP doesn’t provide a built in method to get a recursive list of all files, and provides two basic methods to get lists of files in the current directory. LIST (WebRequestMethods.Ftp.ListDirectoryDetails) gets a list of files and details (formatting subject to the server’s configuration), and NLIST(WebRequestMethods.Ftp.ListDirectory) gets a “name list” which is the same list as LIST, but only returns filenames, and now details.
The result of LIST might look like:
09-18-08 02:11PM 18918524 readme.txt
09-18-08 02:13PM 18918676 Try 2 Parse Me!
05-04-09 02:16PM <DIR> I’m a folder
… or it might look like this:
-rwxrwxrwx 1 owner group 18918524 Sep 18 2008 readme.txt
-rwxrwxrwx 1 owner group 18918676 Sep 18 2008 Try 2 Parse Me!
drwxrwxrwx 1 owner group 0 May 4 14:16 I’m a folder
… or something else entirely. That’s the F in FTP – Fun! (Or it could mean F**k’d).
The solution: First I’m vetoing the use of regular expressions. Experience has taught me there be dragons in that namespace and anytime you can avoid a regex, do. Second, avoid recursive functions unless your in a functional programming language. Here we go:
public static String[] FTPListTree(String FtpUri, String User, String Pass) {
List<String> files = new List<String>();
Queue<String> folders = new Queue<String>();
folders.Enqueue(FtpUri);
while (folders.Count > 0) {
String fld = folders.Dequeue();
List<String> newFiles = new List<String>();
FtpWebRequest ftp = (FtpWebRequest)FtpWebRequest.Create(fld);
ftp.Credentials = new NetworkCredential(User, Pass);
ftp.UsePassive = false;
ftp.Method = WebRequestMethods.Ftp.ListDirectory;
using (StreamReader resp = new StreamReader(ftp.GetResponse().GetResponseStream())) {
String line = resp.ReadLine();
while (line != null) {
newFiles.Add(line.Trim());
line = resp.ReadLine();
}
}
ftp = (FtpWebRequest)FtpWebRequest.Create(fld);
ftp.Credentials = new NetworkCredential(User, Pass);
ftp.UsePassive = false;
ftp.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
using (StreamReader resp = new StreamReader(ftp.GetResponse().GetResponseStream())) {
String line = resp.ReadLine();
while (line != null) {
if (line.Trim().ToLower().StartsWith("d") || line.Contains(" <DIR> ")) {
String dir = newFiles.First(x => line.EndsWith(x));
newFiles.Remove(dir);
folders.Enqueue(fld + dir + "/");
}
line = resp.ReadLine();
}
}
files.AddRange(from f in newFiles select fld + f);
}
return files.ToArray();
}
This function uses a two step process to parse a directory. First a list of file and directory names is retrieved, then a second call is made to get the details of the files. Yes, there are two calls to the server per directory – this allows a safe way to determine the directory name without heavy parsing of the details string. The use of a Queue avoids the need for recursion.
Notes: This function doesn’t perform error checking and will throw an exception on any error – in my case this is the desired behavior, but YMMV. Also, this method isn’t designed for speed – it’s fast enough for my solution (syncing folders across FTP with some custom logic tossed in), so I’m sure there is some room for improvement.
I posted this because I didn’t find anything in the .Net framework that did this already, and searching I found an overwhelming number of samples using regular expressions. Regular expressions are tricky to get right, hard to read, a pain to test, and in my view are a weapon of last resort when a degree of false positives are acceptable.




