Problem: A content heavy web site undergoes a massive reorganization. A dozen URL rewrite rules are added to issue permanent redirects and preserve permalinks across the two versions. The question is - will the rules really work?
Brute Force Approach: Take the last 7 days of web server logs, and write about 40 lines of C# code:
static void Main() { var baseUrl = new Uri("http://localhost/"); var files = Directory.GetFiles(@"..\..\Data", "*.log"); Parallel.ForEach(files, file => { var requests = File.ReadAllLines(file) .Where(line => !line.StartsWith("#")) .Select(line => line.Split(' ')) .Select(parts => new { Url = parts[6], Status = parts[16], Verb = parts[5] }) .Where(request => request.Verb == "GET" && request.Status[0] != '4' && request.Status[0] != '5'); foreach (var request in requests) { try { var client = new WebClient(); client.DownloadData(new Uri(baseUrl, request.Url)); } catch (Exception ex) { // .. log it (in a thread safe way) } } }); }
The little console mode application isn't foolproof, but it did uncover a number of problems and edge cases. As a bonus, it also turned into a good workload generator for SQL Server index re-tuning.