Brute Force Might Work

Thursday, November 29, 2012

Problem: A content heavy web site undergoes a massive reorganization. A dozen URL rewrite rules are added to issue permanent redirects and preserve permalinks across the two versions. The question is - will the rules really work?

Brute Force Approach: Take the last 7 days of web server logs, and write about 40 lines of C# code:

static void Main()
{
    var baseUrl = new Uri("http://localhost/");
    var files = Directory.GetFiles(@"..\..\Data", "*.log");

    Parallel.ForEach(files, file =>
        {
            var requests = File.ReadAllLines(file)
                .Where(line => !line.StartsWith("#"))
                .Select(line => line.Split(' '))
                .Select(parts => new
                                     {
                                         Url = parts[6],
                                         Status = parts[16],
                                         Verb = parts[5]
                                     })
                .Where(request => request.Verb == "GET" && 
                                  request.Status[0] != '4' &&
                                  request.Status[0] != '5');

            foreach (var request in requests)
            {
                try
                {
                    var client = new WebClient();
                    client.DownloadData(new Uri(baseUrl, request.Url));
                }
                catch (Exception ex)
                {
                    // .. log it (in a thread safe way)
                }
            }
        });
}            

The little console mode application isn't foolproof, but it did uncover a number of problems and edge cases. As a bonus, it also turned into a good workload generator for SQL Server index re-tuning.


Comments
gravatar Matt Friday, November 30, 2012
Great idea. Love it.
Comments are closed.

My Pluralsight Courses

K.Scott Allen OdeToCode by K. Scott Allen
What JavaScript Developers Should Know About ECMAScript 2015
The Podcast!