OdeToCode IC Logo

Your Developer Horoscope

Friday, December 19, 2008 by scott

Aries (March 21-April 19): Avoid committing yourself to the next project at work. It’s going to become a death march, and you know it. Save your skills and energy for some open source hacking.

Taurus (April 20-May 20): You’ve been flirting with functional programming and now it’s time to take the plunge. Free your soul of side-effects and embrace a monad. It will make you feel good.

Gemini (May 21-June 21): You are entering a period of introspection. For pair programming, it’s best to hook up with Cancer. Avoid Pisces, because you know you’ll bicker over inheritance versus composition until someone gets hurt by a fast-moving keyboard.

Cancer (June 22-July 22): Your creative juices are flowing. Color coordinated themes will jump from your mind to fill the soft, supple curves of the rounded rectangles your customers visually crave. Who said developers can’t design a UI? Not you!

Leo (July 23-Aug. 22): You might start feeling detached from the rest of the team. Now is the time to randomly refactor code that someone else wrote in the name of collective code ownership. You might spark a new relationship!

Virgo (Aug. 23-Sept. 22): You might need to apply some fundamental design patterns to find the elegant solution you seek. Repository. Abstract Factory. Visitor. Pick 2.

Libra (Sept. 23-Oct. 23): There are lots of meetings in your future. Some of those future meetings will be meeting to discuss future meetings (the meta-meeting meeting). Good for you the gaming market for cell phones is hitting its stride.

Scorpio (Oct. 24-Nov. 21): Don’t get frustrated with what your future holds. Broken builds, failing tests, and bug reports like “I think it broke when I click Submit, or something”. Be positive and live stress free. Consider taking up Yoga.

Sagittarius (Nov. 22-Dec. 21): You should start taking security seriously. SQL injections, code injections, script injections, header injections. Everyone is out to get you, and it’s only a matter of time. Only the paranoid survive. If you can trust a Scorpio, you might find a mentor.

Capricorn (Dec. 22-Jan. 19): Time for a change. If you’ve been using a statically typed language, then try something dynamic. If you’ve been using a dynamic language, then try something static. If you do both, then try Malbolge.

Aquarius (Jan. 20-Feb. 18): To you, software is a craft. You always have your eye on an impossible star, but you reach anyway. The vague predictions of horoscopes drive you crazy, but no one ever accused them of being a science, eh? Pisces will be your friend.

Pisces (Feb. 19-March 20): Monkeys with keyboards – that’s what you may be thinking about your team when the next project starts. When life gives you monkeys – make bananas. Or something like that. Watch out for Sagittarius.

Units of Work

Friday, December 12, 2008 by scott

The Unit of Work (let’s call it UoW) is another common design pattern found in persistence frameworks, but it’s new to many .NET developers who are just starting to use LINQ to SQL or the Entity Framework as the “new ADO.NET”. First, the obligatory P of EAA definition:

A Unit of Work keeps track of everything you do during a business transaction that can affect the database. When you're done, it figures out everything that needs to be done to alter the database as a result of your work.

Please think of a “business conversation” if the phrase “business transaction” makes you think of database transactions with BEGIN and COMMIT keywords. Although a UoW may use a real database transaction, the idea is to think at a slightly higher level. When your software needs to fulfill  a specific business need, like process an expense report, the unit of work will record all the changes you make to the objects involved in processing the expense report and save those changes to the database. The unit of work is like a black box recorder.

DataSets and the UoW

If you’ve been using ADO.NET, then it turns out you’ve already been using a UoW. The ADO.NET DataSet offers a good UoW implementation. You can take some records from the database and shove them into a DataSet, then make changes to the data during the course of a business conversation. When the conversation is complete you can use a table adapter thingy to save the entire batch of changes into the database. The DataSet and adapter combination gives you concurrency checks and the ability to undo changes. You can even take a DataSet and send it on an epic journey outside of your app domain, outside your process, or even outside your machine, and wait for the DataSet to return at some future point in time. During its journey, the DataSet will faithfully continue to record changes the adapter can save. 

DataSets have drawbacks, however, and I’m not going to resurrect that tired conversation. Suffice to say there is a reason that many developers have, for many years, been using persistence frameworks that offer a higher level of abstraction.

Persistence Frameworks and the UoW

The UoW implementation varies widely among the different persistence frameworks. LLBLGen has a serializable class explicitly named UnitOfWork that supports field level versioning, while NHibernate hides a UoW implementation behind an ISession interface. In LINQ to SQL the UoW centers around the DataContext, while in the Entity Framework the UoW centers around the ObjectContext (let’s just refer to both as the Context objects). I’m not going to compare and contrast the different implementations – I’m just pointing out that most frameworks provide a UoW, but the features can vary.

It is the L2S and EF implementations that the rest of this post will discuss because they often trip up the ADO.NET developers who make assumptions about the how the Context will behave. With ADO.NET DataSets, every query sent to the database will retrieve the latest data, but this isn’t always true with the Context objects.

[TestMethod]
public void Context_Is_Isolated_From_Outside_Changes()
{           
    using (var ctx = new MovieReviewsContext())
    {
        var movie = ctx.Movies.Where(m => m.ID == 100)
                              .First();
        // make a change
        movie.ReleaseDate = new DateTime(2007, 1, 1);

        SimulateOtherUsersWork();

        // decide we want to refresh from database
        movie = ctx.Movies.Where(m => m.ID == 100)
                               .First();

        // but we don’t see the new data!!!!!!!!!
        Assert.AreEqual(2007, movie.ReleaseDate.Year); 
    }
}

public void SimulateOtherUsersWork()
{
    using (var ctx = new MovieReviewsContext())
    {
        var movie = ctx.Movies.Where(m => m.ID == 100)
                              .First();
        movie.ReleaseDate = new DateTime(2008, 1, 1);
        ctx.SaveChanges();
    }
}

Remember that the context objects implement an identity map. When the second LINQ query goes to the database and fetches the Movie record with an ID of 100, the context will  consult its identity map and see it has already created an object to represent the Movie with an ID of 100. The context simply returns a reference to the existing object instead of creating a new Movie entity  – meaning the context effectively ignores any changes that have been made in the database.

Some developers will think this is a bug (you mean I can’t get fresh data?), but the context objects are designed to live for a single unit of work - a single business conversation. During this time they also isolate your entities from changes that were persisted in different business conversations. Imagine passing an ExpenseReport for 1000 paperclips through business validation logic and submitting changes – only to have the ExpenseReport contents morph into 1000 voodoo dolls immediately after validation because of some query that was executed implicitly. Practically speaking, many applications will never run into the problem of picking up unintended changes, but this isolated behavior is on by default and the consequences catch more than a few people off guard.

Why Is This Important?

Using LINQ to SQL’s DataContext or the Entity Framework’s ObjectContext for more than one business conversation is usually the wrong thing to do. These classes are designed to support a single unit of work because of the way they track changes and use an identity map. If you keep them around too long and do too much work, then the identity map will grow too big, and the data will become too stale.

For desktop applications it is common to start a unit of work when a dialog opens, and complete the unit of work when the dialog closes (context per view). On the server side it is common to start a unit of work when an HTTP request begins processing, and close the unit of work when the HTTP request ends (context per request).

Of course, not every business conversation can complete in a single request or a single dialog box. Managing “long winded” conversations with EF or L2S is tricky, unfortunately, and it’s part of the debate you’ll see in blogs these day. We’ll have to address this issue in a different post.

XM Radio Player Part III : Choices

Wednesday, December 10, 2008 by scott

WPF, AJAX, or Silverlight RIA?
I gots more options than a block of Velveeta.
XBAP? XCOPY? XAP files are just ZIPs!
Let me run my bits on your silicon chips.

When I started to think of what my XM Radio Player would look like, I could only picture two things:

  • Rectangles with rounded corners.
  • Gradient color fills.

Given this detailed initial vision I knew I had to pick WPF.

What about Silverlight?

I considered Silverlight, but three factors kept me away. One issue would be the cross domain calls to the XM Radio domain. Although XM does put a crossdomain.xml file on their server, it only allows calls from *.xmradio.com, which meant I’d have to deploy a web service just to thunk calls over to XM, which isn’t hard, but added one additional piece of complexity. Secondly, I’ve been up to my ears in Silverlight a few times this year and wanted to try something new. Thirdly, it’s been some time since I’ve written a desktop app and I wanted to have some fun, and also take a look at Prism.

Identity Maps

Monday, December 8, 2008 by scott

There are a couple of important patterns in play when you use a persistence framework. These patterns have been around for quite some time but are relatively new to .NET developers who are jumping into LINQ to SQL and the ADO.NET Entity Framework.

The first of these patterns is the Identity Map pattern. There are various aliases for this pattern. Some frameworks call it the “identity cache” while others have an “entity uniqueing” implementation. Using examples as tests, we can describe its behavior with a passing test (the code is written with the Entity Framework, but you could substitute a LINQ to SQL DataContext (or an LLBLGen Context, or an NHibernate Session) and observe the same behavior – the point being that identity maps can be found in almost every framework for persisting objects).

[TestMethod]
public void Context_Implements_An_Identity_Map()
{
    var context = new MovieReviewsContext();

    var movie1 = context.Movies.Where(m => m.ID == 100).First();
    var movie2 = context.Movies.Where(m => m.ID == 100).First();

    var movie3 = context.Movies.Where(m => m.ID > 98 && m.ID < 105)
                               .ToList()  // bring them all into memory
                               .ElementAt(1);

    Assert.IsTrue(Object.ReferenceEquals(movie1, movie2));
    Assert.IsTrue(Object.ReferenceEquals(movie1, movie3));
}

The test will execute three queries against the database. All three queries will retrieve the movie record with an ID of 100, with the last query also pulling some additional records. No matter how many times we query the database, even using different queries, exactly ONE object represents the movie with an ID of 100. The context uses an identity map to track the entities it creates. The mapping information we provide to the framework allows the context to know which property uniquely identifies each entity (the integer ID property in the case of a a Movie).

You can think of the identity map as a Dictionary<int, Movie>. Each time the context sees a resultset from the database, it checks the dictionary to see if it already created an entity for the data. If so, it just returns a reference to the entity it already created. This is different than the behavior of ADO.NET DataSets, where you can issues three queries for Movie.ID == 100 and get back three distinct DataRows. Also, its important to know that each DataContext/ObjectContext in LINQ to SQL/Entity Framework maintains its own identity map.

[TestMethod]
public void Each_Context_Has_Distinct_Identity_Map()
{
    var context1 = new MovieReviewsContext();
    var context2 = new MovieReviewsContext();

    var movie1 = context1.Movies.Where(m => m.ID == 100).First();
    var movie2 = context2.Movies.Where(m => m.ID == 100).First();

    Assert.IsFalse(Object.ReferenceEquals(movie1, movie2));
}

Why Is This Important?

Some people I’ve talked to don’t like the behavior of the identity map, or are surprised by the behavior. But, imagine if we could create two objects from the same context and both objects represent the movie with an ID of 100 – and then we start making changes to those objects.  Which object should “win” when we save changes to the database? This is the scenario Fowler analogizes in his definition of the identity map pattern.

An old proverb says that a man with two watches never knows what time it is. If two watches are confusing, you can get in an even bigger mess with loading objects from a database. If you aren't careful you can load the data from the same database record into two different objects. Then, when you update them both you'll have an interesting time writing the changes out to the database correctly.

Identity maps enforce object identity in memory in much the same way that a database enforces row identity in a database table using a primary key value.

Not every application needs the safety net afforded by an identity map implementation, but, when one is in place you  need to be aware of its ramifications. We’ll cover these in a future post on another important pattern – the Unit of Work.

XM Radio Player Part II : Scraping

Friday, December 5, 2008 by scott

Just to make sure everything was as easy as it looked in Fiddler – I wrote a quick and dirty piece of throwaway code to see if I could programmatically  login to XM and play a stream of music with Windows Media Player. It was ugly, but …

public void Can_Start_Media_Payer_With_Xm_Stream()
{
    var cookies = new CookieContainer();
    
    // step 1: get auth cookie
    HttpWebRequest request = 
WebRequest.Create("http://xmro.xmradio.com" + "/xstream/login_servlet.jsp") as HttpWebRequest; request.CookieContainer = cookies; request.ContentType = "application/x-www-form-urlencoded"; request.Method = "POST"; var requestStream = request.GetRequestStream(); var loginData = Encoding.Default.GetBytes(
"user_id=*******&pword=******"
); requestStream.Write(loginData, 0, loginData.Length); requestStream.Close(); HttpWebResponse response = request.GetResponse()
as
HttpWebResponse; var data = ReadResponse(response); // ... // step 4: get player URL for channel 26 request = WebRequest.Create(
"http://player.xmradio.com"
+ "/player/2ft/playMedia.jsp?ch=26&speed=high") as HttpWebRequest;
request.Method = "GET"; request.CookieContainer = cookies; response = request.GetResponse() as HttpWebResponse; data = ReadResponse(response); Regex regex =
new
Regex( "<param\\s*name=\"FileName\"\\s*value=\"(.*)\"", RegexOptions.IgnoreCase); string url = regex.Match(data).Groups[1].Value; Process.Start("wmplayer.exe", url.ToString()); }

… it worked!

The most difficult piece was digging out the FileName parameter from the playMedia servlet response. The HTML response was almost, but not quite, XHTML complaint. LINQ to XML wasn’t an option, unfortunately, so I used a regular expression. I don’t write regular expressions often enough to build them without a lot of help, which is why I keep a copy of Roy Osherove’s Regulator around. I could paste a copy of the servlet response into Regulator and then iteratively hack at a regular expression syntax until something worked and I cold stop cursing. In a couple weeks I’ll have no idea what stuff like \”(.*)\”. means anymore, which is both good and bad at the same time.

It’s always liked the idea of looking at the high risk areas of a project and writing some throwaway code to see if a solution is technically feasible. Some people call this a “spike” and as I’ve said before – you never want to let spike code into production (but you never want to really throw the code away, either).

Screen Scraping

The regex in the above code is “scraping” data from the HTML. Screen scraping is nearly as old as computers themselves – in fact it’s one of the earliest forms of interoperability. One way to get data from a mainframe is to  capture the data from the mainframe’s display output on a terminal– thus the term “screen scraping”. Not that I’ve ever written a terminal scraper, but I’ve seen them in action and I wouldn’t be surprised if 90% of all Fortune 500 companies that have been in business for more than 30 years are still using screen scraping somewhere for enterprise application integration.

Something I have done quite a bit is screen scraping for the web. “Web scraping” (as some call it) has a slight negative connotation these days as it can be used for nefarious purposes – like the spammers who harvest email addresses from web pages. There have also been some legal battles when people screen scrape to repurpose someone else’s content. But generally speaking - screen scraping isn’t inherently bad. In this case I’m only using scraped data to build a custom player and listen to music with my paid subscription.

With the .NET libraries there are really just three key areas to focus on:

  • Properly formulating the request
  • Managing cookies
  • Parsing the response

Properly formulating the request was relatively easy in this scenario, even with the credentials in the POST data. Comparing what your program is sending to the server with what the web browser sends to the server using a tool like HTTP Fiddler is the best way to troubleshoot problems. Some scenarios aren’t as easy. For instance, if you need to programmatically post to an ASP.NET web form you need the proper hidden field values for stuff like viewstate and event validation, which means you need to parse these hidden fields from a previous GET request for the same web form.

If a web site requires cookies to work properly (as XM does), then you’ll find that managing cookies with the .NET libraries is tricky – but only because it doesn’t happen by default. However, all you need to do is instantiate an instance of the CookieContainer class and attach this same instance to all of your outgoing web requests. .NET will then take care of storing cookies it finds embedded in a server’s response, and transmitting those cookies back to the server on subsequent requests. The framework knows all about cookie domains and secure cookies, so you don’t need to manually finagle HTTP headers at all – just use the container.

Parsing the response is generally where you invest most of your time and feel the most pain. There are managed libraries like the Html Agility Pack that can help, and 1000 other solutions from String.IndexOf, to regular expressions, to Xpath queries over XHTML complaint markup. The basic problem is that sites generally don’t design their web pages for machine-to-machine interaction and they can change their markup at any time and break your scraping logic. No matter how robust your scraping logic is – nothing short of a perfect AI can keep your logic from breaking. It’s always better to stick with official interoperability endpoints that use SOAP, REST, or some other form of structured response that is intended for machines. It’s strange that we are almost in the year 2009 and we will still be using strings and regular expressions to interop with most of the web for the foreseeable future.

Next Steps

Knowing that a custom player was possible, the next step was to pick a platform to build on.

Stay tuned…

Leeroy Jenkins in Software Development

Thursday, December 4, 2008 by scott

Leeroy Jenkins leaps into action The Jenkins video always makes me laugh. It’s the short and sad tale of a man who doesn’t have the patience for planning, nor the stamina for statistical analysis. He’s a man who leaps into action without thought of consequence. To Leeroy, there are no 50,000 foot views, no discussions, and … no tomorrow. In the video you can watch as Leeroy’s guild is decimated by dragons who take advantage of the chaos provided by Leeroy’s thoughtless actions.

You might be thinking I’m going to liken Leeroy to the developer who writes 200 lines of untested code inside a method named “SubmitOrder” before his team has even left the conference room where they are discussing what the software is supposed to do – but I’m not going to make that analogy.

Well, not that exact analogy.

I think there is a time and a place to get some Leeroy into your life. I’m always amazed at email threads where participants debate the possible outcomes of psuedocode when it would easier to just jump in and let the computer decide the outcome.

When in doubt – just write the code (preferably with a Leeroyesque shout). Thought experiments are for physicists!

XM Radio Player Part I : Fiddling

Tuesday, December 2, 2008 by scott

I’ve had an XM radio subscription for a couple years now but I’ve never been entirely happy with any of the options for XM Radio Online. XM’s online player has changed very little over the years, and it is missing many simple features you can find in other third party players and real radio hardware – like artist and song notifications. A couple days ago I started to think about just writing my own XM radio player for the heck of it.

The first step was to see what goes on between the browser and the server when streaming music. Note: this is an attempt to build a legitimate custom player to use with an XM subscription – not an attempt to get XM programming for free.

Fiddler

Fiddler is my tool of choice for web traffic sniffing. If you are just looking for tools to monitor the JSON payload sizes for an AJAX-ified web page, then there are a million tools and plug-ins with this functionality. Fiddler goes well beyond the entry level sniffing tool and touts itself as an HTTP debugger. You can set breakpoints, tamper with requests, build requests (from scratch, or by dragging and dropping a request from the list of previous requests), force requests to trickle (to simulate slow networks – great for testing SIlverlight splash screens),  and extend Fiddler with managed code. There is also a CustomRules.js file that one can use to quickly extend and customize Fiddler using JScript.NET. As an example, the following code will place a new “Hide CSS” option in the Fiddler Rules menu and declare a variable to store the state of the menu item:

public static RulesOption("Hide CSS requests")
var m_HideCSS: boolean = false;

To hide CSS requests when the menu item is checked, add the following to the OnBeforeRequest function:

if(m_HideCSS) {
  if(oSession.uriContains(".css")) {
    oSession["ui-hide"] = "hiding css";
  }
}

There are dozens of other useful rules in the Fiddlerscript cookbook.

fiddler quickexecPerhaps the most unknown feature in Fiddler is the QuickExec box (or at least I didn’t know about QuickExec until recently). QuickExec is the little black box at the bottom of the recorded session list. The box serves as a command line interface to Fiddler, and you can use QuickExec to filter the session, perform string matching, set breakpoints, launch a web browser, and more. There is a video demonstration for QuickExec available, and a list of default commands. If you use Fiddler regularly it’s worth your time to learn QuickExec and find the shortcuts that make your job easier.

Fiddler and XM

It only takes a few minutes with Fiddler to figure out the four important endpoints for XM radio online.

  • A “login servlet” that requires a POST with user credentials in the payload. A successful login will respond with an auth cookie that needs to be sent along on subsequent requests. In ASP.NET we’d call this forms authentication.
  • A “channel data” servlet that returns the available XM channels in JSON.
  • A “now  playing” servlet that returns the artist and song currently playing on each XM channel (again in JSON).
  • A “play media” servlet that accepts a channel number in the query string and returns HTML. Buried in the HTML is an object tag that instantiates media player and includes the URL to the music as a parameter (actually the URL parameter points to an endpoint that returns an ASX playlist which media player understands – the actual mms:// endpoint is in the ASX). The URL includes what looks like an encrypted auth token - the auth cookie isn’t required anymore, but the streaming server can still determine if the user is authorized to steam music.

This was enough information to decide that building a custom player isn’t terribly difficult. The next step was to write some .NET code just to prove there wasn’t any technical hurdles in walking through these 4 endpoints. We’ll look at that in Part II.

(c) OdeToCode LLC 2004 - 2025