OdeToCode IC Logo

Dates and Times - Software's Bane

Wednesday, January 7, 2009 by scott
I’ve always been amused by date and time problems in software, and in my mind I often juxtapose “real” time against “software” time.  Real time inexorably marches forward, while software time can go forwards, backwards, left, right, and sometimes belly up. Inside this theater of my brain, real world time is played by a dead panning Steve Martin, while software’s interpretation of time is a buffoonish John Candy. You can’t help but laugh at the contrast – it’s a classic comedic recipe.

Naturally, I chuckled when last week’s Zunicide turned out to be leap year related. I also giggled at the Pontiac gaff, chortled when I read about crashed DVRs, tittered over the stalled Norwegian trains, and cackled at Cisco’s Kerberos slipup. For non stop laughs I can read the Leap Zine -  a birthday club for anyone born on February 29th. The stories the leap babies can tell are endless. Problems with driver licenses, insurance policies, and rental car companies. The list goes on and on. Some leap babies alter their birth certificates just to avoid the entire mess.

Do leap years represent an edge case? That was one of the topics for debate today on Twitter. I don’t believe leap years fall under the strict definition of an edge case, but they are obviously tricky enough that we’ve consistently screwed them up - and it’s not always a laughing matter. In 1996, a leap year bug shut down 660 computers at a New Zealand aluminum smelter and cost the plant over $1 million in repairs. I’m sure they fixed the bug, but I think there is still a problem with their pronunciation of aluminum. It’s 4 syllables, everybody – ah-LOO-meh-num.

There are also numerous time bugs in software that aren’t leap year bugs. Remember the Win98 date rollover bug? Ever used software with an inadvertent time bomb? How many glitches exist because of the two digit year? Developers are cursed when it comes to date and time handling. And just when you have it figured out, some government, or Pope goes and changes the rules.

Leaps and Boundaries

Q: Why shouldn’t the following test pass?

[TestMethod]
public void A_Name_Would_Give_It_Away()
{
DateTime almost2009 =
new DateTime(2008, 12, 31, 23, 59, 59,
DateTimeKind.Utc);
    DateTime newDate = almost2009.AddSeconds(1);

Assert.AreEqual(2009, newDate.Year);
}

A: Because 2008 included a leap second. Technically, the last second for 2008 on the UTC clock was 23:59:60 (not that any runtime I’ve ever used correctly accounts for leap seconds).

If leap years are a minefield for developers – at least they are a predictable minefield. We know when leap years will occur in the future. All we need is the diligence and thought to write some proper algorithms. Leap seconds are a different matter.

Q: How many UTC seconds will elapse between now and January 1st, 2012?

A: Nobody knows

Leap seconds are a periodic adjustment made to UTC time. The IERS announces the need for a leap second in a bulletin they publish every 6 months. Maybe we will have a leap second this year, maybe not. We have millions of seconds elapse in a year, though, does anyone care if we miss just one?

Ask the makers of a Motorola GPS receiver that is rumored to be utilized in some fast moving, explosive munitions.

If time wasn’t so important to civilization, I think we’d try to circumvent all these problems by just not mixing clocks with computers.

A Message For You

Thursday, January 1, 2009 by scott

This year’s silly, obfuscated, console mode C# program is brought to you by LINQ.

using System.Linq;
using Electricity = System.Collections.Generic.IEnumerable<char>;
using here = System.Collections.Generic.Dictionary<int, char>;
namespace Message{static class __{

static Electricity Transform(this here collection)
{  return new
    int[] { 
     16, 22, 28, 
      28, 7, 0, 20, 
       9, 11, 0, 7, 
        9,22,13,5 }
         .Select(i =>
          collection[i]);} 
static void Main() {
    System.Console.WriteLine(
    @"                       ._
                             !~
                             yuuue
                             |_w-|
                             | _r|
                             |_ -|
        ________ .$$. ______ | - | _____________
                .#$H$. __    |-  | ....__
          _.--' $$$$$$    ` -[__n]        `--a:f-
                $$$$$$    -.
           -.    `:/'    _.))        .--.
                  ||   .'.-'     _..-.. _.-.
           ._.-.  ''  /  (     .'      `.
         -'     `.   .    `. -'p
                  `. .      `--..
                      `.
        "
  .Distinct().Select((c, i) => new { c, i })
  .ToDictionary(ci => ci.i, ci => ci.c)
  .Transform().Aggregate("", (s, c) => s += c));  
}}} // ascii_based_on_art_By_Andreas_Freise www.ascii-art.de //

 

Peace!

Your Developer Horoscope

Friday, December 19, 2008 by scott

Aries (March 21-April 19): Avoid committing yourself to the next project at work. It’s going to become a death march, and you know it. Save your skills and energy for some open source hacking.

Taurus (April 20-May 20): You’ve been flirting with functional programming and now it’s time to take the plunge. Free your soul of side-effects and embrace a monad. It will make you feel good.

Gemini (May 21-June 21): You are entering a period of introspection. For pair programming, it’s best to hook up with Cancer. Avoid Pisces, because you know you’ll bicker over inheritance versus composition until someone gets hurt by a fast-moving keyboard.

Cancer (June 22-July 22): Your creative juices are flowing. Color coordinated themes will jump from your mind to fill the soft, supple curves of the rounded rectangles your customers visually crave. Who said developers can’t design a UI? Not you!

Leo (July 23-Aug. 22): You might start feeling detached from the rest of the team. Now is the time to randomly refactor code that someone else wrote in the name of collective code ownership. You might spark a new relationship!

Virgo (Aug. 23-Sept. 22): You might need to apply some fundamental design patterns to find the elegant solution you seek. Repository. Abstract Factory. Visitor. Pick 2.

Libra (Sept. 23-Oct. 23): There are lots of meetings in your future. Some of those future meetings will be meeting to discuss future meetings (the meta-meeting meeting). Good for you the gaming market for cell phones is hitting its stride.

Scorpio (Oct. 24-Nov. 21): Don’t get frustrated with what your future holds. Broken builds, failing tests, and bug reports like “I think it broke when I click Submit, or something”. Be positive and live stress free. Consider taking up Yoga.

Sagittarius (Nov. 22-Dec. 21): You should start taking security seriously. SQL injections, code injections, script injections, header injections. Everyone is out to get you, and it’s only a matter of time. Only the paranoid survive. If you can trust a Scorpio, you might find a mentor.

Capricorn (Dec. 22-Jan. 19): Time for a change. If you’ve been using a statically typed language, then try something dynamic. If you’ve been using a dynamic language, then try something static. If you do both, then try Malbolge.

Aquarius (Jan. 20-Feb. 18): To you, software is a craft. You always have your eye on an impossible star, but you reach anyway. The vague predictions of horoscopes drive you crazy, but no one ever accused them of being a science, eh? Pisces will be your friend.

Pisces (Feb. 19-March 20): Monkeys with keyboards – that’s what you may be thinking about your team when the next project starts. When life gives you monkeys – make bananas. Or something like that. Watch out for Sagittarius.

Units of Work

Friday, December 12, 2008 by scott

The Unit of Work (let’s call it UoW) is another common design pattern found in persistence frameworks, but it’s new to many .NET developers who are just starting to use LINQ to SQL or the Entity Framework as the “new ADO.NET”. First, the obligatory P of EAA definition:

A Unit of Work keeps track of everything you do during a business transaction that can affect the database. When you're done, it figures out everything that needs to be done to alter the database as a result of your work.

Please think of a “business conversation” if the phrase “business transaction” makes you think of database transactions with BEGIN and COMMIT keywords. Although a UoW may use a real database transaction, the idea is to think at a slightly higher level. When your software needs to fulfill  a specific business need, like process an expense report, the unit of work will record all the changes you make to the objects involved in processing the expense report and save those changes to the database. The unit of work is like a black box recorder.

DataSets and the UoW

If you’ve been using ADO.NET, then it turns out you’ve already been using a UoW. The ADO.NET DataSet offers a good UoW implementation. You can take some records from the database and shove them into a DataSet, then make changes to the data during the course of a business conversation. When the conversation is complete you can use a table adapter thingy to save the entire batch of changes into the database. The DataSet and adapter combination gives you concurrency checks and the ability to undo changes. You can even take a DataSet and send it on an epic journey outside of your app domain, outside your process, or even outside your machine, and wait for the DataSet to return at some future point in time. During its journey, the DataSet will faithfully continue to record changes the adapter can save. 

DataSets have drawbacks, however, and I’m not going to resurrect that tired conversation. Suffice to say there is a reason that many developers have, for many years, been using persistence frameworks that offer a higher level of abstraction.

Persistence Frameworks and the UoW

The UoW implementation varies widely among the different persistence frameworks. LLBLGen has a serializable class explicitly named UnitOfWork that supports field level versioning, while NHibernate hides a UoW implementation behind an ISession interface. In LINQ to SQL the UoW centers around the DataContext, while in the Entity Framework the UoW centers around the ObjectContext (let’s just refer to both as the Context objects). I’m not going to compare and contrast the different implementations – I’m just pointing out that most frameworks provide a UoW, but the features can vary.

It is the L2S and EF implementations that the rest of this post will discuss because they often trip up the ADO.NET developers who make assumptions about the how the Context will behave. With ADO.NET DataSets, every query sent to the database will retrieve the latest data, but this isn’t always true with the Context objects.

[TestMethod]
public void Context_Is_Isolated_From_Outside_Changes()
{           
    using (var ctx = new MovieReviewsContext())
    {
        var movie = ctx.Movies.Where(m => m.ID == 100)
                              .First();
        // make a change
        movie.ReleaseDate = new DateTime(2007, 1, 1);

        SimulateOtherUsersWork();

        // decide we want to refresh from database
        movie = ctx.Movies.Where(m => m.ID == 100)
                               .First();

        // but we don’t see the new data!!!!!!!!!
        Assert.AreEqual(2007, movie.ReleaseDate.Year); 
    }
}

public void SimulateOtherUsersWork()
{
    using (var ctx = new MovieReviewsContext())
    {
        var movie = ctx.Movies.Where(m => m.ID == 100)
                              .First();
        movie.ReleaseDate = new DateTime(2008, 1, 1);
        ctx.SaveChanges();
    }
}

Remember that the context objects implement an identity map. When the second LINQ query goes to the database and fetches the Movie record with an ID of 100, the context will  consult its identity map and see it has already created an object to represent the Movie with an ID of 100. The context simply returns a reference to the existing object instead of creating a new Movie entity  – meaning the context effectively ignores any changes that have been made in the database.

Some developers will think this is a bug (you mean I can’t get fresh data?), but the context objects are designed to live for a single unit of work - a single business conversation. During this time they also isolate your entities from changes that were persisted in different business conversations. Imagine passing an ExpenseReport for 1000 paperclips through business validation logic and submitting changes – only to have the ExpenseReport contents morph into 1000 voodoo dolls immediately after validation because of some query that was executed implicitly. Practically speaking, many applications will never run into the problem of picking up unintended changes, but this isolated behavior is on by default and the consequences catch more than a few people off guard.

Why Is This Important?

Using LINQ to SQL’s DataContext or the Entity Framework’s ObjectContext for more than one business conversation is usually the wrong thing to do. These classes are designed to support a single unit of work because of the way they track changes and use an identity map. If you keep them around too long and do too much work, then the identity map will grow too big, and the data will become too stale.

For desktop applications it is common to start a unit of work when a dialog opens, and complete the unit of work when the dialog closes (context per view). On the server side it is common to start a unit of work when an HTTP request begins processing, and close the unit of work when the HTTP request ends (context per request).

Of course, not every business conversation can complete in a single request or a single dialog box. Managing “long winded” conversations with EF or L2S is tricky, unfortunately, and it’s part of the debate you’ll see in blogs these day. We’ll have to address this issue in a different post.

XM Radio Player Part III : Choices

Wednesday, December 10, 2008 by scott

WPF, AJAX, or Silverlight RIA?
I gots more options than a block of Velveeta.
XBAP? XCOPY? XAP files are just ZIPs!
Let me run my bits on your silicon chips.

When I started to think of what my XM Radio Player would look like, I could only picture two things:

  • Rectangles with rounded corners.
  • Gradient color fills.

Given this detailed initial vision I knew I had to pick WPF.

What about Silverlight?

I considered Silverlight, but three factors kept me away. One issue would be the cross domain calls to the XM Radio domain. Although XM does put a crossdomain.xml file on their server, it only allows calls from *.xmradio.com, which meant I’d have to deploy a web service just to thunk calls over to XM, which isn’t hard, but added one additional piece of complexity. Secondly, I’ve been up to my ears in Silverlight a few times this year and wanted to try something new. Thirdly, it’s been some time since I’ve written a desktop app and I wanted to have some fun, and also take a look at Prism.

Identity Maps

Monday, December 8, 2008 by scott

There are a couple of important patterns in play when you use a persistence framework. These patterns have been around for quite some time but are relatively new to .NET developers who are jumping into LINQ to SQL and the ADO.NET Entity Framework.

The first of these patterns is the Identity Map pattern. There are various aliases for this pattern. Some frameworks call it the “identity cache” while others have an “entity uniqueing” implementation. Using examples as tests, we can describe its behavior with a passing test (the code is written with the Entity Framework, but you could substitute a LINQ to SQL DataContext (or an LLBLGen Context, or an NHibernate Session) and observe the same behavior – the point being that identity maps can be found in almost every framework for persisting objects).

[TestMethod]
public void Context_Implements_An_Identity_Map()
{
    var context = new MovieReviewsContext();

    var movie1 = context.Movies.Where(m => m.ID == 100).First();
    var movie2 = context.Movies.Where(m => m.ID == 100).First();

    var movie3 = context.Movies.Where(m => m.ID > 98 && m.ID < 105)
                               .ToList()  // bring them all into memory
                               .ElementAt(1);

    Assert.IsTrue(Object.ReferenceEquals(movie1, movie2));
    Assert.IsTrue(Object.ReferenceEquals(movie1, movie3));
}

The test will execute three queries against the database. All three queries will retrieve the movie record with an ID of 100, with the last query also pulling some additional records. No matter how many times we query the database, even using different queries, exactly ONE object represents the movie with an ID of 100. The context uses an identity map to track the entities it creates. The mapping information we provide to the framework allows the context to know which property uniquely identifies each entity (the integer ID property in the case of a a Movie).

You can think of the identity map as a Dictionary<int, Movie>. Each time the context sees a resultset from the database, it checks the dictionary to see if it already created an entity for the data. If so, it just returns a reference to the entity it already created. This is different than the behavior of ADO.NET DataSets, where you can issues three queries for Movie.ID == 100 and get back three distinct DataRows. Also, its important to know that each DataContext/ObjectContext in LINQ to SQL/Entity Framework maintains its own identity map.

[TestMethod]
public void Each_Context_Has_Distinct_Identity_Map()
{
    var context1 = new MovieReviewsContext();
    var context2 = new MovieReviewsContext();

    var movie1 = context1.Movies.Where(m => m.ID == 100).First();
    var movie2 = context2.Movies.Where(m => m.ID == 100).First();

    Assert.IsFalse(Object.ReferenceEquals(movie1, movie2));
}

Why Is This Important?

Some people I’ve talked to don’t like the behavior of the identity map, or are surprised by the behavior. But, imagine if we could create two objects from the same context and both objects represent the movie with an ID of 100 – and then we start making changes to those objects.  Which object should “win” when we save changes to the database? This is the scenario Fowler analogizes in his definition of the identity map pattern.

An old proverb says that a man with two watches never knows what time it is. If two watches are confusing, you can get in an even bigger mess with loading objects from a database. If you aren't careful you can load the data from the same database record into two different objects. Then, when you update them both you'll have an interesting time writing the changes out to the database correctly.

Identity maps enforce object identity in memory in much the same way that a database enforces row identity in a database table using a primary key value.

Not every application needs the safety net afforded by an identity map implementation, but, when one is in place you  need to be aware of its ramifications. We’ll cover these in a future post on another important pattern – the Unit of Work.

XM Radio Player Part II : Scraping

Friday, December 5, 2008 by scott

Just to make sure everything was as easy as it looked in Fiddler – I wrote a quick and dirty piece of throwaway code to see if I could programmatically  login to XM and play a stream of music with Windows Media Player. It was ugly, but …

public void Can_Start_Media_Payer_With_Xm_Stream()
{
    var cookies = new CookieContainer();
    
    // step 1: get auth cookie
    HttpWebRequest request = 
WebRequest.Create("http://xmro.xmradio.com" + "/xstream/login_servlet.jsp") as HttpWebRequest; request.CookieContainer = cookies; request.ContentType = "application/x-www-form-urlencoded"; request.Method = "POST"; var requestStream = request.GetRequestStream(); var loginData = Encoding.Default.GetBytes(
"user_id=*******&pword=******"
); requestStream.Write(loginData, 0, loginData.Length); requestStream.Close(); HttpWebResponse response = request.GetResponse()
as
HttpWebResponse; var data = ReadResponse(response); // ... // step 4: get player URL for channel 26 request = WebRequest.Create(
"http://player.xmradio.com"
+ "/player/2ft/playMedia.jsp?ch=26&speed=high") as HttpWebRequest;
request.Method = "GET"; request.CookieContainer = cookies; response = request.GetResponse() as HttpWebResponse; data = ReadResponse(response); Regex regex =
new
Regex( "<param\\s*name=\"FileName\"\\s*value=\"(.*)\"", RegexOptions.IgnoreCase); string url = regex.Match(data).Groups[1].Value; Process.Start("wmplayer.exe", url.ToString()); }

… it worked!

The most difficult piece was digging out the FileName parameter from the playMedia servlet response. The HTML response was almost, but not quite, XHTML complaint. LINQ to XML wasn’t an option, unfortunately, so I used a regular expression. I don’t write regular expressions often enough to build them without a lot of help, which is why I keep a copy of Roy Osherove’s Regulator around. I could paste a copy of the servlet response into Regulator and then iteratively hack at a regular expression syntax until something worked and I cold stop cursing. In a couple weeks I’ll have no idea what stuff like \”(.*)\”. means anymore, which is both good and bad at the same time.

It’s always liked the idea of looking at the high risk areas of a project and writing some throwaway code to see if a solution is technically feasible. Some people call this a “spike” and as I’ve said before – you never want to let spike code into production (but you never want to really throw the code away, either).

Screen Scraping

The regex in the above code is “scraping” data from the HTML. Screen scraping is nearly as old as computers themselves – in fact it’s one of the earliest forms of interoperability. One way to get data from a mainframe is to  capture the data from the mainframe’s display output on a terminal– thus the term “screen scraping”. Not that I’ve ever written a terminal scraper, but I’ve seen them in action and I wouldn’t be surprised if 90% of all Fortune 500 companies that have been in business for more than 30 years are still using screen scraping somewhere for enterprise application integration.

Something I have done quite a bit is screen scraping for the web. “Web scraping” (as some call it) has a slight negative connotation these days as it can be used for nefarious purposes – like the spammers who harvest email addresses from web pages. There have also been some legal battles when people screen scrape to repurpose someone else’s content. But generally speaking - screen scraping isn’t inherently bad. In this case I’m only using scraped data to build a custom player and listen to music with my paid subscription.

With the .NET libraries there are really just three key areas to focus on:

  • Properly formulating the request
  • Managing cookies
  • Parsing the response

Properly formulating the request was relatively easy in this scenario, even with the credentials in the POST data. Comparing what your program is sending to the server with what the web browser sends to the server using a tool like HTTP Fiddler is the best way to troubleshoot problems. Some scenarios aren’t as easy. For instance, if you need to programmatically post to an ASP.NET web form you need the proper hidden field values for stuff like viewstate and event validation, which means you need to parse these hidden fields from a previous GET request for the same web form.

If a web site requires cookies to work properly (as XM does), then you’ll find that managing cookies with the .NET libraries is tricky – but only because it doesn’t happen by default. However, all you need to do is instantiate an instance of the CookieContainer class and attach this same instance to all of your outgoing web requests. .NET will then take care of storing cookies it finds embedded in a server’s response, and transmitting those cookies back to the server on subsequent requests. The framework knows all about cookie domains and secure cookies, so you don’t need to manually finagle HTTP headers at all – just use the container.

Parsing the response is generally where you invest most of your time and feel the most pain. There are managed libraries like the Html Agility Pack that can help, and 1000 other solutions from String.IndexOf, to regular expressions, to Xpath queries over XHTML complaint markup. The basic problem is that sites generally don’t design their web pages for machine-to-machine interaction and they can change their markup at any time and break your scraping logic. No matter how robust your scraping logic is – nothing short of a perfect AI can keep your logic from breaking. It’s always better to stick with official interoperability endpoints that use SOAP, REST, or some other form of structured response that is intended for machines. It’s strange that we are almost in the year 2009 and we will still be using strings and regular expressions to interop with most of the web for the foreseeable future.

Next Steps

Knowing that a custom player was possible, the next step was to pick a platform to build on.

Stay tuned…

(c) OdeToCode LLC 2004 - 2025