Herding Code

Monday, July 7, 2008 by scott

herdingcode Herding Code is a podcast about a variety of topics in technology and software development. It’s done roundtable style with myself, Scott Koon, Kevin Dente, and Jon Galloway. The conversations are a blast, and I hope informative, too.

Tune in to the feed here: http://feeds.feedburner.com/HerdingCode

Swimming Upstream Is Hazardous

Thursday, July 3, 2008 by scott

Salmon swim upstream, and look at what happens …

    

salmon

Every developer is familiar with the “work around”. These are the extra bits of extra code we write to overcome limitations in an API, platform, or framework.

But, sometimes those limitations are a feature. The designer of a framework might be guiding you in a specific direction. Take the Silverlight networking APIs as an example. The APIs provide only asynchronous communication options, yet I’ve seen a few people try to block on network operations with code like the following:

AutoResetEvent _event = new AutoResetEvent(false);
WebClient client = new WebClient();
client.DownloadStringCompleted +=
(s, ev) => { _message.Text = ev.Result; _event.Set(); };
client.DownloadStringAsync(new Uri("foo.xml", UriKind.Relative));
_event.WaitOne();

This code results in a deadlock, since the WebClient tries to raise the completed event on the main thread, but the main thread is blocked inside WaitOne and waiting for the completed event to fire. This deadlock is not only fatal to the Silverlight application, but can bring down the web browser, too. Even if this code didn't create a deadlock, do you really want your application to block over a slow network connection?

When you find yourself writing “work around” code, it’s worthwhile to review the situation. Are you really working around a limitation? Or are you working against the intended use of a framework? Working against the framework is rarely a good idea – there can be a lot of hungry bears waiting to catch you in the future.

Pluralsight 2.0

Wednesday, July 2, 2008 by scott
Pluralsight has a new website, and the new site includes some online training options! See Fritz’s post for more details. Be sure to check out one of the newest classes - the LINQ Fundamentals course, too.

Rob's Not So Lazy MVC Storefront

Wednesday, May 21, 2008 by scott

Rob ran into some lazy load problems in his MVC Storefront and later proclaimed:

"…if you set any Enumerable anything as a property, it's Count property will be accessed when you load the parent object. This negates using any deferred loading for any POCOs, period"

Rob thought this was a problem with .NET in general, but I was suspicious. Veeery suspicious. I downloaded Rob's latest bits and found some interesting behavior.

Based on the screen shot of the call stack that Rob posted, it appeared LINQ to SQL was doing some type conversions. If you poke around the classes mentioned in the call stack, you'll eventually wander into a GenerateConvertToType method that uses LCG to build dynamic methods. Just based on the opening conditional logic, I thought Rob might solve his problem by using LazyList<T> for his business object properties, too (whether or nor he'd want to is a different question), so I modified his Category class for a few experiments to see what would really lazy load.

public class Category {

    
// rob's original
    public IList<Product> Products { get; set; }
    
    
// experimental
    public LazyList<Product> ProductsLazy { get; set; }
    
public IQueryable<Product> ProductsQueryable { get; set; }
    
public IEnumerable<Product> ProductsEnumerable { get; set; }

    
// ...

This was in hopes that LINQ to SQL wouldn't feel compelled to do a conversion via List<T>. I just needed to tweak the query to set all four properties.

var result = from c in db.Categories
            
join cn in culturedName on c.CategoryID equals cn.CategoryID
            
let products = from p in GetProducts()
                 
             join cp in db.Categories_Products
                    
            on p.ID equals cp.ProductID
                      
     where cp.CategoryID == c.CategoryID
                
            select p
             select new Category
             {
                 ID = c.CategoryID,
                 Name = cn.CategoryName,
                 ParentID = c.ParentID ?? 0,
                 Products =
new LazyList<Product>(products),
                 ProductsQueryable = products,
                 ProductsEnumerable = products.AsEnumerable(),
                 ProductsLazy =
new LazyList<Product>(products)          
             };
             return result;

This experiment failed in a stunning fashion, because none of the Product properties lazy loaded – they all eagerly populated themselves full of real product objects. Hmmm.

Slight Detour

Watching SQL Profiler, I started to wonder why there were soooo many queries running. Sure, the stuff wasn't lazy loading but the queries were flying by quicker than eggs at a Steve Ballmer talk. Yet, the code that was kicking off the whole process was just looking for a single category:

Category result = _repository.GetCategories()
                             .WithCategoryID(id)
                             .SingleOrDefault();

That problem turned out to be in Rob's WithCategoryID extension method.

public static IEnumerable<Category> WithCategoryID(
    
                                 this IEnumerable<Category> qry, int ID) {

    
return from c in qry
          
where c.ID == ID
          
select c;
}

By taking an IEnumerable<T> parameter, the extension method was forcing the query to execute and then doing all the ID checks using LINQ to Objects. Just switching over to IQueryable<T> made the method a lot more efficient, and the number of queries came down tremendously.

Correlating Problems

Back to the original problem, which was a bit of a mystery because I've been able to lazy load collections using IEnumerable<T> and IQueryable<T>. After some more fiddling, I began to suspect the query itself. The query uses a correlated subquery by virtue of the fact that the range variable c is used inside the query for products (c.CategoryID). I'm guessing that LINQ to SQL felt compelled to take care of all the work in one fell swoop. Instead of using a subquery, I presented LINQ to SQL with a method call that pushed the needed parameter (c.CategoryID) onto the stack, and made things slightly more readable in the process.

       var result = from c in db.Categories
                    
  join cn in culturedName
                       
on c.CategoryID equals cn.CategoryID
                   

                    let
products = GetProducts(c.CategoryID)
                   

                    select
new Category
                    {
                        ID = c.CategoryID,
                        Name = cn.CategoryName,
                        ParentID = c.ParentID ?? 0,
                        Products =
new LazyList<Product>(products),
                        ProductsQueryable = products,
                        ProductsEnumerable = products.AsEnumerable(),
                        ProductsLazy =
new LazyList<Product>(products)                    
                    };
      
return result;

   }

  
public IQueryable<Product> GetProducts(int categoryID)
   {
      
var products = from p in GetProducts()
                      
join cp in db.Categories_Products on p.ID equals cp.ProductID
                      
where cp.CategoryID == categoryID
                      
select p;
      
return products;
   }

And voila! Three of the properties (ProductsQueryable, ProductsEnumerable, ProductsLazy) would lazy load their Products from the database. Only the original IList<Product> property would eagerly fetch data. From what I can decipher in the grungy code, when LINQ to SQL sees it needs to assign to an IList<T>, and it doesn't have an IList<T>, it eagerly loads a new List<T> and copies those elements into the destination. At least, that's my theory.

Knowing what I know now, I could tell Rob to stick with IList<T> as his property type, but to make sure he has IList<T> on both sides of the assignment in his projection (and tuck the product query into a method call). In other words, use the following to create the LazyList<T> - LINQ to SQL won't load up Products during some wierd type conversion:

public class LazyList<T> : IList<T> {

   
public static IList<T> Create(IQueryable<T> query)
    {
    
    return new LazyList<T>(query);
    }

    // ...

Conclusion? Beware of mismatched types, particularly with IList<T>, and watch out for eager execution with correlated subqueries.

Visual Designers Don’t Scale

Tuesday, May 20, 2008 by scott

Microsoft has a long history of being visual. They've made quite a bit of money implementing graphical user interfaces everywhere – from operating system products to database servers, and of course, developer products. What would Visual Studio be if it wasn't visual?

And oh how visual it is! Visual Studio includes a potpourri of visualization tools. There are class diagrams, form designers, data designers, server explorers, schema designers, and more. I want to classify all these visual tools into one of two categories. The first category includes all the visual tools that build user interfaces – the WinForms and WebForms designers, for instance. The second category includes everything else.

Visual tools that fall into the first category, the UI builders, are special because they never need to scale. Nobody is building a Windows app for 5,000 x 5,000 pixel screens. Nobody is building web forms with 5,000 textbox controls. At least I hope not. You can get a pretty good sense of when you are going to overwhelm a user just by looking at the designer screen.

Visual tools that fall into the second category have to cover a wide range of scenarios, and they need to scale. I stumbled across an 8-year-old technical report today entitled "Visual Scalability". The report defines visual scalability as the "capability of visualization tools to display large data sets". Although this report has demographics data in mind, you can also think of large data sets as databases with a large number of tables, or libraries with a large number of classes - these are the datasets that Visual Studio works with, and as the datasets grow, the tools fall down.

Here is an excerpt of a screenshot for an Analysis Services project I had to work with recently:

Here is an excerpt of an Entity Data model screenshot I fiddled with for a medical database:

These are just two samples where the visual tools don't scale and inflict pain. They are difficult to navigate, and impossible to search. The layout algorithms don't function well on these large datasets, and number of mouse clicks required to make simple changes is astronomical. The best you can do is jump into the gnarly XML that hides behind the visual representation.

I'm wondering if the future will see a reversal in the number of visual tools trying to enter our development workflow. Perhaps textual representations, like DSLs in IronRuby, will be the trick.

The Power of Programming With Attributes

Wednesday, May 14, 2008 by scott

Nothing can compare to the Real Power of programming with attributes. Why, just one pair of square brackets and woosh – my object can be serialize to XML. Woosh – my object can persist to a database table. Woosh – there goes my object over the wire in a digitally signed SOAP payload. One day I expect to see a new item template in Visual Studio – the "Add New All Powerful Attributed Class" template: *

[Table]    
[
DataObject]
[
DataContract]    
[
Serializable]
[
TwoKitchenSinks]      
[
CLSCompliant(true)]        
[
DefaultProperty("Name")]
[
DefaultBindingProperty("Name")]
[
DebuggerStepThroughAttribute]
[
GuidAttribute("F0DD2CAA-2132-11DD-AC50-FE9355D89593")]
public class Person
{
    [
Column]        
    [
DataMember]        
    [
XmlAttribute]
    [
Browsable(true)]
    [
ReadOnly(false)]
    [
Category("Advanced")]
    [
Description("The person's name")]        
    
public string Name { get; set; }

    
// TODO: YOUR INSIGNIFIGANT BIZ LOGIC GOES HERE...
}

Which begs the question – could there ever be a way to separate attributes from the class definition?**

* Put down the flamethrower and step away - I'm kidding.

**This part was a serious question.

Two LINQ to SQL Myths

Monday, May 12, 2008 by scott

LINQ to SQL requires you to start with a database schema.

Not true – you can start with code and create mappings later. In fact, you can write plain-old CLR object like this:

class Movie
{
    
public int ID { get; set; }
    
public string Title { get; set; }
    
public DateTime ReleaseDate { get; set; }
}

… and later either create a mapping file (full of XML like <Table> and <Column>), or decorate the class with mapping attributes (like [Table] and [Column]). You can even use the mapping to create a fresh database schema via the CreateDatabase method of the DataContext class.

LINQ to SQL requires your classes to implement INotifyPropertyChanged and use EntitySet<T> for any associated collections.

Not true, although foregoing either does come with a price. INotifyPropertyChanged allows LINQ to SQL to track changes on your objects. If you don't implement this interface LINQ to SQL can still discover changes for update scenarios, but will take snapshots of all objects, which isn't free. Likewise, EntitySet provides deferred loading and association management for one-to-one and one-to-many relationships between entities. You can build this yourself, but with EntitySet being built on top of IList<T>, you'll probably be recreating the same wheel. There is nothing about EntitySet<T> that ties the class to LINQ to SQL (other than living inside the System.Data.Linq namespace).

LINQ to SQL has limitations and it's a v1 product, but don't think of LINQ to SQL as strictly a drag and drop technology.

My Pluralsight Courses

K.Scott Allen OdeToCode by K. Scott Allen
What JavaScript Developers Should Know About ECMAScript 2015
The Podcast!