OdeToCode IC Logo

Downloads, Downloads Everywhere

Friday, June 30, 2006 by scott

A real programmer on the Win-OT list has been waxing eloquently about apt-get. He does have a point.

apt-get is one of the command line tools that can administer software packages on Linux distros. apt-get manages all the gnarled dependencies and makes installing new software on Linux mostly a breeze. Throw on one of the GUI front ends for apt-get, and new software is only a few clicks away. Want to program in Ruby? Click - click. Want a web server? Click - click. It's cool (and by cool I mean totally sweet).

The equivalent to apt-get on a Windows desktop is, I guess, a combination of "Add / Remove Programs" and "Windows Update". Sorry … "Microsoft Update". I have both of these icons on my Start menu and I forgot what the difference is, really, but the Microsoft Update icon is the prettiest of the two, so it must be better. I'm shallow that way.

"Add Remove Programs" doesn't really add any software to a machine, unless the software is an operating system component and you have the CD. I mean, you really have the CD close enough to insert into the CD drive. Heaven help you if it's 1 AM at the hotel room in Dog Lick, Kentucky and you need to install MSMQ onto your genuine, activated copy of Windows. Thank goodness for ISO images.

I've never seen anything interesting appear on Microsoft update. Stuff like the Consolas font or Power Toys. On Tuesday, however, two of my machines installed a critical update to remove unacceptable symbols from a font. I thought I'd sleep better that night knowing these foul symbols were gone, but I had strange dreams about super-intelligent aliens covertly landing on earth, graduating from the finest law schools, and then enslaving the planet using a loophole in international maritime laws. Related? Yes.

It's painful to move to a new machine that doesn't have the software I need. Search Microsoft.com, page through results, download, save, open, click-click-click-click, and install. Why aren't the pastel sunsets of the Nature Theme not available through Microsoft Update? Why doesn't the Web Application Project add-on appear? Will I forever need to download and install PowerShell manually?

I've been kicking around the idea of building an app or script to download and install the common software I use, but some of this software requires registration, and other software requires passing the Windows Genuine Advantage test. It might be an interesting challenge. If Microsoft provided a tool that could pick from the smorgasbord of add-ons, utilities, and applications spread across the Microsoft.com domain, it would be cool (and by cool, I mean totally sweet).

Downloads, downloads every where, 
    and I can't find the links;
Downloads, downloads every where,
    I just want machines in synch.

Apologies to Samuel T.

Page Scraping

Thursday, June 29, 2006 by scott

Q: I want to programmatically retrieve a web page and parse out some information inside. What's the best approach?

A: For fetching the contents of a page, the simplest class to use is the System.Net.WebClient class. For more advanced scenarios that require cookies and simulated postbacks to the server, chances are you'll have to graduate to the System.Net.WebRequest and WebResponse classes. You'll find a lot of material on the web that demonstrate how to use these classes.

If you have to pull specific information out of a page, then the "best approach" will depend on the complexity of the page and nuances of the data. Once you have the contents of a page in a string variable, a few IndexOf() and Substring() method calls might be enough to parse out the data you need.

Many people use the RegEx class to find data inside of HTML. I'm not a fan of this approach, though. There are so many edge cases to contend with in HTML that the regular expressions grow hideously complex, and the regular expression language is notorious for being a "write-only" language.

My usual approach is to transform the web page into an object model. This sounds complicated, but not if someone else does all the heavy lifting. Two pieces of software that can help are the SgmlReader on GotDotNet, and Simon Mourier's Html Agility Pack. The agility pack is still a .NET 1.1 project, but I have it running under 2.0 with only minor changes (I just needed to remove some conditional debugging attributes). With these libraries, it is easy to walk through the page like an XML document, perform XSL transformations, or find data using XPath expressions (which to me are a lot more readable than regular expressions).

Here is a little snippet of code that uses the agility pack and will dump the text inside all of the links (<a> anchor tags) on OTC's front page.

WebRequest request = WebRequest.Create("https://odetocode.com");
using (WebResponse response = request.GetResponse())
{
    HtmlDocument document =
new HtmlDocument();
    document.Load(response.GetResponseStream());

    
foreach (HtmlNode node in document.DocumentNode.SelectNodes("//a"))
    {
        
Console.WriteLine(node.InnerText);
    }
}

The "Scraping" term in the title of this post comes from "screen scraping", which is a term almost as old as IT itself.

Importance: High

Wednesday, June 28, 2006 by scott

It's a quiet day.

Too quiet - you think to yourself.

Then it happens….

A message arrives in your Inbox, but not just any message.

This message is adorned with the Red Exclamation of Importance!

You can feel the adrenaline in your fingers as they snap the mouse into position. A breathless moment during the double-click, and the message is now in front of your eyes:

Hi everyone! Just a little reminder to get those TPS status reports in by next week. Also, we've change the passcode for next month's conference call – it is now 6661212.
Thanks! :)

Oh. It's from that person. Every mail that person sends carries the Red Exclamation of Importance.

Have you ever wondered what that person would do in the event of a real emergency?

Themes For ASP.NET User Controls

Tuesday, June 27, 2006 by scott

Q: I can't apply a Theme attribute to my user control. How am I supposed to theme and skin my user controls?

A: User Controls don't have a theme property, but when you assign a theme to a page, the theme does influence all of the controls in the page. A user control uses the theme of its page.

If the theme includes CSS files, then ASP.NET will link to the CSS files. Any styles inside the CSS that are applicable to classes, names, and markup in the user control will style the user control.

If the theme includes skins, then the skins will skin the appropriate controls inside of user controls. For instance, if you have defined a default skin for Calendar controls, then any Calendar controls inside your user control will have that skin applied.

It is also possible to apply a skin directly to the user control, we just need to know a couple tricks. First, let's define a simple user control: MyWidget. MyWidget just renders a Message property.

<%@ Control Language="C#" CodeFile="MyWidget.ascx.cs" Inherits="MyWidget" %>
<h1><%= Message %></h1>

The CodeFile for the control declares a Message property, but more importantly, adds a Themeable attribute to the control.

using System;
using System.Web.UI;

[
Themeable(true)]
public partial class MyWidget : System.Web.UI.UserControl
{
    
private string _message;
    
public string Message
    {
        
get { return _message; }
        
set { _message = value; }
    }
   
}

Finally, we just need a .skin file. We need to @ Register our control, and then write the skin just like we'd write any other skin.

<%@ Register Src="~/MyWidget.ascx" TagName="MyWidget" TagPrefix="otc" %>

<otc:MyWidget Message="Sahil Looks Funny" runat="server" />

Now, anytime we drop MyWidget on a page, the skin file will set a default message (assuming the control's page has a theme selected).

URL-Rewriting breaks my HREFs

Monday, June 26, 2006 by scott

Q: I've been using the RewritePath method of the HttpContext class to rewrite URLs since ASP.NET 1.1. Everything was working until I moved my CSS files inside an ASP.NET 2.0 Theme directory. ASP.NET automatically writes out link tags for my stylesheets, but the href inside is broken. How do I fix this problem?

A: When rewriting URLs, don't use the RewritePath method that only accepts a path parameter, use the new overload that accepts a path parameter and a rebaseClientPath parameter. Pass a value of false for rebaseClientPath and the auto-magic injection of stylesheets will work.

Full explanation: The href in the stylesheet link provided by ASP.NET is a relative URL computed by Control.ResolveClientUrl. ResolveClientUrl is computing a path relative to the rewritten URL, not the original URL. Unfortunately, the relative URL won't work in the browser, because the browser is using the original URL.

By default, RewritePath rewrites a private field in the HttpRequest class by the name of _clientBaseDir. This field is used by Control.ResolveClientUrl to compute the relative URL. Passing false in the overloaded version of RewritePath tells the method to not rewrite this field. ResolveClientUrl will then use the original URL (the same URL the browser is using) when calculating relative URL, so everything works, and harmony remains.

Software Librarian

Friday, June 23, 2006 by scott

One of the nice benefits of an MSDN subscription is having a boatload of software for development, test, and experimentation. The drawback is the boatload of discs that accumulate during the year.

Consonica Software has a product they call the “Software Librarian for MSDN”. After a quick install, you tell the software what type of subscription you have, and what products you use, and then it tells you which discs to feed into the computer. Sometime later, everything you use is now copied to a hard drive. The monthly updates are a breeze, and no more fumbling around for discs. Watch the video demo.

Software Librarian

Disclaimer: Consonica sent me a copy of their software to use. I can honestly say I've found it works very well and saves time. In a partner setting, you typically have 5 MSDN licenses for every 1 set of physical media. In theory, this works well, but in practice, it works like this:

0:00 Decide to install SQL Server 2005
0:01 Start to rummage through 50 DVDs in the MSDN storage case
0:15 Check the DVD drive on dev machine #1
0:16 Check the DVD drive on dev laptop
0:17 Check the DVD drive on dev machine #2
0:18 Start an email rant about DVDs missing from the case
0:19 Remember you sent the same email last month
0:20 Copy last month’s rant into current email, press Send
0:23 Start SQL 2005 download from MSDN subscribers page
1:23 Download completes
1:23 Someone finds the SQL 2005 DVD in the break room

Check out Software Librarian, it does save some time.

The Base Activity Library in Windows Workflow

Thursday, June 22, 2006 by scott

The latest OdeToCode article covers the base activity library in Windows Workflow. We look at the activities to communicate with local and remote services, evaluate rules, listen for events, and every other activity that ships in the WF toolbox.

Read it here: Windows Workflow: The Base Activity Library

As always, I appreciated any feedback you can offer.