The Monad Experiment

Friday, July 8, 2005

When I saw Adam Barr’s “Monad and RSS” posts (Part I, II, III), I thought of something I’ve wanted to have recently: a utility to parse through my OPML file and tell me what feeds were dead or not updated for months.

Monad is the up and coming command line shell in Windows. Adam’s samples made XML in Monad look easy, so I thought I could bang out a quick script…

… then I discovered the pain of umpteen different syndication formats, each with their own quirky implementations. I am now amazed there are any working feed readers in existence. Why does simple stuff always get so difficult?  

Still, I have something that works “okish”. The script tells me about dead and deserted feeds. There are some quirks. For instance, Ted Neward’s <pubDate> element refuses to parse into a DateTime in Monad. I blame it on the JSP extension in Ted’s URL.

The exception handling took some getting used to – but is great to have in script. On the other hand, I kept turning things into strings when I didn’t want strings. I stumbled into the right format to use for namespace-qualified elements, but I do like the way it works. I didn’t want to throw XmlNamespaceManager mumbo jumbo into script.

I’d like to see someone who actually knows Monad take the script and turn it into one of those 5 line masterpieces full of pipe symbols and regular expressions.

$opmldoc = [xml]$(get-content $args[0])

$webclient = new-object System.Net.WebClient

$cutoff = [DateTime]::get_Now().AddDays(-30)

 

foreach($feed in $opmldoc.opml.body.outline)

{

  $date = $null

  $doc = $null

  trap [System.Net.WebException]

  {

    write-host "Web error fetching feed for " $feed.title

    write-host " Error: " $_.Exception.Status

    continue

  }

  trap [System.Exception]

  {

    write-host "Choked on: " $feed.title

    continue

  }

 

  #because of goofy leading chars in msdn feed

  $raw = $webclient.DownloadString($feed.xmlUrl)

  if($raw -ne $null)

  {

    $doc = [xml]$raw.SubString($raw.IndexOf("<"))

  }

 

  #see if this is rss

  if($doc.rss -ne $null)

  {

    if($doc.rss.channel.item[0].pubDate -ne $null)

    {

      # uses <pubDate>

      # sort items by date and pick the most recent

      $date = [DateTime](

                $doc.rss.channel.item |

                sort-object @{ e = { [DateTime]$_.pubDate }; asc=$false}

               )[0].pubDate

    }

 

    if($doc.rss.channel.item[0].{dc:date} -ne $null)

    {

      # uses <dc:date>

      $date = [DateTime](

                $doc.rss.channel.item |

                sort-object @{ e = { [DateTime]$_.{dc:date} }; asc=$false}

               )[0].{dc:date}

    }

 

    # if we still don't have a date, try <lastBuildDate>

    if($date -eq $null)

    {

      $date = [DateTime]$doc.rss.channel.lastBuildDate

    }

  }

 

  # check for RDF

  elseif($doc.{rdf:RDF} -ne $null)

  {

    $date = [DateTime](

               $doc.{rdf:RDF}.item |

               sort-object @{ e = { [DateTime]$_.{dc:date} }; asc=$false}

              )[0].{dc:date}

  }

 

  # check for ATOM

  elseif($doc.feed -ne $null)

  {

    $date = [DateTime](

               $doc.feed.entry |

               sort-object @{ e = { [DateTime]$_.issued }; asc=$false}

              )[0].issued

  }

  if($date -eq $null)

  {

    write-host "Did not parse date from " $feed.title

  }

  elseif($date -lt $cutoff)

  {

    write-host "Stale feed alert!! : " $feed.title

  }

}


Comments
Adam Barr Saturday, July 9, 2005
A few people on the Monad team looked at the script and the consensus was there wasn't a whole lot of magic that could be done, except perhaps use a switch statement.

But you may be interested in my latest installment of "Monad and RSS" where I talk about handling different syndication formats:

www.proudlyserving.com/.../monad_and_rss_p_2.html

- adam
Thomas Lee Wednesday, August 3, 2005
One small suggestion I'd offer. The statement:

if($doc.rss -ne $null)

could be written simply as

if ($doc.rss)

Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!