Posts for June 2008

Parsing XML with PowerShell

I'm addicted to PowerShell. This cool scripting environment is simple to use, and with very few lines of script; it is possible to accomplish tasks that otherwise often would be a lot of tedious work. (If we didn't have PowerShell, I would propably wip up a C# program to do the same, but PowerShell is really lightweight, is interactive and is generally very forgiving for small tasks where you just "want the job done".

As an example, today I needed to look at a log files generated by Visual Studio to figure out why the environment wouldn't start on my home PC. As it turns out, these log files are actually XML files. Of course I could have just started reading through the XML, but all the angle brackets confuses my brain; when I'm actually mostly interested in the text content of the log file.

So, five minutes later, this 3-line script; parse-vslog.ps1 was born:

1: param( [string]$file = $(throw "required parameter" ) )
2: $log = [xml](get-content $file)
3: $log.activity.entry | select record,type,description | format-table -wrap -auto

This is what happens in the script:

On line 1, we declare that we need a $file parameter (variables and parameters is prefixed with $ in PowerShell), that should be required.

On line 2 we use the get-content cmdlet to get the contents of a file. PowerShell has a lot of XML helping features, one of which is the ability to "cast" the content to XML using the [xml] construct. What really happens behind the scenes, is that PowerShell instantiates an XmlDocument and loads the text content of the file in that.

Last, on line 3, we take advantage of the fact that PowerShell let's us select XML nodes by using simple dotted notation. Here we are interested in all the the /activity/entry nodes. We pass the result along the pipeline and selects the 3 most important values using the select cmdlet. And, lastly, we format the output nicely with format-table, specifying that we would like the cmdlet to auto-select the column widths (-auto) and that text output should be wrapped on multiple lines (-wrap).

So insted of having to look at XML that goes on like this:

1: xml-stylesheet type="text/xsl" href="ActivityLog.xsl"?>
2: activity>
3:   entry>
4:     record>1record>
5:     time>2008/06/15 15:44:18.220time>
6:     type>Informationtype>
7:     source>Microsoft Visual Studiosource>
8:     description>Visual Studio Version: 9.0.21022.8description>
9:   entry>
10:   entry>
11:     record>2record>
12:     time>2008/06/15 15:44:18.221time>
13:     type>Informationtype>
14:     source>Microsoft Visual Studiosource>
15:     description>Running in User Groups: Administrators Usersdescription>
16:   entry>
17:   entry>
18:     record>3record>
19:     time>2008/06/15 15:44:18.221time>
20:     type>Informationtype>
21:     source>Microsoft Visual Studiosource>
22:     description>ProductID: 91904-270-0003722-60402description>
23:   entry>
24:   entry>
25:     record>19record>
26:     time>2008/06/15 15:44:19.094time>
27:     type>type>
28:     source>Microsoft Visual Studiosource>
29:     description>Destroying Main Windowdescription>
30:   entry>
31: activity>
32:  

Now, I can get this much nicer output in the console (note that the XML above has been shortened for the blog. It was actually around 150 lines):

record type        description
------ ----        -----------
1      Information Visual Studio Version: 9.0.21022.8
2      Information Running in User Groups: Administrators Users
3      Information ProductID: 91904-270-0003722-60402
4      Information Available Drive Space: C:\ drive has 42128211968 bytes; D:\ drive has 38531145728 bytes; E:\ drive h
                   as 127050969088 bytes; F:\ drive has 117087354880 bytes
5      Information Internet Explorer Version: 7.0.6001.18063
6      Information Microsoft Data Access Version: 6.0.6001.18000
7      Information .NET Framework Version: 2.0.50727.1434
8      Information MSXML Version: 6.20.1076.0
9      Information Loading UI library
10     Information Entering function CVsPackageInfo::HrInstantiatePackage
11     Information Begin package load [Visual Studio Source Control Integration Package]
12     Information Entering function CVsPackageInfo::HrInstantiatePackage
13     Information Begin package load [team foundation server provider stub package]
14     Information End package load [team foundation server provider stub package]
15     Information End package load [Visual Studio Source Control Integration Package]
16     Information Entering function VBDispatch::GetTypeLib
17     Information Entering function LoadDTETypeLib
18     Error       Leaving function LoadDTETypeLib
19                 Destroying Main Window
 

I think this is a good representative of the strength of PowerShell. Using only a few lines of script and a minimum of time, I created a reusable script, that will probaply save a lot of time in the future.


ReSharper 4 Available

The good folks over at Jetbrains has finally released version 4 of their ReSharper productivity enhancing tool with support for C#3.0.

Highly recommended.


Clever use of C# 3.0 LINQ Expressions

Jafar Husain shows us a quite clever way to use a C# 3.0 LINQ expression to get a symbol name. I think this is a really good idea to use in cases when you need the name of a symbol as a string, since it avoids using hard-coded strings and gives us the option to use automatic refactorings without breaking stuff. And since it is implemented as an extension method; it does not pollute the interface of your classes. There might be a slight performance penalty when using this method; but I think it will be neglible for all but some extreme cases. But if you need to use it in a tight loop, you should propably make some performance measurements in advance ;-)

Now, if only C# 3.0 had been available a couple of years ago, when I wrote a lot of statically typed datasets (automatically generated from the database schema using CodeSmith, of course). The bulk of the code in those datasets where properties that would generally retrieve a value from a DataRow in a column named the same as the property. If I could have used the approach mentioned above, I am sure it would have saved me some sweat during later refactorings of the code in question.

It is a Good Thing our tools and languages continually evolves.