Dot Net Thoughts

January 19, 2008

Serializing with WCF

Filed under: csharp,Uncategorized — dotnetthoughts @ 1:23 pm
Tags: , , ,

First of all, I want to apologize for the amount of time it has taken me to put up a new blog post. I typically try to write once a week, but my life has been crazy as of late. The outcome of all of the madness is that I will be packing up my family and moving from Portland, Oregon to Denver, Colorado at the end of this month. (My wife has been accepted into a Master’s degree program there!)Because of this, posts for the next couple of months may be sporadic, as well. (I’m looking for work in the Denver area. If anybody has any leads, let me know… )

In celebration of my move, I thought that I would put together an example that demonstrates two different ways that WCF serializes objects. (This posting will simply show different ways the engine can work. I hope to actually wire it into the WCF pipeline in a future post.) Since I spent last night packing all of my books into boxes, I thought that maybe I’d find a way to organize them. The goal of this application is to create a manifest list allows me quickly find any book that I’ve packed away in a moving box.

As always, the complete code is at the end of the blog post. The amount of code with this post is a little bit greater than usual. I’ll try and write code directly related to the topic within the text, but I would recommend copying the code to studio if you want to follow along.

Since my main objective is to manage books and boxes, I’m going to begin by simply creating a very basic object for each. My Book object will contain two different properties (Title and Author), and my Box object will simply contain an Id. Since we will be using the WCF engine to do our serialization we will need to mark the classes and properties we wish to expose to the serialization engine with the [DataContract] and the [DataMember] attributes.

 
    #region Book Class            

    [DataContract] 
    public class Book 
    { 
        [DataMember] 
        public string Title { get; set; }            

        [DataMember] 
        public string Author { get; set; } 
    } 
    #endregion            

    #region Box Class 
    [DataContract] 
    public class Box 
    { 
        [DataMember] 
        public int Id { get; set; } 
    } 
   #endregion 

I now need a method to associate books with boxes. In order to keep my Book and Box object as loosely coupled as possible, I will add a Manifest object to the project which will be responsible for maintaining the relationship between the two. The Manifest object will expose three properties. The Books property will contain a collection of all of the Books on my bookshelf. The Boxes property will contain all of the Boxes I’m using to move. The LineItems property will contain information as to which book is in which box.

 
        //Property inside of the Manifest Class. 
        private List<Box> _boxes = new List<Box>(); 
        [DataMember] 
        public List<Box> Boxes 
        { 
            get 
            { 
                return _boxes; 
            } 
            private set 
            { 
                _boxes = value; 
            } 
        } 
        //LineItems and Books here… 

Since we are using WCF, the serializer will require both a getter and a setter for every object it serializes. I don’t really want to give consumers of my class the ability to blow away my boxes collection, so I have explicitly created a private setter. This allows WCF the access it needs, while denying direct access to other consumers.

LineItems are added to the collection by calling the AddLineItem method. This method adds the book to the book collection, adds a box to the box collection, and creates a relationship between the two by adding a new LineItem to the LineItems collection. It is important to note that the association created in the LineItems collection does not contain new instances of books and boxes, rather it contains a reference to the master copy in the Books and Boxes collection. (In other words, the association is by reference.)

 
        public void AddLineItem(Book book, Box box) 
        { 
            Books.Add(book); 
            Boxes.Add(box); 
            LineItems.Add(new LineItem(book, box)); 
        } 

I’ve overridden the ToString on the manifest object to display the complete list of books and boxes that it contains. There is also a CreateShippingManifest method which will be used to create a generic manifest of three books contained in two boxes. (See code below.)

To test our code, we’ll create our sample manifest and view the results.

 
      Manifest manifest = CreateShippingManifest(); 
      Console.WriteLine(manifest.ToString()); 

The results which come back are as follows:

 
        Box 1   A Brief History of Time 
        Box 1   Guards Guards 
        Box 2   The Reptile Room 

Now, we’ll make an assumption that I made a mistake in entering a box label into the program. Instead of box 1, I meant the Id to be box 100. We’ll make the change to the box id, and the manifest will automatically pick up the results.

 
        Manifest manifest = CreateShippingManifest(); 
        manifest.Boxes[0].Id = 100; 
        Console.WriteLine(manifest.ToString()); 

Results:

 
        Box 100 A Brief History of Time 
        Box 100 Guards Guards 
        Box 2   The Reptile Room

Now, let’s add some code to serialize and deserialize our manifest. This is actually a fairly simple process. The first method will serialize the data to disc (as Xml), the second method will read the file and deserialize it into a new object.

 
      public void Serialize(string fileName) 
      { 
        DataContractSerializer ds = new DataContractSerializer(this.GetType());            

         using (Stream stream = File.Create(fileName)) 
         { 
             ds.WriteObject(stream, this); 
         } 
      }   

      public void Deserialize(string fileName) 
      { 
          using (Stream stream = File.OpenRead(fileName)) 
          { 
              DataContractSerializer ds = new DataContractSerializer(this.GetType()); 
              Manifest manifest = (Manifest)ds.ReadObject(stream); 
              this.Books = manifest.Books; 
              this.Boxes = manifest.Boxes; 
              this.LineItems = manifest.LineItems; 
          } 
      } 

We can now run code similar to that we ran above. Between steps, let’s serialize and deserialize our objects, though.

 
      static void Main(string[] args) 
      { 
          string serializationFile = @"c:\tempmw\Manifest.xml";            

          Manifest manifest = CreateShippingManifest(); 
          manifest.Serialize(serializationFile);            

          Manifest manifest2 = new Manifest(); 
          manifest2.Deserialize(serializationFile);            

          Console.WriteLine("Object after deserialization"); 
          Console.WriteLine(manifest2.ToString());            

          manifest2.Boxes[0].Id = 100; 
          Console.WriteLine(manifest2.ToString());            

          Console.ReadLine(); 
      } 

This yields the results:

 
      Object after deserialization 
      Box 1   A Brief History of Time 
      Box 1   Guards Guards 
      Box 2   The Reptile Room            

      Box 1   A Brief History of Time 
      Box 1   Guards Guards 
      Box 2   The Reptile Room 

But wait a minute! What happened here? We clearly changed the Id of the first box to be 100, yet the results still state that our box is Box 1! The code looks very similar to what we used in the non-serialized objects. The reason that this happens is because of the way the default instance of the DataContractSerializer writes out the results. If you take a look at the Xml file that is created from our serializer, you’ll see results similar to the following:

 
<Books> 
  <Book> 
    <Author>Stephen Hawking</Author> 
    <Title>A Brief History of Time</Title> 
  </Book> 
  … 
</Books> 
<Boxes> 
  <Box> 
    <Id>1</Id> 
  </Box> 
  … 
</Boxes> 
<LineItems> 
  <Manifest.LineItem> 
    <Book> 
      <Author>Stephen Hawking</Author> 
      <Title>A Brief History of Time</Title> 
    </Book> 
    <Box> 
      <Id>1</Id> 
    </Box> 
  </Manifest.LineItem> 
  … 
</LineItems>  

The Xml does not contain the associations between the manifest items and the books and boxes that we set up so carefully in our code. Changing the box Id in the master collection no longer changes the box id in all of the children.

We can very easily fix this by updating the DataContractSerializer instantiation in our Serialize method. The important parameters in the new constructor are the third and fifth. The third parameter (maxItemsInObjectGraph) indicates the total number of objects that the Xml can contain. If the number of objects in the Xml is exceeded, an error will be raised. The fifth parameter (preserveObjectReferences) indicates that the associations between objects should be preserved.

 
        public void Serialize(string fileName) 
        { 
            //DataContractSerializer ds = new DataContractSerializer(this.GetType());            

            DataContractSerializer ds = new DataContractSerializer(this.GetType(), null, 100, false, true, null);            

            using (Stream stream = File.Create(fileName)) 
            { 
                ds.WriteObject(stream, this); 
            } 
        } 

The important parameters in the new constructor are the third and fifth. The third parameter (maxItemsInObjectGraph) indicates the total number of objects that the Xml can contain. If the number of objects in the Xml is exceeded, an error will be raised. The fifth parameter (preserveObjectReferences) indicates that the associations between objects should be preserved. Now, when we rerun our demo, we see that the Id of the box is correctly updated.

 
      Object after deserialization 
      Box 1   A Brief History of Time 
      Box 1   Guards Guards 
      Box 2   The Reptile Room            

      Box 100 A Brief History of Time 
      Box 100 Guards Guards 
      Box 2   The Reptile Room 

Examining the Xml now shows the Id and IdRef structures in place to reassociate the data.

 
<Manifest z:Id="1">        

<Books z:Id="2" z:Size="3"> 
   <Book z:Id="3"> 
     <Author z:Id="4">Stephen Hawking</Author> 
     <Title z:Id="5">A Brief History of Time</Title> 
   </Book> 
   ... 
</Books> 
<Boxes z:Id="12" z:Size="3"> 
   <Box z:Id="13"> 
      <Id>1</Id> 
   </Box> 
   ... 
</Boxes> 
<LineItems z:Id="15" z:Size="3"> 
   <Manifest.LineItem z:Id="16"> 
      <Book z:Ref="3" i:nil="true"/> 
      <Box z:Ref="13" i:nil="true"/> 
   </Manifest.LineItem> 
   ... 
   </LineItems> 
</Manifest> 

So, there you have it. With WCF you can serialize by value or by reference. Pretty neat stuff. This has kind of been a marathon post, I hope you could follow along. Let me know if you find anything that isn’t clear!
Code Safe!
MW

 
---------------------Sample Code-------------------------------------- 
//Console Application 
using System; 
using WcfSample;        

namespace PersistApp 
{ 
    class Program 
    { 
        static void Main(string[] args) 
        { 
            //You will need to change the filepath to a file on your local machine. 
            string serializationFile = @"c:\tempmw\Manifest.xml"; 
            //*********************************************************************        

            Manifest manifest = CreateShippingManifest();        

            Console.WriteLine("Manifest before serializing...\n"); 
            Console.WriteLine(manifest.ToString());        

            manifest.Serialize(serializationFile);        

            Manifest manifest2 = new Manifest(); 
            manifest2.Deserialize(serializationFile);        

            Console.WriteLine("Object after deserialization"); 
            Console.WriteLine(manifest2.ToString());        

            manifest2.Boxes[0].Id = 100; 
            Console.WriteLine(manifest2.ToString());        

            Console.ReadLine(); 
        }        

        private static Manifest CreateShippingManifest() 
        { 
            Manifest manifest = new Manifest();        

            Book book1 = new Book { Title = "A Brief History of Time", Author = "Stephen Hawking" }; 
            Book book2 = new Book { Title = "Guards Guards", Author = "Terry Pratchett" }; 
            Book book3 = new Book { Title = "The Reptile Room", Author = "Lemony Snicket" };        

            Box box1 = new Box { Id = 1 }; 
            Box box2 = new Box { Id = 2 };        

            manifest.AddLineItem(book1, box1); 
            manifest.AddLineItem(book2, box1); 
            manifest.AddLineItem(book3, box2);        

            return manifest; 
        }        

    } 
}        

//*************************************************************        

//Classes        

using System; 
using System.Collections.Generic; 
using System.IO; 
using System.Runtime.Serialization; 
using System.Text;        

namespace WcfSample 
{        

    #region Manifest 
    [DataContract] 
    public class Manifest 
    { 
        private List<Box> _boxes = new List<Box>(); 
        private List<Book> _books = new List<Book>(); 
        private List<LineItem> _lineItems = new List<LineItem>();        

        [DataMember] 
        public List<Box> Boxes 
        { 
            get 
            { 
                return _boxes; 
            } 
            private set 
            { 
                _boxes = value; 
            } 
        }        

        [DataMember] 
        public List<Book> Books 
        { 
            get 
            { 
                return _books; 
            } 
            private set 
            { 
                _books = value; 
            } 
        }        

        [DataMember] 
        public List<LineItem> LineItems 
        { 
            get 
            { 
                return _lineItems; 
            } 
            private set 
            { 
                _lineItems = value; 
            } 
        }        

        public void AddLineItem(Book book, Box box) 
        { 
            Books.Add(book); 
            Boxes.Add(box); 
            LineItems.Add(new LineItem(book, box)); 
        }        

        public void Serialize(string fileName) 
        { 
            //DataContractSerializer ds = new DataContractSerializer(this.GetType()); 
            DataContractSerializer ds = new DataContractSerializer(this.GetType(), null, 100, false, true, null);        

            using (Stream stream = File.Create(fileName)) 
            { 
                ds.WriteObject(stream, this); 
            } 
        }        

        public void Deserialize(string fileName) 
        { 
            using (Stream stream = File.OpenRead(fileName)) 
            { 
                DataContractSerializer ds = new DataContractSerializer(this.GetType()); 
                Manifest manifest = (Manifest)ds.ReadObject(stream); 
                this.Books = manifest.Books; 
                this.Boxes = manifest.Boxes; 
                this.LineItems = manifest.LineItems; 
            } 
        }        

        public override string ToString() 
        { 
            StringBuilder sb = new StringBuilder();        

            foreach (LineItem li in this.LineItems) 
            { 
                sb.AppendLine(String.Format("Box {0}\t{1}", li.Box.Id, li.Book.Title)); 
            }        

            return sb.ToString(); 
        } 
    #endregion        

        #region LineItemClass 
        [DataContract] 
        public class LineItem 
        { 
            private Book _book; 
            private Box _box;        

            public LineItem(Book book, Box box) 
            { 
                _box = box; 
                _book = book; 
            }        

            [DataMember] 
            public Box Box 
            { 
                get 
                { 
                    return _box; 
                } 
                private set 
                { 
                    _box = value; 
                } 
            }        

            [DataMember] 
            public Book Book 
            { 
                get 
                { 
                    return _book; 
                } 
                private set 
                { 
                    _book = value; 
                } 
            } 
        } 
        #endregion 
    }         

    #region Book Class        

    [DataContract] 
    public class Book 
    { 
        [DataMember] 
        public string Title { get; set; }        

        [DataMember] 
        public string Author { get; set; } 
    } 
    #endregion        

    #region Box Class 
    [DataContract] 
    public class Box 
    { 
        [DataMember] 
        public int Id { get; set; }        

    } 
    #endregion 
}        

January 1, 2008

Persisting the Doubly Linked List

Filed under: Misc Thoughts — dotnetthoughts @ 9:54 am
Tags: , , ,

Recently, we discovered that I needed a doubly linked list to chain objects together in our code. .Net has made this an incredibly easy process, as it provides a LinkedList generic object which manages the creation of the list, as well as the inserting and the deletion of the nodes.

Our project also requires that we persist our linked list to a database. The task seems easy enough. All that needed to be done was to create a table which contains our data, as well as pointers to the previous and next nodes. In other words, our initial table structure looked like this:

Id            int 
ParentId      int (FK to Id) 
ChildId       int (FK to Id) 
Description   nvarchar(30)

This structure seems to work on the surface, but we very quickly realized two very critical problems.

The first problem is that inserting the data into this data structure requires two passes. On the first pass, we insert all of the records into the database. Only after all of the records have been inserted can we assign links to both the parent and child records in the ParentId and ChildId columns.

foreach (link in theChain) 
   {  //Insert the record. }  

foreach (link in theChain) 
   {  //Update the record to include the parent and child pointers }

 The second problem is that the data can fall out of sync with itself. For example, what happens if the data ends up looking like this due to some misbehaving code? Id 1 believes that the child record should be Id 2, but Id 2 believes that it is the top of its own chain.

Id: 1   ParentId: null   ChildId: 2 
Id: 2   ParentId: null   ChildId: 3

Both of these problems can be solved by treating the doubly linked list as a singly linked list in the database. If you have the links of a chain going in one direction, you should be able to determine the links going the other way. We initally avoided this option, because we thought the query to retrieve the data would be extremely complex. (Query the parent with a union of the child, maybe into a temporary table. Ugh.)

While on a walk, yesterday, though, I came up with the idea of simply writing a query with an additional join that would return the data with the links in both directions. Our database would no longer need the ChildId column. If we order our data so that parents always fall above their children (the natural state of a linked list), we can insert all of this data in a single pass. Since there is no ChildId, the data can’t become inconsistent.

Id            int 
ParentId      int (FK to Id) 
Description   nvarchar(30)

We retrieving data to recreate the LinkedList in code, we can get both parent and child ids by linking the LinkedList table to itself.

SELECT parentList.Id, parentList.ParentId, childList.Id AS ChildId
FROM LinkedList parentList
LEFT JOIN LinkedList childList ON parentList.Id = childList.ParentId

It’s always a neat experience when an elegant solution comes out of the blue to solve a complex problem. I’m amazed at how often walking away and letting the subconscience mind work will lead to a better solution than when it is being actively developed. Seems like a good New Year’s resolution will be to walk more. Leads to a healthier me, and healthier code.

Good luck and code safe!

MW

December 22, 2007

Merry Christmas!

Filed under: csharp — dotnetthoughts @ 10:39 am
Tags: , ,

Merry Christmas everyone! I hope this holiday finds you happy and healthy with your loved ones! We’ve made the journey north to Washington to be with our families, and the kids are very excited for Christmas this year.

In celebration of Christmas, I thought that I would share with you a coded Christmas tree. I learned of this Christmas tree back in college when I was taking a math methods class, and first programmed it on my trusty TI-85 graphing calculator.

 To build this tree, we’re going to play a simple game. It has three rules:

  1. Define three points that represent the verticies of a traingle traingle.
  2. Starting from one of the verticies, move half the distance to a randomly chosen vertex point, and draw a new point.
  3. Starting at the new point, move half the distances to a randomly chosen vertex point and draw a new point.
  4. Repeat step 3 until you get bored.

Let’s implement the steps in order. We’ll simply use a windows form project and paint the results directly on the form itself.

Our first step is the definition of the verticies. We’ll declare three points forming our triangle as member variables on our form.

        Point _initialPoint1 = new Point(200, 0); 
        Point _initialPoint2 = new Point(0, 400); 
        Point _initialPoint3 = new Point(400, 400);

Next, we will need a method to draw our individual points on the form itself. My DrawPoint method accepts a point and a graphics object. Accepting the graphics object as a parameter prevents us from continually having to create and dispose the graphics object.

        private void DrawPoint(Point point, Graphics g) 
        { 
            Pen pen = new Pen(Color.Green); 
            g.DrawRectangle(pen, point.X, point.Y, 1, 1); 
        }

To implement steps two and three, we will need a method which, given a point, will calculate the half the distance to one of the original verticies and return a new point. You’ll notice we had to create a new member variable called _random. I initially was creating a new random method within the function, itself, but I was getting decidely unrandom results. When the Random object is created, it uses a seed value from the system time. My method was getting called faster than the time was changing, so I was seeing repeated “random” numbers. By moving the object creation outside of the method, the object is seeded only once, and the values turn out to be truly random.

        Random _random = new Random();             

        private Point GetNextPoint(Point startPoint) 
        { 
            Point pointToMoveTo = new Point(); 
            int randomValue = _random.Next(3); 
            int newX = 0; 
            int newY = 0;    
          
            if (randomValue == 0) pointToMoveTo = _initialPoint1; 
            else if (randomValue == 1) pointToMoveTo = _initialPoint2; 
            else if (randomValue == 2) pointToMoveTo = _initialPoint3;         
     
            newX = (startPoint.X + pointToMoveTo.X) / 2; 
            newY = (startPoint.Y + pointToMoveTo.Y) / 2;                   

            return new Point (newX, newY); 
        }

Finally, we just need a method to iterate over it several times. I’m running my 50000 times. As always, be sure you dispose any Graphics objects you create.

        private void ChristmasTree_Load(object sender, EventArgs e) 
        { 
            Point currentPoint;               

            this.Show();               

            using (Graphics g = this.CreateGraphics()) 
            { 
                currentPoint = _initialPoint1; 
                for (int i = 0; i < 50000; i++) 
                { 
                    currentPoint = GetNextPoint(currentPoint); 
                    DrawPoint(currentPoint, g); 
                } 
            }  
        }

 Excellent. Let’s build and run our Christmas tree progam and see what comes out.

ChristmasTree

Isn’t that neat? This code generates a well known fractal called a Sierpinski triangle. An entertaining (non-code) alternative to creating the traingle is to write out Pascal’s triangle, and shade in all of the odd numbers. Pretty neat stuff!

Merry Christmas, all! Code Safe!

MW

——————————Complete Code listing——————————————————-

using System; 
using System.Collections.Generic; 
using System.ComponentModel; 
using System.Data; 
using System.Drawing; 
using System.Linq; 
using System.Text; 
using System.Windows.Forms; 

namespace ChristmasTree 
{ 
    public partial class frmChristmasTree : Form 
    { 
        public frmChristmasTree() 
        { 
            InitializeComponent(); 
        }          

        Point _initialPoint1 = new Point(200, 0); 
        Point _initialPoint2 = new Point(0, 400); 
        Point _initialPoint3 = new Point(400, 400); 
        Random _random = new Random();          

        private void ChristmasTree_Load(object sender, EventArgs e) 
        { 
            Point currentPoint;             
            this.Show();             
            using (Graphics g = this.CreateGraphics()) 
            { 
                currentPoint = _initialPoint1; 
                for (int i = 0; i < 50000; i++) 
                { 
                    currentPoint = GetNextPoint(currentPoint); 
                    DrawPoint(currentPoint, g); 
                } 
            }  
        }          

        private void DrawPoint(Point point, Graphics g) 
        { 
            Pen pen = new Pen(Color.Green); 
            g.DrawRectangle(pen, point.X, point.Y, 1, 1); 
        }          

        private Point GetNextPoint(Point startPoint) 
        { 
            Point pointToMoveTo = new Point(); 
            int randomValue = _random.Next(3); 
            int newX = 0; 
            int newY = 0;   
           
            if (randomValue == 0) pointToMoveTo = _initialPoint1; 
            else if (randomValue == 1) pointToMoveTo = _initialPoint2; 
            else if (randomValue == 2) pointToMoveTo = _initialPoint3;              

            newX = (startPoint.X + pointToMoveTo.X) / 2; 
            newY = (startPoint.Y + pointToMoveTo.Y) / 2;              

            return new Point (newX, newY); 
        } 
    } 
}

December 14, 2007

The Linq Jukebox – Part 2 (LINQ Joins)

Filed under: csharp — dotnetthoughts @ 11:47 pm
Tags: , , ,

Last week, I blogged about creating a jukebox using LINQ to query the filesystem. We used a quick-and-dirty query to join properties from three different objects (two directory objects and a file object) into an anonymous type which represented the music folder hierarchy on my machine. It was a pretty neat first attempt at using LINQ, but I want my jukebox to do more.

My goal for this blog is to append additional information to the individual tracks in my library. To do this, I will store comments on individual tracks in an xml file. Each xml element will contain the filename of the track to which the comment applies, as well as the comment itself. Using the filename as a key, we will use LINQ to join the additional Xml information into the data generated from the directory structure.

The xml file is structured as follows. (The file attribute is truncated for space in this posting.)

<TrackData>   
   <Track file="C:\...\Aerosmith\Get A Grip\4 Fever.mp3" comment="Covered by Garth Brooks"/>   
   <Track file="C:\...\The Big Horn Brass\Christmas With The Big Horn Brass\10 Let It Snow.mp3" comment="Merry Christmas!"/> 
</TrackData>

The first step in this process will be to load the Xml. LINQ introduces a whole new series of objects for dealing with Xml data. When using LINQ, XDocuments are used to hold Xml data. These can be loaded directly from a file in much the same you would load the more familiar XmlDocument object.

     XDocument xmlComments = XDocument.Load(Path.Combine(musicFolder, "CustomData.xml"));

We’re now going to loop through each of the Track elements and extract both the file name and the comment from the data. We will start at the DocumentElement and drill down into the XML DOM object using the Elements method. This new method returns an IEnumerable list of type XElement. Using each of these elements, we will create a new anonymous type from the element’s attributes and store a collection of them in the variable comments.

            var comments = 
                from comment in xmlComments.Elements("TrackData").Elements("Track") 
                select new { 
                    File = (string)comment.Attribute("file"), 
                    Comment = (string)comment.Attribute("comment")};

Next, we will take the LINQ query that we created in last week’s blog, and place it in a variable called tracks.

            var tracks = 
                from artistDirectory in topDirectory.GetDirectories() 
                from albumDirectory in artistDirectory.GetDirectories() 
                from file in albumDirectory.GetFiles("*.mp3", SearchOption.TopDirectoryOnly) 
                select new { 
                     Track = Path.GetFileNameWithoutExtension(file.Name), 
                     Album = albumDirectory.Name, 
                     Artist = artistDirectory.Name, 
                     TrackFile = file.FullName 
                };

Now comes the fun part. Using JOIN syntax very similar to SQL, we can take these two different types and merge them based on a primary/foreign key style relationship.

            var mergedResults = 
                from t in tracks 
                from c in comments 
                where c.File == t.TrackFile 
                select new 
                { 
                    Artist = t.Artist, 
                    Album = t.Album, 
                    Track = t.Track, 
                    TrackFile = t.TrackFile, 
                    Comment = c.Comment 
                };

For those of you following along at home, you’ll be quick to point out that this is not quite the result we’re looking for. This LINQ query returns an INNER JOIN style result. I have 50 or so tracks on my laptop, but only two entries in my Xml file. Using this query, I can only access the two tracks that exist in the Xml. What we really want is a LEFT OUTER JOIN. We wish to include all tracks, even if they don’t have a comment associated with them.

It takes a little bit to convince LINQ to do a flattened OUTER JOIN. First, we will do a group join on our Xml and store our results in a temporary variable (tracksAndComments) using the DefaultIfEmpty() method. DefaultIfEmpty will force a null into the right hand side of the join if no data matches the key. Next, we will export this data into a new anonymous type to be stored in mergedResults. We will use a ternary operator to replace any null tracksAndComments objects with an empty string.


            var mergedResults = 
            	from t in tracks 
            	join c in comments on t.TrackFile equals c.File into tracksAndComments 
            	from tc in tracksAndComments.DefaultIfEmpty() 
            	select new 
            	{ 
                   Artist = t.Artist, 
                   Album = t.Album, 
                   Track = t.Track, 
                   TrackFile = t.TrackFile, 
                   Comment = tc == null ? String.Empty : tc.Comment 
            	};

That’s all it takes. The mergedResults variable now contains track information and comments. In other words, we have done the following in eight lines of code:

  • Queried the file system, returned data nested two folders deep, and formatted them into a simplified object.
  • Loaded an XmlDocument, queried it to retrieve all track types, and stored the attribute values in into a simplified object.
  • Merged both of these objects into a single result set.
  • Iterated over the results and passed them on to another method for processing.

Not bad for eight lines of work. I’ve included the complete method below.

Hope you find this helpful. Good luck and code safe!

MW

        private void LoadMusicData() 
        { 
            string musicFolder = @"C:\Users\Mike\Music\iTunes\iTunes Music";           

            DirectoryInfo topDirectory = new DirectoryInfo(musicFolder); 
            XDocument xmlComments = XDocument.Load(Path.Combine(musicFolder, "CustomData.xml"));           

            var comments = 
                from comment in xmlComments.Elements("TrackData").Elements("Track") 
                select new 
                { 
                    File = (string)comment.Attribute("file"), 
                    Comment = (string)comment.Attribute("comment") 
                };           

            var tracks = 
                from artistDirectory in topDirectory.GetDirectories() 
                from albumDirectory in artistDirectory.GetDirectories() 
                from file in albumDirectory.GetFiles("*.mp3", SearchOption.TopDirectoryOnly) 
                select new { 
                     Track = Path.GetFileNameWithoutExtension(file.Name), 
                     Album = albumDirectory.Name, 
                     Artist = artistDirectory.Name, 
                     TrackFile = file.FullName 
                }   

            var mergedResults = 
            	from t in tracks 
            	join c in comments on t.TrackFile equals c.File into tracksAndComments 
            	from tc in tracksAndComments.DefaultIfEmpty() 
            	select new 
            	{ 
                   Artist = t.Artist, 
                   Album = t.Album, 
                   Track = t.Track, 
                   TrackFile = t.TrackFile, 
                   Comment = tc == null ? String.Empty : tc.Comment 
            	};           

            foreach (var mergeResult in mergedResults) 
            { 
                AddListViewItem(mergeResult.Artist, mergeResult.Album, mergeResult.Track, 
                      mergeResult.Comment, mergeResult.TrackFile); 
            }           

        }

December 9, 2007

The Linq Jukebox

Filed under: csharp — dotnetthoughts @ 8:18 am
Tags: , , ,

Earlier this week, the Portland Area Dot Net Users’ Group had an installation party for Visual Studio 2008. During the event, they had a contest to see who could come up with the best LINQ sample. The winner would receive a customized Zune. While I had not yet used LINQ, I decided to throw my hat in the ring with the following query:

     var query = from ZuneWinner in db.PeopleInRoom 
     where ZuneWinner.FirstName == "Michael" and   
     ZuneWinner.LastName=="Weier" 
     select new {ZuneWinner.FirstName, ZuneWinner.LastName};

As you’ve probably already guessed, I didn’t leave with the fancy new piece of hardware. I did receive a chuckle from the judge, though.

After everything had ended, I came up with an idea that may have been a serious contender. A cool entry would have been to try and model an mp3 player’s functionality using LINQ. I decided to create a program that would rip through the music structure on my PC and display the results by artist, album and track.

On my laptop, I have an ITunes folder. (Can you say ITunes on an essentially Microsoft blog?) This folder arranges music into three different levels. The topmost folder contains one folder for each artist I have music for on my PC. Each artist folders contains one folder for each album I have by this artist. The artist folder, in turn, contains a list of tracks that I have available to play on my pc.

LinqFolders

My ultimate goal is to take this hierarchical structure and flatten it into a listbox view similar to the following:

LinqListbox

Traditional (pre-LINQ) programming would have achieved this through a simple nested-loop construct. Starting at the top level folder, loop through the subfolders populating the ListBox’s ListItems as you go.

 
        private void LoadMusicData2() 
        { 
            string musicFolder = @"C:\Users\Mike\Music\iTunes\iTunes Music"; 
            DirectoryInfo topDirectory = new DirectoryInfo(musicFolder); 
            foreach (DirectoryInfo artistDirectory in topDirectory.GetDirectories()) 
            { 
                foreach (DirectoryInfo albumDirectory in artistDirectory.GetDirectories()) 
                { 
                    foreach (FileInfo trackFile in albumDirectory.GetFiles("*.mp3", 
                        SearchOption.TopDirectoryOnly)) 
                    { 
                        AddListViewItem(artistDirectory.Name, albumDirectory.Name, 
                            Path.GetFileNameWithoutExtension(trackFile.ToString()), trackFile.FullName); 
                    } 
                } 
            } 
        } 

This method works well enough, but LINQ gives us a much more elegant solution. In the above code, we have three different objects. The first object maintains artist directories, the second maintains album directories, and the third maintains track information. By using LINQ, we can create an anonymous type which will hold only the pieces of data we are interested in dealing with. The following method is the LINQ equivalent to the above code.

 
        private void LoadMusicData() 
        { 
            string musicFolder = @"C:\Users\Mike\Music\iTunes\iTunes Music"; 
            DirectoryInfo topDirectory = new DirectoryInfo(musicFolder); 
            var query = 
                from artistDirectory in topDirectory.GetDirectories() 
                from albumDirectory in artistDirectory.GetDirectories() 
                from file in albumDirectory.GetFiles("*.mp3", SearchOption.TopDirectoryOnly) 
                select new { Track = Path.GetFileNameWithoutExtension(file.Name), 
                                  Album = albumDirectory.Name, 
                                  Artist = artistDirectory.Name, 
                                  TrackFile = file.FullName};        

            foreach (var trackData in query) 
            { 
                AddListViewItem(trackData.Artist, trackData.Album, trackData.Track, trackData.TrackFile); 
            }         
        } 

The from statements in this query retrieves data from each of the individual folders and merges them into a flattened hierarchy. The select statement then creates a new anonymous type which contains four properties: Track, Album, Artist, and TrackFile. Not only has the data been reduced to only the data we care about, it has been renamed to make more sense for our application. Finally, we loop through the data returned in the query, adding the values into the ListView.

So, really, what is so amazing? Is the second method really that much better than the first?

What I really think will set LINQ apart is the fact that it is platform agnostic when it comes to querying data. The same syntax can be used to query databases, objects, and Xml. Furthermore, one is able to extract and merge exactly what one needs from these different types and combine them into specialized types on the fly. Sorting and filtering data in LINQ is very simple. Want to see only music by the Big Horn Brass with the tracks in descending order? No problem.

 
            var query = 
                from artistDirectory in topDirectory.GetDirectories() 
                from albumDirectory in artistDirectory.GetDirectories() 
                from file in albumDirectory.GetFiles("*.mp3", SearchOption.TopDirectoryOnly) 
                where artistDirectory.Name == "The Big Horn Brass" 
                orderby file.Name descending 
                select new { Track = Path.GetFileNameWithoutExtension(file.Name), 
                                  Album = albumDirectory.Name, 
                                  Artist = artistDirectory.Name, 
                                  TrackFile = file.FullName}; 

The biggest drawback I see at the moment is the syntax for non-trivial queries. I’d originally wanted to merge in some Xml comments to a couple of tracks using the equivalent of a LEFT OUTER JOIN, but never quite got it to work right. The syntax for Xml seems to be entirely new and is not immediately intuitive to someone who has used the old model. (Granted, I’ve probably played with LINQ a total of three hours, now, so I can’t complain too much.) I’ve picked up a LINQ book, and will work to figure that one out. I suspect that the syntax will become easier with time and practice.

Is the demo worth a free Zune? Well, if anybody at Microsoft feel so, let me know <grin>.

That’s pretty much it for today.

Code Safe!

MW

Next Page »

Create a free website or blog at WordPress.com.