Dot Net Thoughts

December 14, 2007

The Linq Jukebox – Part 2 (LINQ Joins)

Filed under: csharp — dotnetthoughts @ 11:47 pm
Tags: , , ,

Last week, I blogged about creating a jukebox using LINQ to query the filesystem. We used a quick-and-dirty query to join properties from three different objects (two directory objects and a file object) into an anonymous type which represented the music folder hierarchy on my machine. It was a pretty neat first attempt at using LINQ, but I want my jukebox to do more.

My goal for this blog is to append additional information to the individual tracks in my library. To do this, I will store comments on individual tracks in an xml file. Each xml element will contain the filename of the track to which the comment applies, as well as the comment itself. Using the filename as a key, we will use LINQ to join the additional Xml information into the data generated from the directory structure.

The xml file is structured as follows. (The file attribute is truncated for space in this posting.)

<TrackData>   
   <Track file="C:\...\Aerosmith\Get A Grip\4 Fever.mp3" comment="Covered by Garth Brooks"/>   
   <Track file="C:\...\The Big Horn Brass\Christmas With The Big Horn Brass\10 Let It Snow.mp3" comment="Merry Christmas!"/> 
</TrackData>

The first step in this process will be to load the Xml. LINQ introduces a whole new series of objects for dealing with Xml data. When using LINQ, XDocuments are used to hold Xml data. These can be loaded directly from a file in much the same you would load the more familiar XmlDocument object.

     XDocument xmlComments = XDocument.Load(Path.Combine(musicFolder, "CustomData.xml"));

We’re now going to loop through each of the Track elements and extract both the file name and the comment from the data. We will start at the DocumentElement and drill down into the XML DOM object using the Elements method. This new method returns an IEnumerable list of type XElement. Using each of these elements, we will create a new anonymous type from the element’s attributes and store a collection of them in the variable comments.

            var comments = 
                from comment in xmlComments.Elements("TrackData").Elements("Track") 
                select new { 
                    File = (string)comment.Attribute("file"), 
                    Comment = (string)comment.Attribute("comment")};

Next, we will take the LINQ query that we created in last week’s blog, and place it in a variable called tracks.

            var tracks = 
                from artistDirectory in topDirectory.GetDirectories() 
                from albumDirectory in artistDirectory.GetDirectories() 
                from file in albumDirectory.GetFiles("*.mp3", SearchOption.TopDirectoryOnly) 
                select new { 
                     Track = Path.GetFileNameWithoutExtension(file.Name), 
                     Album = albumDirectory.Name, 
                     Artist = artistDirectory.Name, 
                     TrackFile = file.FullName 
                };

Now comes the fun part. Using JOIN syntax very similar to SQL, we can take these two different types and merge them based on a primary/foreign key style relationship.

            var mergedResults = 
                from t in tracks 
                from c in comments 
                where c.File == t.TrackFile 
                select new 
                { 
                    Artist = t.Artist, 
                    Album = t.Album, 
                    Track = t.Track, 
                    TrackFile = t.TrackFile, 
                    Comment = c.Comment 
                };

For those of you following along at home, you’ll be quick to point out that this is not quite the result we’re looking for. This LINQ query returns an INNER JOIN style result. I have 50 or so tracks on my laptop, but only two entries in my Xml file. Using this query, I can only access the two tracks that exist in the Xml. What we really want is a LEFT OUTER JOIN. We wish to include all tracks, even if they don’t have a comment associated with them.

It takes a little bit to convince LINQ to do a flattened OUTER JOIN. First, we will do a group join on our Xml and store our results in a temporary variable (tracksAndComments) using the DefaultIfEmpty() method. DefaultIfEmpty will force a null into the right hand side of the join if no data matches the key. Next, we will export this data into a new anonymous type to be stored in mergedResults. We will use a ternary operator to replace any null tracksAndComments objects with an empty string.


            var mergedResults = 
            	from t in tracks 
            	join c in comments on t.TrackFile equals c.File into tracksAndComments 
            	from tc in tracksAndComments.DefaultIfEmpty() 
            	select new 
            	{ 
                   Artist = t.Artist, 
                   Album = t.Album, 
                   Track = t.Track, 
                   TrackFile = t.TrackFile, 
                   Comment = tc == null ? String.Empty : tc.Comment 
            	};

That’s all it takes. The mergedResults variable now contains track information and comments. In other words, we have done the following in eight lines of code:

  • Queried the file system, returned data nested two folders deep, and formatted them into a simplified object.
  • Loaded an XmlDocument, queried it to retrieve all track types, and stored the attribute values in into a simplified object.
  • Merged both of these objects into a single result set.
  • Iterated over the results and passed them on to another method for processing.

Not bad for eight lines of work. I’ve included the complete method below.

Hope you find this helpful. Good luck and code safe!

MW

        private void LoadMusicData() 
        { 
            string musicFolder = @"C:\Users\Mike\Music\iTunes\iTunes Music";           

            DirectoryInfo topDirectory = new DirectoryInfo(musicFolder); 
            XDocument xmlComments = XDocument.Load(Path.Combine(musicFolder, "CustomData.xml"));           

            var comments = 
                from comment in xmlComments.Elements("TrackData").Elements("Track") 
                select new 
                { 
                    File = (string)comment.Attribute("file"), 
                    Comment = (string)comment.Attribute("comment") 
                };           

            var tracks = 
                from artistDirectory in topDirectory.GetDirectories() 
                from albumDirectory in artistDirectory.GetDirectories() 
                from file in albumDirectory.GetFiles("*.mp3", SearchOption.TopDirectoryOnly) 
                select new { 
                     Track = Path.GetFileNameWithoutExtension(file.Name), 
                     Album = albumDirectory.Name, 
                     Artist = artistDirectory.Name, 
                     TrackFile = file.FullName 
                }   

            var mergedResults = 
            	from t in tracks 
            	join c in comments on t.TrackFile equals c.File into tracksAndComments 
            	from tc in tracksAndComments.DefaultIfEmpty() 
            	select new 
            	{ 
                   Artist = t.Artist, 
                   Album = t.Album, 
                   Track = t.Track, 
                   TrackFile = t.TrackFile, 
                   Comment = tc == null ? String.Empty : tc.Comment 
            	};           

            foreach (var mergeResult in mergedResults) 
            { 
                AddListViewItem(mergeResult.Artist, mergeResult.Album, mergeResult.Track, 
                      mergeResult.Comment, mergeResult.TrackFile); 
            }           

        }
Advertisements

Blog at WordPress.com.