Dot Net Thoughts

January 1, 2008

Persisting the Doubly Linked List

Filed under: Misc Thoughts — dotnetthoughts @ 9:54 am
Tags: , , ,

Recently, we discovered that I needed a doubly linked list to chain objects together in our code. .Net has made this an incredibly easy process, as it provides a LinkedList generic object which manages the creation of the list, as well as the inserting and the deletion of the nodes.

Our project also requires that we persist our linked list to a database. The task seems easy enough. All that needed to be done was to create a table which contains our data, as well as pointers to the previous and next nodes. In other words, our initial table structure looked like this:

Id            int 
ParentId      int (FK to Id) 
ChildId       int (FK to Id) 
Description   nvarchar(30)

This structure seems to work on the surface, but we very quickly realized two very critical problems.

The first problem is that inserting the data into this data structure requires two passes. On the first pass, we insert all of the records into the database. Only after all of the records have been inserted can we assign links to both the parent and child records in the ParentId and ChildId columns.

foreach (link in theChain) 
   {  //Insert the record. }  

foreach (link in theChain) 
   {  //Update the record to include the parent and child pointers }

 The second problem is that the data can fall out of sync with itself. For example, what happens if the data ends up looking like this due to some misbehaving code? Id 1 believes that the child record should be Id 2, but Id 2 believes that it is the top of its own chain.

Id: 1   ParentId: null   ChildId: 2 
Id: 2   ParentId: null   ChildId: 3

Both of these problems can be solved by treating the doubly linked list as a singly linked list in the database. If you have the links of a chain going in one direction, you should be able to determine the links going the other way. We initally avoided this option, because we thought the query to retrieve the data would be extremely complex. (Query the parent with a union of the child, maybe into a temporary table. Ugh.)

While on a walk, yesterday, though, I came up with the idea of simply writing a query with an additional join that would return the data with the links in both directions. Our database would no longer need the ChildId column. If we order our data so that parents always fall above their children (the natural state of a linked list), we can insert all of this data in a single pass. Since there is no ChildId, the data can’t become inconsistent.

Id            int 
ParentId      int (FK to Id) 
Description   nvarchar(30)

We retrieving data to recreate the LinkedList in code, we can get both parent and child ids by linking the LinkedList table to itself.

SELECT parentList.Id, parentList.ParentId, childList.Id AS ChildId
FROM LinkedList parentList
LEFT JOIN LinkedList childList ON parentList.Id = childList.ParentId

It’s always a neat experience when an elegant solution comes out of the blue to solve a complex problem. I’m amazed at how often walking away and letting the subconscience mind work will lead to a better solution than when it is being actively developed. Seems like a good New Year’s resolution will be to walk more. Leads to a healthier me, and healthier code.

Good luck and code safe!

MW

November 10, 2007

TransactionScope and Unit Tests

Filed under: Uncategorized — dotnetthoughts @ 6:29 am
Tags: , , , , ,

Writing unit tests to validate a data access layer (DAL) can be a time consuming (but life saving) task. One of the biggest challanges of DAL unit tests is assuring that you have consistent data to pull from the database. Dedicating a database with static data for unit tests doesn’t always work. As unit tests are added to the project, data may need to be added to the database. This can cause previously created tests to fail, and a lot of time can be lost trying to resync everything.

One technique for getting around the unit test database consistency issue is to write data to the database you expect to find. The steps for doing this would be:

  • Begin a transaction in your unit test.
  • Write the data to the database that your data access layer will need.
  • Test the data access layer functionality against the inserted data.
  • Roll back the transaction.

Using a transaction has a couple of distinct advantages. Since you are running in the scope of an uncommitted transaction, your fellow developers running unit tests will not see your added data (isolation). Also, rolling back the transaction places the database back to the original state in which you found it.

Managing transactions by explicitly attaching them to the connection object doesn’t always work well when testing a data access layer. Since the DAL often contains code to retrieve its own connection, the following sequence often occurs:

  • Begin a transaction in your unit test.
  • Write the data to the database that your data access layer will need.
  • Call the Data Access Layer. The Data Access layer creates its own connection.
  • The DAL attempts to read the new data from the database. It is blocked, though, because it is in a different transaction than the unit test transaction, and will not be able to complete until the DAL commits.

The TransactionScope object helps alleviate this problem. It sits on top of the Distributed Transaction Coordinator, and will assign any code running within its context to the same transaction. In other words, a TransactionScope context within your unit tests will force your data access layer code to run within the same context. Isolation is maintained from other developers, but your DAL can access and manipulate the data as needed. (This does require that you have the Distributed Transaction Coordinator Service running on the box that handles the transaction.)

To demonstrate, let’s assume that I have a DAL method that I want to test that returns all users from a database. This method gets its own connection, retrieves the users, and returns them as a DataSet.

public DataSet GetUsers() 
{ 
    DataSet dataSet = new DataSet(); 
    string connectionString = GetConnectionString();      

    using (SqlConnection connection = new SqlConnection(connectionString)) 
    { 
        connection.Open(); 
        string sql = "SELECT * FROM [User]"; 
        SqlCommand command = new SqlCommand(sql, connection); 
        SqlDataAdapter adapter = new SqlDataAdapter(command); 
        adapter.Fill(dataSet); 
    }      

    return dataSet; 
}

Here is the unit test to test this code. It doesn’t test nearly all of the functionality you would want to check in a real unit test, but it does demonstrate the TransactionScope. Note that the TransactionScope object doesn’t have an explicit RollBack() method. The Rollback occurs if the TransactionScope object is disposed without Complete() being called on it. This occurs at the end of the using block.

[TestMethod()] 
public void GetUsersTest() 
{ 
    string connectionString = GetConnectionString();     

    using (TransactionScope ts = new TransactionScope()) 
    { 
        using (SqlConnection connection = 
            new SqlConnection(connectionString)) 
        { 
            connection.Open(); 
            DataLayer dataAccessLayer = new DataLayer();     

            DataSet dataSet = dataAccessLayer.GetUsers(); 
            AddNewUser("Fred", connection);     

            dataSet = dataAccessLayer.GetUsers(); 
            DataRow[] dr = dataSet.Tables[0].Select("[UserName] = 'Fred'"); 
            Assert.AreEqual(1, dr.Length); 
        } 
    } 
}

Hope this is helpful. Good luck and code safe!

MW

Blog at WordPress.com.