Dot Net Thoughts

January 1, 2008

Persisting the Doubly Linked List

Filed under: Misc Thoughts — dotnetthoughts @ 9:54 am
Tags: , , ,

Recently, we discovered that I needed a doubly linked list to chain objects together in our code. .Net has made this an incredibly easy process, as it provides a LinkedList generic object which manages the creation of the list, as well as the inserting and the deletion of the nodes.

Our project also requires that we persist our linked list to a database. The task seems easy enough. All that needed to be done was to create a table which contains our data, as well as pointers to the previous and next nodes. In other words, our initial table structure looked like this:

Id            int 
ParentId      int (FK to Id) 
ChildId       int (FK to Id) 
Description   nvarchar(30)

This structure seems to work on the surface, but we very quickly realized two very critical problems.

The first problem is that inserting the data into this data structure requires two passes. On the first pass, we insert all of the records into the database. Only after all of the records have been inserted can we assign links to both the parent and child records in the ParentId and ChildId columns.

foreach (link in theChain) 
   {  //Insert the record. }  

foreach (link in theChain) 
   {  //Update the record to include the parent and child pointers }

 The second problem is that the data can fall out of sync with itself. For example, what happens if the data ends up looking like this due to some misbehaving code? Id 1 believes that the child record should be Id 2, but Id 2 believes that it is the top of its own chain.

Id: 1   ParentId: null   ChildId: 2 
Id: 2   ParentId: null   ChildId: 3

Both of these problems can be solved by treating the doubly linked list as a singly linked list in the database. If you have the links of a chain going in one direction, you should be able to determine the links going the other way. We initally avoided this option, because we thought the query to retrieve the data would be extremely complex. (Query the parent with a union of the child, maybe into a temporary table. Ugh.)

While on a walk, yesterday, though, I came up with the idea of simply writing a query with an additional join that would return the data with the links in both directions. Our database would no longer need the ChildId column. If we order our data so that parents always fall above their children (the natural state of a linked list), we can insert all of this data in a single pass. Since there is no ChildId, the data can’t become inconsistent.

Id            int 
ParentId      int (FK to Id) 
Description   nvarchar(30)

We retrieving data to recreate the LinkedList in code, we can get both parent and child ids by linking the LinkedList table to itself.

SELECT parentList.Id, parentList.ParentId, childList.Id AS ChildId
FROM LinkedList parentList
LEFT JOIN LinkedList childList ON parentList.Id = childList.ParentId

It’s always a neat experience when an elegant solution comes out of the blue to solve a complex problem. I’m amazed at how often walking away and letting the subconscience mind work will lead to a better solution than when it is being actively developed. Seems like a good New Year’s resolution will be to walk more. Leads to a healthier me, and healthier code.

Good luck and code safe!

MW

Advertisements

October 20, 2007

Doubles, Decimals, and Dividing by Zero

Filed under: Misc Thoughts — dotnetthoughts @ 7:02 am
Tags: , , , , , , , ,

“.Net Exception Divide by zero” often comes up as a search criteria when this blog is hit. (I use division by zero a lot when I’m writing about errors and exception handling. It’s easy to create, and it’s easy to understand.) I couldn’t quite figure out exactly why anybody would be querying on it directly. I think I’ve figured it out, though.

Last week, I was typing up a division example for the blog. For whatever reason, I used double instead of decimal for my input parameters and return value.

   private static double Divide(double i, double j) 
      { 
         return (i / j); 
      }

When I passed in values of 5 and 0 for this method, I was expecting a divide by zero exception. Instead, my console app ran just fine and printed the word Infinity on my screen. I was totally caught off guard by this result. If you pull out an old calculus book, you’ll find that mathematicians often will say that the value of 1/x, as x approaches 0, is infinity (a standard limit), but that 1/0 is undefined. (You can’t take one object and break it into groups of zero.)

When I run the same code above using decimals, instead of doubles, I get a divide by zero exception. This is what I would expect.

With a little bit of thought and poking around, I think that there is a method behind the madness.

Doubles are floating point types. These types are specifically engineered never to throw an exception. Instead, they return values such as infinity, positive infinity, negative infinity, and not a number. Why?

I suspect that the reason has to do with precision. The double type supports numbers as small as +-5*10-324 and as large as +-1.7*10308. Its precision, however is only 15 to 16 digits. Due to the way that floating points are handled, you can’t ever be totally sure exactly what value your Double contains for very large or small numbers. For example, the following code prints out a value of 9.99988867182683E-321 in my output window when I run it:

   private static void DoublePrecision()  
   { 
       double double1 = 1 * Math.Pow((double)10, (double)-320); 
       Trace.WriteLine("double1: " + double1.ToString()); 
   }

This loss of precision means that the compiler itself can’t determine whether a value truly is zero, or if it is a really small number quite close to zero. Mathematically, 1/(10-324)2 is 1/10-648. If the first equation were plugged into .Net, the the result wouldn’t be infinity, but it would be greater than the double can deal with. The convention seems to be to return infinity for my initial divide by zero question.

Decimals, on the other hand, support numbers as small as +-1*10-28 and as large as +-7.9*1028. A decimal’s precision is 28 to 29 significant digits. Since the precision is equal to the exponential power of the range supported, the compiler knows the value it holds in the register is accurate, at least to defined precision. If it thinks it has a zero, it actually has a zero, and can safely throw a DivideByZero error if you try to use it to divide. This added precision is why Microsoft encourages decimals for financial and scientific values.

I’m making some educated guesses behind the thinking of the writers of the IEEE 754 standard, which defines floating types, but I don’t suspect I’m too far off. Let me know if you have any insights!

Good luck and code safe!

Mike

October 12, 2007

Why Language is Important (Why I prefer C#)

Filed under: Debugging,Misc Thoughts — dotnetthoughts @ 7:13 pm
Tags: , , , , , , ,

How many times have you heard this statement?

 “It doesn’t really matter whether you choose Visual Basic.Net or C#. It all compiles down to the CLR, anyway.”

This statement makes me shudder. It’s at least partially true. All managed code does compile down into the common language runtime. This is what allows us to mix and match components written in different languages when building an application. What this statement doesn’t recognize is the fact that every language was created for a specific purpose.

In his book CLR via C#, Jeffrey Richter lists around two dozen different compilers he knows of. These include many well known languages such as C#, J#, LISP, Perl, and Eiffel, just to name a few. Having this many different compilers would be a lot of wasted effort if each language acted exactly the same as every other language out there.

The reality is that each language has its advantages and disadvantages. C# and VB.Net are great for handling Input and Output. APL is optimized for mathematics and finance. PERL is a monster when it comes to string manipulation. LISP, one of the oldest programming languages, is still the language of choice for AI. C++.Net allows both managed and unmanaged code to run within the same module. Every language is designed for a specific type of development.

I would love to learn the ins and outs of all the different languages out there. When working as a consultant, however, reality dictates that I will almost always be programming either C# or Visual Basic.Net.

Choosing between these two isn’t always an easy decision. In many ways, I find the UI presented by the Visual Basic team to be superior to the UI offered by C#. Filtering down methods and properties to those most used, the my namespace, and better on-the-fly error detection make Visual Basic.Net a very enjoyable programming experience. When I’m in the IDE, I feel that VB wins hands down.

Ultimately, though, when I look at the intent of the entire language (and not just the UI), VB loses some of its shine. The intent of the C# language was to build a clean and new language around the .Net runtime. C# has a clean mapping between .Net runtime capabilities and features. Close alignment with the framework was its intent. VB, on the other hand, was designed to maintain market share by retaining much of the language from the previous version of VB. This often did not make sense in the context of the new .Net environment.

Let me give you an example. In legacy VB days, there was no such thing as structured exception handling. All errors were handled either by calling On Error Goto <label>, or by calling On Error Resume Next. On Error Goto isn’t a particularily good way to handle errors. On Error Resume Next is a disaster. Take this code, for example:

  Sub ResumeNextTest()
     On Error Resume Next
     Dim xml As XmlDocument = New XmlDocument
     xml.Load("c:\AMissingFileName.xml")
     Console.WriteLine(xml.OuterXml)
     Console.WriteLine("ResumeNext complete.")
  End Sub

If the attempt to load the xml fails, there will be no indication. You could probably infer that the xml didn’t load when the Console prints out nothing on the next line, but there is no explicit indication. In this example, the Xml data is accessed right away. In a more complex app, though, the problem of the missing Xml data may not show up for quite some time.

The .Net runtime has no concept of On Error Resume Next, though. If an error is thrown, something has to happen. How does VB get around this?

To find out, I wrote a simple Console App with the above method included, and ran reflector against it to see what the VB compiler did when it converted the code to IL. (I’ve included the converted code at the bottom of this post.) Essentially, VB wraps the entire On Error Resume Next method in a try-catch block. As it progresses through the code, the CurrentStatement variable is updated to indicate the location in the code. Should an exception occur, the exception is swallowed, and the user is redirected to the label following the point that the trouble happened.

While this is a clever way of solving the problem, it allows a very dangerous practice that should have been eliminated to carry on. Furthermore, if I ever had a developer present me with this kind of spaghetti code during a code review, we would have to have a serious talk.

Your choice of VB, C#, or any other compiler is up to you. Be sure you know the pros and cons of whatever language before you develop with it, though.

Good luck and code safe!

 Mike

public static void ResumeNextTest() 
{ 
    // This item is obfuscated and can not be translated. 
    int VB$ResumeTarget; 
    try 
    { 
        int VB$CurrentStatement; 
    Label_0001: 
        ProjectData.ClearProjectError(); 
        int VB$ActiveHandler = -2; 
    Label_0009: 
        VB$CurrentStatement = 2; 
        XmlDocument xml = new XmlDocument(); 
    Label_0011: 
        VB$CurrentStatement = 3; 
        xml.Load(@"c:\AMissingFileName.xml"); 
        goto Label_0089; 
    Label_0024: 
        VB$ResumeTarget = 0; 
        switch ((VB$ResumeTarget + 1)) 
        { 
            case 1: 
                goto Label_0001;   

            case 2: 
                goto Label_0009;   

            case 3: 
                goto Label_0011;   

            case 4: 
                goto Label_0089;   

            default: 
                goto Label_007E; 
        } 
    Label_0044: 
        VB$ResumeTarget = VB$CurrentStatement; 
        switch (((VB$ActiveHandler > -2) ? VB$ActiveHandler : 1)) 
        { 
            case 0: 
                goto Label_007E;   

            case 1: 
                goto Label_0024; 
        } 
    } 
    catch (object obj1) when (?) 
    { 
        ProjectData.SetProjectError((Exception) obj1); 
        goto Label_0044; 
    } 
Label_007E: 
    throw ProjectData.CreateProjectError(-2146828237); 
Label_0089: 
    if (VB$ResumeTarget != 0) 
    { 
        ProjectData.ClearProjectError(); 
    } 
}

September 29, 2007

Xml Code Snippets

A couple of months ago, I was giving a presentation about Web Services at the Portland Area Code Camp. The presentation was going okay, but I had the dreaded right-after-lunch time slot. The room was a little bit sleepy, to say the least. As I assembled a class, I typed prop and pressed tab twice. Studio diligently dropped a new property skeleton into my code. Suddenly, the room woke up. Everybody wanted to know what the cool add-in was that I was using, and where they could get it from. When I told them it was built into studio, they were elated. I don’t know if any of them remembered anything I had to say about Web Services, but I do know several of them went home and learned about snippets.

Xml Code Snippets are simply keyboard shortcuts for repetitive tasks that you may need to do in the IDE. If you open up the Studio IDE and select Edit–>Intellisense–>Insert Snippet (or press <ctrl>k – <crtl>x), a list of snippets available on the machine appear. Selecting prop, for example, from the list of available C# snippets creates a skeleton property, complete with member variable, into the class.

snippet.JPG

The property snippet highlights three values when dropped on the screen. The type of the property, the name of the property, and the name of the member variable. In the image above, myVar is highlighted. When I modify this variable name and tab off of it, the myVar in the getter and the setter will be updated automatically.

One of the neat things about Snippets is that you can create your own, or update the existing snippets. The included snippets are defined by default in %Program Files%\Microsoft Visual Studio 8\VC#\Snippets\1033\Visual C#. Modifying a snippet in this location will change the snippet for all users using the machine. Snippets can be modified for individual users by placing them in %MyDocuments%\Visual Studio 2005\Code Snippets\Visual C#\My Code Snippets. You can also define your own locations for storing snippets in the Code Snippet Management Window (<ctrl>k – <ctrl>b).

I’ve updated the property code snippet on my machine to take care of a couple of annoyances that I run into when debugging.

First of all, if I’m debugging a method in which I’m passing several properties on an object as parameters, I always have to watch the IDE go down into the getter, and step through retrieving the member variable. Microsoft has given us the ability to automatically move over these steps by adding a DebuggerStepThrough attribute to our code.

Secondly, an object’s data is exposed via the property itself. I don’t need to see the internal variables displayed when I’m using the data tips window. Once again, Microsoft will let us do that with the DebuggerBrowsable attribute.

To update the code snippet on my machine, I simply do the following.

1. Copy the property snippet (prop.snippet) from the location under program files into the location under MyDocuments.
2. Open up the snippet. I’ll see that there are two major areas to the snippet itself. The <Header> element in the Xml provides information about the snippet, and how it will be accessed. The snippet element declares what values the snippet will collect, and how it will apply them.
3. Modify the CDATA section of the Code element to include my new debugger attributes. When I’m done updating, my new Code element appears as follows. (Note that I’m using the fully qualified names so that I don’t have to include System.Diagnostics in every scenario where I want to use my snippet.


<Code Language="csharp">
<![CDATA[[System.Diagnostics.DebuggerBrowsable(System.Diagnostics.DebuggerBrowsableState.Never)] private $type$ $field$;
 public $type$ $property$
 {
  [System.Diagnostics.DebuggerStepThrough]
  get { return $field$;}
  [System.Diagnostics.DebuggerStepThrough]
  set { $field$ = value;}
 }
 $end$]]>
</Code>

If I shut down and restart my browser at this point in time, my new snippet appears on the snippet menu.

Microsoft has released a lot of snippet add-ons for visual studio which can be found at http://msdn2.microsoft.com/en-us/vstudio/aa718338.aspx.

Code safe!

 MW

September 9, 2007

Source Control

Filed under: Misc Thoughts — dotnetthoughts @ 10:49 pm
Tags: , , , ,

If I’m going to be writing a blog that involves code, tools, binaries, images, research and who-knows-what else, it’s critical that an organized process for storing and retrieving data exists. Any professional developer, while sitting at their desk at work, has some version of source control open and running all the time. Why? Because it is critical that one is able to go back to any point in history to examine what the code or documentation looked like.

So why are so many of us at home running without source control? I suspect it’s simply because it is one more application to set up and used on an already overwhelmed desktop. That seems like a fairly weak excuse, however. I can’t count the number of times I’ve wanted to see what a document looked like last week or last month, but was unable to.

So, I started searching the web for different source control options. Functionality, ease of use and price (i.e. free), were my main requirements. A little research on the web led to a couple of open-source standards. I’ve heard of companies using both CVS and Subversion with quite a bit of success, but I ended up shying away from both of them. Executing CVS was via command line switches that I never quite got to work right. Subversion documentation seemed to imply the same type of command line interface.

Eventually, I settled on ionForge’s Evolution. ionForge offers a free personal evaluation license for their source control with no expiration date, which is just perfect for the needs of an individual developer. It also seems to have a fairly impressive set of features beyond source control, such as process work flow, strong support for different versioning of branches (development vs. production), and good security integration. The user interface also seems to be fairly intuitive for basic activities.

ionforge.jpg

The one drawback that I’ve discovered so far in working with the product is a lack of documentation. Several links in the help menu don’t seem to do anything. Also, I can’t find information in the included admin or client guide about some key features (such as the MS SCC API) that I find on the web site. I suspect that I’ll be able to work through any issues that come up with time, though. The price is right.

Anyway, that’s the thoughts for today. Good luck and code safe!

 Mike

Next Page »

Blog at WordPress.com.