This blog has moved to Medium

Subscribe via email


Archive for January 2009

Never rely on tail recursion elimination

Corollary: Never write a recursively method with an unbounded stack depth.

I previously posted a recursive implementation of Factorial, and asked the readers “not to bother saying it’s inefficient because of recursion, since it will probably be optimized away anyway”. Not true.

I don’t really understand why (just posted a question to StackOverflow), but C# does not generally perform tail recursion elimination (at least as my measly test show).

The following method can be easily optimized, both to work faster, consume less space and not to cause a StackOverflowException … but it is not optimized by the compiler (verified by Reflector and the fact a StackOverflowException was indeed thrown).

private static void Foo(int i)
{
    if (i == 1000000)
        return;
 
    if (i % 100 == 0)
        Console.WriteLine(i);
 
    Foo(i+1);
}

So … if for some reason you prefer writing recursive methods over using loops, do it only where performance is not a consideration and you have a good understanding of the maximal stack depth.

Playing with a few ReaderWriterLocks in .Net

With Delver‘s future clouded by a big question mark, I’m looking for a new job. This reminded me of my previous job hunt, and the part we all “love” about it – job interviews.

Last year, two companies I was interviewing for asked me to implement a ReaderWriterLock. It took me more time than I’d hoped, but I got the implementation done. Since this was one of the hardest interview questions (for me at least), I’ve decided to implement one or two versions, and to test the performance vs the built-in reader writer locks available in .NET.

First, I wrote the test code, which is roughly composed of:

  1. Shared (singleton) counters object with two counters, X & Y.
  2. Some reader threads. When a reader reads, it reads both X & Y and makes sure they are equal.
  3. Some writer threads, that increment X & Y together (thus maintaining X == Y).

Then, I tested this setup without any locks, and as expected the reader threads got different values of X & Y.

Next, I implemented the following locks:

  1. DummyReaderWriterLock – a horribly inefficient implementation that is reader/writer-oblivious. It just locks the object regardless of reads/writes.
  2. SemaphoreReaderWriterLock – already a huge improvment, this locks uses a semphore that holds some large number (~2000) locks. Each reader requests only a single lock from the semaphore, and a writer must obtain ALL locks before continuing. Two writers are prevented from deadlocking by a separate Mutex. One immediate problem with this lock is writer starvation. For a writer to obtain the lock, it must first get the mutex and then all locks in the semaphore. This means a single writer is competing against all readers for the semaphore.
  3. EventReaderWriterLock – implemented by two events and a reader count. The first reader & all writers must get signaled in order to get the lock. Once any reader got the lock, other readers are free to enter without blocking. The last reader out is responsible for signaling the event and letting other writers or readers back in.

I also pulled my team leader Oren into this problem, and he came up with a state based implementation – not that far from my own, with an additional “state” integer that represents whether the lock is in “writing mode”, “reading mode” or free. He also added a few TestAndCompare hacks for checking the state.

Finally, I tested the performance of these locks vs two locks that are availble in .NET 3.5: ReaderWriterLock and ReaderWriterLockSlim. I only tested only one scenario, but was pleasantly surprised to discover the performance of both my EventReaderWriterLock and Oren’s StateReaderWriterLock were identical to that of ReaderWriterLockSlim, and better than ReaderWriterLock! (The performance of DummyReaderWriterLock were literally off the charts).

Here is my implementation:

    /// 
    /// A ReaderWriter lock implemented by events.
    ///
 
    /// An AutoReset event that gives ownership of the lock either to one writer or to all the readers.
    /// A manual reset event that allows readers to enter, and is only reset when the last reader finishes
    /// 
    /// 
    public class EventReaderWriterLock : IDisposable
    {
        /// 
        /// Both readers and writer need this lock to work
        /// 
        private readonly AutoResetEvent _lockAvailble = new AutoResetEvent(true);
 
        /// 
        /// Further readers beyond the first one wait on this object
        /// 
        private readonly ManualResetEvent _canReadEvent = new ManualResetEvent(false);
 
        /// 
        /// Used to synch the reader lock/release
        /// 
        private readonly object _readerLock = new object();
 
        private int _readers;
 
        public void LockReader()
        {
            lock (_readerLock)
            {
                int oldReaders = _readers++;
                if (oldReaders == 0)
                {
                    // I'm the first reader, let's fight for the lock with the writers
                    _lockAvailble.WaitOne();
 
                    // got lock, notify all other readers they can read
                    _canReadEvent.Set();
                }
                else
                {
                    // wait for the first reader to signal me
                    _canReadEvent.WaitOne();
                }
            }
        }
 
        public void ReleaseReader()
        {
            lock (_readerLock)
            {
                int oldReaders = _readers--;
                if (oldReaders != 1)
                {
                    // If I'm not the last reader, I do nothing here
                    return;
                }
 
                // I'm the last, let's forbid other readers but allow writers or a first reader.
                _canReadEvent.Reset();
                _lockAvailble.Set();
            }
        }
 
        public void LockWriter()
        {
            _lockAvailble.WaitOne();
        }
 
        public void ReleaseWriter()
        {
            _lockAvailble.Set();
        }
 
        public void Dispose()
        {
            _lockAvailble.Close();
            _canReadEvent.Close();
        }
    }

And the entire zip with the other RWLocks and test harness.

A few immediate conclusions:

  1. Writing a working reader writer lock is a very non-trivial problem for a job interview – but watching an applicant struggle with it can give you insights about his know-how around multi-threaded code.
  2. Writing an all-purpose RWLock seems like a daunting task. In my exercise I specifically avoided broad considerations such as fairness & readers upgrading to writers. Testing it for all end cases seems almost impossible (though some formal theoretical tools exists for such correction proofs)
  3. At least for some problems, writing your own lean solution can be better (performance wise) than relying on the de facto standard. While our solutions weren’t better than ReaderWriterLockSlim, they were significantly better than ReaderWriterLock – again, only in the context of this test harness.

Bonus: my first implementation didn’t have a lock statement in LockReader() and ReleaseReader(), and used Interlocked.Increment() and Decrement() to update the _readers variable. Still, it contained a hidden bug – can you find it, and understand why the lock is necessary?

Programming – The art of giving percise instructions

I have a friend that likes riddling me riddles, kind of like a sphinx.
The last riddle, out of a computer science course, asked to solve programatically the following problem:

You are in a lattice, at some position (x,y), with x>=0 & y>=0. How many different routes do you have for reaching the origin (0,0), where in each step you either go one unit down, or go one unit left?

I found three solutions, each better in time complexity than the previous one:

Solution 1 – Just translate the question

As the title of this post claims, a great deal of the magic of programming is knowing how to translate problems from human language to machine language. The first algorithm should be straight forward – just do as a human would:

public long CountWays1(int x, int y)
{
  if (x == 0 || y ==0)
  {
    // you've reached the edge of the positive quadrant
    // only one way from here - either straight down or straight left
    return 1;
  }
 
  // else, we're somewhere with x>=1 and y>=1.
  return
    // Either you go down now
    CountWays1(x, y-1) +
    // Or you go left
    CountWays1(x-1, y);
}

It’s easy to verify this solves the problem. No magic involved, just take the literal description of the problem and turn it into a language a computer will understand.

Solution 2 – Add a little cache

A quick complexity analysis will reveal actually running this program on not too large inputs will never complete. The solution is hyper-exponential (grows faster than 2^(x+y)). This is because the same sub problem is computed over and over again. When calculating CountWays1(40,40), you will calculate CountWays1(20,20) many times over again (I’m ignoring integer overflow for the sake of discussion). This can be solved by adding a cache, or memory, to our solution:

public long CountWays2(int x, int y)
{
  long[,] solutions = new long[x,y];
  return CountWays2Impl(x, y, solutions);
}
 
private long CountWays2Impl(int x, int y, long[,] solutions)
{
  if (x == 0 || y ==0)
  {
    // you've reached the edge of the positive quadrant
    // only one way from here - either straight down or straight left
    return 1;
  }
 
  // !!! Change from CountWays1 !!!
  if (solutions[x,y] != 0)
  {
    // we already know the solution for (x,y), let's not recalculate it
    return solutions[x,y];
  }
 
  // else, we're somewhere with x>=1 and y>=1, and you don't know the answer yet
  long solution =
    // Either you go down now
    CountWays1(x, y-1, solutions) +
    // Or you go left
    CountWays1(x-1, y, solutions);
 
  // !!! Change from CountWays1 !!!
  // We found the solution for (x,y), let's store the result in the array to avoid recalculations
  solutions[x,y] = solution;
  return solution;
}

In fact, this solution is so simple it can be “coded” in an Excel sheet. Just put 1 in the first row & column, put “=A2+B1” in the (2,2) cell, and copy this formula to all other cells.

The time and space complexity of this solution is O(x*y) – the size of our matrix, where filling every cell in it takes a constant amount of steps.

Solution 3 – Remember Pascal

After playing with Excel and actually calculated some values, I remembered this problem is actually equivalent to Pascal’s Triangle



An association every programmer should have for Pascal’s Triangle is combinatorics. If you think about the problem in terms of combinatorics instead computer science, you’ll see the solution is actually the total number of ways one can order a certain sequence of actions of length X+Y:

In X+Y moves, you will reach the origin (unless you step out of the positive quadrant). Every move is either Down or Left, so to get the number of ways you can calculate the total number of different orders of length X+Y, where every item is either Down or Left, and you have exactly X number of Lefts and Y number of Downs.

The number of ways to order a sequence of length X+Y is (X+Y)! Then, we divide by the number of ways to order the Left actions (X!), and by the number of ways to order the Down actions (Y!). This is because our initial number (X+Y)! counted the sequence “Left, Left, Down” twice.

So the most efficient way to answer the riddle is simply this:

public long CountWays3(int x, int y)
{
  return Factorial(x+y)/Factorial(x)/Factorial(y);
}
 
private long Factorial(int x)
{
  if (x <= 1)
    return 1;
 
  return x*Factorial(x-1);
}

(Note this is Tail Recursion, so don’t give me arguments about inefficient recursion – this can be optimized into a loop by the compiler).

Another draft walkthough, Grixis

Find it here.

Reading enviornment variables from external processes in C#

I was trying to discern which of 3 java processes is our Tomcat process, to kill rouge Tomcats in our unit tests. We have code that uses Process.GetProcessesByName(), and checks the returned StartInfo.

Apparently, Process.GetProcess() returns empty StartInfo.

Digging around, I failed to find a ready-out-of-the-box way to do this. This StackOverflow article pointed me in the right direction of this CodeProject page.

My final result includes:

  1. A small utility that reads either all enviornment variables from some process, or a specific one.
  2. A small C# wrapper

Here is the usage:

private static string GetJavaWorkingDir(int pid)
{
  string processEnvReader = @"Scripts\ReadProcessEnv\bin\ReadProcessEnv.exe";
  ProcessEnvReader reader = new ProcessEnvReader(processEnvReader);
  return reader.Read(pid, "catalina_base");
}

Another draft walkthrough, 4-color Alara

Check it out here.