This blog has moved to Medium

Subscribe via email


Posts tagged ‘Bugs’

Ouch, System process is locking my folders!

A devilish problem I’ve had in the past week is the the Windows “System” process was locking a folder or two. This was on a build agent, and was causing problems deleting that folder as part of the build. This was all happening while I was introducing a new build type to our collection – automated system tests using Selenium – and I initially thought some voodoo in Selenium was causing the problem.

LockHunter (a 64-bit Unlocker equivalent) wasn’t able to unlock the folder (though it did well on folders locked by other processes). The problem seemed to go away after a reboot of the build agent, but then came back to haunt me. I tried asking this on SuperUser (StackOverflow’s colorful twin), but to no avail.

Well, last night I finally found the problem – my own computer had an open explorer window on one of the folders in the build server. I believe this window persisted even after an agent reboot. Closing it solved the problem, and achieved world peace.

On the evils of yield

I absolutely love yield. Don’t we all? It simplifies writing enumerations to the point of absurdity. Simplicity, however, is a double-edged sword – I spent the better part of this day debugging a most evil bug, that resulted from over-yielding.

At Delver (now Sears) we have a file-based repository containing millions of items. We try to make things as efficient as possible, and sometime we overdo it. Our sin for today is using IEnumerable a bit too much. This repository was designed to be:

  1. Scalable (within our constraints) – should able to hold several million tasks
  2. Fast
  3. Relatively convenient to use – the user should be able to iterate on it using foreach, for one.

To accomplish 1 and 2, we avoided allocating large in-memory structures because they wouldn’t be able to hold the amount of items we’re talking about. To provide a convenient interface, we used IEnumerable.

Here is a mock-up of the code (for simplicity, it doesn’t use the disk but an in-memory serialized dictionary):

public class PeopleRepository
{
    private readonly Dictionary _serializedPeople = new Dictionary();
 
    public void Save(Person person)
    {
        // this method, as innocent as it looks, make it more difficult to discover the bug. See ahead.
        Save(new[]{person});
    }
 
    public void Save(IEnumerable<Person> people)
    {
        var serializedPeople = from p in people select new {p.ID, p.Serialized};
        foreach (var p in serializedPeople)
            _serializedPeople[p.ID] = p.Serialized;
    }
 
    public IEnumerable<Person> Read(Predicate<Person> predicate)
    {
        foreach (int id in _serializedPeople.Keys)
        {
            var person = new Person(_serializedPeople[id]);
            if (predicate(person))
                yield return person;
        }
    }
}

The bug I tracked was that updates to the repository were not taking place, but instead were simply ignored. The first thing I tried, was writing a simple test:

// Setup
var repository = new PeopleRepository();
repository.Save(new Person(1, " John")); // oops, I put an extra space here
 
// Find &amp; Fix John
var john = repository.Read(p => p.ID == 1).First();
john.Name = john.Name.Trim();
 
// Fix poor John back to the repository
repository.Save(john);
 
// Make sure john is saved properly
john = repository.Read(p => p.ID == 1).First();
if (john.Name != "John")
    throw new Exception();

Sadly, this test passed with flying colors. More debugging revealed the problem happened because both our Read and Save methods returned IEnumerables. It appears that Read() read the items and made the required updates … but … when writing the items back to the repository, it iterated on them.

Let me repeat – we read some items, iterated on them and modified some, saved and thus reiterated. Bam!

Bam

The second iteration didn’t iterate on the modified items – because the internal implementation of Read used a yield statement, there was no actual collection returned. So the second iteration just caused the repository to re-read all the items from disk, and ignore the modified items.

Conclusion: whenever you see methods that return IEnumerables, be suspicious. Odds are it should return a List or Collection. And whatever you do, watch out from feeding that IEnumerable back to the same repository.

Here is a final test that almost reproduces the problem. It crashes with a CollectionWasModified exception, while my actual test & code just silently failed (because the repository I mocked up here doesn’t save to the disk, but rather keep everything in-memory).

// Setup
var repository = new PeopleRepository();
repository.Save(new Person(1, " John")); // oops, I saved a space in front of John
 
// Let's read and fix all people starting with a space
var people = repository.Read(p => p.Name.StartsWith(" "));
foreach (var person in people)
    person.Name = person.Name.Trim();
 
// store the modified points back
repository.Save(people);