This blog has moved to Medium

Subscribe via email


Posts tagged ‘Refactoring’

Javascript refactoring is hard

I’ve been refactoring code all my professional career. I started from C/C++, and when I hit C# (Resharper) and Java (IntelliJ), my productivity at refactoring was boosted by a few factors by the wonderful IDEs and refactoring tools that these languages have.

I am rather confident when I write “dirty code” in Java or C#, because I know that I can swiftly refactor it into beautiful code without too much trouble. Both aforementioned IDEs are so great at this, that it’s painless. You can take an ugly 300 line method and break it into several methods, break a long class into several classes, inline, move, and otherwise shape your code.

Enters javascript and web development.

Last year I’ve made the plunge and officially started working on front end web development, first at Google, and recently at Commerce Sciences. And I’ve recently discovered that … refactoring javascript and frontend code is hard.

The language is dynamically typed

Compilers and IDEs can help you less when the language is dynamic. They can make less assurances about your code, making reasoning and refactoring harder. Safely moving a method from class Foo to class Baz is an easy feat in a statically typed language, and a difficult or impossible one in a dynamic language.

Classes are not first-class citizens

While you can do OOP in javascript, classes are not first class citizens, but rather they’re implemented using functions, objects and prototypes. Automatically reasoning about “classes” without having an equivalent of keyword class is difficult.

Your code is not just Javascript, it’s HTML & CSS as well

Web development is not just about javascript, it’s a combination of JS, HTML & CSS (not to mention other potential technologies such as LESS/SCSS, HAML, JSON, and whatever language your backend is written in). Unless you’re already a web development ace and perfectly design your codebase … you will get these mixed up. Refactoring is about changing the design of your program, and when that design is split up between three or four different technology domains, design mistakes are harder to rectify.

You can’t (at least not yet) do an automatic refactoring to “move inline css into an external css file”, or to “convert static html snippet into a javascript DOM manipulation method”. The makers of IDEs and refactoring tools don’t have javascript completely figured out yet, no wonder they haven’t gotten around to building cross-domain refactoring!

Again, if you “know what you’re doing”, you can structure your program perfectly in the first place, and won’t ever have to do this kind of “cross domain refactoring”. Sadly, we’re not all born with this kind of experience.

It’s harder to know when you’ve broken something

Backend code is so much easier to test than frontend code. While frontend testing is important, and growing, it is by no means as well understood or practiced as classical “backend TDD, this is a calculator, assertEquals(30, calc.plus(10, 20))”.

So, since BE code has much wider test coverage than FE code, and is bloody compiled for Java & C#, when you refactor FE code it has a much greater chance of breaking down, often in subtle ways you’ll only notice on IE 8, or when the network is slow, or whatever edge case surfaces your particular bug.

 

 

So … how do we manage?

  • Refactor less – Since we know refactoring is hard, we try to do less refactoring and more pre-factoring. Think a little more than usual before coding. While I’ve grown the habit of “code first, think about design later” over the years due to power of refactoring, it’s less useful in FE dev, so I need to take the time before coding and try harder to get it the design right on my first attempt.
  • Still .. refactor – IntelliJ and Resharper both offer some refactoring capabilities. I’m most comfortable with “limited scope refactoring” – those that affect one function like Extract Variable. Use the tools you do have, instead of whining about the tools you don’t.
  • Try to think harder about the different problem domains (JS/HTML/CSS), and to develop a better understanding of how to structure your program in a way that won’t force you do to refactoring across problem domains.

Please do share your own experience with refactoring in web dev!

The shorter the better

I wrote in a previous post about a few refactorings meant to eliminate code duplication. I was reminded recently this principle has a name – DRY, which stands for Don’t Repeat Yourself, and should be applied everywhere.

Eliminating duplication, while a noble task, is not the only refactoring one should practice and apply. Breaking up large pieces of code into exceedingly smaller pieces is also important. It makes your code more readable to yourself and to other developers, and also make merges much easier to accomplish – nothing is more terrible than looking at a huge merge of a huge method or class and having to guess what combination of the versions is the correct one.

Here are a couple of important tips/techniques to make your code more manageable:

Tip I – Group expression into methods

It doesn’t matter if it’s code you wrote or stumbled on. Whenever you get the chance, select a few statements in a long method, use Resharper’s Extract Method refactoring, invent some name to describe what this group of statements does, and voila – you’ve shortened the original method. It’s easier to understand and maintain, and now the new method can be called and tested on its own. This technique can turn huge 200-line monster methods into responsible, comprehensible 15-liners.

Ideally you’d want to use this refactoring on a bunch of statements that actually have a logical cohesive meaning together. Even if that’s not the case, usually you’ll be better off with the extracted method. One notable exception – when the set of statements you’d refactor will lead to a method will multiple out parameters. Then you’d still have to declare these variables in the calling code, and your gain greatly diminishes.

Tip II – Group methods into classes

You’ve all seen the huge classes with dozens of methods. Hell, Tip 1 above is all about creating even more methods! Well, once you feel a class has grown too much, you should try to spot groups of methods that have related functionality. Perhaps most methods in the group are used only by a single method, but nowhere else. In this case, if you move all these methods away to a new class and make them private members of that class, their absense will certainly not be missed in the original class – it only uses the one method anyway (be sure to keep it public, of course). Like in the previous tip, you want to try and find groups of methods that have a common logical function in order to use this refactoring. However, even if you can’t find a common function to these methods, they will still be better off in another class.

While Resharper does have a Move Method refactoring, in some cases it’s better to just cut and paste, especially when you move a bunch of methods together, with the data members they use and all.

Tip III – Reduce nesting level

This will not shrink your methods by a lot, but it will make them more managable. Nesting level implies context, which you have to maintain in your head when you are processing code. When is this code called? Only if 5 different if statements happen to succeed. It’s painful on the eye.

A useful refactoring for this is Resharper’s “Invert If”. It takes a convoluted piece of code such as

public void ProcessUserInput(string name)
{
   if (name != null)
   {
      int id = FindIdByName(name);
      if (id != 0)
      {
         if (TryToStoreName(name, id))
         {
            Console.WriteLine("Stored {0}, {1}", name, id);
         }
      }
   }
}

And beautifies it into

public void ProcessUserInput(string name)
{
   if (name == null) return;
 
   int id = FindIdByName(name);
   if (id == 0) return;
 
   if (TryToStoreName(name, id))
   {
      Console.WriteLine("Stored {0}, {1}", name, id);
   }
}

Much easier to follow (the difference is highly evident in methods with 5 nesting levels or more – these acutely need this refactoring).

11 Tips for Beginner C# Developers

Today I sat with a friend (let’s call him Joe), who just switched from a job in QA to programming, and passed on to him some of the little tips and tricks I learned over the years. I’m sharing it here because I thought it could be useful to other people that are new to programming. The focus of this post is C#, but analogous tools and methods exist for other languages of course.

Refactorings (A.K.A Resharper)

(This is just a short introduction. I recommend the book Refactoring: Improving the Design of Existing Code as further reading)

As I’ve written here before, I just love Resharper. It is the best refactoring tool for C# I know of (even though it’s a bit heavy sometimes – make sure to get enough RAM and a strong CPU). Joe asked me about a C# feature called partial classes. He said his class was just too big (4000 lines) and becoming unmanageable, and he wanted some way to break it to smaller pieces. He also said that at his workplace, they lock the file whenever anyone edits it, because it’s very hard to avoid merge conflicts on such huge files.

I was happy he wanted to simplify and clarify his code by splitting the file, but the way he thought of doing this was the wrong way.

Tip I: Single Responsibility and Information Hiding

You should strive to minimize the information you require at place in your program. Joe had dozens of unit tests, which all derived from a single base class that contain methods required for all the tests. In a second look, we saw that some of the methods were in fact only needed by some subset of the tests, that were actually logically close.

To solve Joe’s problem, we created an intermediate class, which inherited from the common test base class, and changed these classes to inherit from the intermediate class. We then used Resharper’s Push Down Member refactoring, which removed the methods from the test base class into the intermediate class. No functionality was changed, but we removed code from the huge 4000 lines class! By continuing this process, we can break down the huge class into separate related classes with Single Responsibilities, and no class would have access to uneeded information (like methods it doesn’t care about).

Tip II – Duplicate Code Elimination

I believe the world of software would be a better place if people were not allowed to Copy-Paste code more than a few times a day. Many coders abuse this convenient shortcut and thus create unmaintainable code. The effective counter-measure to the Copy-Paste plague is elimination of duplicate code.

I saw in Joe’s long file code that looked similar to this:

alice = Configuration.Instance.GetUsers(“alice”);
bob = Configuration.Instance.GetUsers(“bob”);
charlie = Configuration.Instance.GetUsers(“charlie”);
diedre = Configuration.Instance.GetUsers(“diedre”);

Even though this is a rather simple example of code duplication, I strongly believe even such minor infractions should be dealt with. Every line of the above code snippet knows how to obtain users from the configuration. This knowledge has to be read and maintained by developers. Instead, why not get all the “user getting” code into one place and let us simply write what we want to do, instead of how?

To solve this, I use one of these two techniques:

Extract Method, Tiger Style

  1. Choose a single instance of duplication, and locate any parameters or code that is not the same among all the instances of the duplicated code. In our examples, the username (and the assignment variable) are the only two different things between the four lines of code.
  2. For every such parameter, use Introduce Variable. The end result of this phase should look something like this:
    string username = “alice”;
    alice = Configuration.Instance.GetUsers(username);
    bob = Configuration.Instance.GetUsers(“bob”);
    charlie = Configuration.Instance.GetUsers(“charlie”);
    diedre = Configuration.Instance.GetUsers(“diedre”);

  3. Now, use Extract Method on this code (the first line in our example). This creates a new method that I would call GetUsers, that simply gets a string argument username and reads it from the configuration.
  4. Perform Inline Variable on the variable you created in step 1.
  5. Now, change all the other instances to use this new method and delete the redundant code.

The end result looks like this:

alice = GetUsers(“alice”);
bob = GetUsers(“bob”);
charlie = GetUsers(“charlie”);
diedre = GetUsers(“diedre”);

Another way to achieve the same refactoring is Crane Style (I’m just enjoying using kung-fu styles here because I saw Kung-Fu Panda not too long ago :). You can use immediately without creating the temporary user, but then you get a method that specifically returns the username for “alice”, which is not what you wanted. Nevertheless, this method can be refactoring by applying Extract Parameter on “alice”, netting us the same result.

Of course these two examples do not begin to cover the myriad of ways you can and should refactor the code. What’s important is that you always keep an eye out on how you can make your code more concise, which in turns leads to readability and maintainability.

Tip III: Code Cleanup

Resharper sprinkles colors to the right of your currently open file. Every such colored line is either a compilation error (for red lines), or a cleanup suggestion. Go over such suggestions and hear what Resharper has to say (in the example below it seems nobody is using the var xasdf, so a quick alt-Enter while standing on it will remove it).

Another thing which you should do is define and run FXCop
rules to perform a deeper analysis on your entire project/solution and spot potential problems.

Source Control

Tip IV – Use Source Control

I almost left this out as this goes without saying, but properly using source control can probably save you more time and money than all the other tips (or cost you if you don’t use it). The best source control tool I know of for Visual Studio is of course Team Foundation Server, as it has the best integration with the IDE. Other tools are possible, but you have to have a good reason for choosing something other than TFS (One good reason might be cross-platform development and the desire to keep all your code base in a single repository).

Automatic Unit Tests

At Delver, we use NUnit to write unit tests. In past projects I’ve used Visual Studio’s built in test tool, but I found NUn
it to be slightly better, mainly due to the integration with Resharper. Resharper adds a small green button next to every test, and allows you to run your test directly from there instead of looking for the “Test View” window. Tomer told me just last week that he uses a keyboard shortcut for running tests, but for me, this is one shortcut I don’t think I’ll bother learning (the brain can only hole so much).
Another benefit of NUnit is that it runs the test suite in place in your current source folder, instead of copying everything aside to a separate folder like Visual Studio’s tool (this used to takes me gigs of space of old unit test sessions which were rarely if ever used).

Tip V – Write Autonomous Unit Tests

Back to Joe, he has a few tests that don’t work right out of the box. Before running tests, he has to manually run a separate application used by his test suite. As his project contains hundreds of tests, I would love seeing this added as an automatic procedure in his TestInit() method. Automating a manual operation, besides saving time for developers, enables you to:

Tip VI – Use Continuous Integration

Tests that nobody runs are no good. Tests that are run once when written and then forgotten are only slightly better. By the same logic, tests that run all the time are the best. Pick and use a Build Automation System. I wrote before about our chosen solution, TeamCity, and to sum it up – we’re extremely happy about it. TeamCity runs our tests on every commit, on multiple configurations and build agents, and helps us detect bugs faster.

Know thy IDE

Visual Studio is one powerful tool (not belittling java IDEs like IntelliJ and eclipse which in many cases are better). Learn how to use it’s features to your advantage:

Tip VII – Edit And Continue

Suppose that while debugging, you found a bug. You can edit the code, save, and continue the debug session without losing the precious time you took getting to this point.

Tip VIII – Move Execution Location

See the little yellow marker that signifies the current location inside the program being debugged? This arrow can be moved! It took me quite a while to discover this (actually heard about it from Sagie), but if you take and drag this arrow, Visual Studio will rewind or “fast forward” your execution to the desired point. None of the code gets executed, you just skip to where you want to go. Excellent for going back after executing a critical method, and rerunning it as many times as you wish.

Tip IX– Use Conditional Breakpoints

Don’t waste your time waiting for some specific value to appear in the watch for an interesting variable – set your breakpoints to stop only when the desired conditions are met (right-click on the red breakpoint circle and choose “Condition”).

Tip X – Attach to Process

Got a bug that only happens on production machines? You can attach your IDE to any running process (preferably one that is compiled in debug mode), and debug away. You can also programmatically cause a debugger to attach using System.Diagnostics.Debugger.Lau

Tip XI – Use The Immediate Window

The Immediate window is a great tool for executing short code snippets solely for debugging. You can place a breakpoint inside a method you wish to debug, and then call the method directly from the immediate window (saves you from doing this through Edit And Continue)

My Arrogance in Finding Bugs

Last night I discovered this piece of code (simplified version):



private void Foo()
{
bool b = false;
new Thread((ThreadStart)delegate { b = true;}).Start();
WaitForBool(b);
}

private void WaitForBool(bool b)
{
while (!b)
{
Thread.Sleep(1000);
}
}

I was immediately filled with disgust. How could someone write such a function (WaitForBool), which is one big bug? Of course the waiting will never be over, because bool is a value type, and no external influence can modify its value.

Later, I realized the fool is me.

I ran into this code a while back, and it did have a “ref” bool, which means it can be modified by the external thread. Resharper helpfully displayed a tip “this parameter can be declared as a value” , which I took without thinking too much about the consequences (I trust Resharper too much sometimes, it appears). So I deleted the “ref” with Resharper’s help and created the bug myself without noticing.

(As a side note, of course this method of waiting for a bool should never be used – use ManualResetEvent instead).

Update
Fixed in the next Resharper build (Resharper 4.0.1, build 913), within a few hours of reporting it. Cool.