This blog has moved to Medium

Subscribe via email


Archive for May 2011

My walk through the Git book

I’ve been experimenting with git for about the last year, but most of the work I did with it so far was in the “single developer, hack some stuff, push to github” mode of operation, which is very superficial. Now that I’ll be working with it full time (git is one of the “semi wildly adopted” SCMs at Google), I thought it’s time to take a closer look at some wisdom accumulated by other folks, so I finally cracked open the Git book and did a pass over it.

The book is great and usually very fluid. It begins by show-casing the simple use cases you’ll encouter with git, and is filled with short code snippets you can try (even on a train with no WiFi – this is a distributed source control system after all). Some of the examples weren’t crystal clear straight out of the box, and relied on some previous knowledge the authors had (after all, much of the book was pulled together from different sources, so I imagine it was relatively easy to accidentally assume a bit of knowledge that its readers don’t necessarily have at that point).

Here is a summary of questions I had while reading the book, followed by some cool stuff I found at the end. I recommend at least some knowledge of git for the rest of this article, best accompanied with a reading of the Git book itself. As usual, if you find a mistake, please let me know. Some more related recommended reading is the Git for beginners SO question.

What happens on double git add?

git add is used not just to add new files, but also to ‘add’ changes in existing files.

When I do:

echo v1 > foo
git add foo
echo v2 > foo
git add foo
git commit -m bar

Are both versions of foo added to the commit log, or just the latest?

The answer is that just the latest version is actually committed.

After I git merge without conflicts, is a git commit needed?

Coming from svn it was my expectation that after I merge changes into my local branch, I will have to commit them. Doing a quick experiment showed that in git this is not the case at all – if a merge is resolved without manual intervention (including concurrent edits to different places of the same file), then no commit is needed. If there are any conflicts that are resolved manually (by git adding the file after fixing the merge), then a git commit is required.

How does gitk work? Sometimes I see branches, sometimes I don’t … it’s very confusing

This one has been puzzling me for quite a long time. I found that I couldn’t trust gitk, the graphical tool for visualizing commits, branches and merges, because it kept giving me inconsistent results, and for the life of me I couldn’t understand why.

Now I did a few experiments and digging, and found that by default gitk will only show you the current branch, and any objects that are its descendants in the version graph. If you create a branch, switch back to master, and ran gitk, you would not see this branch. What confused me is that upon refreshing, gitk rescans the current branch and add any new nodes to its display, while retaining anything alreaday shown – meaning if you run gitk, switch to a new branch, and refresh gitk, the new branch and its relation to the previous will now be displayed in gitk.

Of course, like all things linux, gitk can be controlled to behave like you want it. Just follow the gitk command with the names of the branches you want shown, or simply add “–all” to see all the branches in your repository.

How can you see the ‘branch structure’ of a repository?

In svn, there is a well defined directed graph between branches. When a branch is created of its parent, this parent-child relation is created and maintained, and the tools readily show you this branch graph.

I could have guessed this, but sources on Stack Overflow confirmed that there is no direct equivalent in git. Instead of branches having parent-child relations, there is a parent-child relation between objects, and so individual files and directories can have multiple parents in the version graph, where other files on the same branch might have completely linear histories. The model is more complex, but more powerful, and it seems to be the core reasons why merges in git are supposed to be easier than in svn.

What does ‘fast forward’ really mean?

Using git, I often saw messages with the words “fast forward”, but never really understood what it meant. This bit is explained rather nicely in the Git book – a fast forward happens when you merged branch b1 to b2, resolved any possible conflicts, and then merge the result back to b1. b2 already contains a version that is a descendant of the “heads” of both b1 and b2, meaning all the “merge work” was already done in it. So, when this structure is merged back to b1, what actually happens is all the revisions and merge work that happened on b2 is copied to b1. After this copying, the b1 branch (a pointer into the revision DAG) is “fast forwarded” to a descendant node that is the head of b2. In effect, the merge’s result becomes the head of b1 in a clean and simple manner.

This is radically different than svn – I still have horror flashbacks sometimes about trying to merge a branch back to trunk. I always first merged trunk to the branch, had to work my ass off to resolve all the conflicts and make the build green, and then sometimes had to do double the work when merging back to trunk. With git, you’re assured that the conflict resolution work you do on your branch is presereved and used to make merging back to master (the git equivalent of trunk) is as easy as cake.

git pull, fetch, and what’s in between

It is said that “git pull” is equivalent to “git fetch”, followed by “git merge”.
The ability to immediately fetch all the content of any remote repository without forcing you to merge it right now is great – you’re free to do the actual merge work and conflict resolution separately, and you only need connectivity to the remote repository for the fetch phase. When I tried this using two local folders, git merge complained, and I failed to understand what arguments I should pass to “git merge” in this case?

This turned out to be a simple technical issue. To merge the changes manually after fetching from an arbitrary remote, simply run git merge FETCH_HEAD (sometimes you just have to know the magic words). Normally, you would fetch from origin (usually the branch you cloned off), or another remotely tracked named branch, so you would just specify its name as the parameter to “git merge”.

How does pushing actually work?

Let’s say I setup a local “common” repo (it has to be bare for reasons explained in the Git book)

mkdir bare
cd bare
git init --bare
cd ..
git clone bare alice
cd alice
touch a && git add a && git commit -m "Added a"
git push # This fails


Why does the push fail?

It turns out that the problem was I tried to push to an empty repository. If I do “git push origin master”, then subsequent “git push” with no arguments succeed.

And now, for some cool stuff:

git bisect ftw

Suppose you just found a critical bug, and have no idea when it was introduced. You write a simple (manual/automated) test for it, and reproduce it, but you’re not sure what it causing it. git bisect to the rescue!

git bisect allows you to do a binary search on your repository to find the exact commit that introduced the bug. While this is possible with other VCSs, it is so natural in git that it’s beautiful. You simply do “git bisect start”, followed by “git bisect good” to indicate the current version works, and “git bisect bad” to indicate it doesn’t, and git will direct you towards the correct half of the version graph until you find the exact version when things turned bad.

Configure your defaults for fun and profit

Here are some tweaks I found in the book that you might want to do (if you have any other tweaks you’d like to recommend, please comment!)

oneline log messages

If, like me, you find the “one liner” log messages easier to read, you can make it the default with

git config –global format.pretty oneline

Life is colorful

Make git status and other messages much easier to read with

git config –global color.ui true

How I got hired by Google (this time)

(Yada yada, this post doesn’t represents any opinions except my own, and barely even that :))

For the sake of friends, family, and other readers who might be interested, I wanted to document the process I’ve gone through with Google. I find it a bit similar to the process Steve Yegg went through when he got hired by Google, because both of us were turned down by Google the first time we interviewed there. I hope this might help you if you ever decide you want to try to become a Googler yourself (here is why I chose it, but you should find your own reasons).

My history with the job market is rather short – before interviewing at Google for the first time, I have been “employed” by the Israeli Defense Force for six years, in the course of normal army service. I was doing professional and interesting work, but I wasn’t exactly free to choose (Israeli mandates a three years compulsory army service for guys, and I signed for three extra years at the age of eighteen as part of the Atuda program). While in the army, I started my Master degree in CS, and when I finished it (half a year after being released from the army), I finally had to make my first employment choice in the “real world”.

I didn’t know the first thing about what I wanted. Throughout my army service, I had the concept that I really wanted to do “research work” (not that I knew what this means, it just sounded cool). I also remembered that I really enjoyed studying algorithms and data structures back in my B. Sc days, so I considered working as an algorithm developer (I had assumed the job has some correlation to the problems I faced in Algorithms and Data Structures courses – which is not necessarily true). Also, I was really drawn to the startup world, having tried unsuccessfully to create one myself. And, of course, Google was on my list of possible employers, because, well, it’s Google.

I approached interviewing at Google like I would any other company. I ignored the emails they sent me on “how to prepare for the interviews”, and hardly did any any research about the Google interview process. Also, since I thought it’s such a good company, I interviewed there immediately – actually, it was among the very first companies that I interviewed for. Unfortunately (or fortunately), I completely sucked at the interviews (or at least this was my subjective feeling at the time). The interview problem I most distinctly remember is being asked to sort an array of integers. “Trivial”, I said, and blurted either “Quicksort” or “Mergesort”.

“OK”, said my interviewer, “that’s a good answer. Now please implement it on this sheet of paper.”

“What?!”, I staggered. Still, I didn’t completely freeze, and wrote down some implementation or another. I did understand the concepts of both sort algorithms well enough to explain them, but when I tried to put it down on paper, I made some annoying +1/-1 indexing error, which I was unable to track down in the course and pressure of the interview. This, and other errors like this one, was one of the factors that made Google decide I was not ready to work for them yet. “Please interview again in another six months” was the reply I got.

Of course, in another six months I was happily employed at what became my job for the next three years, so the thought of interviewing again didn’t cross my mind until now. Recently, when I decided that it’s time for me to switch jobs, I took a second, more mature look at Google. I thought long and hard about what my motivation for leaving Delver, and found that it’s a desire to learn and grow. I then concluded that Google was the best place for me to do this (not an easy decision, but a decision that had to be made nonetheless). Once these thoughts crystallized in my head, I decided that this time, I’m going to be hired. I knew internally that I’m good enough, and I could make it with the right combination of preparations and luck.

The first thing I did was grill a couple of friends that were hired by Google in the last year. They didn’t reveal anything secret about the interviewing process, of course, but talking to them did send me in the right directions – “go study data structures and algorithms!” was the clear message. I first implement Bubble Sort as a warmup exercises, and managed to include a horrible yet hard to detect bug that made me take things a bit more carefully from then on. I revisited Introduction to CS, Data Structures, and Algorithms course material, and actually solved exams by coding the solutions. I didn’t anticipate how difficult this would be – not the actual problems, but the act of mentally forcing myself to fill in all the gritty details and code algorithms that actually work (and prove it via test cases). “Google is looking for smart engineers who know how to solve algorithmic problems, and code the solutions”, I was told, and so I made sure I am one of these engineers. One of the most difficult things I had to do was mentally push myself to do things throughly and not take shortcuts.

I scurried the web, reading tales of other qualified engineers that went through the interview process. I solved several Google Code Jam problems. I thought about how I would build large, Google scale services such as Search, Gmail and Maps, and wasn’t ashamed to ask questions when I didn’t know the answers. When I was almost ready for a Google interview, I started interviewing at other companies. You can’t believe how bad at interviews we (I) get when we don’t practice it for a few years – interviewing with other companies was a good way to get a feel of the market, validate to myself my choice of Google, and get better at interviewing and solving technical problems. Finally, I submitted my resume via a friend that works at Google (a good recommendation from a Googler tends to help your chances … it certainly can’t hurt). All this time, I kept repeating the basics – I reread the material for Data Structures and Algorithms courses quite a few times until it all sunk in, and I must have implemented Quicksort and Mergesort a few dozens of time until I was sure I could do it flawlessly and under pressure.

A Google interview process is usually composed of a basic “screening” interview, followed by a series of more difficult interviews. I was actually told by my Google recruiter that because my results last time I interviewed, I could skip the screening interview and proceed directly to the more difficult ones. I chose not to, and took the screening interview as a warm-up exercise (I was surprised and encouraged to find that it was really easy for me, and this encouragement helped with the other more difficult interviews). Google usually schedules all your advanced interviews (5-6 of them!) into one long day. One important tip is that you can ask your recruiter to split these interviews into two different days – I found it much easier this way)

Finally, after several challenging interviews and nervous days waiting for Google’s response, I got the “ok, you’re hired”. The process I’ve gone through was, I believe, the right process for me. I might have been slightly over-prepared for some (not all!) of my Google interviews (and I was never actually asked to implement Quicksort or Mergesort), but it’s better to be 400% over prepared and succeed than be 10% under prepared and fail. I understand that not everyone can dedicate as much time for the preparation process as I have – it was relatively easy for me because I announced that I’m quitting before finding another job, and so didn’t have to hide the fact I’m interviewing and studying – this can be very stressful, thinking about your new job while nobody at your current gig knows you’re quitting.

If you want to apply to Google, you should decide what process works for you, and how much you really want it. And study up on your Mergesort/Quicksort – learning to implement them flawlessly is not an exercise in memorization, but rather an exercises in solid algorithm building using pre/post-conditions and an excellent “mental muscle warmer”. I took Steve’s advice of doing “short-term warm-ups” to heart.

I hope this has been a useful post for you, and I encourage you the look up the other available sources. Here are some links to get you started:

Good luck in your interviews!

Noogler training at the Googleplex

Not only did Google agree to pay me money for working on cool products and technologies, they were actually nice enough to send me to two weeks of training in the Mountain View Googleplex – me and every Noogler that joins our forces. Observe the very first photo I took once arriving here.

At the time, I thought only “wow, nice bikes”. I didn’t imagine that there were actually dozens of such bikes available all throughout the Google campus. I was delighted to find that I could use them myself. The principle is really simple – whenever you see a bike, you can pick it up and ride to wherever on campus you want to go, and you just leave the bike there. It’s very addictive.

The first few days were a bit difficult. I was still suffering from jet lag, and for some reason thought Googleplex was much smaller than it actually was. Then, as I tried to meet with some friends, I found out that it’s actually a lot larger. It cost me a missed lunch, but I got the message, and the next day got me a printed map of the campus and simply rode all around on a bike to gain a small sense of direction.

The training courses themselves are a lot of fun, and some of the lecturers were really engaging and funny (not to mention educational – duh!). This week does a good job at showing us Nooglers just how much we don’t know, and how much we’ll never know because the scope of knowledge is always increasing at alarming rates. We get a first hand experience of this here at Google.

What really amazed me is the amount of freedom people seem to have here. From what I’ve seen so far, 20% seems totally real, and people are encourage to take up those projects. One famous Googler named Meng has taken up “World Peace”, at first as a 20% project, and now on full time.

Another great thing about working for Google, and specifically Orientation, is how many interesting people you meet. From other Nooglers, to lecturers, to people whom you share interests with – and not just engineers! I just had beers with a few Nooglers this evening, and they all agreed – we want to feel stupid, we want to improve, and this is the reason we all joined Google.

I haven’t talked a lot about the cool toys. Every company has them, but the variety and quality here is very pleasing. Here is a very small selection that I bothered to photograph … my favorite was a console version of Gauntlet – I believe until now I had only played this on my computer a long long time ago, and it was delighting seeing this game on a real size console.

I managed to get a little bit of work done as well. As it happens, my “work” right now is simply learning about the infrastructure and methodologies of both Google and my team, as well as learning web development in general. I’ll get productive soon enough, don’t worry.

And now, off to Toronto for two days before I finally return home. See y’all soon.

Have feedback on Google? Bring it!

Note – this post obviously only represents my personal opinion, and not of Google. Kittens will be killed if you think otherwise.

 

 

 

 

 

 

 

 

Now, that we have that out of the way, let me come to the main point. Feedback is important, and I don’t think we’re doing a good enough job getting and managing user feedback. Yeah, I’m starting to use “we” in this context, even though I’m a super fresh Noogler, I’m practicing!

I just remembered that a few years ago (about five minutes after I discovered User Voice), I found a User Voice feedback forum for Google. Over the years, a few people actually found it and voted up my #1 suggestion. Now that I’m a Googler, I might have some small ability to actually move things in Google, even if right now it’s mostly finding the right people to talk to, and asking them about stuff I care about.

So … if you have a feature suggestion or frustration relating to one of Google’s many products – feel free to post it at the feedback forum – I’ll try to do my best to promote these ideas internally.