Losing data can be very frustrating. Sometimes data is lost because of hardware dying, but other times it’s done by mistake. Thankfully, Git has tools that can assist with the latter case at least. In this article, I will demonstrate how one can use the git-reflog tool to recover lost code and commits.
What is Reflog?
Whenever you add data to your local Git repository or perform destructive operations, Git keeps track of all these using reference logs, also known as reflogs. These log entries contain a SHA-1 hash of the commit associated with it and any references, or refs for short. Refs themselves are branch names, tags, and symbolic refs like HEAD, which is always pointing to the ref or commit id that’s currently checked out.
These reflogs can prove very useful in assisting with data recovery against a Git repository if some code is lost in a destructive operation. Reflog records contain data such as the SHA-1 hash that HEAD was pointing to when an operation was performed, and a description of the operation that was performed as well.
Here is an example of what a reflog might look like:
956eb2f (HEAD -> branch-prefix/v2-1-4, origin/branch-prefix/v2-1-4) HEAD@{0}: commit: fix: post-rebase errors
The first part 956eb2f
is the commit hash of the currently checked out commit when this entry was added to the reflog. If a ref currently exists in the repo that points to the commit id, such as the branch-prefix/v2-1-4
branch in this case, then those refs will be printed alongside the commit id in the reflog entry.
It should be noted that the actual refs themselves are not always stored in the entry, but are instead inferred by Git from the commit id in the entry when dumping the reflog. This means that if we were to remove the branch named branch-prefix/v2-1-4
, it would no longer appear in the reflog entry here.
There’s also a HEAD
part as well. This just tells us that HEAD
is currently pointing to the commit id in the entry. If we were to navigate to a different branch such as main
, then the HEAD ->
section would disappear from that specific entry.
The HEAD@{n}
section is just an index that specifies where HEAD
was n
moves ago. In this example, it is zero, which means that is where HEAD
currently is. Finally what follows is a text description of the operation that was performed. In this case, it was just a commit. Descriptions for supported operations include but are not limited to commit, pull, checkout, reset, rebase, and squash.
Basic Usage
Running git reflog
with no other arguments or git reflog show
will give you a list of records that show when the tips of branches and other references in the repository have been updated. It will also be in the order that the operations were done. The output for a fresh repository with an initial commit will look something like this.
13deb8e (HEAD -> main) HEAD@{0}: commit (initial): initial commit
Now let’s create a new branch called feature
with git switch -c feature
and then commit some changes. Doing this will add a couple of entries to the reflog. One for the checkout of the branch, and one for committing some changes.
4f8d10d (HEAD -> feature) HEAD@{0}: commit: more stuff
13deb8e (main) HEAD@{1}: checkout: moving from main to feature
13deb8e (main) HEAD@{2}: commit (initial): initial commit
This log will continue to grow as we perform more operations that write data to git.
A Rebase Gone Wrong
Let’s do something slightly more complex. We’re going to make some changes to main and then rebase our feature branch on top of it. This is the current history once a few more commits are added.
138afbf (HEAD -> feature) here's some more
cb72b26 even more stuff
4f8d10d more stuff
13deb8e initial commit
And this is what main looks like:
a84bdfa (HEAD -> main) add other content
13deb8e initial commit
After doing a git rebase main
while checked into the feature
branch, let’s say some merge conflicts got resolved incorrectly and some code was accidentally lost. A Git log after doing such a rebase might look something like this.
be44ab0 (HEAD -> feature) here's some more
a84bdfa (main) add other content
13deb8e initial commit
Fun fact: if the contents of a commit are not used after a rebase between the tip of the branch and the merge base, Git will discard those commits from the active branch after the rebase is concluded. In this example, I entirely discarded the contents of two commits “by mistake”, and this resulted in Git discarding them from the current branch.
Alright. So we lost some code from some commits, and in this case, even the commits themselves. So how do we get them back as they’re in neither the main branch nor the feature branch?
Reflog to the Rescue
Although our commits are inaccessible on all of our branches, Git did not actually delete them. If we look at the output of git reflog
, we will see the following entries detailing all of the changes we’ve made to the repository up till this point:
be44ab0 (HEAD -> feature) HEAD@{0}: rebase (continue) (finish): returning to refs/heads/feature
be44ab0 (HEAD -> feature) HEAD@{1}: rebase (continue): here's some more
a84bdfa (main) HEAD@{2}: rebase (start): checkout main
138afbf HEAD@{3}: checkout: moving from main to feature
a84bdfa (main) HEAD@{4}: commit: add other content
13deb8e HEAD@{5}: checkout: moving from feature to main
138afbf HEAD@{6}: commit: here's some more
cb72b26 HEAD@{7}: commit: even more stuff
4f8d10d HEAD@{8}: commit: more stuff
13deb8e HEAD@{9}: checkout: moving from main to feature
13deb8e HEAD@{10}: commit (initial): initial commit
This can look like a bit much. But we can see that the latest commit on our feature branch before the rebase reads 138afbf HEAD@{6}: commit: here's some more
.
The SHA1 associated with this entry is still being stored in Git and we can get back to it by using git-reset. In this case, we can run git reset --hard 138afbf
. However, git reset --hard ORIG_HEAD
also works. The ORIG_HEAD
in the latter command is a special variable that indicates the last place of the HEAD since the last drastic operation, which includes but is not limited to: merging and rebasing.
So if we run either of those commands, we’ll get output saying HEAD is now at 138afbf here's some more
and our git log
for the feature
branch should look like the following.
138afbf (HEAD -> feature) here's some more
cb72b26 even more stuff
4f8d10d more stuff
13deb8e initial commit
Any code that was accidentally removed should now be accessible once again! Now the rebase can be attempted again.
Reflog Pruning and Garbage Collection
One thing to keep in mind is that the reflog is not permanent. It is subject to garbage collection by Git on occasion. In reality, this isn’t a big deal since most uses of reflog will be against records that were created recently. By default, reflog records are set to expire after 90 days. The duration of this can be controlled via the gc.reflogExpire
key in your git config.
Once reflog records are expired, they then become eligible for removal by git-gc. git gc
can be invoked manually, but it usually isn’t. git pull
, git merge
, git rebase
and git commit
are all examples of commands that will trigger git gc
to run behind the scenes.
I will abstain from going into detail about git gc
as that would be deserving of its own article, but it’s important to know about in the context of git reflog
as it does have an effect on it.
Conclusion
git reflog
is a very helpful tool that allows one to recover lost code and commits when used in conjunction with git reset
. We learned how to use git reflog
to view changes made to a repository since we’ve cloned it, and to undo a bad rebase to recover some lost commits.