Churn (code)

One of the most useful things when you’re new to a codebase can be to look at version control and find out who owns what parts of the codebase and how where are the hottest spots in the codebase.

Corey Haines posted a gist years ago of a great way to get the second part of that info with various constraints.

churn number and file name
git log --all -M -C --name-only | sort | uniq -c | sort | awk 'BEGIN {print "count,file"} {print $1 "," $2}'

churn number and file name w/ limiting to last n commits
git log --all -n 5000 -M -C --name-only | sort | uniq -c | sort | awk 'BEGIN {print "count,file"} {print $1 "," $2}'

graph of churn number and frequency
git log --all -M -C --name-only | sort | uniq -c | sort | awk '{print $1}' | uniq -c | sort | awk 'BEGIN { print "frequency,churn_count"} { print $1,$2}'

It is fairly trivial to take these and turn them into a git extension that lets you simply do “git churn”. Many people have these in their dotfiles repos on github.

Leave a Reply

Your email address will not be published. Required fields are marked *