Tracking Engagement: Parsing Git Logs

So I’ve been having a tremendous amount of fun recently on a pet project, and I’m starting to get some really good results from it.
The premise: the level of engagement inside my team of developers will directly correlate to the quality of their output.
The plan: Have my team add a smiley face to their commit messages that indicate how they are feeling at that point in time. Use this information to take over the world.
When I suggested this to the team I expected a little bit of hesitancy, I was worried this might come off as a lame manager-driven idea, but they’ve really taken to it. The premise makes a lot of sense – if you are feeling a bit off it could be for a number of things, for example:
- You’ve been trudging through an annoying piece of code, with some hacks.
- You’ve had trouble understanding what the requirement was, with a lot of distracting back and forth.
- People have been interrupting you, forcing you to context switch a lot.
- You are distracted by personal problems, and not as focused on the code as you usually would be.
In all of these cases it’s likely that the quality of the code might have suffered – and instead of hiding that we actively want to start gathering data around this.
I’ll write future blog posts on the really cool things we can do with this data, but one piece at a time. In the rest of this post I want to really quickly note down how I’m parsing the raw data out of GIT.
Context
Our team has agreed to commit comments in the format “XXXXXX-111 message :)”, that is we start with a jira number, then add a message giving some context to the commit, then finish with a smiley face.
The smiley faces we support are:
>:( |
:( |
:/ |
:\ |
:| |
:) |
:D |
The implementation
For this task I’m simply using a powershell script over a number of repositories. I’ve written this function to parse the git log, and it uses a regular expression to parse out the data:
function GetGitLogs([string] $Path, [string] $RepositoryName, [string] $Project) { set-location $Path $logs = @(git --no-pager log --no-merges --all --branches --remotes --pretty=tformat:"%h%x09%an%x09%ad%x09%s" --since="$Days days ago" --date=iso ` | where {$_ -cmatch "^(?<gitref>\w*)\s*(?<author>[\D\s]*)\s*(?<date>[\s\d:-]*)\s\+\d+\s*(?<message>.+?)\s?(?<morale>[\)\(:>\\\/|D]{2,3})?$"} ` | foreach { new-object psobject -prop ` @{ ` GitCommit = $matches['gitref']; ` Date = $matches['date']; ` Message = $matches['message'].Trim(); ` Author = $matches['author'].Trim(); ` Morale = $matches['morale']; ` Repository = $RepositoryName; ` Project = $Project; ` Files = @(); }}); # extract the files associated with the change foreach($log in $logs) { $log.Files = @(git diff-tree --no-commit-id --name-only -r $log.GitCommit | % {$_.replace("`n", "").replace("`r", "")}); } return $logs; } |
This is the test data I’ve used to validate the regular expression:
bf18307 username 09-09-2016 +1100 jira-200 test test test :) bf18307 firstname lastname 09-09-2016 +1100 jira-201 test test test :( bf18307 username 09-09-2016 +1100 jira-202 test test test:) bf18307 firstname lastname 09-09-2016 +1100 jira-203 test test test >:( bf18307 username 09-09-2016 +1100 jira-204 test test test bf18307 firstname lastname 09-09-2016 +1100 jira-205 test test (tested) bf18307 username 09-09-2016 +1100 jira-206 :|
Once I’ve done my initial parse I want to break these logs down a little more. In the message I should have one jira ticket, but on occasion a developer will forget, or they might commit against multiple tickets. Either way, I want to parse the commits and correlate them back to the tickets worked against. For this I can use this script:
foreach ($log in $logs) { $projectPattern = if ($JiraProject -ne "") {"$JiraProject-\d*"} else {"(?<!([A-Za-z]{1,10})-?)[A-Z]+-\d+"} $list = Select-String $projectPattern -input $log.Message -AllMatches | foreach { $_.matches.Value }; foreach($ticket in $list) { $jira = GetJiraInformation -Ticket $ticket; $tickets += new-object psobject -prop ` @{ ` Ticket = $ticket; ` GitCommit=$log.GitCommit; ` Morale = $log.Morale; ` Author = $log.Author; ` Date = $log.Date; ` Repository = $log.Repository; ` Message = $log.Message; ` Project = $log.Project; ` Files = $log.Files; ` JiraTitle = $jira.Title; ` JiraIssueType = $jira.IssueType; ` }; } if ($list.Count -eq 0) { $tickets += new-object psobject -prop ` @{ ` Ticket = $null; ` GitCommit=$log.GitCommit; ` Morale = $log.Morale; ` Author = $log.Author; ` Date = $log.Date; ` Repository = $log.Repository; ` Message = $log.Message; ` Project = $log.Project; ` Files = $log.Files; ` JiraTitle = ""; ` JiraIssueType = ""; ` }; } } |
For completeness this is the simple script to extract some basic metadata from Jira:
function GetJiraInformation([string] $Ticket) { $result = new-object psobject -prop @{ Title = ''; IssueType = '';} if ($Ticket.Length -eq 0) { return $result; } try { $response = Invoke-WebRequest -Uri "https://ourjiraurl/jira/rest/api/2/issue/$($Ticket)?fields=summary" | ConvertFrom-Json; $result.Title = $response.fields.summary; $response = Invoke-WebRequest -Uri "https://ourjiraurl/jira/rest/api/2/issue/$($Ticket)?fields=issuetype" | ConvertFrom-Json; $result.IssueType = $response.fields.issuetype.name.replace(' ', ''); } Catch { # Likely a board security error e.g. board does not allow unauthenticated access $result.Title = "Unable to retrieve issue from Jira"; $result.IssueType = "Unknown"; } return $result; } |
From here I log all the information to a database (so we can start building up metrics) and I generate some email reports to be sent out. I’ll go more into that in the next post, and then after that I’ll start talking about the really cool stuff we can do with this data.
Image Credit: Bottled_void
Leave a Reply