More elegant solutions while keeping my tracks neat and tidy

Imagine you have a programming problem to solve. The problem is a little bit on the border of your competence, and you are not entirely sure what the solution will be when your done. But, on the other hand, it is not the first time in the history of the world this particular kind of problem has been solved. Without a doubt, there is at least one possible solution to the problem.

You start to describe what you want to achieve by writing an automatic unit test. Quite soon you bump into an obstacle in the form the API you need to use. The API is not a good fit with the current design of the code and makes writing the test unnecessarily tedious. You see a way to fix this, you change a little code and test cases in a related part of the system to make the API usage easier to test. But then, you find another part of the code that depends on the code you just changed. So you update that other piece of code accordingly, but only after fixing a tiny little piece of code to make the change smooth. A while later you have completed all the different small changes that followed from your first design adjustment. You do the last bit of the test case and it is time to do some programming on the original task you set out with. Thanks to your test case and the alterations you have made, the solution is rather straightforward and not too much code is needed. Well done!

This kind of sequence is common before it is time to introduce the new code into the master branch by the way of a pull-request (PR) that co-workers review.

When your co-workers look at your PR, they notice that it is rather big, and spans tens of files and is a hodge-podge of code library version updates, renamed variables, removed code, lifted methods, extracted methods, etc. In the worst case it is all one single commit, maybe more than one, but not small, nor obviously correct. Simply, a too large PR.

The prescription

Let’s put aside, for a moment, whether or not this PR should be reviewed at all, or be returned to the author, and instead contemplate why tasks often, but not always, turns out this way?

There is no shortage of hints, good advice and guidelines for how to make good PR:s, good commits, good commit messages, good refactorings and so on. There are also tools, some good, some less good, to help the programmer reconstruct a sequence of commits for a PR, in contrary to actual order of the code changes being done. Git, for example, offers tools that allows us to save pending code changes out our way (git stash), and tools that allow us to select and single out changed parts into own commits (git add –patch). So, there is really no reason for PRs to end up this way! Or?

I notice that I sometimes manage to create better and sometimes worse pull-requests, in regards to its readability. I also notice that even co-workers that almost always succeed in submitting pull-requests as a string of logical, clear and almost trivial commits, can make good and sometimes better PRs.

On occasions, I find myself with a big mess of code changes and my attempts to split the changed rows into commits (git add –patch) does not go well. My desire to start all over again, now that I know the steps, is miniscule as I have already spent enough time on this particular problem. Besides, the solution and the resulting code is not bad! It might very well be really good. It’s just that the pull-request will be messy and harder to review, which increases the risk of bugs slipping through my co-workers’ tough but fair reviews.

And I know that the reason I end up in this situation is the fact that, to me, programming often is a non-linear activity, where the main task, solving the original problem, is interleaved by the need to change code, solve sub-problems, improving conditions to solve sub-problems, etc.

And, I know that when I structure my work as a stack, I can push what I have done thus far onto the stack (git stash), so that whenever I start something that I would want to present as an individual and separate change, it is both possible and trivial. This new interleaving activity may in turn need to go onto the stack as yet a new change is needed. 

This pattern can be repeated multiple times and by committing a small commit at the finishing of each sub-task with a concise message, a sequence of individual commits will come out. A sequence of commits that each introduces a small and almost trivial change. Running your tests each time ensures that the commit would become a successful build, too. Then, to continue, I simply pop the top task from my stack (git stash pop)  and carry on to solve it, then produce a small commit, and so forth.

Git helps me keep track of my code changes when I work like this. But, I, myself, must prepare so that when a sub-task is completed and committed, I can quickly return to the previous task and focus my mind on my previous line of thought and continue without any delays. We know that task switching can be expensive, so I try to make the life of my future me easier by writing a little instruction whenever I stash, e.g. like `git stash save CONTINUE-with-adding-read-only-checks`, as the git default message when stashing is much less helpful.

The reality

So, even though I know a way that is better, why do I not always succeed with this well-arranged pull-requests?

Maybe I do not value well-ordered PRs enough? In my personal experience, the difference between reviewing easily readable and logically arranged commits in a pull-request compared to one similarly sized commit is distinct. Also, the value of small commits is clear when I need to locate which commit that introduced a regression (git bisect). And of course, the value to my future self when months away from now I need to review the history of a certain piece of code to better understand why it looks like it does.

And, even though I know about and master the tools I need to succeed; unit-tests, git stash, git add –patch, etc?

What if it to succeed takes that me as a programmer focuses my consciousness on two things at the same time? Namely, 1) the problem solving that is my original and main task, and 2) conscious reflection on my ongoing work process.

I need to be aware of what I am doing in two different ways. The first, and most obvious, way is that I need to be aware of: the problem I am trying to solve and the tools and solutions alternatives I have available. The second is that I need to diagnose what step I am currently taking. Whether it is small and isolated enough to be obvious to someone that will review my change? 

I must therefore, dedicate a part of my consciousness to decide if what I currently do is a direct part of solving the original problem, or if it is a refactoring/preparation, and then immediately decide if I should stash my workspace, and make a commit of the small change. 

In situations when this is challenging to me, either because I am a novice in the programming language, the technology, the algorithm or the problem domain, most of my focus will be on the first part, the solving of the original problem. Very little, if any, of my mind will concern itself with reflecting on the type of change I am doing at each moment, and therefore I will neglect to stash, and commit the small steps that I do. The result, of course, is not a nice string of small and obviously correct commits.

The paradox

Is this an essential paradox? That, when I do, to me simple problem solving and programming, my pull-requests will turn out neat and nicely ordered, but when I work with complex problems with no solution known in advance, and really could use that extra bit of reviewing of my code, I will make messier pull-requests that will make reviewing more difficult?

I don’t think it has to be a paradox, but I notice that it often turns out that way.

What if we could practice noticing these logical programming changes, such the behaviour become reflexes? As reflexes, noticing micro task changes would not burden our conscious mind. Then, in theory, we could remain completely focused on our original problem solving at the same time as we, by reflex, make our tracks in the source code history, small and obviously correct.

Compare this idea to learning to ride a bicycle. To the complete beginner, biking requires full attention of your mind and body. Even making your feet turn the pedals requires a conscious thought. To us that have been biking since early childhood, the actual biking require so little of our consciousness that we can look around, keep a conversation with our friend, read a map, answer calls on the mobile, or even do tricks, like letting go of the handle-bars.

The bike-ride

Maybe it is so, that the behaviour that makes our work turn out like a series of logical commits can be practiced, much in the same way as biking? In that case, the desired behaviour would eventually turn so fully internalized that we would do it without spending a single thought on it? 

Then we could practice this behaviour, so that we could indeed focus even more of our mind’s attention on making good, clear and elegant solutions, and, at the same time make our pull-requests easy to review to our co-workers and our future selves.

Do you recognize yourself in this essay? Do you have any advice or experience on how to get maximal attention on finding good and elegant solutions and still leave neat and tidy traces in the code repository? Join and share on #softwareexcellence.

Git-hook to stop committing focused tests

Revised 2019-05-13. Code updated. Check git-hub for latest version.

Over and over again, I find myself committing focused tests, e.g. test files that has fdescribe instead of describe, fcontext instead of context and fit instead of it. The effect is that any testruns based on my committed code will not run the whole testsuite. Typically this is noticed by the buildserver (e.g. by watching the count of executed testcases and treating large drops in count as an error) so the risk that actually ends up in the product shouldn’t be very high.

Loosing time and breaking flow

But still, when this happens I break my flow and loose valuable time and it should not have to be this way. For some time, I have thought about fixing myself a git-hook, that will alert me that I am trying to commit a focused test. Today, was the day I turned thought into action. So, I took the time needed to learn the basics about git-hooks and created my first pre-commit git-hook.

The pre-commit source

Here’s the code I wrote in the first round:

#!/bin/sh

if git rev-parse --verify HEAD >/dev/null 2>&1
then
	against=HEAD
else
	# Initial commit: diff against an empty tree object
	against=$(git hash-object -t tree /dev/null)
fi

for FILE in `git diff-index --name-status $against -- | cut -c3-` ; do
    # Check if the file contains 'fdescribe("'
    if grep -q -e '^ *fdescribe(\"' -e '^ *fcontext(\"' -e '^ *fit(\"' "$FILE"
    then
        echo $FILE  'Commit blocked to prevent commiting test focus. (do not commmit fdescribe, fcontext, or fit)'
        exit 1
    fi
done
exit<pre class="lang:bash" title="First iteration">
</pre>

It seemed to work fine in the first few simple tests but soon I realized it didn’t do what I really wanted, namely to check the changes that was actually staged for the commit, and not the files as they were on the filesystem. So, after some trial and error, the hook code turned out like:

#!/bin/sh
echo "Pre-commit hook to block focused tests"

if git rev-parse --verify HEAD >/dev/null 2>&1
then
	against=HEAD
else
	# Initial commit: diff against an empty tree object
	against=$(git hash-object -t tree /dev/null)
fi

SAVEIFS=$IFS
IFS=$'\n'
for FILE in `git diff-index --diff-filter=AM --name-only $against`
do
    # Check if the file contains patterns
    if git grep --cached -q -e '^ *fdescribe(\"' -e '^ *fcontext(\"' -e '^ *fit(\"' -- $FILE
    then
        echo $FILE  ': Commit blocked to prevent commiting test focus. (do not commit fdescribe, fcontext, or fit)'
        exit 1
    fi
done
IFS=$SAVEIFS
echo "Commit ok, no focused tests found in staged changes"
exit

So, first I found a git way of filtering the changes to only additions or modifications, ie. git diff-index –diff-filter=AM and then restricting the output to –name-only which took away the pain of some parsing. Then using git grep –cached I could search the staged data for the patterns and ignore changes to the same file that is not staged. I also added some printouts to let me know that the hook is active and when the commit passes the checks.

Setup git to use your hook for repo

Git allows you to setup hooks in multiple ways; local to the repo, system global, custom absolute path to the hooks, and custom path to the hooks relative to the repo. Here I will show you the most basic way (takes least config changes). To use this hook in this repo, copy the script-file into your repo’s ./git/hooks-folder and name it pre-commit. Like this, assuming your repo is in your home folder and named the-repo:

cp pre-commit-block-focused-test ~/the-repo/.git/hooks/pre-commit
chmod +x ~/the-repo/.git/hooks/pre-commit-block-focused-test

Github repo for comments and issues

You can find the latest version of the script and a README.md that shows some other ways to config git to use your hooks in my git-hooks repo on github.

Data driven unit tests in Swift using Quick/Nimble

I sometimes find myself writing unit tests that is testing the same function for different examples of data. Instead of repeating almost the same code over and over again, I like to parameterize those tests and make them data driven. 

However, when I started coding Swift and making unit testing with Quick and Nimble I couldn’t immediately get that same approach to work. This caused not only sadness in my life, but a lot of repeated code that I really thought it shouldn’t be necessary.

Yesterday, I found myself in the situation, again, where almost the same code would duplicated for every type of message that my code should handle. Then I decided to go to the bottom with why the data driven approach would not work with Quick and with a bit of luck, create a solution and a PR.

So, I started a swift project for the purpose of isolating the problem with the smallest possible example. I didn’t take me long to write a small data driven set of tests that I had not been able to get working a year ago. And imagine my surprise, when my little data driven test suite just ran straight off the bat. The problem wasn’t there anymore.

I thought to myself, that maybe it had been working all the time and I had just messed up before. Perhaps that really is the case, but that doesn’t change the fact that I had not been able to find any examples on the internet where Quick/Nimble was used in a data driven way, including the official documentation. Then I tried the approach in my project code, in a slightly different way, and then it didn’t work. Adapting my project code to follow the exact same pattern as my isolated example, I finally got data driven unit tests to work in my project. 

Because it was not a completely hassle free experience and since there was not much to find on the net about it, I thought I should share an example. So, here is an example using Swift 4.2, Quick 2.0 and Nimble 8.0. I hope you will have good use for it!

Data driven unit tests in Quick/Nimble

import Quick
import Nimble

class QuickDataDrivenTests: QuickSpec {
    override func spec() {
        describe("a function") {
            let specified_examples : [(input: Int, output: Int)] = [
                (input: 0, output: 0),
                (input: 1, output: 2),
                (input: 2, output: 4)
            ]
            specified_examples.forEach { (input: Int, output: Int) in
                context("for input \(input) and output \(output) ") {
                    var for_each_executed_with_input : Int?
                    beforeEach {
                        for_each_executed_with_input = input
                    }
                    it("should execute the beforeEach") {
                        expect(for_each_executed_with_input).toNot(beNil())
                    }
                    it("should execute the before each with each entry") {
                        expect(for_each_executed_with_input).to(equal(input))
                    }
                    it("should double the input for output") {
                        expect(double(input)).to(equal(output))
                    }
                }
            }
        }
    }
}

func double(_ value : Int) -> Int { return 2*value }

Some pitfalls

There are some pitfalls, that I stumbled upon that might cause you an issue.

Duplicate test names.

Remember to use the data to make the strings in the context, and it unique. For example by using the Swift string interpolation like

context("for input \(input) and output \(output)

Your tests are not discovered

It seems that the forEach block must have the tests in at least one context to be discovered and later executed. 

Simple when you know it

As you see the example is pretty straight forward and simple, once you get all the parts in. I hope this will help many swift programmers reduce their boilerplating in Swift unit tests and encourage even more unit testing 🙂

Deinit crashes when DispatchQueue is inactive or suspended

The crash

If you find your code crashing (EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)) in the deinit function of a class you made. Even if you didn’t write a deinit-function yourself, but get a crash in the deinit automatically generated by the compiler. And if you in that class has a property with an instance of a DispatchQueue (but it could be any of the subtypes of DispatchObject, really).
Chances are you fell into the same trap as I did, and had to spend quite some time to figure out what caused it.

Example

Given the following example code, where a class has a property which it creates and assigns a DispatchQueue to:

import Foundation

class Client {
   init() {
      dq = DispatchQueue(label: UUID().uuidString, attributes [.concurrent, .initiallyInactive])
   }

   private let dq : DispatchQueue

   deinit() {
      print("crashes when leaving this scope")
   }
}

When your run the following bit of code (in a test for example):

 var client : Client? = Client()
 client = nil  // this makes the deinit to execute and crash

it will crash in the deinit after printing the message: crashes when leaving this scope.

The problem

In his WWDC talk from 2016, Pierre mentions the criteria for deallocation of dispatch objects and its sub types, like DispatchQueue. The two conditions the DispatchQueue must meet are: it must be active, and it must not be suspended.

In the example above, the queue is created inactive, which means that it cannot directly be deallocated without activating the queue. Hence, the deinit cannot complete successfully, and crashes.

deinit() {
  dq.activate()
}

Other lessons learned

Last, but not least, I also learnt that I cannot take for granted that deinit of my classes always works and that its prudent to test that newly initialized object can be deinit:ed without problems, atleast, when dealing with DispatchObjects.

Swift 3 and NSPredicate implicitness trap

When using Swift 3 and XCode 8.x and Core Data I had the following problems with NSPredicate.

First I wrote something like this:
let uuid = UUID(uuid_string: a_string_with_uuid)
let fetch: NSFetchRequest<UserMO> = UserMO.fetchRequest()
fetch_user_request.predicate = NSPredicate(format: "uuid = %@", uuid)
let users = try! managedObjectContext!.fetch(fetch_user_request)

That didn’t work and compiled with error message:
Argument type 'UUID' does not conform to expected type 'CVarArg'
Some googling found me a possible solution, namely that for swift it was better to use:

init(format predicateFormat: String, argumentArray arguments: [Any]?)
over what I had implicitly used:
init(format predicateFormat: String, arguments argList: CVaListPointer)

I changed my code into the following:
let uuid = UUID(uuid_string: a_string_with_uuid)
let fetch: NSFetchRequest<UserMO> = UserMO.fetchRequest()
fetch_user_request.predicate = NSPredicate(format: "uuid = %@", [uuid])
let users = try! managedObjectContext!.fetch(fetch_user_request)

This removed the compilation error message, BUT the resulting executable did not do what I expected. The query returned an empty results independently of me trying various other parameters to match, e.g. display_name. Instead, the predicate evaluation seemed to fail silently.

As it turns out, the correct way to invoke the NSPredicate initializer with Swift 3 was to have explicit labels for both arguments like below NOTE: the added argument label argumentArray::

let uuid = UUID(uuid_string: a_string_with_uuid)
let fetch: NSFetchRequest<UserMO> = UserMO.fetchRequest()
fetch_user_request.predicate = NSPredicate(format: "uuid = %@", argumentArray: [internally_identified_by])
let users = try! managedObjectContext!.fetch(fetch_user_request)

Hopefully, this might save someone else a minute or two..

« Older posts

© 2024 Niffic AB

Theme by Anders NorenUp ↑