Saturday, April 4, 2015

Build Tooling Discussions, Part II: Style Checking & Linting

In an earlier post I released base-api, a template project for developing Scala API servers.  In this post I'd like to discuss why I included a style checker / linter in that project and the philosophy of automated code repository rule enforcement in general. The last post on this topic covered automated code formatting, which is really just a subset of style rule enforcement that happens to be possible to fix in an automated manner.

Style checking and linting are closely related concepts that differ primarily in that:
  • Style rules are aimed at reducing the cognitive load (see also) imposed on engineers attempting to understand or work with your codebase. 
  • Linting is aimed at preventing bugs enforcing a ban on specific patterns and practices that are both considered harmful and detectable in an automated manner. 
We consider these together because often style rules and linting bleeds together. For example, is it a stylistic concern that all if statements have braces, or is not including them a harmful practice because it frequently leads to bugs where logic accidentally escapes control flow? Hard to say, so we just roll them into one topic.

A common rule among style checkers is the prevention of magic number use. Let's take a look at an example where preventing use of magic numbers might save our bacon. Say we implement this block of code somewhere in our application:
    def hasFlag(bits: Int, flag: Int) = {
      ((bits >> flag) & 1) != 0
    }
    
    def doSomeLogic(userPreferences: Int) {
      // bit 0 is make profile public
      if (hasFlag(userPreferences, 0)) {
        // show public profile
      }
      // bit 1 is receive site emails
      if (hasFlag(userPreferences, 1)) {
        // send daily digest email
      }
    }
Then elsewhere another developer implements the logic to store user preferences in the database:
    // sets the bit at $flag position to $enabled
    def setFlag(bits: Int, flag: Int, enabled: Boolean) = {
      val value = if (enabled) 1 else 0
      bits | (value << flag)
    }

    def storeUserPreferences(receiveEmails: Boolean, publicProfile: Boolean) {
      var userPreferences = 0
      userPreferences = setFlag(userPreferences, 0, receiveEmails)
      userPreferences = setFlag(userPreferences, 1, publicProfile)
      // store userPreferences...
    }
We suddenly have a very hard to find bug that may not even show up in cursory testing. We release and maybe it takes a while before anyone even notices some users complaining about getting emails they weren't supposed to. If you haven't noticed it - the position of the receiveEmails and publicProfile bits are reversed in these two snippets. With a 'no magic numbers' rule enforced, this code would not be allowed to build / commit / merge (depending on where the pipeline you implement the rule). Hopefully the developer's instinct would be to rectify this situation with an enum thusly:
    object UserPreferenceBits extends Enumeration {
      type UserPreferenceBit = Value
      val ReceiveEmails = Value(0)
      val PublicProfile = Value(1) 
    }

    // checks whether an integer has the bit at $flag position enabled
    def hasFlag(bits: Int, flag: UserPreferenceBit) = {
      ((bits >> flag.id) & 1) != 0
    }

    // sets the bit at $flag position enabled
    def setFlag(bits: Int, flag: UserPreferenceBit, enabled: Boolean) = {
      val value = if (enabled) 1 else 0
      bits | (value << flag.id)
    }
However if they do not understand the reasoning behind the 'no magic numbers' rule, they might try to do something unpleasant like:
    def doSomeLogic(userPreferences: Int) {
      val publicProfileFlag = 0 // BAD!
      val receiveEmailFlag = 1 // BAD!
      // bit 0 is make profile public
      if (hasFlag(userPreferences, publicProfileFlag)) {
        // show public profile
      }
      // bit 1 is receive site emails
      if (hasFlag(userPreferences, receiveEmailFlag)) {
        // send daily digest email
      }
    }

    def storeUserPreferences(receiveEmails: Boolean, publicProfile: Boolean) = {
      var userPreferences = 0
      val receiveEmailFlag = 0 // BAD!
      val publicProfileFlag = 1 // BAD!
      userPreferences = setFlag(userPreferences, receiveEmailFlag, receiveEmails)
      userPreferences = setFlag(userPreferences, publicProfileFlag, publicProfile)
      // store userPreferences....
      userPreferences
    }
Unfortunately some bad practices just can't be defended against, and in the face of a truly determined developer all rules will whither and die. That's what collaboration on the rules creation process and a strong commitment to education about it within an organization is crucial to its efficacy.



Where and how to deploy style checking & linting


There are three primary places one can implement these checks - at compile time, at commit time, or as part of the PR merge process. As with most automation, running it earlier is generally better - at least at compile time the developer is likely still in the context of the offending code. Sometimes this isn't possible - maybe your code doesn't even have a compile step, or maybe your linter is really heavy and including it in your compile step would cause an unacceptable delay for the developer. If you have a good reason, later is OK too, it just has to be balanced against the increased cost of getting back into context to fix whatever problems are uncovered.

The other part of the 'how' is your internal education process about these kinds of rules, and process for updating / adding / removing them. All developers have to be invested in the process for it to matter (and for them to not try to work around it), so it's worth your time to demonstrate real benefits in your project and make sure everyone is on board. Another consideration is that you should do this as early as possible, the most common failure of attempts to implement rules like this is that they are made on large extant codebases that have thousands or more violations. Ain't nobody going back through and fixing all that.

For Scala my favorite is ScalaStyle - it's mostly style rules but also has some good functional rules.