Chill Out with the Defaults

I predominantly work in Swift and Kotlin, both of which support default arguments. As with any feature it’s worth being careful as overuse can lead to unexpected design trade-offs.

A common pattern I keep seeing in various codebases I work on is that data transfer objects are being defined using default arguments in their constructors. I think this leads to a few issues that I’ll explore in this post.

A Simple Example

Here’s a typical example of a class with a default argument on tags.

struct BlogPost {
    let title: String
    let tags: [String]

    init(title: String, tags: [String] = []) {
        self.title = title
        self.tags = tags
    }
}
data class BlogPost(
    val title: String,
    val tags: List<String> = emptyList()
)

It’s not always clear why this is done. I suspect it’s often out of habit or convenience for testing. My suspicion that this is related to testing comes from seeing this pattern repeatedly even though the production code is explicitly providing every argument and therefore never makes use of the defaults but the tests do. It gets my spidey senses tingling when it feels like we are weakening our production code in the service of adding tests.

The unintended consequence of these defaults is that the compiler can no longer be as helpful.


Exhaustivity Tangent

Just to make sure we are on the same page let’s talk about exhaustivity checking with enums as hopefully people will have experience with this. If we declare an enum and later switch over its cases the compiler can check to make sure we cover every case (assuming we don’t add a default case.)

For example let’s start with a PostStatus with two cases

enum PostStatus {
    case draft
    case published
}

func outputFolder(status: PostStatus) -> String {
    switch status {
        case .draft: "/dev/null"
        case .published: "blog"
    }
}
enum class PostStatus {
    Draft,
    Published
}

fun outputFolder(status: PostStatus): String {
    return when (status) {
        PostStatus.Draft -> "/dev/null"
        PostStatus.Published -> "blog"
    }
}

If we add a third case of archived then the compiler will force us to revisit our outputFolder function as it’s no longer exhaustive:

enum PostStatus {
    case archived
    case draft
    case published
}

func outputFolder(status: PostStatus) -> String {
    switch status { // Error -> Switch must be exhaustive
        case .draft: "/dev/null"
        case .published: "blog"
    }
}
enum class PostStatus {
    Archived,
    Draft,
    Published
}

fun outputFolder(status: PostStatus): String {
    return when (status) { // Error -> 'when' expression must be exhaustive. Add the 'Archived' branch or an 'else' branch.
        PostStatus.Draft -> "/dev/null"
        PostStatus.Published -> "blog"
    }
}

This is great because it means the compiler will guide us step by step through every callsite so we can decide what the appropriate action to take is.


Missing Exhaustivity 😢

If we agree that exhaustivity checking is a good thing then we can extend the same logic to the first example. Let’s say we create instances of our BlogPost type in a few places in our codebase and we want to add a new property of isBehindPaywall. If we add a default value then the compiler doesn’t help us by highlighting all the callsites that we should reconsider. If we are lucky then we make the change and all is fine, if we aren’t so lucky then we could accidentally have blog posts being hidden/shown when they are not supposed to be. In this case I’d much rather the compiler makes me check every callsite so I can make the correct decision.

In practice all this means is that I have to explicitly specify the isBehindPaywall argument and accept that there might be some duplication:

// Explicit
BlogPost(title: "Some blog post", isBehindPaywall: false)

// Implicit
BlogPost(title: "Some blog post")
// Explicit
BlogPost(title = "Some blog post", isBehindPaywall = false)

// Implicit
BlogPost(title = "Some blog post")

Local Reasoning

The explicit version above has another strength to it that is due to the improved local reasoning. If I want to know how the isBehindPaywall state was decided I can simply look at the callsite that instantiated the instance. In the defaults case this isn’t as simple - first I need to look at the callsite then if no value was provided I need to look up the declaration of BlogPost. With IDEs that click through this might not seem like a hardship but it also means we are vulnerable to changes being made at a distance e.g. someone could change the default value and it could have wide ranging side effects to any callsite that didn’t explicitly add a value.


Discoverability

You might think it’s fine when I add the property I’ll go and check every callsite myself manually. This is all well and good if you are the sole owner of the codebase and if you aren’t publishing your code as a library. But bear in mind it’s not a permanent fix as anyone can come along and create an instance of BlogPost and they may or may not see the isBehindPaywall option, which means they could get it wrong.

The issue is when people come to create instances of our BlogPost type they can get away without providing a value for isBehindPaywall and be blissfully unaware it even exists or its impact e.g.

BlogPost(title: "My Blog post")
BlogPost(title = "My Blog post")

Respecting Boundaries

Another subtle issue is whether something like a data transfer object should even have this knowledge bestowed upon it or if some code that has the business rules should be in charge.

Consider this scenario I had recently:

I have a backend service that supplies data for Android and iOS clients. The backend uses kotlinx.serialization, iOS uses some legacy JSONSerialization code and Android is using Gson.

       +---------+
       | Backend |
       | kotlinx |
       +---------+

+---------+  +-------------------+
| Android |  |        iOS        |
|   Gson  |  | JSONSerialization |
+---------+  +-------------------+

With this setup we have 3 different code bases that are bound by an informal contract of what the JSON should look like. Each platform is using different libraries to encode/decode and could have subtle differences in how this is done.

We also have a Kotlin Multiplatform library so it makes sense to refactor like this:

  • Extract the code from the backend service to the Kotlin Multiplatform module
  • Utilise this kotlinx.serialization based code from all 3 places
       +-----------------------+
       | kotlinx.serialization |
       +-----------------------+
            ^      ^      ^
           /       |       \
          /   +---------+   \
         |    | Backend |    |
         |    +---------+    |
         |                   |
       +---------+    +---------+
       | Android |    |   iOS   |
       +---------+    +---------+

We still have 3 code bases that have to agree on how the JSON is structured but now that is handled in a more concrete way by providing the type and the encoding/decoding logic in a shared library.

With this refactor originally the type that lived in the backend service had default values encoded into it but with this new split it doesn’t really make sense. The Android/iOS clients are supposed to be totally dumb and just trust the data handed to them but the original type knows too much with defaults being baked in. It makes much more sense to strip the defaults and keep the business rules on the server populate these values, which means that the type is a simple as possible.


Conclusion

It may seem like I’m beating on default arguments but in reality I use them all the time. My main point is before adding defaults, ask what you might lose. Sometimes, explicit arguments add a little duplication but make your code safer, more discoverable and easier to reason about.

Explaining Regex Locally with Xcode

Crafting regular expressions is fun - when you nail the incantation you feel like a wizard but as time passes, that once elegant spell can start to look like the output of an Enigma machine. My approach when needing to figure out what an old regex was doing was to paste it into whatever the top search result for “Explain regular expression” was.

Something that I’ve been doing for a while but hadn’t really thought about was using Xcode’s Refactor > Convert to Regex Builder as a way to explain a regular expression without having data leave my machine. Yes, it’s a shocker but one of Xcode’s refactoring tools actually works for me. It works surprisingly well and as a bonus gets those endorphins flowing knowing that my data is safe from some AI drivel.

Here’s what this looks like in practice using an example of decoding a regex for UK postcodes taken from this Stackoverflow post.

let postcodeRegex = /^([a-zA-Z]{1,2}[a-zA-Z\d]{1,2})\s(\d[a-zA-Z]{2})$/

After running Refactor > Convert to Regex Builder we get

Regex {
    /^/
    Capture {
        Regex {
            Repeat(1...2) {
                CharacterClass(
                    ("a"..."z"),
                    ("A"..."Z")
                )
            }
            Repeat(1...2) {
                CharacterClass(
                    ("a"..."z"),
                    ("A"..."Z"),
                    .digit
                )
            }
        }
    }
    One(.whitespace)
    Capture {
        Regex {
            One(.digit)
            Repeat(count: 2) {
                CharacterClass(
                    ("a"..."z"),
                    ("A"..."Z")
                )
            }
        }
    }
    /$/
}

Depending on your level of experience with regex you might think this is actually more wordy/overkill (in this case I’d agree) but the point is each part of the regex is broken out into smaller parts that have names explaining what they are doing. Normally concepts like Capture, Repeat and CharacterClass require you to understand regex syntax but the builder makes them much more approachable. With the builder style a dev with no experience with regex could make sense of this without needing to look up regex rules and how they are applied.

Wrap up

Even if it wasn’t designed as an “explain this regex” feature, that’s exactly how I use it and it’s a lifesaver when revisiting old code.

Turning CI Logs into Actions

When a CI job fails, the first thing you usually do is scroll through a wall of logs trying to spot the error. It’s tedious, slow, and often the exact same dance on every project. You could bolt on scripts to comment on PRs or ping Slack, but then you’re duplicating logic across repos and languages and spreading around more auth tokens than you’d like.

What if instead, your build just emitted special log lines, and a wrapper tool noticed them and took action? That way your CI stays simple, projects don’t need extra secrets, and you get rich behaviour like PR comments or Slack alerts “for free.”

One way around this is to do something similar to what Xcode does where if I emit a log like warning: This is a warning it will furnish the UI with a nice warning triangle. So in our case if we provide an executable that knows all about pinging slack, GitHub and other services we care about we can have this program wrap our build script and look for special logs. When it sees a log it knows how to handle it can perform the right action. With this new parent process being responsible for executing our build script we can choose to strip out any environment variables we don’t want to share, meaning the parent process can be auth’d to talk to slack and the child will know nothing about it.


Less writing more writing code

Thankfully swift-subprocess has us pretty nicely set up for running the child process and parsing logs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import Foundation
import Subprocess

let terminationStatus = try await run(
    .path("/bin/bash"),
    arguments: ["-c", ProcessInfo.processInfo.arguments.dropFirst().joined(separator: " ")],
    error: .standardError
) { execution, standardOutput in
    for try await line in standardOutput.lines() {
        // Parse the log text
        print(line, terminator: "")
    }
}.terminationStatus

switch terminationStatus {
case let .exited(code):
    exit(code)
case let .unhandledException(code):
    exit(code)
}

Let’s unpack the interesting bits:

  • Lines 4-8 are going to execute the command we pass in a bash subprocess.
  • Line 6 is dropping the first argument as this will be the path to the executable. The rest is the command we want to execute.
  • Line 7 ensures that stderr isn’t dropped on the floor as we want our command to be as transparent as possible.
  • Lines 9-12 are where the magic is going to happen soon.
  • Line 11 is just printing stdout otherwise our program will appear to just swallow all output.
  • Lines 15-20 are just taking the result of the subprocess and making it the final result.

I haven’t explored the full API surface yet but this approach seems reasonable.

At a high level, we will invoke our new executable that I’ll call log-commander with our normal build script something like this

log-commander bundle exec fastlane build

Under the hood log-commander will essentially execute /bin/bash -c "bundle exec fastlane build" and then proxy all output and the final result.


Let’s make it do something interesting

As it stands we’ve not achieved much so let’s teach log-commander to comment on a PR. We’ll assume that log-commander will be provided the GITHUB_AUTH, GITHUB_ISSUE_ID, GITHUB_OWNER and GITHUB_REPO as environment variables. With this we can create a simple client to post a comment to a PR on GitHub

class GitHubClient {
    private let baseURL: URL
    private let session: URLSession

    init() {
        let sessionConfiguration = URLSessionConfiguration.default
        sessionConfiguration.httpAdditionalHeaders = [
            "Accept": "application/vnd.github.v3+json",
            "Authorization": "Bearer \(getOrFatalError("TOKEN"))",
            "X-GitHub-Api-Version": "2022-11-28",
        ]
        self.session = .init(configuration: sessionConfiguration)
        self.baseURL = URL(string: getOrFatalError("URL"))!
            .appending(components: "repos", getOrFatalError("OWNER"), getOrFatalError("REPO"))
    }

    func postComment(_ message: String) async throws {
        let url = baseURL.appending(components: "issues", getOrFatalError("ISSUE_ID"), "comments")
        var urlRequest = URLRequest(url: url)
        urlRequest.httpMethod = "POST"
        urlRequest.httpBody = try JSONEncoder().encode(["body": message])
        let (_, response) = try await session.data(for: urlRequest)

        if (response as? HTTPURLResponse)?.statusCode != 201 {
            throw NSError(domain: "GitHubClient", code: (response as? HTTPURLResponse)?.statusCode ?? -1, userInfo: nil)
        }
    }
}

private func getOrFatalError(_ key: String) -> String {
    if let value = ProcessInfo.processInfo.environment["GITHUB_\(key)"] {
        value
    } else {
        fatalError("Required 'GITHUB_\(key)' environment variable not set")
    }
}

* Proper error handling is left as an exercise for the reader.


Connecting the dots

Now we have the top level code to wrap our normal build process and a client to communicate with GitHub we just need to link them together. I’m going to go with the idea that if a line contains an opening and closing curly brace then I’ll attempt to parse it as a JSON payload. If it parses correctly then the log-commander will post the comment to GitHub. Any unsuccessful decoding will just be ignored.

To achieve this we’ll introduce a structure that represents the command we care about parsing

struct Action: Decodable {
    let prComment: PRComment?

    struct PRComment: Decodable {
        let message: String
    }
}

I’m nesting the PRComment under an Action struct so I can build up different actions that will require different data.

With this we can update our line parsing to something like this

let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase

for try await line in standardOutput.lines() {
    print(line, terminator: "")

    guard
        let openingBrace = line.firstIndex(of: "{"),
        let closingBrace = line.lastIndex(of: "}"),
        let message = try? decoder.decode(Action.self, from: Data(line[openingBrace...closingBrace].utf8))
    else { continue }

    if let prComment = message.prComment?.message {
        try await client.postComment(prComment)
    }
}

Now if we just emit a log statement in our normal build tool that looks like this

{"pr_comment":{"message":"Example"}}

The log-commander will parse this out and post a comment on the PR with the message Example.


Tidying up

I mentioned that we don’t want the child process to know about GitHub auth and this can be taken care of by manipulating the environment in the initial run invocation something like this:

environment: .inherit.updating(["GITHUB_TOKEN": ""])

Wrap up

I like the idea of writing this logic once and then sharing it among teams to make debugging life easier. Instead of scrolling endlessly through CI logs or wiring up ad-hoc scripts for every project, you get a single wrapper that turns structured logs into actions. This blog showed GitHub PR comments but you could extend to do Slack alerts, build metrics or whatever if you can dream up. It also makes it super low friction to get rich behaviour - e.g. wrap your normal invocation to your build command with this tool and then create some specially formatted logs.