Why I’m choosing Gherkin over Jira to define my app’s functionality

9 min readNov 3, 2024

As I work on Garner, my workplace journalling app, it’s becoming more important to document the functionality of the app so I have a working set of features I can iterate over as new ideas and constraints of the system emerge.

In my day job as a Software Developer it’s common for there to be some form of tool that this information is captured in. This can be in a knowledge base tool like Confluence or Notion, or a “ticket” based tool like Jira or Trello.

I have an issue with the knowledge base approach as it’s way too common for the knowledge base to get out of sync with the functionality in the app, usually due to the party that defines the functionality doing so before the development happens and there being a gap in communication that means things aren’t updated once the feature ships.

I also have an issue with the “ticket” approach as I find those systems are too temporal, usually resulting in some form of ticket archeology having to take place to piece together the definition of a feature. There is a solution for this which is to use a testing tool like Zephyr or X-Ray to create test cases that can be linked to the development work so you have a definition of how to test the functionality.

Another option which I’ve used previously is to write Feature files in Gherkin and keep these in the codebase. These Feature files are then used as a definition of the system’s functionality and when paired with an automation framework become a form of Living Documentation of the system.

There are downsides to Gherkin, it’s often misused and abused, with it being a tool that only the testers use to define the system without the collaboration of the business and even if the business does collaborate it’s often too easy for people to make the scenarios too complex. I’ve also encountered issues in the past where less technically inclined people have refused to store the definition “in code” as there, it’s a barrier to access for them.

Luckily I’m working on Garner by myself so I will be playing the parts of Product, Designer, Developer and Tester so the communication and technical complexities of Gherkin aren’t enough to dismiss this approach.

Figuring out how to document the features of the app

As I’ve been exploring the technical and design aspects of the app idea I’ve focused less on the functionality and the edge cases and more on getting comfortable with the tooling I’d be using so I’m confident I can execute my vision.

Now I’m trying to document the features of the app I need a means to capture things quickly and in a flexible manner that I can ask questions about and adjust it when I land on an answer that impacts the functionality of the app.

Some people I’ve worked with like to do this within a tool like Jira, building up epics and stories within but I find this too cumbersome. For me a User Story Map done digitally in a tool like Miro or Apple’s Freeform gives me the flexibility to move stuff about until I land on the right hierarchy of sticky notes that I can clearly see the definition of the functionality and iterate over it until I’m happy.

Building a User Story Map in Freeform

Freeform is like most of Apple’s free software, in that it sucks but it has enough functionality and doesn’t cost anything so it feels like the right tool for the job for now. I’ve been spoilt in the past working on Enterprise level projects that I’ve been given the top-tier license for Miro and if I could justify the cost I would have used that, but I can’t so here we are.

The classic approach to building a User Story Map is to create a sticky note hierarchy of:

Activity (the User Story)
Task (the action the user takes as part of completing the activity)
Sub-task / Task Details (any useful details about how the action is completed)

The User Story Map is then broken into “slices” so that the most important activities and tasks are prioritised to deliver the most value in the shortest amount of time.

I’d already descoped a lot of work to give myself a tight MVP for Garner so I took a different approach. The MVP would be a very basic Create, Read, Update, Delete (CRUD) app based around the creation of journal entries, the questions asked as part of those entries and a scheduled reminder to prompt the user to create an entry.

As I was looking at the CRUD lifecycle of these entities I followed this process:

I started mapping the entities out and then the stages of the CRUD lifecycle underneath
Underneath each CRUD stage I created an Activity / User Story for the user completing that stage
Under each Activity I then mapped out the different scenario that could impact that Activity being completed, there are some relationships that could mean that certain Activities could not be completed or would need to be completed in a different way
Under each condition I then mapped out the tasks the User and the System would perform to complete the Activity

This meant my User Story Map wasn’t as delivery focused as a traditional User Story Map because there was no real means to “slice” it but I found this approach made it easier for me to explore the functionality.

By being able to map out the conditions that a user would be completing in an Activity I was able to ask myself a question about how the activity would be completed when a certain condition was met and then if needed make room for another column in the User Story Map to map it out.

Once I had all the stages of the CRUD lifecycle for all the entities in the app mapped out and I had no unanswered questions about how the user would complete the activities I started to look at how I would turn the map into a definition of the app’s functionality.

I started by building up Epics and User Stories in Jira for the work to be done and X-Ray for defining the behaviour to test.

Translating the User Story Map to Jira & X-Ray

The process of recreating a User Story Map in Jira is usually pretty straight forward:

You create an Epic to group a set of User Stories, usually based on some form of shared functionality area
You create a User Story for each Activity in the User Story Map
You add the Task and Sub-task information on the ticket to help guide the delivery team

This gives you a set of User Stories that can be picked up by the designers, developers and testers in the delivery team to work on and contains just enough information for them to get started and figure out the details as they build things.

An example User Story with Test Case in Jira and X-Ray

As part of bringing that User Story to life the development team will create a set of test cases in X-Ray and link them to the User Story so that there’s bi-directional traceability between the tests and the User Story. This allows you to report on the quality of the product at a User Story and Epic level as well as a delivery (e.g. a Sprint when doing Scrum) level and is generally quite a nice way of doing things.

In general this is a really nice set up but as I went about doing this for Garner I found that I was struggling to see the benefit of having such a large amount of overhead just so that I could get to work on it. Usually in a team this overhead is justifiable, the User Story acts as a source of truth for all parties and it allows for supporting artefacts such as designs, code and conversation to be attached to it to build a shared understanding.

As I’m working on Garner alone I can’t really justify the time investment as I understand what I need to build.

This unjustifiable overhead became even more apparent to me when it came to writing out the test cases in X-Ray.

X-Ray has a couple of different ways to define the test cases, you can write out a set of steps and list what should and shouldn’t happen at that point or you can write out a Gherkin like Given, When, Then style set of steps.

As I started writing out the steps for a test case I started to realise that I could kill two birds with one stone with Gherkin, I could define the functionality of the app and also build up a suite of executable tests.

When I thought about the cost of X-Ray ($10 a month) and the additional overhead of getting the results of the automated tests I’d be creating as part of developing the app into X-Ray I decided to just use Gherkin as I’d get that reporting from the test runner without doing anything.

Ditching Jira and moving to Feature files in Gherkin

Before I could completely commit to using Gherkin I wanted to make sure that there was a library for executing the Feature files as tests in Swift as that is the language I’ll be writing the app in.

It turns out that while Swift has a bunch of Behaviour Driven Development (BDD) testing libraries most of these have you define the gherkin in the test file instead of reading the feature files and providing a means to define the step implementation.

I did find one library called Cutworm which did this so this gave me the confidence to go forth with Gherkin feature files as a means to define the app’s behaviour.

An example of one of the feature files I’ve written to document my app’s behaviour

To create my set of feature files I mapped the activities in the user story map to a feature file and then a scenario for each conditional scenario in the user story map.

This mapping made things really easy to work with and because feature files are just plain text I was able to do this all quickly with the help of autocomplete in Zed to ensure the steps were consistent.

In the end I had 19 scenarios across 7 features which is quite manageable.

Using scenarios in design

One of the benefits of Gherkin is that it doesn’t focus on “how” a user completes a task but “what” they do to complete a task. This makes it really good for design work as you have the freedom to explore different approaches.

I had already been creating mockups in Figma using the iOS 18 design system template so I continued doing this and created a series of screens for each scenario in the feature files.

Prototypes for the different scenarios in the Add Update Feature in Figma

This worked really well as it not only gave me prototypes for each scenario I could use for testing my ideas but also while building these I found a couple of UX problems I hadn’t found when initially writing the feature files.

I could then explore the problem and write another scenario to cover off the new UX flows that came out of that exercise.

Now I have the prototypes done I can conduct user research to ensure that any additional problems I’ve missed are caught and that I’m confident that the app is easy to use and what people want to use.

Using scenarios in development

I have yet to build my app properly but there are a couple of benefits that defining the app’s functionality in gherkin should give me.

I can use Cutworm to turn the feature files into an executable set of tests and use the report from the test runner to give me living documentation for the app.

I should be able to use the scenarios to map out the set of analytics events and logs that I’d expect to trigger and verify that the observability of what I build is working too.

I’m also really interested in seeing how I can integrate the scenarios into the logging that OSLog gives us.

Summary

I’m really glad I invested the time in exploring my options for defining the behaviour of my app.

It’s allowed me to rule out a solution that would add a lot of complexity for little gain and it’s allowed me to move forward with the design and UX validation of the app. I’m looking forward to seeing how the user research I conduct brings changes to the scenarios I’ve written.

Once I integrate Cutworm into the development side of things I’ll write a post on how this goes and what benefits this has brought. I can already see how being able to take screenshots and verify my app’s observability will bring a lot of quality checks into the process but it may also unearth some additional value I have yet to foresee.