Spikes and Proof Of Concepts: How Using Exploratory Stories Helps Keep You Sane
While it can be tempting to dive right in and start coding as soon as a new story gets assigned to you, it’s often unwise to do so. When operating in an agile environment, we normally need to estimate the level of effort a story will require. If you’re going to commit to saying that a story will take you three days, you need to do whatever you can to stay on the right side of that estimate. You should always be under-promising and over-delivering, spending as little time as possible with your stories festering in the “in progress” column.
While you may want to angle for low-pointed stories in an effort to showcase your capability and efficiency, I’ve found hands-on Spike stories followed by Proofs of Concepts (PoCs) to be essential tools for not only accurate estimates, but ensuring that sufficient time is spent designing a reusable approach.
In addition to proving useful for engineers within the flow of a sprint (or a Kanban… epoch?), the Spike/PoC combo can prove useful in epic planning. Having an estimate for upcoming work — no matter the scale — is infinitely better than a shot in the dark.
What is a Spike?
While one might assume SPIKE is some nebulous acronym from the floppy disk era of computing, this isn’t true in the Agile context. The term derives from rock climbing, where spikes are driven into the rock face so climbers can hook in their ropes, giving them a checkpoint that they can hang from if they fall. Preserving the climber’s progress on the wall also ensures that they won’t fall to their death in the event of a misstep.
The rationale behind the metaphor to Agile is that while hammering in a spike doesn’t help your velocity (you’re hammering, not climbing) they ensure the stability of your current position and improve the velocity of future climbers. It is inherently a forward-looking concept.
In its initial use, the term was more akin to a proof of concept than a research task. A Spike is a time-boxed effort to produce the simplest possible solution to the problem at hand. In the modern workplace, the time-boxed nature of spike work remains, but the term generally refers to exploratory research into the given user story (or bug, etc).
Too often, engineers think that working on a research story means that they won’t be writing any code, just digging through the codebase, reading documentation, or learning more about the services being used. While a Spike story will almost certainly include some or all of these activities, sometimes there’s no better or more efficient way to learn than by getting your hands dirty with code.
Once you’re ready to close out the Spike, you should update the story with a description of the work you did and what you learned, then share those findings with another engineer before closing out the story. This helps ensure that the knowledge gained from the story doesn’t disappear with you if you get sick, are otherwise unavailable, or if the follow-up work gets deprioritized. It’s good to find moments where you can keep your teammates up-to-date on what you’re working on with digestible information. This can prove immensely helpful later on.
What is a Proof Of Concept?
Question: “If your Spike work includes coding, then how does that differ from a proof of concept?”
A Proof of Concept (PoC) tests out the desired pattern for the solution with a limited blast radius, Spike coding just encompasses writing the bare minimum new code needed to indicate the potential of the desired functionality.
Given the time-boxed nature of a Spike story — generally one or two days — you want to do the minimum research required to code a quick solution that indicates whether your intended approach will work. Nothing unearths unforeseen complexity and unintended side effects like testing. As mentioned before, this simple solution does not need to adhere to best practices or any efficiency measures. Therefore, none of the code used in a Spike should be reused. It’s not the first pass at coding a solution, it’s just a test.
The first attempt should come as part of your PoC. If necessary, you can presage the PoC story with a design review story in order to spend more time working out an approach to the problem that does adhere to best practices, and run that approach by other engineers / your architect if need be.
The PoC work, once you start it, will be encompassed within a user or bug story. It should encompass a strategically selected vertical slice of the application(s) and designed in such a way that it can be scaled horizontally.
Note: A ‘vertical slice’ of an application encompasses an entire workflow as it traverses the front end, your business logic, your database, and then back to the front end. The slice covers all parts of the application, from ‘top’ to ‘bottom’, however it’s ‘narrow’ in that it doesn’t have to cover all the possible request paths, just one of them. You can then scale the solution designed using a vertical slice of the application ‘horizontally’ by covering the rest of the potential request paths.
While this first proof of concept often shouldn’t be immediately accessible to users, you’re writing production-quality code, meaning that the solution you’ve settled on should be testable, extensible, and compliant with your team’s agreed-upon best practices.
It’s entirely possible that something goes awry during your proof of concept. If this does occur, don’t despair. Again, it’s a proof of concept in more than name. Given all the research and documentation you’ve done leading up to this point, you’ll have empowered your teammates to be up to speed on your approach and you can pull them in for a second opinion. Given that your story was for a proof of concept, you can close that story out, open up a new Spike or architecture review if necessary, then embark on a new proof of concept.
If your PoC goes according to plan, it may be possible to release the fix/feature to prod and then continue to work on extending it to horizontally cover the rest of your application(s), weaning your code off of the old pattern and onto the new, delivering to your customers iteratively in order to strike a balance between code quality and time to market. If the feature at hand can’t be released iteratively, you can still close out the PoC story and pull in a new story/bug to complete the feature/fix. Given all you know at this point, this story should be easy to point appropriately.
How Exploratory Stories Help: An Example
Let’s look at a hypothetical example of employing this workflow contrasted with a (fairly grim) alternative. Please bear in mind that while I’ll be describing some real technologies (as few as I can), I haven’t really given any thought to whether this change would be useful. I’m just interested in how we track the work.
Imagine you work on an application with a sizable user base that includes peer-to-peer messaging functionality. Due to a recent influx of traffic, users are experiencing high wait times.
The database design is a problem your team inherited and no one likes it. One of your teammates came up with a noSql schema that they’ve proved will halve the size of your database while drastically improving read speeds. They’ve ported the database over to that schema and have set up prod and non-prod instances ready to go in your cloud environment. They have a script prepared that will load the database with any missing info once the time comes to cut over. However, that person is wrapped up in a higher priority effort so they won’t be able to be of too much assistance.
So the task lands at your feet. It’s time to write your first story and tackle it. You’re tempted to say that while, yes, it’s a sizable body of work, it’s fairly straightforward because a lot of the heavy lifting has been done already. When your teammates have needed to perform architecture shifts in the past, they’ve created one story for the nonprod transition — that essentially functions as a spike — and another for prod.
Reality A: You use a Spike, Design Review, and Proof of Concept
You’re about to create the first story, but something gives you pause. Your sql writer class is small, but you have a sneaking suspicion that you may not have the whole picture of all the work entailed yet. You decide to spike it.
Reality B: You just make some stories, not giving it much thought
You go ahead and create the nonprod story. How bad can it be?
A:
You write a bit of code to connect to the deployed nonprod noSql instance when loading a user’s messages (you can worry about the appropriate time to establish the connection at a later date) and hardcode the queries needed for your CRUD operations for a sample user (programmatic query formation can also come later).
Establishing a connection to the new instance takes work, but you figure it out. You open a local version of the app and go to load your messages and…nothing shows up. You check the logs and see that the query you wrote was indeed run. You check that query against the database and get back the results you expect. You log out the response from the database in the app. It’s all there. A wire has clearly come unplugged but you’re not sure where yet.
B:
You get to work writing a class to establish connection to the new instance. The application is still using the old sql database for other purposes, so you need to add an abstracted connection manager class to choose when to establish connection to which database. This turns out to be much more challenging than you’d anticipated. You pull in some other engineers to help with the designs and they share concerns that the app already isn’t managing its connections very well. They want to augment this design to allow for improvements to be made after the fact. While this shouldn’t be your immediate concern, changes need to be made to the existing connection pattern in order to abstract the concern away and handle it at a later date. You spend a couple of days brainstorming and running snippets of code by them.
A:
You forgot about the application’s query response validation! All database responses get checked against their expected schemas. When you skip this check, your messages show up. You make a note to add the noSql schemas to the validation class. However, while you can now see your messages, you’re not able to interact with them like you should. There are no emojis, no clipboard and reply options when you click on a message; the chat feels dead.
You haven’t worked in the reaction system before. You start looking into the reaction code and it’s a total mess. The functionality is strewn across several nebulous classes and some of it lives in classes that should have nothing to do with messaging, let alone reactions. And they ALL rely on the SQL query pattern and response object and table structures that got changed in the move.
It calls for a total rewrite. You want to move everything into one class, but you’re not sure what the side effects of stripping those functions out would be as some of them seem to be reused by unrelated parts of the application. You start to sweat.
B:
You finally have a design for handling the new and existing database connections that you feel pretty good about, but you’re starting to feel the pressure of your deadline. Your team says they don’t do time-based estimations, but in practice everyone knows how long a five-point story usually takes.
Everyone aims to pull in seven points of work each sprint. Do the math. You only have a couple days left before this story will start stinking up your board and you’re maybe a third of the way through.
You start coding the connection design.
A:
You present your findings at stand up. As a team, you discuss that the reactions rewrite can happen after you prove out the new messages pattern. You feel a huge weight lifted off your shoulders. You decide to pull in a design review story to architect connection and query writing patterns. You set up a working meeting with your organization’s architect and the two of you are able to make a lot of progress.
You knock out the query formation and schema validation designs quickly and she points out that you’ll need to make some changes to pooling management in order to create a class to handle both sorts of database connections and puts you in contact with the engineer who wrote the existing connection pattern.
You reach out to that engineer and they’re sort of helpful. They don’t remember how things work super well, but they’re able to elucidate some mechanics that have been mystifying you.
You spend another day and, after setting up another meeting to bounce ideas off a teammate, settle upon a connection management design that you’re happy with.
B:
Okay, you’re connected to the database. Great. You hard code some queries to perform database reads, figuring you can work backwards from there.
You try to load your messages and… nothing shows up. Why isn’t anything showing up? That query was working in the noSql console. Whatever. There’s no time to worry about that now. There must be some minor snag somewhere, you can worry about that later. It’s your own fault for trying to cut corners. Just get the query formation class written properly, then figure out why the response isn’t showing up. You start coding.
A:
You’ve pulled in what your team is calling a “messaging-only PoC.” Updating the reactions system for said messages is being deprioritized for now.
With your designs in hand you get to work, slowly swapping out the janky — but working — code from your spike. The fact that you’re always able to revert to having functionality proves tremendously helpful as you’re able to quickly uncover several minor errors in the queries you’d been trying to write.
In the space of just a couple days you’re able to get your design written and tested. You’d be ready to take the new feature to prod, if it weren’t for the mess that is responses.
You pull in another spike story to see which of the functions involved in the responses process are used elsewhere so you can try to figure out how to design a standalone class. It’s going to take a lot of work, but your team is happy with what you’ve done so far and they’re caught up with what you’re working on, ready to help out if need be.
B:
You’ve written zero tests. None. And you’re officially overdue. Way overdue. The good news is that this is just a QA story and you don’t need to merge to main in order to deploy to QA.
Given your timeline, you’ve decided that testing is outside the scope of this story. You can worry about that when you go to prod. In retrospect it was odd to have broken this effort into QA and prod deployments — it’s not really an infrastructure story anyways. Adding the prod config took barely any effort, you’ll need something to spend that story working on.
You’re just about done adding the new schema to the validation class (took a long night of panicking to figure out what was going on there), you handled the dependency injection for whether it’s a SQL or noSQL response in a sort-of janky way, but it works.
Great, you’re finally able to load messages! As you’re testing the functionality you realize the reaction options aren’t populating. You look through the code responsible. What a hideous mess. It’s all over the place and tightly coupled to the old database design in multiple respects. How did you forget about this? How could you be so stupid? How…
No time to think. You roll up your sleeves and start coding.
The ‘good’ example here is very idyllic and the ‘bad’ example outright unhinged, but breaking your stories down like this can really have a very powerful effect on the way you think and work.
Benefit 1: Better Intra-Team Communication
‘Exploratory stories’ could also be thought of as ‘explanatory stories’. You’re likely going to be working in a pod with product people, tech leads, and designers in addition to fellow engineers. Your non-coding coworkers are going to understand very little of the technical details you share in stand ups and, to be honest, it’s also a lot to ask your fellow engineers to stay engaged with the problems you’re working on based on the breadcrumbs shared in meetings like that.
In order to keep our teams up to speed, it’s important to leverage our shared language of agile. Everyone on the team knows, or has the capacity to learn, what a Spike, Proof of Concept, and Design Review is. Everyone also has the ability to understand the functional utility of going through the necessary steps before leaping into coding a solution. If anyone on your team is resistant to the idea of such due diligence, I hope this article, or one of the many others like it out there, can serve as a resource to convince them otherwise.
If your manager or product owner remains resistant to the idea of pulling in a Spike or PoC before pointing out a large story, I’d strongly advise you to start looking for a new role. Not only will that environment not foster good coding habits, it will likely hold you in a near-constant position of heightened stress.
Benefit 2: Improved Estimations
Insidiously, even if efforts (like adopting fibonacci or some other abstracted ‘pointing’ strategy) are taken to base story estimates on something other than time, a new shared vocabulary for how long stories of a certain size remain on the board begin to take form.
While you don’t want to commit yourself to an overly-optimistic timeline, you also don’t want to look like you’re milking what other engineers or products consider to be an ‘easy’ story in order to luxuriate in your task. This aversion often leads new (and some seasoned) engineers to underpoint their stories.
My team practices blind pointing and writing stories in such a way that they could be picked up by any engineer on the team to help offset this phenomenon, but by itself that doesn’t solve the issue of engineers getting stuck with an unexpectedly complicated task.
How blind pointing works: When writing the story, we leave the points section empty. When we refine the story, someone reads it out, then the PO opens up a poll and everyone votes on how many points they think it’s worth. If the result isn’t unanimous, we then have a discussion about why we chose what we did and decide on what we think makes the most sense, normally defaulting to the highest estimation, unless it was based upon an incorrect assumption. A lot of previously unrealized complexity is called out during this process, as everyone has a slightly different frame of reference for the problem at hand.
An inherent issue with estimation is that if you don’t know all the work that will go into the story, your estimation must be a guess (Which fact, among many others, calls the whole exercise of estimation into question #noestimates). So it really behooves the team to pull in a time-boxed Spike to better understand the story’s scope and ensure they’re not trapping an engineer in a thick jungle of code that they’ll have to hack their way out of.
Benefit 3: Good Code Stewardship
That brings us to arguably the most important benefit of using exploratory stories.
An engineer who’s behind on a deadline is a threat to a codebase. In the best case scenario, the delayed engineer will normally wind up repurposing existing solutions or crafting something a little too case-specific, then logging a bunch of tech debt items for later improvement.
In the worst case scenario, like the one depicted in Reality B, missed deadlines result in the perpetuation (and probable further degradation) of an already awful pattern, with no regard for best practices, efficiencies, or testability.
As you take on new work, you should always feel like you have enough time to not just tackle your problem the right way, but leave everything you touch better than you found it. While there’s never enough time to perfect every part of your code base, when you break stories down into planning and research tasks, you can have team-level discussions about the current state of things and what needs to change before you can move forward responsibly.
Operating with transparency and sharing concerns amongst your team with leadership present gives you a lot more leverage to advocate for constructive redesign then when you’re behind schedule. Effectively documenting and sharing your Spike findings also allows your teammates to weigh in on and advocate for well-designed solutions to the problems you unearth.
Conversely, not effectively sharing the issues you’re struggling with and siloing yourself can turn a good team into an antagonistic entity. Rather than serving as other minds to bounce ideas off of, your teammates’ perspectives begin to seem threatening, ready to poke holes in your approach and call out additional work you need to pull in. Your product owner goes from a helpful conduit between engineers and their customers to an exacting taskmaster and your manager, wondering why this simple story is taking you so long, might put guidelines on the work you take on in the future. Given how little their teammates know, I highly doubt the engineer in Example B would be able to advocate for a redesign in order to improve upon the response functionality.
We’re living in a professional landscape dominated by agile and in order to be good stewards of our codebases (and our psyches), we need to know how to efficiently navigate the shared vocabulary it provides.