Introduction
This post covers how idempotency keys are used in Medusa and how you can implement your own idempotency key logic in a NodeJS application to make your API more robust. This post and the implementation discussed here are inspired by this article by Brandur.
What is idempotency?
Idempotence is a mathematical term used to describe algebraic expressions that remain invariant when raised to a natural power - the word itself comes from the Latin words idem and potence meaning same and power respectively. In software and tech idempotency typically refers to the idea that you can perform an operation multiple times without triggering any side effects more than once. This is an extremely powerful property for fault tolerance in larger systems where service availability cannot be guaranteed. If you are familiar with RESTful design you have probably heard that Copy to clipboardDELETE
requests should be idempotent, meaning that no matter how many times you make a Copy to clipboardDELETE
request on a certain resource it should always respond with confirmation that the resource has been deleted (unless business rules don't allow it that is).
In fintech applications, idempotency is typically extended to other types of requests to ensure that sensitive operations like issuing money transfers, etc. don't erroneously get duplicated. For example, Stripe has support for idempotency on all of their requests controlled by an Copy to clipboardIdempotency-Key
header. This allows you to safely retry requests if necessary, for example, you may be issuing an "Authorize Payment", but just after the request is sent your internet experiences an outage and you have no way of knowing whether the payment was successfully authorized or not; however, by using idempotency keys you can safely retry the "Authorize Payment" request without having to worry about making two payment authorizations.
One of the major benefits of headless commerce is that you can pick and choose the tools in your stack and have them integrate with each other for a best in breed stack; however, the more systems that are connected the more prone you will be to having inconsistencies across your tools, e.g. because of things out of your control such as server outages, connectivity issues or other unexpected situations. To solve this issue Medusa implements idempotency key support so that you can safely retry requests until consistency is confirmed.
How can idempotency keys be used?
There are two perspectives worth considering when answering the question of how idempotency keys can be used: one is from a client perspective, for example, when calling an API from a frontend, the other is from a server perspective when transferring data between systems. The purpose is the same in both circumstances, namely to ensure that an operation is completed correctly.
Client perspective
Imagine that you are adding a line item to a shopping cart through an API like Medusa's. You make a request to add the line item, but right after sending the request your internet drops resulting in a "Server not reachable" response - at this time it is not clear whether the request made it to the server and that the underlying database was able to successfully update your cart with the new item or if the internet dropped before sending the request and thus didn't result in an update in the backend. In the former case, a retry would result in your cart now having two items instead of one as expected, so if you retry the request you will have to have a compensating mechanism, which is tricky and tedious to build and test.
A typical retry flow
This is where idempotency keys come in handy as they can help you ensure that the intended state is reached even in fragile environments. In practice the requests would look something like this:
1234567891011121314151617181920const idempotencyKey = uuidv4()const makeRequest = async () => {return await api.post("/store/carts/[cart-id]/items", {variant_id: variant.id,quantity: 1}, {headers: {"Idempotency-Key": idempotencyKey}})}let resultlet shouldRetry = truewhile (shouldRetry) {const { response, error } = await makeRequest()if (error) {shouldRetry = shouldRetry(response) // retry logicif (shouldRetry) {
Notice that it is the same idempotency key that is being passed across all retries. This indicates to the backend: "Hey, I am only interested in this operation happening once - can you please check if the operation has already succeeded. If so just respond with the result of the succeeded operation, otherwise, perform the operation now and store the result under this key so subsequent requests with the same key don't perform the operation multiple times".
Server perspective
Now, shifting to the server perspective, imagine that you are integrating your commerce API with a payment provider like Stripe and you need to allow your API's consumers to be able to issue refunds. You both have to keep track of the refunds in your system but will also have to call Stripe's API to make sure that the refund of the money goes through to the customer's bank account. Consider what steps your backend would have to take when handling a refund request - you may come up with something along the lines of this:
- Validate that the requested amount can be refunded (i.e. less than the original payment for the order minus what has already been refunded).
- Create a record of the refund in your database.
- Issue refund through the Stripe API.
- Store Stripe refund id in the internal record of refund.
- Dispatch job to send a refund confirmation email to the customer
- Complete request and respond
A naive implementation would just execute each of the steps and hope for the best, but that would be a bad idea. Consider what would happen in a case where a server experiences an outage and you have to decide whether to retry the request or not - you don't know which of the steps failed so it is unclear whether the Stripe request has been processed; if it has a new request would duplicate the refund which is obviously bad. Alternatively, if the Stripe request hasn't been processed you may have stale data in your internal database.
A slightly better approach would be to wrap everything into an ACID transaction and roll back if something fails. This way you don't end up having records in your database if something fails unexpectedly; however, in the case of an error you are still left in the dark as to whether the Stripe request was successfully processed or not; so how might you safely retry your failed request? Luckily, Stripe has support for idempotency keys so if your implementation makes sure to forward the idempotency key to Stripe you can safely retry your request without having to worry about refunding the requested amount more than once; however, it is not all external systems that have support for idempotency keys and under such circumstances, you need to take additional measures for your requests to be idempotent. You will see how this can be accomplished through atomic phases shortly.
Idempotency Key implementation in Expressjs
The outline here will reveal how to implement idempotency keys in a NodeJS application with Express. It is assumed that the underlying database for the application is an ACID-compliant relational database like Postgresql. Before going further it should be clarified what exactly is meant by an idempotency key in this context: an idempotency key is a string that identifies a database entity that tracks the progress of an API request. By tracking the progress, idempotency keys can either pick up where previously failed requests left off or if a previous request succeeded they can be used to return a cached result of the request.
Building further on the idea of a Cart API, consider the API request needed to transform a Cart into an Order. The steps to take will be something like the following:
Consider the steps in the above request and what your system state and your payment providers state will have recorded in each of the failure points. You may consider each of them and find the following:
Failure point #1
You have created a record of the incoming request, but have failed to authorize the payment and no order has been created. You can safely retry the request.
Failure point #2
The payment has successfully been authorized and a record of the payment is stored. The order has not been created. If you retry the request now you will be authorizing the payment again - this may fail or worse authorize a new payment duplicating the payment from the previous request. Unless some compensation mechanism has been implemented in your authorization logic that checks for a previous payment it is generally not safe to retry the request.
Failure point #3
At this point you have both authorized the payment and created an order in your system. Retrying the request may result in both a duplicate order and a duplicate payment authorization.
Now consider what will happen if you wrap your entire request in a transaction that rolls back after each of the failure points. For failure point 1 you can safely retry, but rolling back at failure point 2 and 3 will result in your own state and the external state of the payment provider being out of sync. Namely the payment provider will have a payment that your internal system has no record of. In order to overcome this problem you must be able to recover from failed requests depending on whether the external system mutation has been completed or not. In simple terms a request retry should be able to say: "If the payment was already authorized skip that step and continue with creating the order. If the payment wasn't authorized do that now and continue". The points in the request lifetime where you wish to be able to retry from will be called recovery points in the following discussion.
Atomic phases
Between each recovery point you will complete an atomic phase, which is a set of operations that happen within a transaction. If one of the operations fail you will roll back the atomic phase and a retry of the request can then pick up from the recovery point that came before the atomic phase. Considering the request lifecycle above once again, you should realize that you will want 3 atomic phases. One before the payment authorization when the idempotency key is created, one containing the payment authorization and one after the payment authorization has been completed. The diagram below illustrates the atomic phases and each of the recovery points:
Retrying failed requests will now pick up from the most recently reached recovery point meaning that new requests will either skip the payment authorization or retry it if it failed, but will never duplicate it.
Now that you have a rough idea about the parts of the system that you will need to keep track of it is time to look at how you might implement this starting with a simplified database schema.
1234567891011121314151617181920IdempotencyKey- id- idempotency_key- request_path- request_params- response_code- response_body- recovery_pointPayment- id- payment_provider- idempotency_key- amountCart- id- items- completed_at
Note that the idempotency key entity notes which path and which parameters an API call is requesting. It also has fields for the response code and body to send after the API call has succeeded so that retries of completed requests can skip directly to the response.
To make atomic phases easy to work with consider the implementation below from Medusa's IdempotencyKeyService.
1234567891011121314151617181920async workStage(idempotencyKey, func) {try {return await this.transaction(async (manager) => {let keyconst { recovery_point, response_code, response_body } = await func(manager)if (recovery_point) {key = await this.update(idempotencyKey, {recovery_point,})} else {key = await this.update(idempotencyKey, {recovery_point: "finished",response_body,response_code,})}
The Copy to clipboardIdempotencyKeyService
in Medusa allows you to execute an atomic phase by using the service method called Copy to clipboardworkStage
, which takes an Copy to clipboardidempotencyKey
string and a Copy to clipboardfunc
function containing the operations to be executed inside the atomic phase. The function can return either a Copy to clipboardrecovery_point
string in which case the idempotency key's recovery point is updated to that value or alternatively a Copy to clipboardresponse_body
and Copy to clipboardresponse_code
in which case it is assumed that the operation is completed and we can allow the recovery point to be updated to "finished".
API controller implementation
Now it is time to implement the API controller that takes in the request to create an order from a cart. Below you are using a state machine pattern to step through each of the API request's atomic phases.
Notice that the first step in the implementation is to upsert the idempotency key: either by using a provided token in the Copy to clipboardIdempotency-Key
header or alternatively by creating a new one at random (this happens in Copy to clipboardinitializeRequest
).
Once the idempotency key is retrieved the request moves into the state machine where the recovery point of the idempotency key determines which atomic phase should be executed first. If the most recent recovery point is Copy to clipboard"started"
the request moves to authorization of the payment, if that has already been completed the request goes straight to creating the order.
The code snippet below is a simplified version of Medusa's request handler.
1234567891011121314151617181920export default async (req, res) => {const { id } = req.paramsconst idempotencyKeyService = req.scope.resolve("idempotencyKeyService")const cartService = req.scope.resolve("cartService")const orderService = req.scope.resolve("orderService")const headerKey = req.get("Idempotency-Key") || ""let idempotencyKeytry {idempotencyKey = await idempotencyKeyService.initializeRequest(headerKey,req.method,req.params,req.path)} catch (error) {res.status(409).send("Failed to create idempotency key")return
Notice how unexpected errors are bubbled out to the application controller - it is assumed that your Express app has an error boundary, that handles the error properly. Expected errors that are definitive, that is no matter how many calls you make it should always result in the same error code, which can be stored in the idempotency key so that subsequent requests can short circuit and send the cached response directly.
Using this pattern across your API endpoints will improve the robustness of your API by making it safe to retry all requests. This is useful for requests that modify the internal state alone, but the concept is especially powerful when dealing with requests that modify external states outside the control of your system. The key to making requests like these safe is to wrap external state modifications in atomic phases and allow retries to pick up both before or after such modifications, depending on the progress made from previous requests.
Idempotency in Medusa
In Medusa idempotency has so far been implemented for a handful of API requests, and support is continually being added to more endpoints. The goal is to support idempotency keys for all state-mutating requests so that you can be certain that retrying your requests is safe and harmless. The next step for Medusa will be to add idempotency patterns into the plugin APIs so that Medusa's core can implement self-healing logic that identifies and resolves inconsistencies between systems in your e-commerce stack. This will be a major improvement for the developer experience related to building headless commerce solutions, where there are lots of moving parts and hence lots of potential points of failure.
What's next?
If you wish to dive deeper into how idempotency keys are implemented in Medusa visit the Medusa GitHub repository. You are also more than welcome to join the Medusa Discord server, where you can get direct access to the Medusa engineering team, who will be happy to answer any questions you might have.
Thanks for reading and if you haven't already go check out the post by Brandur that inspired the implementation of idempotency keys in Medusa. Brandur also has a number of other articles that are definitely worth reading if you are looking to improve the robustness of your APIs.