**What**
This PR introduces experimental Hot Module Replacement (HMR) for the Medusa backend, enabling developers to see code changes reflected immediately without restarting the server. This significantly improves the development experience by reducing iteration time.
### Key Features
- Hot reload support for:
- API Routes
- Workflows & Steps
- Scheduled Jobs
- Event Subscribers
- Modules
- IPC-based architecture: The dev server runs in a child process, communicating with the parent watcher via IPC. When HMR fails, the child process is killed and restarted, ensuring
clean resource cleanup.
- Recovery mechanism: Automatically recovers from broken module states without manual intervention.
- Graceful fallback: When HMR cannot handle a change (e.g., medusa-config.ts, .env), the server restarts completely.
### Architecture
```mermaid
flowchart TB
subgraph Parent["develop.ts (File Watcher)"]
W[Watch Files]
end
subgraph Child["start.ts (HTTP Server)"]
R[reloadResources]
R --> MR[ModuleReloader]
R --> WR[WorkflowReloader]
R --> RR[RouteReloader]
R --> SR[SubscriberReloader]
R --> JR[JobReloader]
end
W -->|"hmr-reload"| R
R -->|"hmr-result"| W
```
### How to enable it
Backend HMR is behind a feature flag. Enable it by setting:
```ts
// medusa-config.ts
module.exports = defineConfig({
featureFlags: {
backend_hmr: true
}
})
```
or
```bash
export MEDUSA_FF_BACKEND_HMR=true
```
or
```
// .env
MEDUSA_FF_BACKEND_HMR=true
```
Co-authored-by: Oli Juhl <59018053+olivermrbl@users.noreply.github.com>
Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>
**What**
It seems that for some reason the weak map fail in some scenario, but after investigation, the usage of map would not have a bad impact as it will be released after the Distributed transaction if finished. Therefore, falling back to Map instead
FIXES https://github.com/medusajs/medusa/issues/13654
NOTE: Waiting for the user feedback as he is also using node 18. We also use the exact same pattern in all our core flows without issues 🤔
RESOLVES CORE-1171
**What**
- Reduced async overhead for objects with mixed sync/async properties
- Lower memory pressure from eliminated promise allocations
- Faster primitive value processing with early returns
- Better concurrency through selective batching of truly async operations
- Event loop friendly behavior preventing artificial delays
- Reduced memory allocation from eliminated duplicate processing
- Decreased GC pressure from WeakMap-based caching instead of map
**note**
Now, `resolveValue`always treat every resoltion as sync operation unless it is not, meaning we do not create promise overhead when not necessary and only when actually treating with promises
RESOLVES CORE-1163
RESOLVES CORE-1164
**What**
### Add support for non auto retryable steps.
When marking a step with `maxRetries`, when it will fail it will be marked as temporary failure and then retry itself automatically. Thats the default behaviour, if you now add `autoRetry: false`, when the step will fail it will be marked as temporary failure but not retry automatically. you can now call the workflow engine run to resume the workflow from the failing step to be retried.
### Add support for `maxAwaitingRetries`
When setting `retyIntervalAwaiting` a step that is awaiting will be retried after the specified interval without maximun retry. Now you can set `maxAwaitingRetries` to force a maximum awaiting retry number
### Add support to manually retry an awaiting step
In some scenario, either a machine dies while a step is executing or a step is taking longer than expected, you can now call `retryStep` on the workflow engine to force a retry of the step that is supposedly stucked
* fix(orchestration): Use the step definition max retries on set step failure
* Create sweet-turkeys-wait.md
* allow to force permanent failure
* update changeset
** What
- Allow auto-loaded Medusa files to export a config object.
- Currently supports isDisabled to control loading.
- new instance `FeatureFlag` exported by `@medusajs/framework/utils`
- `feature-flags` is now a supported folder for medusa projects, modules, providers and plugins. They will be loaded and added to `FeatureFlag`
** Why
- Enables conditional loading of routes, migrations, jobs, subscribers, workflows, and other files based on feature flags.
```ts
// /src/feature-flags
import { FlagSettings } from "@medusajs/framework/feature-flags"
const CustomFeatureFlag: FlagSettings = {
key: "custom_feature",
default_val: false,
env_key: "FF_MY_CUSTOM_FEATURE",
description: "Enable xyz",
}
export default CustomFeatureFlag
```
```ts
// /src/modules/my-custom-module/migration/Migration20250822135845.ts
import { FeatureFlag } from "@medusajs/framework/utils"
export class Migration20250822135845 extends Migration {
override async up(){ }
override async down(){ }
}
defineFileConfig({
isDisabled: () => !FeatureFlag.isFeatureEnabled("custom_feature")
})
```
FIXES CLO-524
**What**
Add hidden stepDefinition object as part of the step argument and ensure the runAsStep handlers rely on the latest definition when config is being used on the returned step in order to ensure async configuration propagation and nested configuration
**What**
Currently, runAsStep keep reference of the workflow context that is being run as step, except that the step is composed for the current workflow composition and not the workflow being run as a step. Therefore, the context are currently miss matched leading to wrong configuration being used in case of async workflows.
**BUG**
This fix allow the runAsStep to use the current composition context to configure the step for the sub workflow to be run
**BUG BREAKING**
fix the step config wrongly used to wrap async step handlers. Now steps configured async through .config that returns a new step response will indeed marked itself as success without the need for background execution or calling setStepSuccess (as it was expected originally)
**FEATURE**
This pr also add support for cancelling running transaction, the transaction will be marked as being cancelled, once the current step finished, it will cancel the transaction to start compensating all previous steps including itself
Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>
What:
- Added step config `skipOnPermanentFailure`. Skip all the next steps when the current step fails. If a string is used, the workflow will resume from the given step.
- Fix `continueOnPermanentFailure` to continue the execution of the flow when a step fails.
```ts
createWorkflow("some-workflow", () => {
errorStep().config({
skipOnPermanentFailure: true,
})
nextStep1() // skipped
nextStep2() // skipped
})
createWorkflow("some-workflow", () => {
errorStep().config({
skipOnPermanentFailure: "resume-from-here",
});
nextStep1(); // skipped
nextStep2(); // skipped
nextStep3().config({ name: "resume-from-here" }); // executed
nextStep4(); // executed
});
```
**What**
Now that all events management are fixed in the workflows life cycle, the run as step needs to leverage the workflow engine if present (which should always be the case for async workflows) in order to ensure the continuation and the ability to mark parent step in parent workflow as success or failure
Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>
* fix(workflow-engine-*): Prevent passing shared context reference
* fix(workflow-engine-*): Prevent passing shared context reference
* prevent tests from hanging
* fix event handling
* add integration tests
* use interval for scheduled in tests
* skip tests for now
* Create silent-glasses-enjoy.md
* fix cancel
* changeset
* push multiple aliases
* test multiple field alias
* increase wait time to index on test
---------
Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>
Co-authored-by: Carlos R. L. Rodrigues <rodrigolr@gmail.com>
**What**
- Prevent event release when a workflow is run as step and finish
- Use `unlink` instead of `del` when removing keys from redist to push the execution to async thread
FIXES FRMW-2852
**What**
A workflow distributed transaction expect any response and error to be serializable. When it is not the case, the distributed transaction might fail during the save checkpoint that occurs for async steps. This can lead to unexpected behaviour.
With this pr, we introduce a way to handle non serialazable object in a more sustainable manner, this means the following:
- If a workflow throw any non serialazable error (e.g AWS error that contains full IncomingMessage object that related to network communication, think of req/res) then we identify that this object is not serialzable and we clean up the object to make it serializable without loosing the main information, add a new error to the workflow to informed of this issue and can be handled by the user.
- If a response is not serializable (which should not happen at this point because it is handled before by the value resolver), in that case, we wont be able to reuse that response to continue the workflow which means that the workflow is in a non runnable state. In that case we throw a specific error stating that a non serializable context is being provided
**second what**
This pr refactor the `runAsStep` to add better support for workflow cancelation, especially async ones
What:
- copy data before saving checkpoint
- removed unused data format function
- properly handle registerStepFailure to not throw
- emit onFinish event even when execution failed