medusa-store

Author	SHA1	Message	Date
Adrien de Peretti	d51ae2768b	chore(workflow-engine-): cleanup and improvements (#13789 ) What* Cleanup recent work on workflows	2025-10-23 10:50:24 +00:00
Sebastian Rindom	bad0858348	fix: prevent jobId collisions on workflow step retries (#13786 ) ## Summary What — What changes are introduced in this PR? This PR fixes a bug where async workflow steps with retry intervals would get stuck after the first retry attempt due to Bull queue jobId collisions preventing retry jobs from executing. Why — Why are these changes relevant or necessary? Workflows using async steps with retry configurations (e.g., `retryInterval: 1`, `maxRetries: 5`) would fail once, schedule a retry, but the retry job would never execute, causing workflows to hang indefinitely. How — How have these changes been implemented? Root Cause: Bull queue was rejecting retry jobs because they had identical jobIds to the async execution jobs that already completed. Both used the format: `retry:workflow:transaction:step_id:attempts`. Solution: Modified `getJobId()` in `workflow-orchestrator-storage.ts` to append a `:retry` suffix when `interval > 0`, creating unique jobIds: - Async execution (interval=0): `retry:...:step_id:1` - Retry scheduling (interval>0): `retry:...:step_id:1:retry` Updated methods: `getJobId()`, `scheduleRetry()`, `removeJob()`, and `clearRetry()` to pass and handle the interval parameter. Testing — How have these changes been tested, or how can the reviewer test the feature? Added integration test `retry-interval.spec.ts` that verifies: 1. Step with `retryInterval: 1` and `maxRetries: 3` executes 3 times 2. Retry intervals are approximately 1 second between attempts 3. Workflow completes successfully after retries 4. Uses proper async workflow completion pattern with `subscribe()` and `onFinish` event --- ## Examples ```ts // Example workflow step that would previously get stuck export const testRetryStep = createStep( { name: "test-retry-step", async: true, retryInterval: 1, // 1 second retry interval maxRetries: 3, }, async (input: any) => { // Simulate failure on first 2 attempts if (attempts < 3) { throw new Error("Temporary failure - will retry") } return { success: true } } ) // Before fix: Step would fail once, schedule retry, but retry job never fired (jobId collision) // After fix: Step properly retries up to 3 times with 1-second intervals ``` --- ## Checklist Please ensure the following before requesting a review: - [ ] I have added a changeset for this PR - Every non-breaking change should be marked as a patch - To add a changeset, run `yarn changeset` and follow the prompts - [ ] The changes are covered by relevant tests - [ ] I have verified the code works as intended locally - [ ] I have linked the related issue(s) if applicable --- ## Additional Context - Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>	2025-10-21 18:27:21 +00:00
Adrien de Peretti	516f5a3896	fix: workflow async concurrency (#13769 ) * executeAsync * \|\| 1 * wip * stepId * stepId * wip * wip * continue versioning management changes * fix and improve concurrency * update in memory engine * remove duplicated test * fix script * Create weak-drinks-confess.md * fixes * fix * fix * continuation * centralize merge checkepoint * centralize merge checkpoint * fix locking * rm only * Continue improvements and fixes * fixes * fixes * hasAwaiting will be recomputed * fix orchestrator engine * bump version on async parallel steps only * mark as delivered fix * changeset * check partitions * avoid saving when having parent step * cart test --------- Co-authored-by: Carlos R. L. Rodrigues <rodrigolr@gmail.com> Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com> Co-authored-by: Oli Juhl <59018053+olivermrbl@users.noreply.github.com>	2025-10-20 15:29:19 +02:00
Adrien de Peretti	0cbd9f0bc3	chore(): Improve caching rollout (#13702 ) * chore(): Improve caching rollout * Create bright-cobras-complain.md * chore(): Improve caching rollout * downgrade orm to 6.4.3 * chore(): Improve caching rollout * chore(): Improve caching rollout * chore(): Improve caching rollout * chore(): Improve caching rollout * chore(): Improve caching rollout * fix * update changeset * update modules definition * update engine tests * update engine tests * improve integration * improve integration * gracefully disconnect * update test * another attempt * another attempt * fix workflow storage * fix remote joiner * fix remote joiner --------- Co-authored-by: Oli Juhl <59018053+olivermrbl@users.noreply.github.com>	2025-10-08 17:44:00 +02:00
Adrien de Peretti	8734866eb1	fix(): Transform map (#13655 ) What It seems that for some reason the weak map fail in some scenario, but after investigation, the usage of map would not have a bad impact as it will be released after the Distributed transaction if finished. Therefore, falling back to Map instead FIXES https://github.com/medusajs/medusa/issues/13654 NOTE: Waiting for the user feedback as he is also using node 18. We also use the exact same pattern in all our core flows without issues 🤔	2025-10-02 15:54:11 +00:00
Adrien de Peretti	76aa4a48b3	fix(): workflows concurrency (#13645 )	2025-10-02 11:11:38 -03:00
Adrien de Peretti	12a96a7c70	chore(): Move peer deps into a single package and re export from framework (#13439 ) * chore(): Move peer deps into a single package and re export from framework * WIP * update core packages * update cli and deps * update medusa * update exports path * remove analyze * update modules deps * finalise changes * fix yarn * fix import * Refactor peer dependencies into a single package Consolidate peer dependencies into one package and re-export from the framework. * update changeset * Update .changeset/brown-cows-sleep.md Co-authored-by: Oli Juhl <59018053+olivermrbl@users.noreply.github.com> * rm deps * fix deps * increase timeout * upgrade version * update versions * update versions * fixes * update lock * fix missing import * fix missing import --------- Co-authored-by: Oli Juhl <59018053+olivermrbl@users.noreply.github.com>	2025-09-22 18:36:22 +02:00
Adrien de Peretti	8ece06d8ed	chore(): upgrade mikro orm (#13450 )	2025-09-19 21:39:18 +02:00
Adrien de Peretti	ebf33bea43	fix(): pipeline missing suites (#13457 ) * fix(): pipeline missing suites * fix tax integration tests * fix tax integration tests * fix pipeline * fix link integration tests * remove old tests and move current one * fix workflow execution integration tests * fix tests and orchestrator * Fix missing suites in pipeline Remove integration-tests-modules from patch list.	2025-09-15 12:54:57 +02:00
Adrien de Peretti	0079464f32	fix(): temporary transform cached data (#13467 ) What Properly store key obj reference for weak map usage	2025-09-11 07:31:01 +00:00
Adrien de Peretti	fc4d5f0ac9	chore(): Workflow engine timers and notification improvements (#13434 ) RESOLVES CORE-1177 What main changes are: - not blocking execution when notifying - timers management - race condition checks improvements	2025-09-08 18:19:55 +00:00
Adrien de Peretti	8e5c22a8e8	chore(): Improve workflows sdk tooling (#13421 ) RESOLVES CORE-1171 What - Reduced async overhead for objects with mixed sync/async properties - Lower memory pressure from eliminated promise allocations - Faster primitive value processing with early returns - Better concurrency through selective batching of truly async operations - Event loop friendly behavior preventing artificial delays - Reduced memory allocation from eliminated duplicate processing - Decreased GC pressure from WeakMap-based caching instead of map note Now, `resolveValue`always treat every resoltion as sync operation unless it is not, meaning we do not create promise overhead when not necessary and only when actually treating with promises	2025-09-08 16:24:10 +00:00
Adrien de Peretti	d7692100e7	chore(orchestration): add support for autoRetry, maxAwaitingRetries, retryStep (#13391 ) RESOLVES CORE-1163 RESOLVES CORE-1164 What ### Add support for non auto retryable steps. When marking a step with `maxRetries`, when it will fail it will be marked as temporary failure and then retry itself automatically. Thats the default behaviour, if you now add `autoRetry: false`, when the step will fail it will be marked as temporary failure but not retry automatically. you can now call the workflow engine run to resume the workflow from the failing step to be retried. ### Add support for `maxAwaitingRetries` When setting `retyIntervalAwaiting` a step that is awaiting will be retried after the specified interval without maximun retry. Now you can set `maxAwaitingRetries` to force a maximum awaiting retry number ### Add support to manually retry an awaiting step In some scenario, either a machine dies while a step is executing or a step is taking longer than expected, you can now call `retryStep` on the workflow engine to force a retry of the step that is supposedly stucked	2025-09-08 12:46:30 +00:00
Carlos R. L. Rodrigues	bd571aca82	chore(orchestration): remote joiner query planner (#13364 ) What: - Added query planning to the Remote Joiner, enabling phased and parallel execution of data aggregation. - Replaced object deletes with non-enumerable property hiding to improve performance.	2025-09-04 14:18:02 +00:00
Carlos R. L. Rodrigues	9412669e65	chore: idempotent cart operations (#13236 ) * chore(core-flows): idempotent cart operations * changeset * add tests * revert * revert route * promo test * skip bugs * fix test * tests * avoid workflow name conflict * prevent nested workflow from being deleted until the top level parent finishes * remove unused setTimeout * update changeset * rm comments --------- Co-authored-by: adrien2p <adrien.deperetti@gmail.com>	2025-08-28 15:04:00 +02:00
Adrien de Peretti	ff152e7ace	fix(orchestration): Use the step definition max retries on set step failure (#13319 ) * fix(orchestration): Use the step definition max retries on set step failure * Create sweet-turkeys-wait.md * allow to force permanent failure * update changeset	2025-08-28 14:35:31 +02:00
Adrien de Peretti	c5d609d09c	fix(orchestration): Prevent workf. cancellation to execute while rescheduling (#12903 ) What Currently, when cancelling async workflows, the step will get rescheduled while the current worker try to continue the execution leading to concurrency failure on compensation. This pr prevent the current worker from executing while an async step gets rescheduled Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>	2025-07-16 14:44:09 +00:00
Carlos R. L. Rodrigues	e74044af4d	chore(orchestration): improve transaction errors (#12951 )	2025-07-14 11:37:54 -03:00
Adrien de Peretti	e9a33d0700	fix(orchestration): Handle expected lifecycle errors (#12886 ) * fix(orchestration): Handle expected lifecycle errors * fix(orchestration): Handle expected lifecycle errors * Create fast-ears-own.md	2025-07-06 23:11:03 +02:00
Adrien de Peretti	95d282e8ef	fix: test utils events + workflow storage (#12834 ) * feat(test-utils): Make event subscriber waiter robust and concurrent * feat(test-utils): Make event subscriber waiter robust and concurrent * fix workflows storage * remove timeout * Create gentle-teachers-doubt.md * revert timestamp * update changeset * fix execution loop * exit if no steps to await * typo * check next * check next * changeset * skip when async steps * wait workflow executions utils * wait workflow executions utils * wait workflow executions utils * increase timeout * break loop --------- Co-authored-by: Carlos R. L. Rodrigues <rodrigolr@gmail.com> Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>	2025-06-30 13:34:08 +02:00
Adrien de Peretti	316a325b63	fix(workflow-engine-*): Cleanup expired executions and reduce redis storage usage (#12795 )	2025-06-24 13:32:10 +02:00
Carlos R. L. Rodrigues	ebe5cc7acd	chore(index): return ids only (#12543 )	2025-05-20 11:16:02 -03:00
Adrien de Peretti	7fdbf2a965	fix(workflows-sdk): Miss match context usage within run as step (#12449 ) What Currently, runAsStep keep reference of the workflow context that is being run as step, except that the step is composed for the current workflow composition and not the workflow being run as a step. Therefore, the context are currently miss matched leading to wrong configuration being used in case of async workflows. BUG This fix allow the runAsStep to use the current composition context to configure the step for the sub workflow to be run BUG BREAKING fix the step config wrongly used to wrap async step handlers. Now steps configured async through .config that returns a new step response will indeed marked itself as success without the need for background execution or calling setStepSuccess (as it was expected originally) FEATURE This pr also add support for cancelling running transaction, the transaction will be marked as being cancelled, once the current step finished, it will cancel the transaction to start compensating all previous steps including itself Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>	2025-05-14 13:28:16 +00:00
Adrien de Peretti	80007f3afd	feat(workflows-*): Allow to re run non idempotent but stored workflow with the same transaction id if considered done (#12362 )	2025-05-06 17:17:49 +02:00
Carlos R. L. Rodrigues	e180253d60	feat(orchestration): skip on permanent failure (#12027 ) What: - Added step config `skipOnPermanentFailure`. Skip all the next steps when the current step fails. If a string is used, the workflow will resume from the given step. - Fix `continueOnPermanentFailure` to continue the execution of the flow when a step fails. ```ts createWorkflow("some-workflow", () => { errorStep().config({ skipOnPermanentFailure: true, }) nextStep1() // skipped nextStep2() // skipped }) createWorkflow("some-workflow", () => { errorStep().config({ skipOnPermanentFailure: "resume-from-here", }); nextStep1(); // skipped nextStep2(); // skipped nextStep3().config({ name: "resume-from-here" }); // executed nextStep4(); // executed }); ```	2025-04-17 12:49:58 +00:00
Adrien de Peretti	8618e6ee38	fix(): Properly handle workflow as step now that events are fixed entirely (#12196 ) What Now that all events management are fixed in the workflows life cycle, the run as step needs to leverage the workflow engine if present (which should always be the case for async workflows) in order to ensure the continuation and the ability to mark parent step in parent workflow as success or failure Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com>	2025-04-17 09:34:19 +00:00
Adrien de Peretti	2f6963a5fb	fix(): Event group id propagation and event managements (#12157 )	2025-04-14 15:57:52 -03:00
Carlos R. L. Rodrigues	31abba8cde	fix(orchestrator): save checkpoint before async step (#12138 )	2025-04-10 15:36:36 +00:00
Adrien de Peretti	13e159d8ad	fix(workflow-engine-): Prevent passing shared context reference (#11873 ) fix(workflow-engine-): Prevent passing shared context reference fix(workflow-engine-): Prevent passing shared context reference prevent tests from hanging * fix event handling * add integration tests * use interval for scheduled in tests * skip tests for now * Create silent-glasses-enjoy.md * fix cancel * changeset * push multiple aliases * test multiple field alias * increase wait time to index on test --------- Co-authored-by: Carlos R. L. Rodrigues <37986729+carlos-r-l-rodrigues@users.noreply.github.com> Co-authored-by: Carlos R. L. Rodrigues <rodrigolr@gmail.com>	2025-04-09 10:39:29 +02:00
Carlos R. L. Rodrigues	92c7baa561	fix(orchestration): handle multiple shortcuts with same name (#12066 )	2025-04-07 11:59:28 -03:00
Carlos R. L. Rodrigues	0625f76cd4	chore(workflow-engine): export cancel method (#11844 ) What: * Workflow engine exports the method `cancel` to revert a workflow.	2025-03-17 12:59:09 +00:00
Adrien de Peretti	fc652ea51e	fix(workflow-engine-): scheduled jobs interval (#11800 ) What* Currently only cron pattern are supported by scheduled jobs, this can lead to issue. for example you set the pattern to execute every hours at minute 0 and second 0 (as it is expected to execute at exactly this constraint) but due to the moment it gets executed we our out of the second 0 then the job wont get executed until the next scheduled cron table execution. With this pr we introduce the `interval` configuration which allows you the specify a delay between execution in ms (e.g every minute -> 60 * 1000 ms) which ensure that once a job is executed another one is scheduled for a minute later. Usage ```ts // jobs/job-1.ts const thirtySeconds = 30 * 1000 export const config = { name: "job-1", schedule: { interval: thirtySeconds }, } ```	2025-03-13 15:05:13 +00:00
Adrien de Peretti	72d2cf9207	fix(workflow-engines): race condition when retry interval is used (#11771 )	2025-03-12 09:53:34 -03:00
Carlos R. L. Rodrigues	22276648ad	feat: query.index (#11348 ) What: - `query.index` helper. It queries the index module, and aggregate the rest of requested fields/relations if needed like `query.graph`. Not covered in this PR: - Hydrate only sub entities returned by the query. Example: 1 out of 5 variants have returned, it should only hydrate the data of the single entity, currently it will merge all the variants of the product. - Generate types of indexed data example: ```ts const query = container.resolve(ContainerRegistrationKeys.QUERY) await query.index({ entity: "product", fields: [ "id", "description", "status", "variants.sku", "variants.barcode", "variants.material", "variants.options.value", "variants.prices.amount", "variants.prices.currency_code", "variants.inventory_items.inventory.sku", "variants.inventory_items.inventory.description", ], filters: { "variants.sku": { $like: "%-1" }, "variants.prices.amount": { $gt: 30 }, }, pagination: { order: { "variants.prices.amount": "DESC", }, }, }) ``` This query return all products where at least one variant has the title ending in `-1` and at least one price bigger than `30`. The Index Module only hold the data used to paginate and filter, and the returned object is: ```json { "id": "prod_01JKEAM2GJZ14K64R0DHK0JE72", "title": null, "variants": [ { "id": "variant_01JKEAM2HC89GWS95F6GF9C6YA", "sku": "extra-variant-1", "prices": [ { "id": "price_01JKEAM2JADEWWX72F8QDP6QXT", "amount": 80, "currency_code": "USD" } ] } ] } ``` All the rest of the fields will be hydrated from their respective modules, and the final result will be: ```json { "id": "prod_01JKEAY2RJTF8TW9A23KTGY1GD", "description": "extra description", "status": "draft", "variants": [ { "sku": "extra-variant-1", "barcode": null, "material": null, "id": "variant_01JKEAY2S945CRZ6X4QZJ7GVBJ", "options": [ { "value": "Red" } ], "prices": [ { "amount": 20, "currency_code": "CAD", "id": "price_01JKEAY2T2EEYSWZHPGG11B7W7" }, { "amount": 80, "currency_code": "USD", "id": "price_01JKEAY2T2NJK2E5468RK84CAR" } ], "inventory_items": [ { "variant_id": "variant_01JKEAY2S945CRZ6X4QZJ7GVBJ", "inventory_item_id": "iitem_01JKEAY2SNY2AWEHPZN0DDXVW6", "inventory": { "sku": "extra-variant-1", "description": "extra variant 1", "id": "iitem_01JKEAY2SNY2AWEHPZN0DDXVW6" } } ] } ] } ``` Co-authored-by: Adrien de Peretti <25098370+adrien2p@users.noreply.github.com>	2025-02-12 12:55:09 +00:00
Shahed Nasser	d7342c6f82	chore(types): add TSDocs for retentionTime (#11345 )	2025-02-06 15:22:32 +02:00
Carlos R. L. Rodrigues	65fae943c9	feat(orchestration): hydrate resultset (#11263 ) What: * Add support for aggregating data into existing resultset. * `query.graph` new option `initialData` containing the resultset to be hydrated * It fetches data where the requested fields are not present and merge with the existing resultset	2025-02-03 11:24:57 +00:00
Carlos R. L. Rodrigues	e98d3c615e	chore(orchestration): validate PK when throwIfKeyNotFound (#11190 ) CLOSES: FRMW-2895	2025-01-29 19:17:36 +00:00
Carlos R. L. Rodrigues	01acf9e700	fix(orchestration): avoid retry when finished (#10913 )	2025-01-10 13:12:08 -03:00
Adrien de Peretti	7d8f6cf39f	fix(): Workflow cancellation + gracefully handle non serializable state (#10674 ) FIXES FRMW-2852 What A workflow distributed transaction expect any response and error to be serializable. When it is not the case, the distributed transaction might fail during the save checkpoint that occurs for async steps. This can lead to unexpected behaviour. With this pr, we introduce a way to handle non serialazable object in a more sustainable manner, this means the following: - If a workflow throw any non serialazable error (e.g AWS error that contains full IncomingMessage object that related to network communication, think of req/res) then we identify that this object is not serialzable and we clean up the object to make it serializable without loosing the main information, add a new error to the workflow to informed of this issue and can be handled by the user. - If a response is not serializable (which should not happen at this point because it is handled before by the value resolver), in that case, we wont be able to reuse that response to continue the workflow which means that the workflow is in a non runnable state. In that case we throw a specific error stating that a non serializable context is being provided second what This pr refactor the `runAsStep` to add better support for workflow cancelation, especially async ones	2025-01-05 13:30:17 +00:00
Carlos R. L. Rodrigues	aeb5b43692	feat(workflows-sdk): add response to permanent failure (#10177 )	2024-11-20 17:29:37 -03:00
Carlos R. L. Rodrigues	3e265229f2	chore(event-bus): event bus error handling (#10085 )	2024-11-13 18:03:29 -03:00
Carlos R. L. Rodrigues	1eef324af3	fix(orchestration): fix set step failure (#10031 ) What: - copy data before saving checkpoint - removed unused data format function - properly handle registerStepFailure to not throw - emit onFinish event even when execution failed	2024-11-12 10:06:36 +00:00
Adrien de Peretti	f295596df2	fix(orchestration): Ids wrongly processed when using operators map (#9781 ) * fix(orchestration): Ids processing when using operators map * fix(orchestration): Ids processing when using operators map * address feedback * address feedback	2024-10-25 11:10:35 +02:00
Carlos R. L. Rodrigues	6fa98b6a4d	chore(orchestration): modules method context (#9669 ) What * Test to check if the MedusaContext is being injected when calling a Module's method	2024-10-18 21:08:06 +00:00
Carlos R. L. Rodrigues	d3c1580c06	fix(orchestration): local workflow proxy (#9664 )	2024-10-18 13:30:16 +00:00
Adrien de Peretti	34d57870ad	chore: workflow internals improvementss (#9455 )	2024-10-10 09:11:56 +02:00
Carlos R. L. Rodrigues	f9e8403d29	fix(orchestrator, workflows-sdk): skip async step (#9482 )	2024-10-07 07:35:30 -03:00
Oli Juhl	f7472a6fa6	fix: Idempotent cart completion (#9231 ) What - Store result of cart-completion workflow for three days by default - This enables the built-in idempotency mechanism to kick-in, provided the same transaction ID is used on workflow executions - Return order from cart-completion workflow if the cart has already been completed - In case transaction ID is not used on workflow executions, we still only want to complete a cart once	2024-10-04 12:01:09 +00:00
Carlos R. L. Rodrigues	8155c4e9ee	fix(workflows-sdk): when then return value (#9427 )	2024-10-02 07:40:39 -03:00
Adrien de Peretti	02629625ec	feat(orchestration): Provide hint in workflows error (#9400 ) * feat(orchestration): Provide hint in workflows error * remove log * fix tests * improve stack * fix type * formatting --------- Co-authored-by: Riqwan Thamir <rmthamir@gmail.com>	2024-10-02 11:54:07 +02:00

1 2

81 Commits