SUMMARY:
Boomi’s parallel processing feature improves performance for high-volume integrations by splitting the workload to process multiple documents simultaneously using either threads or processes.
- Boomi enables parallelism primarily through the Flow Control shape, where you can configure the number of threads (for parallelism within a single process execution) or processes (for parallelism across multiple concurrent executions).
- While parallel processing dramatically reduces execution time for tasks like bulk data loads, you must carefully consider and tune the number of threads to avoid overwhelming external systems, such as hitting Salesforce’s concurrent API call limits.
- Parallel processing is ideal for large, independent data loads but should be avoided when strict record ordering is required or when integrating with systems that cannot handle multiple parallel requests.
A successful implementation requires balancing the speed benefits of parallelism against external system constraints and the specific requirements of the integration workflow.
Table of contents
As integrations grow in size, performance often becomes a challenge. A process that works well for a few records may slow down significantly when dealing with thousands. To handle such situations, Boomi provides parallel processing, which allows data to be processed faster by splitting the workload.
What is Parallel Processing?
Parallel processing means that Boomi can process multiple documents at the same time instead of one after another. This reduces overall execution time and helps improve performance when working with large volumes of data.
How Does Boomi Enable Parallel Processing?
Boomi provides parallelism mainly through the Flow Control shape and the Atom’s runtime settings:
- Flow Control Shape
- Allows you to define the number of threads that should process documents in parallel.
- Parallel Processing Type (in the Flow Control dialog): choose Threads or Processes.
- Threads is the default and runs work in multiple threads within the same JVM.
- Processes runs work as separate JVM processes (heavier, more isolated).
- You’ll only see/configure Parallel Processing Type if Molecules are enabled for your account. On single Atoms, use Threads.
- Each thread/process handles a portion of the documents independently and then continues with the process flow.
- Atom/Molecule Runtime Settings
- Control how many process executions can run at the same time on the Atom or Molecule.
- This applies when multiple executions of the same process are triggered or scheduled concurrently.
Threads vs. Processes in Parallelism
A common question is the difference between threads and processes in Boomi:
- Threads – Run in parallel within a single execution of a process. For example, if you configure four threads, a batch of 1000 records may be split into four sets of 250, and each set is processed at the same time.
- Processes – Run in parallel across multiple executions. If your process is scheduled or triggered several times at once, the Atom can run them concurrently, depending on its configuration.
In simple terms:
- Threads = parallelism inside one run of the process.
- Processes = parallelism across multiple runs of the process.
Real-World Example: Salesforce Bulk Load
Imagine you are syncing 50,000 customer records from an ERP system into Salesforce. If you process them sequentially, the integration may take hours, especially since Salesforce imposes API usage limits.
Instead, you configure the Flow Control shape with five threads. Boomi splits the 50,000 records into 10,000 each, and all five sets are processed simultaneously.
- This reduces total processing time dramatically.
- However, you must also consider Salesforce’s concurrent API call limits. If five threads push too many requests at once, you may hit errors, so tuning the thread count and using Salesforce Bulk API may be better.
This example highlights how parallel processing can provide speed but must be balanced against external system constraints.
Key Considerations
When using parallel processing, it is important to plan carefully:
- Splitting of Documents – Data should be split into multiple documents for threads to work effectively.
- Thread/Process Count – More isn’t always better. Too many threads/processes can pressure CPU and memory. Start small, test, and tune.
- External Systems – Target applications (like Salesforce or SAP) may have limits on API calls or connections. Parallelism could cause failures if those limits are exceeded.
- Ordering of Records – Parallel processing does not guarantee order. If order matters, parallelism may not be suitable.
When to Use Parallel Processing
- Large data loads such as bulk migrations or nightly syncs.
- CPU-intensive transformations that can be divided into smaller, independent tasks.
- Scenarios where records are independent of each other and do not need to be processed sequentially.
When to Avoid Parallel Processing
- Real-time flows with small volumes (where the overhead is greater than the benefit).
- Cases that require strict ordering of records.
- Integrations with systems that cannot handle multiple parallel requests.
Conclusion
Parallel processing is a powerful feature in Boomi that can significantly improve performance for high-volume or heavy workloads. By understanding how Flow Control (including Parallel Processing Type) and Atom/Molecule settings work—and by testing carefully—you can choose the right balance for your integration needs.
To learn more, check out our Boomi Services.
Contact us more any questions.