Looping Construct for ETL – Simplified by Talend DI

I am not a regular ETL developer but I do “dabble” in developing code from time to time. One requirement that I come across is looping (while, for, infinite) in a job and performing a task based on condition and then going to sleep for a period of time. This certainly can be accomplished through a scheduler as well but is often time easier to accomplish in the ETL job design.

I have had the opportunity to implement this construct using SSIS and Datastage. In my opinion, Datastage was the most complicated – took me a good 5 hours to get the code to do exactly what I wanted. SSIS was more cooperating – I was able to do the same in about 3 hours.

As I started working with Talend Open Studio earlier this week, I was amazed to see how simple it was to work with Talend.

As you open the Studio, you can find various orchestration steps very quickly – something the other two tools do not do. i.e. the IDE user interface is more intuitive for Talend

Pallet_Image11

From here, it was drag and drop easy…

Step 1: Drag the while loop on the canvas – define the parameters.

DefineLoop

Step 2: Create a job run step to be executed in each of the iterations

ExecuteJob

Step 3: Drag the “sleep” step and assign values.

I was done start to finish in about 20 minutes and the code generated in Java was pretty easy to read and understand. Very cool!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Post Comment