Pentaho Data Integration 4 Cookbook – Win a Free Copy

Pentaho is very fortunate to have such a fantastic community. There are a few community rockstars that find time in their uber busy lives to write books about using Pentaho. The latest book published, the Pentaho Data Integration 4 Cookbook by co-authors Adrián Pulvirenti & Maria Carina Roldán is making its way to the top of the Amazon bestseller tech list. Even more impressive – this is Maria’s second book about PDI in just 15 months! (In April 2010 she published PDI 3.2: Beginner’s Guide). We were interested to learn more about the book and the authors. Check out our interview below to get the inside scoop about the PDI 4 Cookbook.

Read below to learn how to win a FREE copy of the PDI 4 Cookbook and for a special discount offer from Packt Publishing

1) What inspired you to write the PDI 4 cookbook so soon after “PDI 3.2 for beginners”?
Maria: At the time PDI 3.2 for Beginners was published there was a clear need for a book that revealed the secrets of Kettle, in particular for those who barely knew about this tool. The book had a great acceptance especially coming from the Pentaho Community. Today I can say that the main inspiration was definitely that rewarding feedback.

On the other side, at the time that book was published, Pentaho was about to release PDI 4. From a beginner perspective, there aren’t big differences between Kettle 3.2 and Kettle 4. Thus, there is nothing that refrain you from learning Kettle 4 with the help of the Beginner’s book. However Kettle 4 brought a lot of new features that deserved to be explained. This was also a motivation for writing this new book.

2) What is the main goal behind the book?  What do you aim to bring across?
Adrián: This book is intended to help the reader quickly solve the problems that might appear while he or she is developing jobs and transformations. It doesn’t cover PDI basics – the Beginner’s book does. On the contrary, it focuses on giving the PDI users quick solutions to particular issues.

  • Can I generate complex XML structures with Kettle?
  • How do I execute a transformation in a loop?
  • What do I need for attaching a file in an email?
  • These are common questions solved in the book through quick easy-to-follow recipes with different difficulty levels.

3) Where did you find the inspiration for this new book?
Maria: The main inspiration for this book was the PDI forum; many of the recipes explained in the book are the answers to questions that appear in the forum again and again, as for example: how to use variables, how to read an XML file, how to create multi-sheet Excel files, how to pass parameters to transformations, etc. Just to give an example, the recipe “Executing part of a job once for every row in a dataset” explains how to loop over a set of entities (people, product codes, filenames, or whatever), which is a very recurrent issue in the Kettle forum.

Besides that, Kettle itself was an inspiration. While outlining the contents of the book and with the aim of having a diversified set of recipes we browsed the list of steps and job entries many times thinking: Is there something that we aren’t covering? Are there steps that deserve a recipe by themselves? Many of the recipes that you can find in the Cookbook came out after that exercise. “Programming custom functionality,” a recipe that explains how to use the UDJC step and quickly explains other scripting related steps, is just an example of these set of recipes.

4) What do you like so much about Pentaho (Data Integration) to make you write books about it?
Maria:  I have used Kettle since the 2.4 version, when many of the tasks could only be done with JavaScript steps. Despite that, I already admired the flexibility and power of the tool. From that moment Kettle has really improved in performance, functionality and look & feel. Its capabilities are endless and this goes unnoticed for many users. That’s what makes me write about it: The need to uncover those hidden features, and explain how easily you can do things with Kettle.

Adrián: In my daily work I integrate all kinds of data: xml files, plain text files, databases, and so on. Anyone facing these tasks knows about the time and effort required for accomplishing them. Meeting Kettle was love at first sight. Thanks to Kettle I realized that these formerly tedious tasks can be done in a fast, fun and easy way. I liked the idea of writing this book to share my own experiences with other people.

5) When can we expect the next book(s)?
Adrián: Just as Kettle, the whole Pentaho Suite has grown a lot in the latest years. There is undoubtedly much to write about it.

However at this time we’d like to enjoy the recently published book and look forward for the feedback of the Pentaho community.


Win a free Pentaho Data Integration 4 Cookbook. Like Pentaho on Facebook and leave a comment here about which chapter(s) or recipe(s) you think will be most useful for you and why (you can see the full index in the book here). You also have the chance to win on Twitter by following Pentaho and tweeting your comment with the hashtag #PDI4. Maria and Adrián will pick their favorite comment to win. Deadline to leave a comment is July 26 at 12pm/EST.

Packt Publishing is offering an exclusive 20% discount off the Pentaho Data Integration 4 Cookbook when you purchase through for Pentaho BI from the Swamp readers. At the shopping cart, simply enter the discount code PentahoDI20 (case sensitive).

***Update July 27***
The winner of the free book goes to Mike Dugan. As Adrián explains, “Because he expressed in a few words the essence of chapter 7, which is one of our favorites.”

Mike’s response to his favorite chapter and why, “Chapter 7 is the key here. Who wants to recreate the wheel??? Just like Newton I believe in the conservation of energy…. Especially MY energy. Do it once, use it a lot, look like a rock star with minimal effort.”

Well said! Congrats Mike, you will receive a free copy of the PDI Cookbook courtesy of Packt Publishing soon.

Read all the responses here

4 Responses to Pentaho Data Integration 4 Cookbook – Win a Free Copy

  1. Kuldeep says:

    Chapt. 9 is more useful as it throws lot of light on how one can use Pentaho ETL task as well as other than ETL task. Like data fetch from web services.

  2. friendkak says:

    Chapter 7 is quite good and useful for my realtime projects. Thanks,AK

  3. krishna says:

    Chapter 4 File Management is very useful because we can know more about deleting file,deleting files and moving files all about files is given in brief.
    chapter 7 Executing and Reusing Jobs and Transformations explains about transformations and jobs completely really usefull

  4. Madhavi says:

    Chapter 6 is most important one as it talks about data flow….

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 12,434 other followers

%d bloggers like this: