PowerUp
"How to create TOS components" Tutorial : Part 11

Component Creation - Part 11

It's been some time I did not post a lesson in this tutorial (sorry, busy working on some other projects).
However I received quite some feedback (thanks!), in some cases requests for help with the exercises.
You probably figured it is not trivial to debug your code, personally I find it hard, sometimes, to check the template code, I found that the best solution is to keep it as simple as possible.

Today I would like to illustrate my methodology in creating and debugging components.
DISCLAIMER : I doubt this is the best methodology, it just happens to work for me, if you have suggestions, let's discuss them in the talendforge forum as they could be interesting for many others.

Getting organized


I don't use the internal editor, I simply use GEdit (Linux) or Notepad++ (Windows) because I use them for about everything (including writing this tutorial in HTML).
I have nothing against the internal editor, it's just my personal preference, however in this way I don't get much of syntax highlighting...
So, the first "rule" I adopted is to compile the component every few lines I type.
I know what you are thinking : It's painful to switch to another tool, push the components to the palette etc every few lines of code.
Here is the way I do it (it works for me) : I have a dual monitor, on my right monitor I keep TOS open in the job design perspective (oh, yeah : I am not using the component designer perspective other than to register a new component).
On the left monitor I keep my code editor with all the files I am working on open at the same time (code snippets, other components from which I copy part of the code etc).
Every few lines I type in, I click on the TOS window where I normally have a blank project open, and press Ctrl+Shift+F3, that does the magic : recompiles your custom components.
Warning : it does not install a new component, so when creating a new component, you need first to push it from the component designer view.

Starting a new component project


Now you know how my desktop looks like, but the real problem is the project idea in my head, which is normally much messier than my desktop.
I then try to figure out a few basic things before starting :
  • A : Is it going to be a trivial elaboration or will it require some complex java in it?
  • B : Which kind of component is it going to be? Input, output, process?
  • C : Which key parameters should it have?
That's it, answering those simple question gives me an idea of what I am going to do next.
Suppose the answer to A is : "simple elaboration". Then in most cases I can create a small prototype using a tJavaFlex component (remember I told you I normally keep an empty project open in my TOS window...).
It will not have the same flexibility as the copmponent, but there is already a lot you can do there, in few cases I even discovered I did not really need a component at all, as it was far too easy to implement it with tJavaFlex.
Let's imagine instead we need some complex java elaboration. In this case my approach would be completely different.
To address more complex needs I normally prefer to create a standalone java class to test my idea, I might even use other libraries (some made by myself, some others available in the java world).
This is done completely outside the Talend environment, as a matter of facts I might even close TOS and spend some time with pure java coding (even days in some cases).
Once I am reasonably happy with my java code, then I can decide how to bring it to the java component.
My preferred way is normally to create a jar and simply import it inside the component, that keeps the component code simpler allowing an easier maintenance of the overall project.
However in some case it may turn out that the solution was implemented with a few lines of code, so it might even make sense to simply copy&paste them into the component itself.

Answering question B often helps in understanding which sections I may need, which metadata / data flow classes I may need to use, where loops should be opened and closed in the java output code etc.
About loops : If you have an input component (a component that is "injecting" data into a job) you are likely to have a loop controlled in the begin section and closed in the end section, that would be my starting point.
An output or process component will normally need to get metadata information in the main section, so, in that case, this is a good candidate to start off your component.
Answering question C helps a lot to figure out how your XML descriptor file should look like.
Even before coding some java / jet code into your sections, you need to set up a XML skeleton that describes your component, I like to start with the minimum I need and enrich it as I go.
So, depending from which section of the java code I am starting with, I will prepare the needed parameters in the XML (and the XML Header section, the component will not be installable without a proper header).

Debugging


Every coder knows the fun of debugging, right?
Whenever I am coding (and therefore debugging) I normally get organized with a few cardboard boxes close to my desk,so I can kick and smash them to let some steam off. It's not too bad once your colleagues get used to it :)
Just to say that when developing Talend components I use a few boxes, probably a bit more than my average, your mileage may vary, of course.
As frustrating this experience can be sometimes, the basic debugging techniques also work here :
  • Test your code every few lines
  • Add debug output to the console : System.out.println("debug 1234") will output to the TOS console
  • Identify immediately if the issue is from the output code or from the template and focus ONLY on that part to fix the issue
To minimize the chances of typos etc I use some code snippets I collected over time, you can copy them from existing components.
As an example I have a snippet that gives me the skeleton of the loop needed to iterate the input connections and their metadata, it also includes comments reminding me which imports I need to have in the template.
In these snippets, all the code blocks that are opened (tags or { ) are also closed with matching close tags and parenthesis.
I also tend to label the blocks with comments so that it's easier to match a parenthesis that closes a block to the one that opens it.

Finally, to check your template code, there is a nice feature in the studio :
Check your window menu, click on "show view", then make sure your "navigator" view is active.
In the navigator, locate the entry .JETEmitters/src/org/talend/designer/codegen/transaltors/[your component family]/[your component].
There you will find some java files that are java versions of your jet template files (I know!! that kind of gives me an headache too, another representation of the code!! aargh).
It might be hard to inspect the java output code there because it is expressed as a set of string constants, instead the template code is translated into pure java.
Basically when your compile a data integration job, this java code is executed to generate the job code.
When you install a component into the palette, instead, this JETEmitters code is compiled and gives compilation errors if you have issues with your jet section.

For the java output code, I found the best way to debug it is to drop the component in a blank project and set up a simple test job.
Start the job once, check eventual errors generated and then switch to the "code" view of the job itself.
Besides locating the eventual errors, you actually learn a lot by doing this (at least I did).
Remember : the java code is uopdate ONLY when you compile the job (try to execute it).

Common issues


At least these are issues pretty common to me, might not apply to you.
  • template tags opened and not closed (how to avoid : open and close tags, then write code inside)
  • code blocks opened and not closed in the template (happens a lot when moving section of code from a section to another)
  • Missing jet imports for classes you are using in the template (reduce chanches of this happening using code snippets that state which imports are needed)


Happy coding!

P.S. : While writing this tutorial and describing the code snippet thing, I was thinking that maybe it could make sense to organize an online archive for them. Some kind of place where you can search by keyword, and get the needed code in your clipboard. Anyone could contribute code and share it with others.
Do you think this could be a useful functionality in talendforge? I could organize that in my website, but probably it would better in the talend community website.
Let me know your opinions in the forums or in our feedback module.


Part 10