The Out of Memory Error and Java Heap Space Error are two of the usual errors which occur in the Talend jobs handling a large volume of data. These errors can be avoided to an extent by following some design guidelines.
(1) Keep in mind that tMap is a heavy component. Minimize its use in your jobs.
- Avoid tMap if you need just simple transformations like trimming the string values, replacing null numbers by zeroes, etc. In its place you can use tJavaRow component.
- If you want to get only a small set of columns from a huge collection avoid using a tMap. For that you can use a lighter component- tFilterColumns
- Similarly, to filter rows you can use tFilterRow instead of a tMap
(2) Use store on disk option whenever necessary.
This option is available in tMap, tUniqRow, tSortRow, etc.
While using store on disk option in tMap the directory to store temporary data will be created automatically. This data will not be deleted or replaced on subsequent run(s) of the job. So it is advised to delete the temporary directory created using tFileDelete component from within the job. You can give that in On Subjob Ok of tPostJob component.
In the case of tUniqRow the temporary directory should be created manually before the job run/or can be handled within the job. If the temporary directory is not available, the component tUniqRow will give out FileNotFoundException!
In the case of tSortRow the temporary directory will be created automatically (see the image below- a check box can be seen)
(3) The JVM arguments can be modified as and when needed.
-Xms256M - initial memory size available to JVM is 256 MB
-Xmx1024M - maximum memory size available to JVM is 1024 MB