Wednesday, 7 March 2012

ODI Series – tips for improving essbase load times.

I thought I would cover a few tips for speeding up data loading to essbase using ODI.

The first tip is for speeding up data loads to essbase databases and while it may be obvious to many people I often find this setting in the IKM left as default and hear complaints that apparently the essbase adaptor is slow at loading data.

When you first create an interface to load data to an essbase database you will see an option in the IKM called COMMIT_INTERVAL.

The default setting for this option is 1000 which basically means that the data will be streamed using the Java API to the database in chunks of 1000 records.

Now let me go through an example of the difference in load times when using this option.

I successfully loaded over 311,000 records to an ASO database from a RDBMS source, in the operator you will see that it took the worrying time of 309 seconds.

In the log you can see that there is an entry for each of the 1000 records being loaded to the buffer, this is not only slowing the data load by a massive amount but also increasing the log size because for just one load the process is being repeated over 300 times.

So let’s see what happens when the COMMIT_INTERVAL is increased to a size that is bigger than the number of records to be loaded.

The data load is down to 14 seconds which is a huge improvement.

As the ODI agent is now handling loading the data in larger chunks then this means there will be an increase in the amount of the memory required for the JVM.

The graph displays the memory being used by the agent with the default 1000 commit records and then with the increased commit size so be prepared to increase the maximum heap size of the agent.

So if you are loading 5 million records it is not as simple as setting the commit size to that amount, testing is required to get a balance between an acceptable load time and JVM size.

If you are data loading to essbase using the ODI KM then I suggest to always use a load rule as it is the optimal method, if you don’t use a load rule if a record is rejected for any reason eg. Unknown member the load method changes and starts loading record by record.

If you have a large data load with possible error records then you could be waiting for an eternity for the load to finish without using a rule.

There are other optimisation techniques outside of ODI but If are loading data to a BSO database then it is also definitely worth reading the section “Optimizing Data Loads” which I am not going to cover as I believe in this case the documentation does a good job of explaining and if applied can drastically reduce load times.

There is another possible option to speeding up loading but I will be perfectly honest and say it may not actually make that much of a difference when using the Hyperion technologies and has a bigger impact when using RDBMS technologies.

If you look at a Hyperion related data server in the topology you will see “Array Fetch Size” which is set to 30 and cannot be changed.

This basically means is that when the knowledge module Java code retrieves data from the source or staging area it will retrieve 30 rows at a time.

If you edit one of the Hyperion knowledge modules and look at the code for a load step then you will see the fetch array being used.

The code retrieves the value stored in the “Array Fetch Size” and then uses that as the fetch size in the JDBC SQL queries.

If you look in the operator at the step the values is always set to the default of 30.

One way of being able to set the fetch size in the interface KM options is to first create a new option, this is done by right clicking a Knowledge module and selecting “New Option”.

I have created an option called FETCH_SIZE with a type of Value and a default of 30.

Now the load step of the KM is edited and the fetch size is set by retrieving the value in the option using.


In the interface KM options the FETCH_SIZE should now be available to be set.

If you are going to test out changing the default fetch size then the answer is not just to set the size as big as possible because it will probably have a detrimental effect on the performance or crash the agent JVM, it does need playing around with to find the optimal value.


Baguetex said...

Hello John,

Can you explain a bit what do you mean by Load Rule?

It's this option?
"RULES_FILE :- You don’t have to use a rules file with this IKM but an important thing to note is that without a rules file the data will always be sent as “overwrite” and you will need to use a rules file if you want to add to existing data."


John Goodwin said...

It means an essbase load rule which you can create via EAS console.