Wednesday, 22 February 2012

ODI Series – ASO parallel data loads

I have been meaning to write up this blog for a long time but never got round to it and a recent post on the essbase Oracle forum prompted me to revisit this topic.

As you may be aware there are a couple of methods available to parallel load data into an ASO database,  one method is to have multiple sessions of Maxl each performing a data load or the much more efficient way of using one Maxl statement using multiple load rules (up to eight).

The Maxl syntax is
Import database <appname>.<dbname> data connect as <SQLUSER> identified by <SQLPASSWORD> using multiple rules_file ‘<RULENAME>’,’<RULENAME>’.. to load_buffer_block starting with buffer_id <ID_NUMBER> on error write to …

A buffer is created for each load rule and the id starts with the number defined in the statement, so if you have two load rules starting at buffer id 100 then once the statement is executed buffers 100 and 101 will be created and data from the first load rule will be loaded to buffer 100 and the second to 101, once both data loads are complete then the content of the buffers will be committed to the database in one operation.

For an example I am going to load data to the ASOsamp.Sample database using two load rules with the parallel load Maxl and then go through the process to achieve the same using ODI and the essbase adaptor.

I created two SQL load rules DLSQL2,DLSQL3 which were pretty much identical in format.

import database AsoSamp.Sample data connect as ODISTAGE identified by 'password' using multiple rules_file 'DLSQL2','DLSQL3' to load_buffer_block starting with buffer_id 100 on error write to "F:\Scripts\dloaderror.txt";

A simple Maxl statement was created to use the parallel data load functionality starting with buffer id 100 and using the two rules files I had created.

If you query the load buffers while the data loads are taking place you can see that two buffers (100,101) have been created.

The default aggregation method is AGGREGATE_SUM which means “Add values when the buffer contains multiple values for the same cell.”

By default missing values are ignored but zero values are not.

Later on I will compare how ODI manages the buffers.

If you check the essbase application log when the parallel loads are being run you will see that the load buffers are initialised

The data loads then take place

And finally once the data loads are complete the buffers are committed in one operation.

So a nice and simple example which I will now try and replicate the functionality using ODI.

I am going to be using ODI 11g as load plan are available which make it much simpler to execute scenarios in parallel, it is certainly possible to achieve it in 10g using a package and the scenarios set to asynchronous mode.

The essbase ODI model reversing options was set to multiple data columns with measures being the data column and members “Original Price”, “Price Paid”, “Returns”, “Units”, “Transactions”, this will match the data load in the load rules which were originally set up for the Maxl method.

I created two similar interfaces with the source as a relational table and the target being the essbase database datastore, I had to use the SUM function on the members due to an issue I encountered when loading same cell records with multiple values, I will cover this issue towards the end of this blog.

In the IKM options you will notice there are three options relating to ASO, these options just appeared in one of the patch releases of 10g though I am not sure exactly which version that was, if you don’t have the options then you really should be patching ODI because many issues with the Hyperion adaptors have been addressed over time.

BUFFER_ID – This is exactly the same as the buffer id concept used with Maxl

BUFFER_SIZE – When performing an incremental data load, Essbase uses the aggregate storage cache for sorting data. You can control how much of the cache a data load buffer can use by specifying the percentage (between 0 and 100% inclusive). By default, the resource usage of a data load buffer is set to 100, and the total resource usage of all data load buffers created on a database cannot exceed 100. For example, if a buffer of 90 exists, you cannot create another buffer of a size greater than 10. A value of 0 indicates to Essbase to use a self-determined, default load buffer size

This is the same as resource usage in the essbase world, it differs slightly than that when used in Maxl it is a percentage between 0.1 and 1 and in ODI it is between 0 and 100.

GROUP_ID – This option is used when using parallel data loads, each interface that is going to be run in parallel will require the same group id. I am not sure exactly how internally it works but it looks like in the core Java code for the adaptor there is a buffer id manager and when interfaces are executed the buffer id is added to the manager, once all executing sessions are complete then the array of IDs in the buffer manager are committed at once.

Second interface KM options -

Each of the interfaces have a load rule defined and I set both interfaces to a group ID of 1, the first interface had a buffer id of 1 and the second a buffer id of 2, as there are two loads in parallel the buffer size in each of the interfaces was set to 50%.

A scenario was generated for each interface this is because when using load plans it is not possible to directly add an interface and a scenario is required.

A simple load plan was created and the two scenarios generated from the interfaces were added and set to execute in parallel.

After executing the Load Plan you can see in the operator that both the scenarios are executed in parallel.

Looking at the essbase application logs then it seems to follow the same concept as with the Maxl method, now it is not going to be exactly the same as the Maxl method uses SQL load rules and ODI uses the Java API and streams the data via the load rule.

SQL load rule using Maxl -

ODI Data load -

If you query the load buffers when using ODI to load data then there is a difference in some of the default properties being used.

The aggregation method being used is “AGGREGRATE_ASSUME_EQ” compared the Maxl default of “AGGREGATE_SUM”, I am not aware of the property that ODI is using and I am not sure why it doesn’t use the aggregate sum method.

Ignore missing are also set to false compared to the default of True when using Maxl, setting missing to True is the optimal performance.

The problem here is that with ODI you have no option to change these default values and are stuck with what has been hardcoded in the Java code, this is pretty poor in my opinion and I wish Oracle would wake up a little and start enhancing the adaptors and take data integration with EPM seriously before they are left behind, if they were enhanced then I am sure more would use them instead of being forced down firing off Maxl from within ODI.

I had a hunt around in the Java code to see how the buffers are initialised.

They use the loadBufferInit method which is available the essbase Java API.

The aggregation method in the adaptor code is being fixed to duplicates assume equal.

The options ignore missing and zero value is also hardcoded.

The options when the buffer is committed are also fixed so you lose the options to work with slices and overriding data.

In my opinion it really wouldn’t be touch much of a development exercise to make these options available to be set in the IKM like with buffer id and size.

Anyway this leads me on to the issue I experienced when loading multiple values to the same cell which may be down to the fixing of the aggregation method as shown above.

The above source data has multiple values being loaded against the same cell and if the data is loaded using Maxl then it loads without any issues.

com.hyperion.odi.essbase.ODIEssbaseException: Cannot Load buffer term. Essbase Error(1270089): Data load failed: input contains different values for the same cell [(Original Price, Curr Year, Jan, Sale, Credit Card, No Promotion, 26 to 30 Years, 20,000-29,999, Photo Printers, 017589, 13835): 240 / 236]

If you load the same data using ODI then the interface fails with an error about the input data containing different values for the same cell, this may be because of the aggregation property being used when the essbase adaptor initialises the buffer.

The workaround is to sum the data in the interface but I still consider this to be a bug with the knowledge module because you have no way of changing any of the default values, over to you Oracle.

Wednesday, 8 February 2012

ODI Series – Launching Calculation Manager rules

This blog has come about due to a post of the Oracle Planning forum, basically it is addresses an issue with executing Calculation Manager rules using the command line utility CalcMgrCmdLineLauncher using ODI.  In the past I wrote a blog on how to execute Hyperion business rules using ODI but I to be perfectly honest I have never tried the same with Calculation manager rules so I thought I would give it a go.

Once again it would be nice if there was an API available to launch the rules but unfortunately there is not one available yet so we are left with the command line utility which can be restrictive in terms of logging and functionality.

First of all let’s start with a simple rule which also uses a variable with a runtime prompt for defining the scenario member.

Before even going near ODI I am going to make sure that I can get the command line utility running successfully.

The syntax for using the utility is

CalcMgrCmdLineLauncher.cmd [-f:passwordFile] /A:appname /U:username /D:database [/R:business rule name | /S:business ruleset name] /F:runtime prompts file [/validate]

The parameters are pretty much self-explanatory but if you are looking for detailed information then have a read of the section “Launching Business Rules With a Utility” in the planning admin guide.
If you are unsure on how to create an encrypted password file then have a read here

I created a RTP file and the format for the file is

Password file will be : password.txt
Application name: PLANSAMP
Username: admin
Database/Plan Type name : Consol
Rule name : SIMPLE
RTP file: RTP.txt

The location of the utility is

This means the command line syntax to run the SIMPLE rule would be
E:\Oracle\Middleware\user_projects\epmsystem1\Planning\planning1\CalcMgrCmdLineLauncher.cmd -f:password.txt /A:PLANSAMP /U:admin /D:Consol /R:SIMPLE /F:RTP.txt

Personally I find the output from the utility to be pretty poor and the assumption is if there is no error message it ran successfully.

It is possible to check the Job Console from within planning to see more details around the execution of the rule, this information can also be queried from the planning application relational table HSP_JOB_STATUS

If the rule is run from planning then there is additional information in calcmgrlaunch.log in

The command line utility unfortunately does not write to this log.

So now the rule is running from command line it is time to transfer this logic to ODI, I am using 11g and I am going to use an ODI procedure using the ODI Tools technology, if you are using 10g the process is very similar.

The ODI tool I am using is OdiOSCommand which you probably guessed invokes an OS command line shell using cmd on Windows and sh on nix.

There are parameters to split out the standard output and errors, working directory is the location the command is executed from and synchronous waits for the completion of the command.

OdiOSCommand "-OUT_FILE=E:\ODIDEMO\CalcManager\SIMPLE.log" "-ERR_FILE=E:\ODIDEMO\CalcManager\SIMPLE.err" "-FILE_APPEND=NO" "-WORKING_DIR=E:\ODIDEMO\CalcManager" "-SYNCHRONOUS=YES" E:\Oracle\Middleware\user_projects\epmsystem1\Planning\planning1\CalcMgrCmdLineLauncher.cmd  -f:password.txt /A:PLANSAMP /U:admin /D:Consol /R:SIMPLE /F:RTP.txt

If you have issues running the command then there is also an extra parameter "-COMMAND=" so the syntax would be

OdiOSCommand "-OUT_FILE=E:\ODIDEMO\CalcManager\SIMPLE.log" "-ERR_FILE=E:\ODIDEMO\CalcManager\SIMPLE.err" "-FILE_APPEND=NO" "-WORKING_DIR=E:\ODIDEMO\CalcManager" "-SYNCHRONOUS=YES" "-COMMAND=E:\Oracle\Middleware\user_projects\epmsystem1\Planning\planning1\CalcMgrCmdLineLauncher.cmd  -f:password.txt /A:PLANSAMP /U:admin /D:Consol /R:SIMPLE /F:RTP.txt"

I am not a fan of hardcoding of parameter values and prefer to use variables as much as possible.

Each variable was created as text type.

CM_DIR – Working directory containing RTP, password and log files.
CM_PASSWDFILE – password file containing encrypted password for the administrator user.
CM_PLANAPP – Planning application name.
CM_PLANDIR – Path of the calculation manager command line utility.
CM_PLANTYPE – Plan type to run the calculation against.
CM_RTPFILE – Runtime prompt file
CM_RULE – Calculation Manager rule to execute.
CM_USER – Planning application administrator.

These variables were then transferred to the syntax of OS command.

A scenario was generated from the ODI procedure as I will be using a Load Plan to execute the command line call.

It is possible to achieve the same results using a package if you prefer or are using 10g.

The scenario was added to the load plan and all the variable values were set .

The logs generated from running command line utility will be based on the name of the rule e.g. SIMPLE.log and SIMPLE.err

The standard out log includes the variable information.

The error log includes information messages as well as any errors that were generated.

As I said earlier there is not really any information in the logs to say that executing the rule was successful.

So let’s change the RTP to an invalid member “Actual1” and see what happens when it is executed.

 The Operator shows a successful execution, this is because the utility has not returned an error code to the calling OS shell.

The output log now contains the invalid member in the launch variables.

The error log does have an additional line with a failure message even though it doesn’t say it has failed and the line should have started ERROR: but unfortunately that is the best you get at the moment with the utility.

This means the logic is that if the rule completed successfully the last entry in the error log will be

INFO: Application PLANSAMP exists in Registry: {1}

Otherwise it will have failed and contained an error message.

Using ODI there are a number of different ways to handle this logic and generate a failure, one method could be to load the error log into a table and then analyse the records which I have described the process in a previous blog.

Another technique is to simply read in the error lines of the error log and check to see if the last line starts with “INFO”, this could be done using Java, Groovy or Jython which I am going to use.

I added a step to the ODI procedure using Jython technology which will execute after calling the command line utility, the code basically reads in all the lines of the error log which is not a problem because the file is so small and then checks if the last line starts with ‘INFO’, if it doesn’t then an error is raised.

If the load plan is run again this time it fails and generates an error with the message to why it failed so at least now any errors can be trapped and acted upon.