Sunday, 22 March 2009

ODI Series - Excel functionality

I was going to start taking a look at using the web services functionality in ODI to invoke a web service, one of the products within the EPM area that using web services is EPMA so my idea was to look at integrating the two components. Unfortunately during my testing I ran into a few issues and surprisingly they were with ODI and not EPMA, I spent a while trying to resolve the issue but with no success.

I have since logged the problem with Oracle and they have confirmed it as a bug, I am not sure of the timeframe to fix the bug and there is no workaround so I am going to have to wait on that one.

So today I am going to look at a few areas where you can use excel with ODI. The first method I will be looking at is extracting data from excel and loading into a database table, now the database table could easily be replaced with planning or essbase. I first would like to stress that extracting information directly from excel is not the most efficient method and can be resource hungry but it is just to give you an idea of how it can be done. The best method would be to try to manipulate the excel file using VBA and turn it into a csv format which could then be loaded in faster using ODI.

Anyway lets consider that there is no option to change the excel file and the requirement is to extract the data from it.

Apologises for the style of the report I didn’t want to spend too long creating a masterpiece.

So the object of this exercise is to get the information from the excel file into the above database table. To achieve this the structure of the excel file has to be reversed in the ODI datastore format but before this is done named ranges have to be created in the file, this is because the ODI excel technology works on the basis of turning named ranges into datastores.

The named range “Colas” takes care of the product and period fields.

The named range “MrktScenMeas” relates to the fields Market,Scenario and Measures, the top row is hidden in the excel file but is required in the named range so it can be converted into column names in the ODI datastore. A final named range “RootBeers” was added that highlighted the second table of data in the excel file.

Next step is to create an ODBC connection to the excel file as this is the technology that ODI uses to integrate with excel.

Now to set up the connection details in the Topology manager, first insert a data server using “Microsoft Excel” technology.

The user/password section is left blank.
In the JDBC information the driver to be used is :- sun.jdbc.odbc.JdbcOdbcDriver
and the format for the JDBC url is :- jdbc:odbc: <odbc_dsn_alias>

Replacing <odbc_dsn_alias> for the name which was given in the previous step so in my case it will be :- jdbc:odbc:ODI_EXCEL_REPORT

Once this is applied the physical schema window is opened, the only information you have to supply is the logical schema name.

On to the designer where the reversing can being, first a new model is created, the technology of “Microsoft Excel” and logical schema is selected.

In the reverse section just the context is chosen and Standard is left as default for the type of reverse, in the “Selective Reverse” section the named ranges that were created will be displayed and you then have the option to pick and choose the ones you want to reverse into datastores.

Once reversed the datastores will be created with the column names to match the excel file.

If you right click the datastore you should be able to view the data from the excel file

Now that the models are complete it is time to build an interface to load the data from excel straight into the database table. I am not going into reversing the database table as I have covered that off in past blogs.

I am using the target database repository, as the staging area so there is no need to select the staging area different from target option.
In the diagram section the db table datastore are dragged as the target datastore and the two excel datastores dragged onto the source area.

The problem at the moment is there is no join between the two source datastores and there is no common column to create a join.

To get around this you can drag one of the columns on the source datastores on to the other source datastore to create a join, select the staging area to execute on and click the Cross option, this creates a cross join or otherwise known as a Cartesian join which combines each row of the two datastores.

In the flow area the LKM should be automatically selected as LKM SQL to SQL and the IKM I chose was IKM SQL to Control Append, the FLOW_CONTROL option has to be set to No as there are no key fields in the integration.

To load the other table of data in the excel file then the interface would be exactly same with just the datastore with the product/periods changing, the two interfaces can be then put into a package to seamlessly load the data. Make sure you don’t have the excel file open when you run the interfaces otherwise it will fail.

Next time I will look at automatically generating file datastores from information held in an excel template using the RKM File (FROM EXCEL)


osmaneyup said...
This comment has been removed by the author.
Araceli said...

Hello Jhon,

I hope you could help me with a problem i'm facing when trying to load data from excel to Oracle:

In the step number 2 (creating table) it indicates error because of the generated script doens't indicate the columns but the columns are shown in the model of Excel.

The generated script is:

create table STAGING.C$_0GRUPOS


Note: I'm using Oracle XE.

Could you help me,please? Thank you