Saturday, 18 October 2008

ODI & Planning - Brief look into file manipulation

Just when you thought the blog about ODI and essbase was going to happen another quick sidetrack comes along, after my previous blog I had some really positive feedback and a request to run through file manipulation, so I am just going to go over briefly a few of the areas that Glenn has raised.

Ok, I think it is best to show the file I have quickly created, I am not going to bother with any of the planning member properties as I have covered all this before.

The logic is the in column A the first item in each cell will be the parent, the child will be the remaining string, the alias will be a concatenation of the child and product code in column B.

The first step as always was to create a DataStore pointing the csv file and reverse it to bring back the columns.

I am going to use the same DataStore I have used for previous examples for the target planning application.

This time I am going to load the data into the Segments dimension, I added a couple of members in planning so the metadata file could be loaded against them.

So now we have our source and target we can create the interface to bring them together.

As usual in the definition of the interface name it, set the context and use “Sunopsis Memory Engine” as the staging area.

In the diagram drag the PRODFILE DataStore into the source and the Segments DataStore into the target.

Usually you would just map the source straight to the target but we have to do a little bit of manipulation to get it into the format we want.

Lets start with the Parent, this will be formed from the Product column in the source. If you select the Parent column in the target it will display the mapping window, from here you can click the “Launch Expression Editor” button.

In the Expression editor there are a number of functions available, we are going to concentrate on the Strings functions.

For the parent we need part of the Product string so the ideal function is SUBSTRING.

The format of the function is SUBSTRING(<string>, <start>[, <length>])

An example of one of the rows from the product file is:- “LCD Sony Bravia 40”

In our case the <string> element will be in the Product column.
The <start> is a numerical value of where you want the function to begin; as we want to start from the beginning of the string it will be 1.

The <length> of the string we want to extract is varied so we can’t just enter a numerical value, so at the moment our function will be :-

SUBSTRING(PRO.Product, 1, <length>)

To find the length we need to use another function, we are trying to find the first occurrence of a space in the string and to achieve this the LOCATE function can be used.

The format is :- LOCATE(<string1>, <string2>[, <start>])

<string1> is the string you are trying to locate
<string2> is the string which you want to find <string1> in
<start> is a numerical value of the location in the <string2> where you want to start from.

From this we can generate:- LOCATE(<string1>, PRO."Product", 1)

The first time I used this function to locate a space character I just used :-
LOCATE(’ ‘, PRO."Product", 1) but unfortunately this invalid, luckily there is a function called SPACE

SPACE(<count>) – where equals the number of character spaces, so we just want 1.

This gives us :- LOCATE(SPACE(1), PRO."Product", 1)

And if we add this to our original SUBSTRING function we come up with.

SUBSTRING(PRO.Product, 1, LOCATE(SPACE(1), PRO.Product, 1))

On the example from the flat file:- “LCD Sony Bravia 40” this will extract “LCD” exactly what we want for the parent.

Right lets move on to mapping the child (Segments), once again we can the same sort of format with the SUBSTRING function, the only difference this time is we want to extract from the first occurrence of a space character and exclude the beginning of the string.

With the use of the LOCATE that we used in the previous example then we come up with.

SUBSTRING(PRO.Product, LOCATE(SPACE(1), PRO.Product, 1))

With the SUBSTRING function you have to specify the <length>, if you don’t include specify it will extract to the end of the string.

On “LCD Sony Bravia 40” it will extract “Sony Bravia 40” which is correct for the child member.

Finally we have to generate the Alias which is a concatenation of the child member and the product code column, in this example I am going to separate the two strings with “ – “.

As we know how to extract the child member we can use that in our formula.

Once again there is a function we can use and this time it is CONCAT

CONCAT(<string1>, <string2>) which is self-explanatory how it works.

To start with we want to concatenate the child string with ‘ ‘ and this produces.

CONCAT(SUBSTRING(PRO.Product, LOCATE(SPACE(1), PRO.Product, 1)),' - ')

We also need to concatenate the product code the CONCAT function is required again.

Producing :-
CONCAT(CONCAT(SUBSTRING(PRO.Product, LOCATE(SPACE(1), PRO.Product, 1)),' - '),PRO."ProdCode")

The flow diagram should resemble

Ready to execute and check the results.

All looks good in the Operator and the generated log files show no errors.

There we have it from 2 columns in a flat file and some quick manipulation we have generated a hierarchy with parent/child/alias.

Hopefully this will give you enough insight to start being able to manipulate various sources and begin to use some of the inbuilt functions.

1 comment:

GlennS said...

Thanks John.As usual, nicely put together so any simpleton like me can follow