The problem

Recently I have participated in a discussion regarding ODI mappings complexity. More specifically reducing mappings complexity. The ODI mappings I am talking about are using datastores built on behalf of bitemporal tables in Oracle databases. Worths mentioning that temporal behavior is implemented using plain "start date" and "end date" fields for both valid time and transaction time periods. Oracle 12c's temporal validity feature is not used.

ETL jobs process data in a snapshot based manner. Each job receives two date parameters (day granularity) used for temporal selection:

  • a date used to filter source datastores on valid time

  • a date used to filter source datastores upon transaction time

 

These date type values are used to populate VT/TT fields for the inserted and updated data in target tables and to filter source tables data. It is possible to have inside an ODI mapping multiple instances of the same datastore, each instance being filtered on a different valid time. For example, comparing today's data with yesterday's data for a given table.

The necessity to filter source tables on valid time(VT) and transaction time(TT) could lead to a proliferation of Filter components or in the best case increases the complexity and verbosity of existing Filter conditions. It would be nice if we could find a way to eliminate the Filter components created exclusively for VT/TT filtering purposes but preserve the VT/TT conditions in the final SQL code generated by the mapping. Otherwise, if a temporal datastore has to be filtered on some other condition, it would be nice to not have to specify VT/TT conditions (in order to keep the filter condition as simple as possible) yet preserving VT/TT conditions on the final SQL.

It is assumed that temporal source tables are filtered by a reference date so we don't want to complicate the already complex ODI mapping logical and physical layouts with well-known filtering conditions. This is something similar to Oracle application context and fine-grained access control feature. Indeed this was one of the solutions proposed for simplifying the mappings but:

  • it will not work for non-Oracle database

  • it will not work when you have to filter on different reference dates inside the same mapping

 

The solution I describe below uses ODI component knowledge module to inject valid time/transaction time conditions directly in mapping's Abstract Syntax Tree. The conditions are specified using XKM options. Because each source datastore instance(or better said the associated physical node) will have an associated XKM, we can specify a filtering condition for each. If necessary we can specify a different reference date for each instance of a given temporal source datastore. Proceeding this way we will have the valid time/transaction time conditions present on the final SQL code without creating dedicated Filter components. We would obtain simpler ODI mappings but we have to create a custom XKM.

 

Component Knowledge Modules

Component KMs have been introduced with ODI 12c but it was not possible to create/change them until 12.2.1.2.1 release. In ODI documentation we find the following description of Component KMs:

 

Component KM is a new, improved style of KM development, which is applicable for IKMs, LKMs, XKMs. KMs for mappings, have two different types of implementation styles: The legacy 11g-style and component-style. 11g-style KMs are designed to use the monolithic odiRef substitution API object and syntax in their template commands. Component-style KMs are designed to use the newer object-oriented substitution API objects and newer template flow control syntax in their template commands. Both styles of KM can be seen in the project and global KM tree, as IKMs, LKMs, and CKMs with a predefined set of component-style KMs called XKMs. All new LKMs, IKMs, and XKMs should be designed and coded using the new component KM style. Component KMs are new since ODI 12c and are first exposed in the ODI studio from 12.2.1.2.1 release. A component KM has the same functionality as any other KM, which includes tasks and options. But it also includes some added new functions that includes:
  • An associated delegate script whose purpose is to produce an Abstract Syntax Tree (AST). The produced AST object is a tree of java class instances that describes the metadata for the mapping component, in a way that is tailored to be used to generate code for a specific language and technology.

...

What is an AST? As mentioned above an AST is a bunch of java objects organized in a hierarchic way (tree). These objects are used by the mapping code generator to generate the final generated result code (i.e. INSERT INTO...AS SELECT...). How are the AST objects created? ODI documentation provides a very good explanation:

 

The XKM is responsible for gathering and assembling the mapping logic for source and intermediate mapping components, and storing it in a set of code generation object known as an “Abstract Syntax Tree” or AST objects. The AST objects are tailored to generating executable code in some form.

Prior to the 12c release of ODI, each interface or mapping was assigned one LKM and one IKM, and possibly one CKM. Starting from ODI 12c, each component in a mapping physical design will have an assigned Component KM. A component KM can be an XKM, LKM, IKM, or CKM. An XKM is assigned to each source or intermediate node, an LKM is assigned to each AP node, and an IKM is assigned to each target datastore node. During code generation, the mapping code generator iterates through all the mapping component nodes, and each Component KM contributes some information to the final generated result code. Each Component KM has an associated delegate script, that is implemented as an ODI internal java class or by a groovy script. The delegate script for each KM is used to generate a java AST object. The generated AST object is then passed as an input parameter to the next node’s KM delegate class. When an LKM for an AP node or an IKM for a target node is reached, a combined AST tree is produced, which includes all the AST objects produced by all the upstream nodes. The AST tree can then be used as a substitution API object to substitute values into the LKM or IKM task line commands.

Back to our initial problem: we want to introduce some filters on the final generated code without creating Filter components and/or without adding extra filter condition on the existing Filter components. How could be this done? My idea was to try to modify the AST object corresponding to temporal datastores and inject there the temporal selection conditions (i.e. start_date <= reference date < end_date) which I want to add. Which type of Component KM is best to implement such change? The XKM. In ODI documentation is specified this:

 

The XKM typically does not generate any session steps of its own. Its main function is to provide java AST object instances to the downstream Component KMs.

This is what I want. No extra steps, just an additional condition on the code generated for reading data from a temporal datastore. Let's see how we can create an XKM which adds the required temporal selection conditions.

  

Creating a custom XKM 

As mentioned we want to create an XKM which should offer the possibility to specify the valid time and the transaction time values and create a filter on the source table based on these two values. Considering that column names used for temporal fields are named VT_START_DATE, VT_END_DATE, TT_START_DATE, and TT_END_DATE the XKM should add to generated SQL code conditions like the following one (the effectively SQL code generated by XKM will look differently):

VT_START_DATE <=  TO_DATE(VT_REF_DATE, 'YYYY-MM-DD') < VT_END_DATE

TT_START_DATE <=  TO_DATE(TT_REF_DATE, 'YYYY-MM-DD') < TT_END_DATE

Of course, the reference values for valid time and transaction time needs to be specified using KM options. In the KM which we will build, these options are named VALID_TIME and TRANSACTION_TIME.

We want to have some flexibility regarding the names of the columns used for the fields which store the start and end timestamps of valid and transaction periods. For this purpose, we will introduce in the KM four options which allow the user to specify the names of these fields.

In the version of ODI which I use (12.2.1.2.6) it is possible to create a new Component KM which inherits an existing Component KM. Therefore I have created a new XKM named "XKM Bitemporal Extract" which inherits the existing "XKM SQL Extract" knowledge module. The following step was to create the KM options mentioned before in this article:

The next step is to customize the new XKM behavior: adding the code necessary for the injection of temporal validity filter. Here I have faced two problems:

  • Changing XKM's code related to AST. I didn't find a method to change the delegate class or the Groovy delegate script in ODI Studio. The only way I have managed to edit this information was to export the opening the XKM and edit the resulted XML file in a text editor. Exporting some of the XKMs provided with ODI and opening the resulting XML files we see references to a delegate class and/or a delegate Groovy script. When both are mentioned the delegate class refers a class from the delegate Groovy script. This matches the Component KM description from ODI documentation: Each Component KM has an associated delegate script, that is implemented as an ODI internal java class or by a groovy script.

 

What has been left is to insert the delegate Groovy script shown above in the XKM. I have done it by editing the XML file obtained from KM export. The script should be introduced in the field with the name "DelegateScript" as shown below:

 

At this point the new XKM was ready so I have imported it back to ODI. I have to mention that "Import Replace" functionality didn't produce the desired result. The KM was imported without visible error but it didn't function as expected. Therefore I had to first delete the existing "XKM Bitemporal Extract" KM before importing the XML file.

I didn't try to implement the delegate class as an external class (i.e. a Java class in a jar file included in ODI Studio's classpath). The pre-defined component KMs are using this approach so I expect this approach is possible.

The XKM can be downloaded here

.