vdmgen is a code generator based on the same principle as vdmiter, however it is intended to generate code libraries from manifests and templates, not to execute generated code on the fly. Various versions of this script exist and others can be customized, focused on more specialized and typically more complex tasks. Here are some examples of its functionality:
Generates DDL and DML for system or user-maintained MQTs (MQTGEN.) Coupled with the VDM Aggregation Strategy, a minimal set of optimal MQTs can be defined, generated and maintained.
Allows expansion of cascading templates that can be "macro-included" in other templates.
Handles structured comments as well as variable substitution
The vdmiter command is a simple mechanism for iterating over the rows of a manifest and macro-substituting values in a template script or an in-line SQL query. Here is an overview of the command parameters: vdmiter -m manifestfile -t templatefile -vx SQLtext
Lexical are dimensions that combine and give a name to combinations of two or more attributes. A common use is for hierarchical codes, such as the ZIP code system, or breakdown codes. A lexical dimesions essentially encodes all combinations of its attributes into a single key that is used in the fact. It is also possible that the fact contains the individual keys that the lexical dimension table is made of. In those cases the table is not acting as a dimension but can be used to drive drop-downs and pick-lists in a progressive selection of the user interface so that the user sees only valid combinations of these attributes.
Simply put, the VDMGEN code generator reads manifest files and templates, and produces SQL/DDL and DB2 command language data-specific script libraries.
Manifests are files that list a series of names in the first (header) row and corresponding values in each subsequent row. VDMGEN reads-in the header names and processes each subsequent row. Processing entails the following:
Identify and read appropriate templates
Construct substitution tokens from manifest header names and values
Scan each template and apply substitutions
Store output to the appropriate library directories
Repeat through all nested/associated manifests and templates
Templates look like regular SQL scripts, including two special kinds of literals:
Release 2.2 adds templates for incremental maintenance and includes the following enhancements:
Improvements in the use of ksh93 in AIX and Linux and generalization of the vdmgen code generator
Conversion of the directory structure to be compliant with Unix best practices
Extensions and refinements in the manifest structure and the ability to support many families of objects
The ability to maintain two tiers of dependent structures, e.g. Data Mart and their MQTs
New templates for MQT definition compliant with Cubing Services, incremental maintenance of data marts and MQTs, support for DB2-maintained MQTs in deferred mode and various new utilities
The following templates from (VDMGEN) release 2.1 will be supported:
The problem addressed is illustrated by the following example:
The data warehouse (DW) contains a large multi-billion transaction fact and five major dimensions
Cubing Services accesses seven relational data marts (DMs) built from a subset of the DW data. Private dimensions are introduced, specific to each data mart, to support the cube models. These data marts are identical in structure and functionality, but defined over separate subsets of the fact after being transformed through views, one for each data mart.
Three MQTs are defined over each of the data marts to enhance performance of queries issued from Cubing Services
Two additional MQTs are defined directly over the DW fact to support reporting off the DW fact
The framework assumes the existence of one or more prime facts and dimensions that are maintained from external sources. Transformations and aggregations are encapsulated in defining views which in turn are materialized into physical tables. Two types of such derived structures are supported:
First tier data marts, aggregates or MQTs over the prime facts and dimensions.
Second tier MQTs or aggregations built on top of first tier objects.
Methods for maintaining the first tier of data marts and derived tables can be:
Full Refresh
Periodically refresh the target from the defining view over the source. This approach has the following characteristics
The VDMGEN 2.2 framework supports the definition and maintenance of data marts and MQTs in DB2 release 9.7 using incremental or full-refresh methodologies. It contains three major components:
The VDMGEN code generator which builds DB2 command language and SQL script libraries suitable for DB2 9.7
The VDMGEN Customizable Script Templates used by the code generator
Top level scripts and utilities suitable for AIX 5.2 and ksh93
Let's take a simple example to illustrate the approach. Our database contains five facts, and each of them has millions or billions of rows. The facts, although they share dimensions were developed at different times by different teams, so the naming conventions are not 100% consistent. Our objective is to list the count of rows and distinct count of load-batches and distinct count of accounts for each of the facts. Ideally we could write a simple query like the following, and only change the name of the table we run it against: