An Example

Let's take a simple example to illustrate the approach. Our database contains five facts, and each of them has millions or billions of rows. The facts, although they share dimensions, were developed at different times by different teams, so the naming conventions are not 100% consistent. Our objective is to list the count of rows and distinct count of load-batches and distinct count of accounts for each of the facts. Ideally we could write a simple query like the following, and only change the name of the table we run it against:

select count_big(*) CNT,
       count(distinct BATCH_NUMBER) BATCH_CNT,
       count(distinct ACCOUNT_NBR)  ACCOUNT_CNT
  from FACT

The problem is that the spelling of the fields BATCH_NUMBER and ACCOUNT_NUMBER is not consistent across all facts, so we need to vary each query accordingly. To handle this using the VDM approach we can take the following steps:

  1. Create a manifest, that contains three columns:
    • The Fact Table name
    • The column name used for Batch Number
    • The column name used for Account Number
  2. The first row of the manifest file contains the labels used as variable names. Subsequent five rows describe each of the five facts.
    
    TABNAME      BATCHNAME        ACCOUNTNAME
    TABLE1       BATCH_NUMBER     ACCNT_NBR
    TABLE2       BATCH_NUMBER     ACCOUNT_NUMBER
    TABLE3       BATCH_NBR        ACCNT_NBR
    TABLE4       BATCH_NUMBER     ACCOUNT_NBR
    TABLE5       BATCH_NO         ACCNT_NO
    
    
    let's call this file "manifestfile.txt"
  3. Write the SQL template as above, but instead of using the actual names for the three items in the manifest, use variables that will be macro-substituted by the code generator as follows:
    • Use __TABNAME__ in place of the FACT in the from clause
    • Use __BATCHNAME__ in place of the BATCH_NUMBER
    • Use __ACCOUNTNAME__ in place of the ACCOUNT_NUMBER
  4. The query in the template should look like this:
    select count_big(*) CNT,
           count(distinct __BATCHNAME__) BATCH_CNT,
           count(distinct __ACCOUNTNAME__)  ACCOUNT_CNT
      from __TABNAME__
    
    
    let's call this file "templatefile.sql"
  5. Use the vdmiter function to display the SQL after the substitution, or simply execute it.
    
    to see the generated SQL use debug mode:
    vdmiter -D -m manifestfile.txt -t templatefile.sql 
    
    To run the SQL drop the debug mode:
    vdmiter    -m manifestfile.txt -t templatefile.sql 
    
    
VDM Access: