VDM Public

Versioned Dimensional Model (VDM)

“The Versioned Dimensional Model” (VDM) originated in 1991, before dimensional modeling was publicized by Kimball. The earliest publications (1992) were in the Relational Journal (Codd and Date’s Relational Institute) and later through the 90s in trade magazines and presented in conferences. Since 2000 various white papers and reports have been only available to existing and prospective customers upon request.

VDM Access:

VDM Public

Read more about Versioned Dimensional Model (VDM)

The VDM Registry

The Registry is a collection of functions with persistence that reliably handle the following areas:

Common Codes

VDM Access:

VDM Public

Read more about The VDM Registry

Parallelism Models

There are three modes of parallelism shown below: Three Parallelism models

VDMETL on Regular Unix

VDM Access:

VDM Public

Read more about Parallelism Models

Operational Setup

In order to facilitate the easy management of files we propose one of two methods of setting up the environment.

Moving Files Through Directories: Under this approach, files are moved into a landing directory. When the files are ready for processing, they are moved again to a directory that is monitored by demon processes which recognize the file, chose the correct channel the file should be processed in and invoke the vdmetl command to process it. After the processing is complete, the file is moved to a "check" directory and from there to the archive.

VDM Access:

VDM Public

Read more about Operational Setup

The Mapping Manifest - A Machine-Readable Specification

The specific manifests discussed here capture mappings from Cobol copybook names and positions to table columns. The purpose of these manifests is two-fold:

As human-readable documents, they provide detail mapping and transformation specifications.
As machine-readable documents, they assist in the development of correct and validated parsers that can process stage files and populate database tables.

VDM Access:

VDM Public

Read more about The Mapping Manifest - A Machine-Readable Specification

HDFS and Unix Integration

There are two straightforward and one more complex way of achieving HDFS-VDM integration.

VDM Access:

VDM Public

Read more about HDFS and Unix Integration

HDFS Design and Synergies with VDM

The Hadoop Distributed File System (HDFS) uses divide and conquer techniques behind the covers to distribute data and processing. The design of HDFS according to "Tom White - The Definitive Guide to Hadoop - O'Reiley" is driven by three primary objectives:

VDM Access:

VDM Public

Read more about HDFS Design and Synergies with VDM

VDMETL - Hadoop Integration

There are two levels at which integration can occur:

Use HDFS...
- to transform and store files for use by Hadoop
- to land, store, manage, process and archive source data targeted to traditional DBMS platforms
Enable VDM-generated parsers and other modules to run under Hadoop using the map-reduce architecture

VDM Access:

VDM Public

Read more about VDMETL - Hadoop Integration

The run-time VDMETL© framework consists of the components required to decompress, read, parse and load source files into their intended target. It uses parsers generated with VDMGEN© and services for handling code lookups, exceptions, and logging. The framework components are described in more detail through the links below.

VDM Access:

VDM Public

VDMETL Framework

The VDMETL© framework is a customizable open set of scripts and processes utilizing the VDMGEN© capabilities to create, maintain and execute ETL data provisioning activities with simple available Unix tools. The run-time framework can vary from a basic parallel scalable run-time framework relying on Unix communicating processes that is provided, to a Hadoop/HDFS streaming framework. A totally redesigned Spark-based solution is under development. Parsers are generated from human/machine readable mapping specification documents.

VDM Access:

VDM Public

Read more about VDMETL Framework
Log in to post comments

Navigation

Relationships

Versioned Dimensional Model (VDM)

The VDM Registry

Parallelism Models

Operational Setup

The Mapping Manifest - A Machine-Readable Specification

HDFS and Unix Integration

HDFS Design and Synergies with VDM

VDMETL - Hadoop Integration

The VDMETL© Run-Time Framework

VDMETL Framework

Pages

Lead Forensics

User login

Navigation

Relationships

You are here

VDM Public

Pages

Lead Forensics