Data Modeling Standards and Principles
The purpose of UVU's Data Modeling Standards (DMS) is to ensure consistency, quality, and efficiency in how data models are created, documented, and shared across UVU. These standards are built on the four pillars of Enterprise Architecture Standards: Leverage Before Buy Before Build (LBB), Strategy First, Technology Second (SFTS), Standardization with Flexibility (SWF), and Governance & Risk Management (GRM).
Before we expose data for analytics or reporting, we will first model the data. Here are things to keep in mind as you go about generating a data model. We use a 3-tiered data model development strategy as outlined in this presentation. Please use that as a guide while developing your model.
Data Modeling Strategy - Main Points
Data modeling documentation must include an Entity Relationship Diagram (ERD) using Crows Feet notation, a Data Definition Table and possibly a Data Flow Diagram.
Crows Feet Notation:
Design around business concepts and entities (student, course, enrollment etc.). For example, the goal would be to have one entity for student that can be shared amongst many datasets rather than a different representation of student just for your specific data model.
Denormalization: Try and avoid duplicate data, but not at the cost of performance. We may include a field in more than one place if it means faster querying by reducing joins.
Naming Conventions: Use Proper case for names (i.e. FieldName, TableName). Avoid using abbreviations.
Relationships and Cardinality: Define clear relationships and cardinality between entities (i.e. Every class has at least one student but a student may have zero classes).
Surrogate Keys: Use surrogate keys for primary keys in all fact and dimension tables. Surrogate key names should end in _DWKey (i.e. FieldName_DWKey) so we know it is a surrogate key that is not from the source system.
Data Modeling Tools: We recommend using Erwin, Power Designer, or Power BI. Check with your local area techs about licensing and approved software options.
Data Governance: Work closely with data governance to understand any security concerns, transformation rules, validation rules, etc. that have been cataloged for a piece of data. If it doesn’t exist in the catalog, work with data governance to get it added.
Modeling for our Azure Data Lake (ADL)
[THIS SECTION IS A WORK IN PROGRESS]
Our Business Intelligence Data Platform (BIDP) team is our primary source for getting a new model into ADL. The following process will make sure the deployment of your model goes smoothly and quickly. DO NOT move on to a new step before the previous is completed as that will likely cause extensive rework.
Inform the BIDP team (currently managed by Ken Dahl) and the EA team (currently managed by Shane Green) of your intent to have a new model loaded into ADL. They will likely set up some preliminary meetings and answer any questions you have about the process.
Design the data model you need to see in ADL. You might use Visio, Lucid Charts, or you might write queries off the source system that you would like to see converted to pull from ADL objects. The BIDP and EA team can help you decide which is best.
If you decide to write your own queries and want the BIDP team to convert them to something appropriate for ADL, be sure to check out our SQL Development Standards page. Following those guidelines will make your deployment to ADL go that much faster.
When you think your representation of the proposed data model is ready, it will need to be submitted to BIDP for review. There will be some back and forth as changes are requested and implemented.
Once the proposed data model is approved by BIDP, it will get prioritized and added to their workload.
While BIDP is developing your model, you will need to participate in multiple cycles of review with them.
When everyone is satisfied the model is ready for testing, BIDP will execute data loads into a DEV or QA version of the model and you can point your application (whatever you plan to use to consume the data) to that version.
After multiple cycles of review to make sure the output of your application is as expected, you will approve the model.
BIDP will then execute their process for moving the model to production.