Free Essay

Best Practices in Data Modeling

In:

Submitted By marcosh
Words 1267
Pages 6
Best Practices in Data Modeling
Dan English

Objectives

• • • • •

Understand how QlikView is Different from SQL Understand How QlikView works with(out) a Data Warehouse Not Throw Baby out with the Bathwater Adopt Applicable Data Modeling Best Practices Know Where to Go for More Information

QlikView is not SQL (SQL Schemas)

SQL take a large schema and queries a subset of tables. Each query creates a temporary “Schema” of only a few tables. Query result sets are independent of each other.

Query 1

Query 2

QlikView is not SQL (QV Schemas)

QlikView builds a smaller and more reporting friendly schema from the transactional database. This schema is persistent and reacts as a whole to user “queries”. A selection affects the entire schema.

QlikView is not SQL (Aggregation and Granularity)
Store Table
Store A B SqrFootage 1000 800

Sales Table

Store A A A B B

Prod 1 2 3 1 2

Price $1.25 $0.75 $2.50 $1.25 $0.75

Date 1/1/2006 1/2/2006 1/3/2006 1/4/2006 1/5/2006

Select * From Store, Sales Where Store.Store = Sales.Store will return:

SqrFootage 1000 1000 1000 800 800

Store A A A B B

Prod 1 2 3 1 2

Price $1.25 $0.75 $2.50 $1.25 $0.75

Date 1/1/2006 1/1/2006 1/1/2006 1/1/2006 1/1/2006

Sum(SqrFootage) will return: 4600 If you want the accurate Sum of SqrFootage in SQL you can not join on the Sales table in the same Query!

QlikView is not SQL (Benefits)



QlikView allows you to see the results of a selection across the entire schema not just a limited subset of tables.

QlikView is not SQL (Benefits)



QlikView allows you to see the results of a selection across the entire schema not just a limited subset of tables. QlikView will aggregate at the lowest level of granularity in the expression not the lowest level of granularity in the schema (query) like SQL.



QlikView is not SQL (Benefits)



QlikView allows you to see the results of a selection across the entire schema not just a limited subset of tables. QlikView will aggregate at the lowest level of granularity in the expression not the lowest level of granularity in the schema (query) like SQL. This means that QlikView will allow a user to interact with a broader range of data than will ever be possible in SQL!





QlikView is not SQL (Challenges)



Several SQL queries can join different tables together in completely different manners. In QlikView there is only ever One way tables join in any one QlikView file. This means that Schema design is much more important in QlikView!





A Word about Requirements



Requirements will always inform your schema design.

A Word about Requirements

• •

Requirements will always inform your schema design. If you do not fully understand your requirements and these requirements are not thoroughly documented you are not ready to begin scripting. No exceptions!

A Word about Requirements

• •

Requirements will always inform your schema design. If you do not fully understand your requirements and these requirements are not thoroughly documented you are not ready to begin scripting. No exceptions. Requirements are focused in the problem domain; not the solution domain.



A Word about Requirements

• •

Requirements will always inform your schema design. If you do not fully understand your requirements and these requirements are not thoroughly documented you are not ready to begin scripting. No exceptions. Requirements are focused in the problem domain; not the solution domain. Most Schema design questions are not really schema design questions they are really requirements questions.

• •

The Traditional Data Warehouse

ISQL AR OLAP Cube

Cube Viewer

GL ODS Data Mart ERP Reports Other Viewer

Source Data

Data Staging

Data Presentation

Access Tools

How QlikView Can Be Used

AR

Data Mart

QlikView

ODS GL QlikView

ERP

QlikView

Source Data

Data Staging

Data Presentation

Access Tool

Observations

• •

There Is No One Best Data Modeling Best Practice. Data Modeling Is Entirely Dependant on Requirements
Systems, Skill Sets, Security, Functionality, Flexibility, Time, Money, and Above all… Business Requirements!

• • •

Likewise Best Practices are not Universal Apply Best Practices Situationaly Sometimes (Gasp!) even QlikView may not be the Right Tool

Relational vs. Dimensional Modeling

Relational

Dimensional

Relational vs. Dimensional Modeling

Relational •Complex Schemas •Efficient Data Storage •Schema Quicker to Build •Schema Easier to Maintain •Queries More Complicated •Confuses End Users

Dimensional •Simpler Schemas •Less Normalized •Schema Complex to Build •Schema Complex to Maintain •Simpler Queries •Understood by End-users

4 Steps to Dimensional Modeling

1. 2.

Select the business process to model. Declare the grain of the business process.
Ex. One trip, One Segment, One Flight, One historical booking record

3. 4.

Choose the dimensions that apply to each fact table row. Identify the numeric facts that will populate each fact table row.

Multiple Star Schemas and Conformed Dimensions

Common Dimensions

Business Process

Date

Product

Store

Promotion

Warehouse

Vendor

Contract

Shipper

Store Sales

X

X

X

X

Store Inventory

X

X

X

Store Deliveries

X

X

X

Warehouse Inventory

X

X

X

X

Warehouse Delivery

X

X

X

X

Purchase Orders

X

X

X

X

X

X

Using QVD Files to Conform Dimensions

StoreSales.qvw Date.qvd Prod.qvd Store.qvd Promo.qvd DB .QVW Warehouse.qvd Vendor.qvd Contract.qvd Shipper.qvd WHDelivery.qvw StoreDelivery.qvw StoreInv.qvw

WHInventory.qvw

PurchaseOrders.qvw

Slowly Changing Dimensions

• • •

Dimension values change over time in relationship to each other. Classic example: Sales Force Territory Reorganization Postal code 24829 was in territory A1 but as of June 1st 2006 it moved to territory D3.

Slowly Changing Dimensions

Three way to deal with this 1. Overwrite Original Value
Very Easy - Now all sales for 28429 roll into D3 regardless of date

2.

Add a Dimension Row (requires surrogate key)
Preserves history
FakeKey 123 124 PostalCode 28429 28429 TerrID A1 D3

3.

Add a Dimension Field
Allows Comparison
FakeKey 123 PostalCode 28429 TerrID D3 OldTerrID A1

Possible to combine solutions

Circular References

Anytime you enclose area in the table viewer you will encounter a circular reference.

Circular References

Circular References are common in QlikView because you get only one set of join relationships per QlikView file. When you get a circular reference ask yourself if you could live without one of the joins. If you can, cut it. Otherwise you may have to resort to concatenation or a link table to remove the circular reference. Don’t kill yourself with technical link tables if you don’t have to!

Link Tables

Link tables essentially allow you to join two or more fact tables against a common set of dimensions without the usual circular references.

Wrong!
FactTable1

Dimmension1

Dimmension2

FactTable2 Dimmension3

Link Tables

Link tables essentially allow you to join two or more fact tables against a common set of dimensions without the usual circular references.

Right!
FactTable1

Dimmension1

LinkTable

Dimmension2

FactTable2 Dimmension3

Last Words



If your end users reject your application then you have failed, regardless of your technical execution. End user requirements and end user experience should always dictate your approach to developing QlikView applications, including data modeling. Many data warehousing techniques and best practices are directly applicable to QlikView data modeling. Data modeling had been ongoing for many years brilliant minds have contributed to the field; we don’t always need to reinvent the wheel.







Recommended Resources

Data Modeling: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (2nd Edition) – Ralph Kimball, Margy Ross – Wiley – ISBN: 0471200247 Requirements Gathering: Exploring Requirements: Quality before Design – Donald C. Gause, Gerald M. Weinberg – Dorset House ISBN: 0932633137

Questions?

Thank You!
Dan English

Similar Documents

Free Essay

Benefits of 3 D Modelling

...The Transition from 2D Drafting to 3D Modeling Benchmark Report Improving Engineering Efficiency September 2006 — Underwritten, in Part, by — The Transition from 2D Drafting to 3D Modeling Benchmark Report Executive Summary D o more with less. The mandate hasn’t changed for manufacturers. They must develop more products with increasing complexity to address customer and competitive pressures. Yet, there’s no “give” in project timelines to adopt new technologies like 3D modeling to help them win. However, some manufacturers are not only adopting 3D modeling technology, but excelling at hitting their product development targets at the same time. How is it possible? Interestingly enough, it’s actually quite simple. Key Business Value Findings • Best in class manufacturers their hit revenue, cost, launch date, and quality targets for 84% or more of their products. • Best in class performers typically produce 1.4 fewer prototypes than average performers. • Best in class performers average 6.1 fewer change orders than laggard performers. • In total, best in class manufacturers of the most complex products get to market 99 days earlier with $50, 637 lower product development costs. Implications & Analysis How do they do it? • Best in class performers are 40% more likely to have engineers use CAD directly to ensure they stay close to the design. • Best in class performers are 24% more likely to take advantage of extended 3D modeling design capabilities. They are 55% more...

Words: 8589 - Pages: 35

Premium Essay

Philosophy Of Literacy Strategies

...Shared Vision Gather Team • Develop a philosophy of literacy • Establish common teachable foundations by grade level and course for use of literacy strategies • Initiate an ongoing professional development Objectives • Focus on data driven instruction • Focus on student motivation • Focus on integrating differentiated • Teach reading in a manner which reflects quality research-based teaching practices. Materials and Curriculum • Focus on specific curriculum content and pedagogies needed to teach effectively. • Focus on differentiated instruction and cooperative learning. • Incorporate a program that is cross curricular and meets learning goals while offering differentiated support for all learners Desired Outcomes • Increase reading...

Words: 622 - Pages: 3

Premium Essay

Bsc, Business Score Model

...Mô hình năng lực Career and Competency Pathing: The Competency Modeling Approach By Maggie LaRocca Introduction Competencies are behaviors that encompass the knowledge, skills, and attributes required for successful performance. In addition to intelligence and aptitude, the underlying characteristics of a person, such as traits, habits, motives, social roles, and self-image, as well as the environment around them, enable a person to deliver superior performance in a given job, role, or situation.Competency modeling is the activity of determining the specific competencies that are characteristic of high performance and success in a given job. Competency modeling can be applied to a variety of human resource activities. This research paper will describe how organizations identify their core competencies and how they are applying this competency data to improve performance. It will also explain some emerging trends in competency modeling. Developing Competency Models Competencies enable employees to achieve results, thereby creating value. It follows that competencies aligned with business objectives help foster an organization's success. Organizations must understand their core competency needs - the skills, knowledge, behaviors, and abilities that are necessary for people in key roles to deliver business results.According to Boulter, et al (1998), there are six stages involved in defining a competency model for a given job role. These stages are: 1. Performance criteria -...

Words: 1154 - Pages: 5

Free Essay

Statistical Modeling

...1 Motivating GLMMs I briefly summarize the motivations for GLMMs (in linguistic modeling): • The Language-as-fixed-effect-fallacy (Clark 1973 following Coleman 1964). If you want to make state- ments about a population but you are presenting a study of a fixed sample of items, then you cannot legitimately treat the items as a fixed effect (regardless of whether the identity of the item is a factor in the model or not) unless they are the whole population. – Extension: Your sample of items should be a random sample from the population about which claims are to be made. (Often, in practice, there are sampling biases, as Bresnan has discussed for linguistics in some of her recent work. This can invalidate any results.) • Ignoring the random effect (as is traditional in psycholinguistics) is wrong. Because the often significant correlation between data coming from one speaker or experimental item is not modeled, the standard error estimates, and hence significances are invalid. Any conclusion may only be true of your random sample of items, and not of another random sample. • Modeling random effects as fixed effects is not only conceptually wrong, but often makes it impossible to derive conclusions about fixed effects because (without regularization) unlimited variation can be attributed to a subject or item. Modeling these variables as random effects effectively limits how much variation is attributed to them (there is an assumed normal distribution on random effects)...

Words: 598 - Pages: 3

Premium Essay

Paper

...of the software development companies. This paper presents how the Alpha Bay Corporation handled software development process and the type of software development methodology used by them and the recommendations for improvements which includes agile approaches including Extreme Programming, Agile Modeling and SCRUM, describes the differences between them and recommends when to use them. Company Profile Alpha Bay Corporation is building a world-class system called AIRSTM. Their software gives retailers the integrated, real-time data access they need to see where their inventory is, when they need it, across all channels of their business. This allows retailers to increase their revenues by raising the average order value, reducing out-of-stock situations and dramatically increasing customer loyalty. Alpha Bay’s AIRSTM suite, which includes applications such as Point-of-Service, Web Store, Catalog Order Management, Store Operations, Customer Management, Item Management, Business Intelligence, Reporting, and more. Alpha Bay also develops and deploys state-of-the-art database technologies to manage real-time retail information and deliver unsurpassed levels of data reliability and security. Their customers includes Wal-Mart, IBM, Microsoft, SAS, Entergy, the US...

Words: 1411 - Pages: 6

Premium Essay

Business System Analysis Cheat Sheet

...Chap 1: A system is a set of steps (process) put together to accomplish a task. An information system (IS) is an arrangement of people, data, processes, and information technology that interact to collect, process, store, and provide as output the information needed to support an organization.Types of IS : A transaction processing system (TPS): captures and processes data about business transactions. A management information system (MIS): provides for management-oriented reporting based on other computer systems. A decision support system (DSS): provides information to help make decisions. An expert system: captures the expertise of workers and then simulates that expertise to the benefit of nonexperts. A communications and collaboration system: enables more effective communications between workers, partners, customers, and suppliers to enhance their ability to collaborate.An office automation system: supports the wide range of business office activities that provide for improved work flow between workers. Systems analyst – a specialist who studies the problems and needs of an organization to determine how people, data, processes, and information technology can best accomplish improvements for the business. Chap2 : Information systems architecture - a unifying framework into which various stakeholders with different perspectives can organize and view the fundamental building blocks of information systems. Knowledge (ERD), process (UML) and communication (Interface). Chap3 :...

Words: 1409 - Pages: 6

Premium Essay

Business Analytics

...the practice of iterative, methodical exploration of an organization’s data with emphasis on statistical analysis.  It describes the skills, technologies, practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Business analytics is used by companies committed to data-driven decision making.  It focuses on developing new insights and understanding of business performance based on data and statistical methods. BA is used to gain insights that inform business decisions and can be used to automate and optimize business processes. Business analytics makes extensive use of statistical analysis, including explanatory and predictive modeling, and fact-based management to drive decision making. It is therefore closely related to management science. Analytics may be used as input for human decisions or may drive fully automated decisions. Data-driven companies treat their data as a corporate asset and leverage it for competitive advantage. Successful business analytics depends on data quality, skilled analysts who understand the technologies and the business and an organizational commitment to data-driven decision making. Once the business goal of the analysis is determined, an analysis methodology is selected and data is acquired to support the analysis.  Data acquisition often involves extraction from one or more business systems, cleansing, and integration into a single repository such as a data warehouse or data...

Words: 4604 - Pages: 19

Premium Essay

Build a Web Applications and Security Development Life Cycle Plan

...development organization) that drives the development and evolution of security best practices and process improvements, serves as a source of expertise for the organization as a whole, and performs a review (the Final Security Review or FSR) before software is released. What are the activities that occur within each phase? Training Phase- Core Security Training Requirements Phase- Establish security requirements, create Quality Gates/Bug Bars, perform Privacy Risk assesments. Design Phase-Establish Design Requirements, perform Attack Surface Analysis/Reduction, use Threat Modeling Implementation Phase- Use approved tools, Deprecate unsafe functions perform static analysis Verification Phase- Perform Dynamic Analysis, Perform Fuzz Testing, Conduct Attack Surface Review Release Phase- Create an incident Response Plan, Conduct Final Security Review, Certify release and archive Response Phase- Execute Incident Response Plan Phase Activities Roles Tools Requirements - Establish Security Requirements -Create Quality Gates/Bug Bars -Perform Security and Privacy Risk Assessments -Project Managers -Security Analysts -Microsoft SDL Process Template for Visual Studio Team System - MSF-Agile + SDL Process Template Design -Establish Design Requirements -Perform Attack Surface Analysis/Reduction -Use Threat Modeling -Project Managers -Tester -Software Developers -Security Analysts -QA -Microsoft Threat Modeling Tool 2014 Implementation -Use Approved Tools -Deprecate Unsafe Functions ...

Words: 2006 - Pages: 9

Premium Essay

Data Mining in Hospitals

...Original Contributions Data Mining Applications in Healthcare Hian Chye Koh and Gerald Tan A B S T R A C T Data mining has been used intensively and extensively by many organizations. In healthcare, data mining is becoming increasingly popular, if not increasingly essential. Data mining applications can greatly benefit all parties involved in the healthcare industry. For example, data mining can help healthcare insurers detect fraud and abuse, healthcare organizations make customer relationship management decisions, physicians identify effective treatments and best practices, and patients receive better and more affordable healthcare services. The huge amounts of data generated by healthcare transactions are too complex and voluminous to be processed and analyzed by traditional methods. Data mining provides the methodology and technology to transform these mounds of data into useful information for decision making. This article explores data mining applications in healthcare. In particular, it discusses data mining and its applications within healthcare in major areas such as the evaluation of treatment effectiveness, management of healthcare, customer relationship management, and the detection of fraud and abuse. It also gives an illustrative example of a healthcare data mining application involving the identification of risk factors associated with the onset of diabetes. Finally, the article highlights the limitations of data mining and discusses some future directions....

Words: 5507 - Pages: 23

Premium Essay

Math Career Progression

...love learning! I fell into math because it sets you up to learn about anything. Grad school has delivered in spades, and I continue to thrill to novel applications and new frameworks. Now that I’m near finishing grad school, I would like to learn more about the applied problems and techniques used in industry. Academia has been great, but I enjoy applying, modeling, and programming over teaching and pure research. This is why I...

Words: 909 - Pages: 4

Premium Essay

Case Study: Partners Healthcare Systems

...Case Study: Partners HealthCare Systems Case Study: Partners HealthCare Systems Partners HealthCare is a non-profit, health system located in Boston that created a data based transformation (Davenport, 2013). It integrated a new system that aligned the participating organizations to cohesively run as one and to help shape the future of the organization. The system didn’t stop there as it was responsible for bettering the patient financing experience and the delivery of healthcare information to other organizations (Davenport, 2013). The initial goal of the organization was making patient care more affordable and accountable by providing integrated, evidence based, patient-oriented care. Problem Identified Partners HealthCare, which was created by major contributing hospitals and medical facilities in the Northeast, initially began as a way to focus on the patient needs; however, the company soon found themselves separating from their initial goal. There was more than one problem identified within the realms of the company. There are three main issues that this case study produced. The first outlying issue is called Alert/Warning Fatigue (Davenport, 2013). This derives from alerts that were placed in the system to warn doctors of notices that could be anything from mixing prescriptions to simple notifications that most doctors already knew or was not in their field of study. A second issue that requires resolution to create a successful working system is the sense...

Words: 1410 - Pages: 6

Premium Essay

Paediatric Orthopaedic Clinic Case

...Using Simulation Modeling to Improve Patient Flow at an Outpatient Orthopedic Clinic Thomas R. Rohleder, PhD Division of Health Care Policy and Research Mayo Clinic 200 First Street SW Rochester, Minnesota 55905 tel: 507-538-1532 Email: rohleder@mayo.edu Peter Lewkonia, MD Faculty of Medicine University of Calgary Calgary, Alberta Diane Bischak, PhD Haskayne School of Business University of Calgary Calgary, Alberta Paul Duffy, MD Faculty of Medicine University of Calgary Calgary, Alberta Rosa Hendijani Haskayne School of Business University of Calgary Calgary, Alberta July 2011 Abstract We report on the use of discrete event simulation modeling to support process improvements at an orthopedic outpatient clinic. The clinic was effective in treating patients, but waiting time and congestion in the clinic created patient dissatisfaction and staff morale issues. The modeling helped to identify improvement alternatives including optimized staffing levels, better patient scheduling, and an emphasis on staff arriving promptly. Quantitative results from the modeling provided motivation to implement the improvements. Statistical analysis of data taken before and after the implementation indicate that waiting time measures were significantly improved and overall patient time in the clinic was reduced. Keywords: Outpatient Clinic, Discrete Event Simulation, Process Improvement, Patient Waiting I. Introduction Visiting hospital outpatient clinics is a very common way for...

Words: 7497 - Pages: 30

Premium Essay

Case Study

...Using Simulation Modeling to Improve Patient Flow at an Outpatient Orthopedic Clinic Thomas R. Rohleder, PhD Division of Health Care Policy and Research Mayo Clinic 200 First Street SW Rochester, Minnesota 55905 tel: 507-538-1532 Email: rohleder@mayo.edu Peter Lewkonia, MD Faculty of Medicine University of Calgary Calgary, Alberta Diane Bischak, PhD Haskayne School of Business University of Calgary Calgary, Alberta Paul Duffy, MD Faculty of Medicine University of Calgary Calgary, Alberta Rosa Hendijani Haskayne School of Business University of Calgary Calgary, Alberta July 2011 Abstract We report on the use of discrete event simulation modeling to support process improvements at an orthopedic outpatient clinic. The clinic was effective in treating patients, but waiting time and congestion in the clinic created patient dissatisfaction and staff morale issues. The modeling helped to identify improvement alternatives including optimized staffing levels, better patient scheduling, and an emphasis on staff arriving promptly. Quantitative results from the modeling provided motivation to implement the improvements. Statistical analysis of data taken before and after the implementation indicate that waiting time measures were significantly improved and overall patient time in the clinic was reduced. Keywords: Outpatient Clinic, Discrete Event Simulation, Process Improvement, Patient Waiting I. Introduction Visiting hospital outpatient...

Words: 7497 - Pages: 30

Premium Essay

Bussines Inteligence Data Mining

...BUSINESS INTELLIGENCE DATA MINING Business intelligence is a computerized technique used is searching, storing and analyzing useful business information (http://en.wikipedia.org/wiki/Business_intelligence). Business intelligence is an increasing strategy employed by many modern ventures, in the attempt to providing quick access to information and helps the business in making appropriate decisions. Holistic information on ones business environment is an important tool, since it does not only shows your past trend, but also prepares the firm for the future improvements. This sets a challenge in establishing the methods to source for the information, and how to use this information to improve a business position . Data mining is the sourcing of any hidden and predictive business information from a relevant database. It involves a thorough analysis of data gained from various sources, manipulating it into useful tool - a tool that leads to raising business revenue, saving on the running costs or both (http://en.wikipedia.org/wiki/Data_mining). Data mining tool incorporates analytical tools that helps build a useful predictive relationship. Data mining tools helps get answers as it scrutinizes data from different perspective to a precision, than any expert could do. Interplay of data mining process with software and hardware utilities is a big step in data analysis. The integration of artificial intelligence and databases heightens the data-mining goal as the information is translated...

Words: 1904 - Pages: 8

Free Essay

Simulations in Wireless Sensor and Ad Hoc Networks: Matching and Advancing Models, Metrics, and Solutions

...University of Ottawa ABSTRACT The objective of this article is to give advice for carrying out a proper and effective simulation activity for protocol design. It challenges some of the existing criticisms of simulation practices that emphasized validation aspects. This article advocates the use of simple models, matching assumptions and metrics in the problem statement and simulation to provide a basic “proof of concept,” and comparison with truly competing solutions, which is possible only after a thorough and critical literature review. Then the complexity of the models can be increased (one parameter at a time), revising the algorithms themselves by adapting them to new assumptions, metrics, and the corresponding simulation environment. Selected independent variables should explain performance under a wide range of scenarios. unclear which protocol will perform well under a wide range of scenarios. It is our view that each article should be judged on its overall contribution, including the assumptions used, theory developed, new algorithms introduced, protocol details, simulation results, and relevance to an ultimate goal of staying on a path toward creating applications. We begin with a literature review of existing criticism for simulation practices, and then discuss what we believe are the main issues. We take a more general view of simulation as a support for new ideas and theories, providing a platform for their comparison with truly competing existing solutions,...

Words: 4845 - Pages: 20