Search
Twitter

Entries in Master Data Services (13)

Saturday
Feb162013

Importing Data Into Master Data Services 2012 โ€“ Part 3

Welcome to Part 3 of a 3-part series on using the new Entity-Based Staging Process in Master Data Services (MDS) 2012.

Part 1: Overview of the Staging Process

Part 2: Loading a Model Where All Attributes Come From a Source System

Part 3: Loading a Model Where Some Attributes Are Maintained Directly in MDS

Sample Model

This builds upon the model built in Part 2 (so please review that blog entry if you’ve not done so already). 

Conceptually the Account Model looks like this:

     image_thumb[4]

The Account Entity in MDS looks like this:

     image_thumb[5]

In Part 2 we assumed that all of the attribute values come from a source system.  However, in Part 3 we are changing that up a little.  We are going to say that the Account Type is maintained directly in MDS.

When MDS is the System of Record

One of the most common use cases for MDS is to augment the data which comes from your source system(s) with additional context.  This could be groupings or descriptive information not stored elsewhere.

For purposes of Part 3, the Account Type entity is maintained in MDS only.  When an attribute is maintained directly in MDS, we need to alter the process described in Part 2 just a bit to ensure the values are preserved during the import process.

In Part 2 we said all the values come from the source.  Put another way, all data is pushed to MDS.  Conceptually, that looks like this:

     image

However, if one or more values come from MDS, we need to add a step to retrieve those values.  Otherwise, we’ll lose them.  Two reasons for this:

  1. If we use an ImportType of 2, which allows updates to occur for existing records, then a null value in the staging table will overwrite an existing value.
  2. The stored procedures provided to populate an MDS Model are not parameterized to allow us to specify which attribute(s) are loaded.  Since all attributes are part of the import process, we need to make sure all data is present in staging to preserve the existing values.

Think of it this way:  We need to retrieve the values out of MDS – as if it were any other source system – to load staging before the stored procedure executes.  Conceptually, that looks like this:

     image

SSIS Package

We’re going to use the same SSIS Package from Part 2.  (Below only shows the modifications to it, so please refer back to Part 2 for any details not listed here.)

Note the same 5 steps are present.  Loading the Account Type Entity is not present in the SSIS package, because it’s maintained directly in MDS.

     image

image

Step 2 is where the important change occurs.  Within the data flow for the Account entity, we need to query MDS to pull back the existing values for Account Type.  If we skip this step and leave it null in staging, those null values will indeed overwrite the existing values because we’re using an ImportType of 2. 

How you retrieve the data out of MDS depends on your personal preference.  In this example I used the Lookup data flow transformation in SSIS.  I don’t want to do a cross-database join in my source query, so I matched up the data from MDS after the source data is in my SSIS pipeline.

     SNAGHTML14529a07

Prerequisite for this step:  A subscription view must exist for the entity to export the data out of MDS.  A subscription view is what’s used to extract data out of MDS.  Another reminder we’re not to interact directly with the MDS source tables.  In my situation, I want a subscription view for the main Account entity – this is where the accounts are actually mapped to the values.  The Account Type entity is just a distinct list of lookup values, and doesn’t give me the mappings to actual accounts.

And that’s it!  Once you know to identify the source of all MDS attributes before you begin creation of a staging package, then it’s smooth sailing from there on.

Finding More Information

Part 1 of this series (Overview of the Staging Process)

Part 2 of this series (discusses Loading a Model Where All Attributes Come From a Source System)

MSDN – Importing Data (Master Data Services)

MSDN – Exporting Data (Master Data Services)

 

Saturday
Feb162013

Importing Data Into Master Data Services 2012 โ€“ Part 2

Welcome to Part 2 of a 3-part series on using the new Entity-Based Staging Process in Master Data Services (MDS) 2012.

Part 1:  Overview of the Staging Process

Part 2:  Loading a Model Where All Attributes Come From a Source System

Part 3:  Loading a Model Where Some Attributes Are Maintained Directly in MDS

Sample Model

First, a few quick words about the sample model we’re going to use.  It’s a simple Account Model which has 3 entities.  The Account Type and Account Class entities exist solely to populate the primary Account entity.

Conceptually the Account Model looks like this:

     image

The Account Entity in MDS looks like this:

     image

The remainder of this blog entry discusses loading the Account model at the leaf (most detailed) level.  All attributes, for all entities, come from a source system.

Leaf Member Staging Tables

To support the Account model, 3 staging tables were created by MDS when the entities and attributes were set up:

     image

Recall that the Account Class and Account Type are domain-based attributes, used as follows:

  • Account Type:  Contains a distinct list of Account Type code and name – used as the lookup table for the main Account entity.
  • Account Class:  Contains a distinct list of Account Class code and name – used as the lookup table for the main Account entity.
  • Account:  Contains the accounts, plus the mapped values of Account Type and Account Class. 

For purposes of this example, we are going to assume all of the attributes in these entities come from a source system external to MDS.  (Note:  Part 3 of this series looks at how to handle it if one or more attributes are maintained directly in MDS – i.e., the data doesn’t exist in a source system anywhere.)

SSIS Package

Following is an example of the flow for the SQL Server Integration Services (SSIS) package.  Note that in a Production ETL environment a few things will be more dynamic, but the purpose of this example is to be simple & straightforward.

     image

 

image

Step 1:  Truncates each of the 3 staging tables.  This is to eliminate the data from the last time this process ran.

     image

 

imageStep 2:  Data flow to populate each entity in the model.

  image

In your source query, make sure you pull back the code field for the domain-based attributes rather than the name field.  For the Account entity, we’ll only populate Account Type and Account Class fields with the code value.  However, if we were populating the Account Class or Account Type entity, we would populate both code and name.

The derived columns include:

  • ImportType of 2 (which means updates can be accepted vs. a failure).  There are 6 codes to choose from to control the behavior.  The list of ImportType codes can be found here:  http://msdn.microsoft.com/en-us/library/ee633854.aspx
  • ImportStatus_id of 0 (which means each record is ready).
  • BatchTag which I chose to create from an ETLExecution_id (i.e., a unique code used for ETL job management) plus a string which refers to this entity.  The reason the string is appended is so that each BatchTag is unique for every data flow.  MDS will generate an error if > 1 of the same BatchTag executes at the same time.  Therefore, each of the 3 data flows in my example append a different string so they are allowed to run in parallel for speed.

     image

The data conversions are only needed because my source data was non-Unicode, and MDS is Unicode.

     image

Mapping of fields into the Leaf Member Staging Table looks like this:

     image

The Code can be ignored if you specified for the Code to be created automatically when the entity was set up (a new feature in MDS 2012).

More information about populating the Leaf Member Staging Table can be found here:  http://msdn.microsoft.com/en-us/library/ee633854.aspx 

image

Step 3:  Load the MDS Model.  With this step, we’re going to be invoking the MDS-provided stored procedure that takes the data out of staging and moves it into the actual model. 

Each staging table has its own stored procedure.  The stored procedure name will be in the format of udp_EntityName_Leaf (unless a different name was chosen when the entity was created). 

     image

Syntax is as follows:

DECLARE @RC int
DECLARE @VersionName nvarchar(50)
DECLARE @LogFlag int
DECLARE @BatchTag nvarchar(50)

SET @VersionName = N'VERSION_1'
SET @LogFlag = 1
SET @BatchTag = (? + ' Account')

EXECUTE @RC = [stg].[udp_Account_Leaf]
   @VersionName
  ,@LogFlag
  ,@BatchTag
GO


Note that the BatchTag needs to match what was specified in the Step 2 data flow.  I’ve set mine to be unique per entity so the batches can be executed in parallel.  The ? in the screen shot above signifies I’m using a parameter (which is mapped to a variable which contains the ETLExecution_id); this helps the BI team managing nightly loads know which run this was associated to.

A LogFlag of 1 specifies logging is enabled.  The VersionName also needs to be specified – a slight complication if you do a lot of versioning in your environment.

Execution of this step to load the model is what correlates to the Integration Management page in MDS:

     image

More information about loading the MDS Model can be found here: http://msdn.microsoft.com/en-us/library/hh231028.aspx

image

Step 4:  Validation of the Model.  Until validation is performed, the new inserts and updates are there but in a “waiting validation” status.  If any Business Rules have been defined for the model, they will be applied when Validation is run.

The Validation applies to the model as a whole, so it only needs to be done once regardless of how many entities we just loaded.

     image

Syntax is as follows:

DECLARE @ModelName nVarchar(50) = 'Account Model'
DECLARE @Model_id int
DECLARE @UserName nvarchar(50)=?
DECLARE @User_ID int
DECLARE @Version_ID int

SET @User_ID =  (SELECT ID 
                 FROM  mdm.tblUser u
                 WHERE u.UserName = @UserName)

SET @Model_ID = (SELECT Model_ID
                 FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                 WHERE Model_Name = @ModelName)

SET @Version_ID = (SELECT MAX(ID)
                   FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                   WHERE Model_ID = @Model_ID)

EXECUTE mdm.udpValidateModel @User_ID, @Model_ID, @Version_ID, 1

More information about validating the Model can be found here:  http://msdn.microsoft.com/en-us/library/hh231023.aspx 

Execution of this step changes the yellow question marks to green check marks:

     image

imageThe 5th and final step I have in my package is to check if any errors occurred and then alert the BI Team and/or Data Steward if needed.  This is done by querying the views provided by MDS for each entity. 

In my process, I just do a simple query that checks if 1 or more error records exist and if so, send an email.  Note the ResultSet is set to Single row.  The total count is passed to a variable.  If the value of the variable is >=1, the next step of generating an email will be kicked off.

     image

The syntax is as follows:

;WITH CTE_CheckEachTable AS
(
SELECT Count_MDSErrors = COUNT(*)
FROM stg.viw_Account_MemberErrorDetails

UNION ALL

SELECT Count_MDSErrors = COUNT(*)
FROM stg.viw_Account_Class_MemberErrorDetails

UNION ALL

SELECT Count_MDSErrors = COUNT(*)
FROM stg.viw_Account_Type_MemberErrorDetails

)

SELECT Count_MDSErrors = SUM(Count_MDSErrors)
FROM CTE_CheckEachTable

More information on viewing errors can be found here:  http://msdn.microsoft.com/en-us/library/ff486990.aspx 

Out Of Scope

The above process excludes the following:

  • Changing the code for an existing record (i.e., usage of the New Code field in the staging table).
  • Deactivating or deleting an existing record (i.e., via usage of specific ImportType codes).

Finding More Information

Part 1 of this series (Overview of the Staging Process)

Part 3 of this series (discusses Loading a Model Where Some Attributes Are Maintained Directly in MDS)

 

Saturday
Feb162013

Importing Data Into Master Data Services 2012 โ€“ Part 1

Welcome to Part 1 of a 3-part series on using the new Entity-Based Staging Process in Master Data Services (MDS) 2012.

Part 1:  Overview of the Staging Process

Part 2:  Loading a Model Where All Attributes Come From a Source System

Part 3:  Loading a Model Where Some Attributes Are Maintained Directly in MDS

Introduction to the MDS Entity-Based Staging Structure

One of the new features of MDS in SQL Server 2012 is a new process for staging data to be imported into MDS.  Key things to be aware of:

  • Each entity has its own staging table in the stg schema.
  • Each entity has its own stored procedure (udp) to load data from staging into the model.
  • Within the stored procedure, you have an ImportType parameter which specifies how new and updated members are treated.  New in 2012 is the ability to update existing values, if you so choose. 
  • If more than one batch will be running at the same time, each batch needs a unique BatchTag. 
  • The model needs to be validated after the stored procedure to load the model is executed.
  • Each entity has its own view to display errors which occurred while loading the model.

The objective of this process is for us to interact with the staging table (in the stg schema), then allow the MDS-generated stored procedure interact directly with the model (in the mdm schema).

An overview of the process is as follows:

       image

Advantages of this new structure include:

  • Ability to handle updates (as well as inserts and deletes), if you choose the Import Type which permits updates.
  • Easier to understand ETL processing.
  • Much faster and efficient ETL processing.
  • Security may be set up per individual staging table, if necessary.  Permission to insert data to the staging table(s) is required.
  • Security may be set up per stored procedure, if necessary.  Permission to execute the stored procedure(s) is required.
  • Members and attributes may be loaded in single batches related to one specific entity.

Tables, Stored Procedures, and Views in the Staging Schema

For each entity, up to 3 staging tables may exist depending if consolidated members exist, and if explicit hierarchies exist.  The leaf member table will always exist once the entity has been created.  It would be very common to not use all 3 possibilities.

Note:  All names are based on the Entity name unless a different name was chosen when the entity was created.

Leaf Members

     Staging table:  stg.EntityName_Leaf

     Stored procedure to load model:  stg.udp_EntityName_Leaf

     View which displays errors: stg.viw_EntityName_MemberErrorDetails

     Details re: fields loaded to the staging table:  http://msdn.microsoft.com/en-us/library/ee633854.aspx

Consolidated Members

     Staging table:  stg.EntityName_Consolidated

      Stored procedure to load model: stg.udp_EntityName_Consolidated

     View which displays errors: stg.viw_EntityName_MemberErrorDetails

     Details re: fields loaded to the staging table:  http://msdn.microsoft.com/en-us/library/ee633772.aspx

Explicit Hierarchies

     Staging table name: stg.HierarchyName_Relationship

     Stored procedure to modify hierarchy:  stg.udp_HierarchyName_Relationship

     View which displays errors: stg.viw_HierarchyName_RelationshipErrorDetails

     Details re: fields loaded to the staging table:  http://msdn.microsoft.com/en-us/library/ee633902.aspx

Managing Import Processes

In the Integration Management section of the MDS web interface, staging batches which have completed, or are queued to run are displayed:

     image

Finding More Information

Part 2 of this series (discusses Loading a Model Where All Attributes Come From a Source System)

Part 3 of this series (discusses Loading a Model Where Some Attributes Are Maintained Directly in MDS)

MSDN – Load or Update Members in Master Data Services by Using the Staging Process

 

Saturday
Feb022013

Comparison of Master Data Services Functionality in Web Interface versus Excel Add-In

There are some differences in functionality available in the MDS web interface (known as Master Data Manager) versus the MDS Excel Add-In.  Following is a quick reference which is applicable to SQL Server 2012.

Management of Data:

Feature

Web

Interface

Excel

Add-In

Add or update members individually

Yes

Yes

Delete members individually

Yes

Yes

Add or update members in bulk

No

Yes

Delete members in bulk

Yes

Yes

Create annotations of changes

Yes

Yes

Create and manage attribute groups

Yes

No

View attribute groups

Yes

Yes

(Via filters)

View transactions

Yes

No

Filters

Yes

Yes

(Fewer options)

Match data

No

Yes

(If DQS integration

is enabled)

Combine data

No

Yes

Create shortcut query file

No

Yes

 

Model Structure:

Feature

Web

Interface

Excel

Add-In

Create model

Yes

No

Create entity

Yes

Yes

(Code and Name columns only)

Create attribute

Yes

No

Edit attribute data type or length

No

Yes

 

Business Rules:

Feature

Web

Interface

Excel

Add-In

Create and edit business rules

Yes

No

Apply business rules (validate)

Yes

Yes

 

Hierarchies:

Feature

Web

Interface

Excel

Add-In

Create explicit hierarchy

Yes

No

View explicit hierarchy

Yes

Yes

(Via filters)

Manage explicit hierarchy

Yes

No

Create derived hierarchy

Yes

No

View derived hierarchy

Yes

No

 

Collections:

Feature

Web

Interface

Excel

Add-In

Create and view collections

Yes

No

Add or update collection members

Yes

No

 

Administration:

Feature

Web

Interface

Excel

Add-In

Administrator tasks such as managing permissions, versions, subscription views, and deployment

Yes

No

 

Friday
Jan252013

New and Discontinued MDS Features in SQL Server 2012

Recently I did some research to become more familiar with the new features of Master Data Services in SQL Server 2012, as well as what’s discontinued and deprecated.  Below are my notes compiled from reading a variety of sources.

New MDS 2012 Features

New Excel Add-In

  • A new Excel Add-In is introduced, which provides an optional alternative to working in the Web Interface.  The Add-In can be downloaded here:  http://www.microsoft.com/en-gb/download/details.aspx?id=35581
  • Works on Excel 2007 forward.  (Exception:  To use the “Send Query” functionality on the Master Data ribbon, this requires Outlook 2010.)

               image

New Shortcut Query Files

  • After you have loaded MDS data into an Excel worksheet, you can use the “Save As Query” menu item on the Master Data ribbon.  This will save a Shortcut Query File to be reused.  There’s also a “Manage Queries” option to load, rename, or delete previously saved queries.
  • Shortcut Query Files have the following advantages:  (1) Saves time and improves accuracy when reloading data for a particular set of data on a regular basis, especially if you've placed filters on the set data.  (2)  Allows you to share the query with a coworker without having to email the data itself in the workbook – this is particularly helpful if it’s sensitive data.  (3) By posting the Shortcut Query File to a SharePoint Document Library, data stewards may utilize it as a starting point within their own Excel workbook.  This improves accuracy, consistency, and efficiency.
  • File format of a Shortcut Query File is *.mdsqx XML format which can be imported and exported for ease in sharing between coworkers.  If you execute the *.mdsqx file, it uses an MDSQueryOpener program (which opens Excel, runs the query, and renders the data).

               image

Redesign of the Web Interface

  • Master Data Manager now uses Silverlight 5.
  • Advantages of the redesigned interface include:  (1) It is faster.  (2) Less page refreshes are required.  (3) Adding, deleting & moving records is quicker.  (4) The layouts are more consistent from screen to screen.  
  • Additional filtering capabilities have been added to the Web Interface (note: the Web UI offers a few more filtering capabilities than the Excel Add-In).

               image

New Integration with Data Quality Services

  • Purpose of the integration with DQS is to ensure no duplicate records exist.  Integration is currently limited to this particular function.
  • This functionality works only in the Excel Add-In.  You can use the “Match Data” button on the Master Data ribbon (or if you set up a staging ETL process using SSIS).  It’s not supported in the Web Interface with this release.
  • Three requirements to use the DQS matching functionality:  (1) The integration with DQS needs to be enabled in the MDS Configuration Manager – the Data Quality section of the Master Data ribbon is not visible until enabled.  (2) A matching policy needs to exist within the DQS Knowledgebase.  (3) Both MDS and DQS_MAIN need to exist on the same SQL Server instance.

              

Restructured Entity-Based Staging Structure

  • Rather than using shared staging tables for use with ETL processing, each entity now has its own group of staging tables including stg.name_leaf, stg.name_consolidated, and stg.name_relationship.
  • Advantages of having separate staging tables per entity:  (1) The ETL runs more efficiently.  (2) Security can be set up per individual table.  (3) Members and attributes may be loaded in single batches.
  • The stored procedures which populate your model from the staging tables support inserts, updates, and deletes.  This is a HUGE improvement in the 2012 version, considering the 2008 R2 staging tables could not handle updates to existing attributes.

               image

Simplified Security Model

  • Derived hierarchies will inherit its permissions from the model.  Security for a derived hierarchy can no longer be set explicitly.
  • Explicit hierarchies will inherit its permissions from the entity.  Security for an explicit hierarchy can no longer be set explicitly.
  • Attribute Groups now have a new Attribute Group Maintenance page to assign update permissions.  Read-only permissions can no longer be set.

               image

New Deployment Command-Line Tool

  • A new command-line tool called “MDSModelDeploy” is introduced.  This allows deployment of not only the model structure, but the data as well.  (Note:  the Web Interface only supports deployment of the structure, not the data.)
  • After deployment, the following items must be manually updated:  (1) User-defined metadata, (2) File attributes,  (3) User and Group permissions.

               image

New Option to Create Code Values Automatically

  • A new option exists to have Code Values created automatically when a new member is added to an entity. In the previous version of MDS, this could be accomplished via a business rule.  However, this new option under Manage>Entities is much more convenient because it takes effect as soon as the member is created.

               image

Miscellaneous

  • Collections now may have a weight assigned to each item.
  • If you add the MDS Web Interface to a page with a SharePoint portal, you can add “&hosted=true” as a query parameter to the URL string.  This reduces the space required to display Master Data Manager.
  • Installation of Master Data Services is now part of SQL Server.
  • Master Data Services is available in the Business Intelligence or Enterprise editions.

Discontinued MDS 2012 Features

  • One side-effect of the Web Interface improvements (see the Redesign of Web Interface section above) is that batch updates are no longer supported.  Of course, doing updates one by one is not efficient if you have a large number of members to update.  Your alternatives for handling a large number of updates is the Excel Add-In or a staging ETL process.
  • The new staging process does not support maintenance of Collections, nor does the Excel Add-In.  Collections may only be maintained via the Web Interface.
  • Business Rules to generate Code Values are no longer supported.  Alternatively, you will want to use the new feature when the entity is set up (see the New Option to Create Code Values Automatically section above).
  • The ability to explicitly secure hierarchies and attribute groups has been eliminated.  (See the Simplified Security Model section above.)
  • Attribute Groups can no longer have read-only permissions (update permissions only).
  • The Web Interface no longer displays an “Export to Excel” button.  Instead of pushing the data from MDS to Excel, you may use the Excel Add-In to pull the data from MDS to Excel.
  • Users (i.e., a typical data steward) no longer have the ability to reverse their own transactions.  Reversal of an MDS transaction now requires an administrator.
  • Annotations cannot be deleted.  Rather, annotations are now retained permanently.
  • Powershell cmdlets are no longer available in this release.

Deprecated MDS 2012 Features

  • The consolidated staging table approach as used with SQL Server 2008 R2 will be removed from a future version.  This includes tblStgMember, tblStgMemberAttribute, and tblStgRelationship plus the udpStagingSweep stored procedure.  If you go with Upgrade Choice #2 (see Upgrade Considerations below), you can still use the 2008 R2 staging processes at this point in time.  However, they will be likely removed in a future release.
  • The Metadata model will likely be removed in a future release.  Although you can see it, it shouldn’t be used for anything.

Upgrade Considerations

When you are ready to upgrade MDS from SQL Server 2008 R2 to SQL Server 2012, you need to make a decision if you are ready to upgrade the database engine or not.  There are two approaches:

  1. Upgrade the Web Application and the MDS database to SQL Server 2012.
  2. Upgrade the Web Application, but leave the MDS database at SQL Server 2008 R2.  With this choice, the database schema is updated to support new features, and you can use the new Web Interface.  You may continue using the 2008 R2 processes (such as the consolidated staging format, or the management of Collections via the staging table structure) as-is until you are ready to redesign them.  A downside to this approach is that you don’t see all staging processes displayed within the Web Interface.  It does buy you some time to redesign though, which is helpful if you have a lot of MDS SSIS packages.

Finding More Information

MSDN – SQL Server Master Data Services

TechNet – What’s New in Master Data Services in SQL Server 2012 RC0

MSDN – Upgrade Master Data Services

MSDN – Discontinued Master Data Services Features in SQL Server 2012

Jeremy Kashel’s Blog – Master Data Services SQL Server 2012 Vs 2008 R2

Ross Mistry and Stacia Misner - Introducing Microsoft SQL Server 2012

 

Sunday
Dec232012

Connecting to MDS from the Excel Add-In

A new feature of Master Data Services (MDS) in SQL Server 2012 is the MDS Excel add-in.  Much of the data steward's work can be performed in Excel rather than the web interface, if desired.  One situation where you definitely want to use Excel:  if a lot of members need to be added or changed, the Excel add-in is much more efficient (the 2012 web interface only does one at a time).

After the Excel Add-In is installed, the first bit of housekeeping is creation of a connection to the MDS Server.  This is done using the first item in the ribbon:

image

You might be inclined to copy and paste the URL from the Master Data Manager home page – but this does not work because it's pointing to a web page.  The Excel connection wants the actual server.  Here's the message received when using the wrong connection:

The connection failed because the URL is not valid or the Master Data Manager web application version is not supported.

image

The correct thing to do is remove /default.aspx from the end of the URL.  In the screen shot below, you’d want to use the yellow highlighted portion which is http://mds.contoso.com:90

image

Within Excel, you need to create the new connection before you can test that it works.

image

You know you’ve made a successful connection when you see the Master Data Explorer pane on the right.

image