Source to target mapping document tool




















I re-phrase ' simple mappings' as below:. If there are no short cuts: Like session overrides. SQ columns and SQ query matches each other. No column aliases in SQ: means you've used source columns and then renamed in Expression transformations. Mapplets: if Mapping using Mapplets then you need to report Mapplet as separate. If you maintain all coding standards then you can go for it if not it's waste of time. I hope I've recollected all, 5 years ago I've worked on using 8. Thanks for your response.

All of the points that you mention above we do have in most of our mappings like. The Metadata Manager MM has been designed specifically to deliver this "data lineage" information. You can retrieve the complete lineage of source attributes to target values using MM once you have imported the respective PowerCenter repositories. MM is licensed as part of certain PowerCenter editions. Please consult your Informatica sales representative for more information and maybe a demo or a Proof Of Concept.

Once the source-to-target mapping document has been completed, the ETL lead developer, with the help of the database administrator and the data quality analyst, must design the ETL process flow, as illustrated by the example in Figure 9. The purpose for the ETL process flow diagram is to show the process dependencies between all extracts , sorting and merging utilities, transformations, temporary work files or tables, error-handling processes, reconciliation activities, and the load sequence.

Extracts: There may be operational interdependencies among several source files and source databases from which data is being extracted. These interdependencies have to be understood because they may affect the timing and the sequence of running the ETL extract programs.

Sorting and merging utilities: Almost every step requires the extracted data to be sorted in a particular fashion so that it can be merged before further processing can occur.

Sorting can also greatly improve load performance. Transformations: Most of the data has to be transformed for a variety of reasons. It is important to examine the most opportune times to perform the transformations. Therefore, transformations applicable to all source data, such as data type and code conversions or data domain enforcements, should be performed early in the ETL process flow. Transformations specific to a target database, such as aggregation and summarization for a specific data mart, should occur toward the end of the ETL process flow.

Temporary work files or tables: Sorting, merging, and transforming require a lot of temporary storage space to hold the interim results. These temporary work files and tables can be as large or larger than the original extracts. Furthermore, these temporary work files and tables are not really "temporary. Error-handling processes: During the ETL process many errors are detected as the data-cleansing specifications are applied to the source data.

If error reports are created or erroneous records are rejected into a suspense file, they should be shown on the ETL process flow diagram. Reconciliation activities: Every program module that manipulates data should produce reconciliation totals. This can be in the form of input and output record counts, specific domain counts, or amount counts. Record counts are sufficient for extract, sort , and merge modules. Domain counts are appropriate for more complicated transformation specifications, such as separating data values from one source data element into multiple target columns.

This extracting filters only Customers Savings account. So I consider this is my Data extraction criteria. Filter Criteria — After we extracted the data from the source we need to filter the data if required for few or more target tables and this will be covered in this space.

Target Database — Target Database name or names will be mentioned in this space. Target Column — All the Target columns from respective Target Database are mentioned in this space these columns will undergo the transformations if required. Key column means the column that makes the record unique and Value means, what makes the record time variant. Comments and Version Changes — This space will explain us what was in the S2T before and what changed now.

Comments will tell us more about the transformation. As soon as you get the S2T, please query your Staging Source tables and check the data that you have got will satisfy your transformation rule. So to achieve the good quality of testing we always interested in UAT data. Example: Suppose if Data modeler mentioned a column as Key column which should not change consecutive loads.

But you have noticed that column values are changing from source on consecutive loads then you should notify the data modeler and the source system , either one should be correct and have to make the changes from their end. In this case, we might look for the record in the target when it has Unknown values. The article is very helpful and informative especially on the breakdown over each of the critical components to complete a successful S2T data mapping exercise.

I often found data professionals frustrated choosing the following when it comes to data mapping exercise: 1 Excel not scalable — painful in version controls, scattered copies and team collaboration, etc. Though Excel is extensively used in data mapping phase but it is prone to errors. You are commenting using your WordPress.

You are commenting using your Google account.



0コメント

  • 1000 / 1000