Append vs. Join?
Hi Datawatchers -
I'm fairly new to Monarch and the DPS, but so far I'm really enjoying it. That said, can anyone explain (in as simple terms as possible) the difference between creating an append versus a join in Data Prep Studio? I'm mostly confused with how the append function works, as well as how this is different than creating an append template in report design. If able, can you please provide a scenario where an append would be more appropriate than join? Any insight/explanation is greatly appreciated!
CD
Answers
-
Chris,
The two "Appends" are for entirely different purposes.
In Classic terms an Append is a data element that can be associated with one or more detail records.
So, for example, in an invoice the detail lines would be the line items listing what was purchased and an append (data added to be available to every detail line) might be the Invoice header details such as Invoice Number, Data, Due date or whatever information is felt to be necessary.
The Append feature in DPS is more like Concatenation if you are looking for a simple comparison.
Basic concatenation would allow the creation of a single file from multiple files.
If all source files are of the same information structure then you create a bigger report or a larger database (if a csv file for example.) However the expectation it that no new FIELDS are added to the original table which is to host the new appended data. Just more rows populating some or all of the existing fields.
In certain circumstances a "smart" concatenation tool, as in DPS Append functionality, allows inputs from data with basically the same data fields but some differences in which fields are available, to be treated as is that are in fact "same enough" to be concatenated into the same structure. There is a bit more to it than that but to keep it simple that is a starting point.
A Join, on the other hand, is usually the combining of data form two different source related but dissimilar sources to create a new data table. It makes use of some common values present in fields in both data sources to create a new, expanded, data table. In effect each record will be expanded to include additional fields.
In some rare situations there may just seem to be overlapping options for whether to use an Append (for example on a predefined table to be populated from several source files that may not all be consistent to each other) or a Join (probably followed by some new field calculation to homogenize the enhanced data set) but mostly it should relatively easy to decide whether DPS Append or Join is the way to go for a specific need.
HTH.
Grant
0