Data Analytics - Ask The Expert Recurring Series Monarch & Monarch Server(Automator) Basic Process Input and Output with Distribution in Monarch Server v.2020/v.2021

Rebecca_Cronin
Rebecca_Cronin
Altair Employee
edited July 2 in Altair RapidMiner

Ask The Expert Series on February 8, 2022 
Monarch & Monarch Server(Automator)


The Session began with a brief overview of Basic Process with Input and Output Distribution in Monarch Server v.2020/v.2021 before diving into the LIVE Questions. 

The Session began with our Expert reviewing the following:

  • Session began with answer of a BUG issue discovered with the Export File by ALL Filters(bursting summary) within Monarch Complete(DPS Mode)  in that one is not able to select the Browse Folder.  This is to be fixed in next release v.2021.2.  This is not experience in Classic Mode. (review first 6 minutes of the session)
  • Basic Process with Input and Output Distribution in Monarch Server v.2020/v.2021 (reviewed at approximately 10-mins after the session started. 
    • Reviewed Server Library
    • Standard Processes - review Projects (13 mins-26 mins)
    • Visual Process w/Model - reviewing Input/Output (26-33 minute time frame)
    • Visual Process w/Workspace -reviewed Input/Output and Workspace MUST BE in the Server Library (33-minute time frame)

LIVE Questions submitted:

Monarch: 
1.  How does DPS help with FUZZY matches between two files or tables? I.E. AR Aging and AP Aging looking for AR contra account names?

  • Monarch Data Prep Studio makes an allowance for potential errors in spelling that would result in a mismatch even when the keys are highly similar (e.g., "bond" vs. "bund") during a join operation.  Fuzzy matching is accomplished in two steps:  Qualification and Refinement. For two keys to be considered similar, Monarch Data Prep Studio computes a "Phonetic Key" using an algorithm that produces matches in a "sounds-like" manner. Once a phonetic match is achieved, refinement is performed. In this step, an "Edit Distance" (i.e., tolerance) is calculated. The edit distance is defined as the minimum number of keystrokes required to make the two keys equal (maximum of 20 strokes). These two steps are completed in sequence. Thus, if two keys are not phonetically similar (Step 1), Step 2 is no longer performed. Because "Bills" sounds very similar to "Bulls," the rows containing these values are matched, and the percent match of these keys is considered 95%. However, because "Patriots" is phonetically dissimilar to "Pastriots,"  the rows containing these values are not matched even if only a very small number of strokes is necessary to match them (Step 2). The same is true for rows including "Dolphins" and "Dolphons."   Tables obtained through fuzzy matching are added to the table selector as usual and may be prepped as necessary.

Monarch Server: 

1. Is there a way to organize my LONG List of Processes into a folder Structure?

  • NO, currently at this time not able to do so. 

Additional Resources:

Adding a Process

Distributions

Processes

Using Fuzzy Joins