#
Transformer Reference
Transformers are used for extracting and converting values which then can be
used by matchers. For a minimal working configuration you need at
least one
#
Convert List
The convert list transformation accepts a list of values as an input, applies the selected operation and returns a modified list.
This transformer supports a dynamic amount of inputs. For each input exactly one output will be available. Each input is independent from the other inputs.
The following operations are available:
#
Extract
The extract operation selects one or multiple sub values from each entry in the list using the provided path(s). If multiple values were selected, then they will be joined as a single text, optionally separated by the provided separator.
For a detailed description about the path syntax, please refer to the
#
Example with One Path
- Path:
aValue
- Separator: ignored
- Must Exist: disabled
[
{
"aValue": "A1",
"bValue": "B1"
},
{
"aValue": "A2"
},
{
"bValue": "B3"
},
{
"aValue": "A4",
"bValue": "B4"
}
]
[
"A1",
"A2",
null,
"A4"
]
#
Example with Two Paths
- Paths:
aValue
andbValue
- Separator:
,
- Must Exist: disabled
[
{
"aValue": "A1",
"bValue": "B1"
},
{
"aValue": "A2"
},
{
"bValue": "B3"
},
{
"aValue": "A4",
"bValue": "B4"
}
]
[
"A1,B1",
"A2",
"B3",
"A4,B4"
]
#
Make Unique
Make unique removes duplicate entries from the list. Two entries are considered duplicates if their type and value are equal.
#
Example:
[
"A",
"A",
1234,
"B",
1234,
"1234"
]
[
"A",
1234,
"B",
"1234"
]
#
Slice
Slice returns a part of the provided list. If the offset is larger than the length of the provided list, then an empty list will be returned. If offset plus limit is larger than the provided list, then only the remaining part will be returned. The first element in the list is at offset 0.
#
Example:
- Offset: 1
- Limit: 2
[
"A",
"B",
"C",
"D",
"E"
]
[
"B",
"C"
]
#
Sort
Sort orders the lists entries in ascending or descending order.
When ordering ascending with different data types for the entries, then numeric values will be sorted first, then texts, other values and null values. Sorting descending will reverse this order. The order for non-numeric and non-text values is undefined.
#
Example
- Ascending: enabled
[
"A",
2,
1.5,
"C",
"B",
"1"
]
[
1.5,
2,
"1",
"A",
"B",
"C"
]
#
Verify Size
Verify size will ensure that the provided lists size is within the boundaries.
The maximum or minimum size can be disabled by providing -1
. If the size
requirements were not met, then an error is returned otherwise the original list
is returned.
#
Example Outside Bounds
- Minimum Size: 3
- Maximum Size: 10
[
"A",
"B"
]
(error)
#
Example Inside Bounds
- Minimum Size: 3
- Maximum Size: 10
[
"A",
"B",
"C"
]
[
"A",
"B",
"C"
]
#
Convert Text
The convert text transformation accepts a text, applies the selected operation and returns the modified text.
This transformer supports a dynamic amount of inputs. For each input exactly one output will be available. Each input is independent from the other inputs.
The following operations are available:
#
Hash Value
Hash value applies one of the provided hash functions on the text input and returns the hash in hex format.
#
Example
- Hash Function: MD5
"Tilores"
"ef8394d0e4896d07e70e4106df2bf560"
#
Keep Only Numbers
Keep only numbers removes all non-number characters.
#
Example
"T1L0R35"
"1035"
#
Normalize Diacritical Characters
Normalize diacritical characters replaces characters such as German Umlaut with their character base.
#
Example
"än éᶍample"
"an example"
#
Normalize White-Spaces
Normalize white-spaces fixes duplicate horizontal white-spaces such as the space character or tabs and replaces them with a single white space. White-spaces at the beginning or the end of the text will be removed completely.
#
Example
" an Examp le "
"an Examp le"
#
Remove All Numbers
Remove all numbers removes all number characters.
#
Example
"T1L0R35"
"TLR"
#
Remove Common Names
Remove common names replaces the text with an empty text if it is in the selected preset name list.
#
Example with Common Name
- Preset: First Names (US)
- Number of Top Most Common Names to Ignore: 20
"michael"
""
#
Example with Uncommon Name
- Preset: First Names (US)
- Number of Top Most Common Names to Ignore: 20
"tilo"
"tilo"
#
Remove Spaces
Remove spaces removes all spaces from the provided text.
#
Example
" an Examp le "
"anExample"
#
Replace Text
Replace text searches for the provided text and replaces all occurrences with the new value. If the search value was not found, then the original text is returned.
#
Example with Simple Replacement
- Search:
old
- Replace With:
new
- Use Regular Expression: disabled
"old town with old church"
"new town with new church"
#
Example with Regular Expression
- Search:
^(.*), (.*)$
- Replace With:
new: $2 $1
- Use Regular Expression: enabled
"Smith, John"
"new: John Smith"
#
Example with Non-Matching Regular Expression
- Search:
^(.*), (.*)$
- Replace With:
new: $2 $1
- Use Regular Expression: enabled
"John Smith"
"John Smith"
#
Use Substring
Use substring returns the first characters of the provided text.
#
Example
- Length: 4
"Tilores"
"Tilo"
#
Extract
The extract transformation selects a value from the input using the provided path.
The transformer supports exactly one input, but provides a dynamic amount of outputs (each containing the same value).
The extract can work in either the simple path mode or in a jq-like path syntax.
For the jq-like path syntax, please refer to the official jq documentation. Please note, that not all jq features might be available. Furthermore, please note, that using the jq-like syntax might have a negative impact on the performance and should be avoided if possible.
In simple path mode, the field names and list indexes are separated by a single
dot .
.
#
Simple Path Mode Examples
- Path:
firstName
- Case Sensitive: disabled
- Must Exist: disabled
{
"firstName": "John",
"lastName": "Smith"
}
"john"
- Path:
name.first
- Case Sensitive: disabled
- Must Exist: disabled
{
"name": {
"first": "John",
"last": "Smith"
}
}
"john"
- Path:
names.1.first
- Case Sensitive: disabled
- Must Exist: disabled
{
"names": [
{
"first": "Jane",
"last": "Doe",
},
{
"first": "John",
"last": "Smith"
}
]
}
"john"
- Path:
name
- Case Sensitive: disabled
- Must Exist: disabled
{
"name": {
"first": "John",
"last": "Smith"
}
}
{
"first": "john",
"last": "smith"
}
- Path:
- Case Sensitive: disabled
- Must Exist: disabled
"John"
"john"
- Path:
firstName
- Case Sensitive: disabled
- Must Exist: disabled
{
"firstName": "",
"lastName": "Smith"
}
null
- Path:
firstName
- Case Sensitive: disabled
- Must Exist: disabled
{
"lastName": "Smith"
}
null
- Path:
firstName
- Case Sensitive: disabled
- Must Exist: enabled
{
"firstName": "",
"lastName": "Smith"
}
(error)
- Path:
firstName
- Case Sensitive: enabled
- Must Exist: disabled
{
"firstName": "John",
"lastName": "Smith"
}
"John"
#
Filter In/Out
The filter in/out transformer removes or keeps only certain values.
This transformer supports exactly one input and output.
#
Examples
- Filter Out: disabled
- List of Values to Filter:
["A", "B"]
"A"
"A"
- Filter Out: disabled
- List of Values to Filter:
["A", "B"]
"C"
""
- Filter Out: enabled
- List of Values to Filter:
["A", "B"]
"A"
""
- Filter Out: enabled
- List of Values to Filter:
["A", "B"]
"C"
"C"
#
Flip Values
The flip values transformation accepts two inputs and flips them during indexing.
This is an easy way to compare values from input a
with values from input b
during matching or searching.
This transformer supports exactly two inputs and two outputs.
#
Example
- Phase: Indexing
"John"
"Jane"
"Jane"
"John"
- Phase: Linking or Searching
"John"
"Jane"
"John"
"Jane"
#
Fork
The fork transformer can be used for simple and complex branching, including IF and SWITCH statements or cloning values for different results.
The transformer supports a dynamic number of inputs and will create exactly the same number of outputs for each configured condition, e.g. two inputs and three conditions will result in six outputs.
The fork strategy will define whether all conditions will be checked or if the check stops after the first satisfied condition (treating all other conditions as if they were not satisfied).
#
Conditions
The output for each condition will contain a value if the condition is satisfied,
otherwise its outputs will all be null
.
Each condition must have one of the following condition types.
#
Equal Values
Equal values will be satisfied if the value input equals the provided static value or another input. Whether two inputs or one input and a static value will be compared, can be toggled using the "compare with other input" checkbox.
All equality checks are case sensitive. If the data was extracted originally
without the case sensitive option in the
#
Always True
This condition will always be satisfied. Use this for cloning values or creating a else or default branch when setting up an IF or SWITCH statement.
#
Match Regular Expression
This will be satisfied if the input matches the provided regular expression.
All equality checks are case sensitive. If the data was extracted originally
without the case sensitive option in the
#
Rule Set Type
This will be satisfied if the current phase is one of the selected phases. Use with caution as this might lead to unexpected, but correct results.
#
Example for Cloning Values
When working with the same data and applying two or more different transformations
afterwards an easy way would be to use multiple outputs from one
- 1 Input
- Fork Strategy: All Satisfied Conditions
- Condition 1:
- Condition Type: Always True
- Condition 2:
- Condition Type: Always True
"John"
"John"
"John"
- 2 Inputs
- Fork Strategy: All Satisfied Conditions
- Condition 1:
- Condition Type: Always True
- Condition 2:
- Condition Type: Always True
"John"
"Smith"
"John"
"Smith"
"John"
"Smith"
#
Example for IF Statement
A simple IF statement (allowing a value to pass if a condition is satisfied) is also possible.
- 2 Inputs
- Fork Strategy: First Statisfied Condition
- Condition 1:
- Condition Type: Equal Values
- Value From Input: 1
- Compare With Other Input: disabled
- Equals Value:
John
With matching condition:
"John"
"Smith"
"John"
"Smith"
Without matching condition:
"Jim"
"Smith"
null
null
#
Example for IF-ELSE Statement
This is an extension of a IF statement with an else branch.
- 2 Inputs
- Fork Strategy: First Statisfied Condition
- Condition 1:
- Condition Type: Equal Values
- Value From Input: 1
- Compare With Other Input: disabled
- Equals Value:
John
- Condition 2:
- Condition Type: Always True
With matching condition:
"John"
"Smith"
"John"
"Smith"
null
null
Without matching condition:
"Jim"
"Smith"
null
null
"Jim"
"Smith"
#
Example for Comparing Inputs
This is similar like the simple IF statement, with the difference that two inputs were compared.
- 2 Inputs
- Fork Strategy: First Statisfied Condition
- Condition 1:
- Condition Type: Equal Values
- Value From Input: 1
- Compare With Other Input: enabled
- Equals Value from Input: 2
With matching condition:
"John"
"John"
"John"
"John"
Without matching condition:
"Jim"
"John"
null
null
#
Example for SWITCH Statement
This is an example for a SWITCH statement where multiple conditions can be satisfied. For simplicity this example only uses one input.
- 1 Input
- Fork Strategy: All Statisfied Conditions
- Condition #1:
- Condition Type: Equal Values
- Value From Input: 1
- Compare With Other Input: disabled
- Equals Value:
John
- Condition #2:
- Condition Type: Equal Values
- Value From Input: 1
- Compare With Other Input: disabled
- Equals Value:
Jim
- Condition #3:
- Condition Type: Match Regular Expression
- Value From Input: 1
- Matches With Regular Expression:
J.*
(meaning: must start withJ
)
Conditions #1 and #3 satisfied:
"John"
"John"
null
"John"
Conditions #2 and #3 satisfied:
"Jim"
null
"Jim"
"Jim"
Condition #3 satisfied:
"Jane"
null
null
"Jane"
#
Generate Value
The generate value transformation creates new values that can be used within the whole transformation process. This may be useful for fallback/default values.
This transformer supports exactly one input and output.
The following generation strategy are available:
#
Static Text
This strategy returns the configured value at the output.
#
Example
- Value:
John Smith
(anything)
"John Smith"
#
Random Number
This strategy returns a random number. The lowest and highest number are also possible returned values.
#
Example
- Lowest Number: 0
- Highest Number: 10
(anything)
3
#
Number Range
This strategy returns a list with all the numbers between the lowest and highest number (including both values).
#
Example
- Lowest Number: 0
- Highest Number: 10
(anything)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
#
Join
The join transformer can be used for merging multiple branches into a single
branch. This is e.g. helpful after a
The join supports a dynamic number of outputs and by default one input for each output. It is possible to change how many inputs are expected by modifying the input per outputs configuration.
The join strategy will define how the data is processed.
#
First Non-Empty Value
This will return the first non-empty value from the inputs with the same name and return it as the output.
#
Example
- 2 Outputs
- Inputs Per Output: 2
"John"
null
"J."
"Smith"
"John"
"Smith"
#
Concatenate Texts
This will concatenate the texts with the same input name into the corresponding
output, optionally separated by the provided separator. Empty values (null
)
and empty texts will be ignored.
If an input contains a string list, then each element will be concatenated.
#
Example with String Inputs
- 2 Outputs
- Inputs Per Output: 2
"John"
null
"J."
"Smith"
"John,J."
"Smith"
#
Example with String Lists
- 1 Output
- Inputs Per Output: 1
["John", "", "Jim", null, "Jane"]
"John,Jim,Jane"
#
Merge Into Array
This will merge the values with the same input name into a list for the
corresponding output. Empty values (null
) will be ignored.
- 2 Outputs
- Inputs Per Output: 2
"John"
null
"J."
"Smith"
["John", "J."]
["Smith"]
#
Make Date
The make date transformer will construct a date from different inputs. This is an easy way to combine multiple date related fields into a single text representation.
Examples of valid values for the full input, where 2023
represents the year,
4
represents the month and 5
represents the day:
- 2023-04-05
- 2023-4-5
- 2023-04-5
- 2023-04
- 2023
The other inputs must be either a number or a text that can be interpreted as a number.
The output will combine different inputs, but prioritize the value from the full input.
The full output will always be in the 2023-04-05
format. Missing values will
be filled with 0, unless the validate option is enabled. The other outputs will
be strings representing the corresponding part from the full output, e.g. 04
If the validate option is enabled and the year, month or day is empty, then an error will be returned.
This transformer supports the following inputs and outputs: full, year, month and day.
#
Example with Full Input Priority
- Validate: enabled
"2023-04"
null
"12"
5
"2023-04-05"
"2023"
"04"
"05"
#
Examples with Partial Date
- Validate: enabled
"2023-04"
null
null
null
(error)
(error)
(error)
(error)
- Validate: disabled
"2023-04"
null
null
null
"2023-04-00"
"2023"
"04"
"00"
#
Examples with Invalid Date
- Validate: enabled
null
2023
2
31
(error)
(error)
(error)
(error)
- Validate: disabled
null
2023
2
31
"2023-02-31"
"2023"
"02"
"31"
#
Normalize Company
The normalize company transformer will split the provided text into a company name and its legal form if present.
This transformer supports exactly one input and two outputs.
#
Example
"Tilo Tech GmbH"
"Tilo Tech"
"GmbH"
#
Normalize Phone Number
The normalize phone number transformer will normalize the provided phone number and split it into a local number and a country code.
This transformer supports exactly two inputs and two outputs.
The default country is used to provide context information for better number
recognition. You can either provide a record specific value using the
The presence of a country code in the phone number (typically represented by a
+
sign, will overwrite the default country). If the country cannot be assumed
and no default country was provided, then the phone number cannot be normalized.
#
Example
"+49650-253-0000"
"US"
"6502530000"
"49"
In this example, the default country is ignored, because the +49
in the input
number clearly indicates a german phone number.
#
Output
The output must be the final step for each transformation branch. Only outputs can be used from matchers. Each output must have a unique label to identify it later in the matchers.