# Transformer Reference

Transformers are used for extracting and converting values which then can be used by matchers. For a minimal working configuration you need at least one extract transformer and one output.

# Convert List

The convert list transformation accepts a list of values as an input, applies the selected operation and returns a modified list.

This transformer supports a dynamic amount of inputs. For each input exactly one output will be available. Each input is independent from the other inputs.

The following operations are available:

# Extract

The extract operation selects one or multiple sub values from each entry in the list using the provided path(s). If multiple values were selected, then they will be joined as a single text, optionally separated by the provided separator.

For a detailed description about the path syntax, please refer to the extract transformer description.

# Example with One Path

  • Path: aValue
  • Separator: ignored
  • Must Exist: disabled
Input
Output
[
  {
    "aValue": "A1",
    "bValue": "B1"
  },
  {
    "aValue": "A2"
  },
  {
    "bValue": "B3"
  },
  {
    "aValue": "A4",
    "bValue": "B4"
  }
]
[
  "A1",
  "A2",
  null,
  "A4"
]

# Example with Two Paths

  • Paths: aValue and bValue
  • Separator: ,
  • Must Exist: disabled
Input
Output
[
  {
    "aValue": "A1",
    "bValue": "B1"
  },
  {
    "aValue": "A2"
  },
  {
    "bValue": "B3"
  },
  {
    "aValue": "A4",
    "bValue": "B4"
  }
]
[
  "A1,B1",
  "A2",
  "B3",
  "A4,B4"
]

# Make Unique

Make unique removes duplicate entries from the list. Two entries are considered duplicates if their type and value are equal.

# Example:

Input
Output
[
  "A",
  "A",
  1234,
  "B",
  1234,
  "1234"
]
[
  "A",
  1234,
  "B",
  "1234"
]

# Slice

Slice returns a part of the provided list. If the offset is larger than the length of the provided list, then an empty list will be returned. If offset plus limit is larger than the provided list, then only the remaining part will be returned. The first element in the list is at offset 0.

# Example:

  • Offset: 1
  • Limit: 2
Input
Output
[
  "A",
  "B",
  "C",
  "D",
  "E"
]
[
  "B",
  "C"
]

# Sort

Sort orders the lists entries in ascending or descending order.

When ordering ascending with different data types for the entries, then numeric values will be sorted first, then texts, other values and null values. Sorting descending will reverse this order. The order for non-numeric and non-text values is undefined.

# Example

  • Ascending: enabled
Input
Output
[
  "A",
  2,
  1.5,
  "C",
  "B",
  "1"
]
[
  1.5,
  2,
  "1",
  "A",
  "B",
  "C"
]

# Verify Size

Verify size will ensure that the provided lists size is within the boundaries. The maximum or minimum size can be disabled by providing -1. If the size requirements were not met, then an error is returned otherwise the original list is returned.

# Example Outside Bounds

  • Minimum Size: 3
  • Maximum Size: 10
Input
Output
[
  "A",
  "B"
]
(error)

# Example Inside Bounds

  • Minimum Size: 3
  • Maximum Size: 10
Input
Output
[
  "A",
  "B",
  "C"
]
[
  "A",
  "B",
  "C"
]

# Convert Text

The convert text transformation accepts a text, applies the selected operation and returns the modified text.

This transformer supports a dynamic amount of inputs. For each input exactly one output will be available. Each input is independent from the other inputs.

The following operations are available:

# Hash Value

Hash value applies one of the provided hash functions on the text input and returns the hash in hex format.

# Example

  • Hash Function: MD5
Input
Output
"Tilores"
"ef8394d0e4896d07e70e4106df2bf560"

# Keep Only Numbers

Keep only numbers removes all non-number characters.

# Example

Input
Output
"T1L0R35"
"1035"

# Normalize Diacritical Characters

Normalize diacritical characters replaces characters such as German Umlaut with their character base.

# Example

Input
Output
"än éᶍample"
"an example"

# Normalize White-Spaces

Normalize white-spaces fixes duplicate horizontal white-spaces such as the space character or tabs and replaces them with a single white space. White-spaces at the beginning or the end of the text will be removed completely.

# Example

Input
Output
"  an       Examp le  "
"an Examp le"

# Remove All Numbers

Remove all numbers removes all number characters.

# Example

Input
Output
"T1L0R35"
"TLR"

# Remove Common Names

Remove common names replaces the text with an empty text if it is in the selected preset name list.

# Example with Common Name

  • Preset: First Names (US)
  • Number of Top Most Common Names to Ignore: 20
Input
Output
"michael"
""

# Example with Uncommon Name

  • Preset: First Names (US)
  • Number of Top Most Common Names to Ignore: 20
Input
Output
"tilo"
"tilo"

# Remove Spaces

Remove spaces removes all spaces from the provided text.

# Example

Input
Output
"  an       Examp le  "
"anExample"

# Replace Text

Replace text searches for the provided text and replaces all occurrences with the new value. If the search value was not found, then the original text is returned.

# Example with Simple Replacement

  • Search: old
  • Replace With: new
  • Use Regular Expression: disabled
Input
Output
"old town with old church"
"new town with new church"

# Example with Regular Expression

  • Search: ^(.*), (.*)$
  • Replace With: new: $2 $1
  • Use Regular Expression: enabled
Input
Output
"Smith, John"
"new: John Smith"

# Example with Non-Matching Regular Expression

  • Search: ^(.*), (.*)$
  • Replace With: new: $2 $1
  • Use Regular Expression: enabled
Input
Output
"John Smith"
"John Smith"

# Use Substring

Use substring returns the first characters of the provided text.

# Example

  • Length: 4
Input
Output
"Tilores"
"Tilo"

# Extract

The extract transformation selects a value from the input using the provided path.

The transformer supports exactly one input, but provides a dynamic amount of outputs (each containing the same value).

The extract can work in either the simple path mode or in a jq-like path syntax.

For the jq-like path syntax, please refer to the official jq documentation. Please note, that not all jq features might be available. Furthermore, please note, that using the jq-like syntax might have a negative impact on the performance and should be avoided if possible.

In simple path mode, the field names and list indexes are separated by a single dot ..

# Simple Path Mode Examples

  • Path: firstName
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
{
  "firstName": "John",
  "lastName": "Smith"
}
"john"
  • Path: name.first
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
{
  "name": {
    "first": "John",
    "last": "Smith"
  }
}
"john"
  • Path: names.1.first
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
{
  "names": [
    {
      "first": "Jane",
      "last": "Doe",
    },
    {
      "first": "John",
      "last": "Smith"
    }
  ]
}
"john"
  • Path: name
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
{
  "name": {
    "first": "John",
    "last": "Smith"
  }
}
{
  "first": "john",
  "last": "smith"
}
  • Path: (empty)
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
"John"
"john"
  • Path: firstName
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
{
  "firstName": "",
  "lastName": "Smith"
}
null
  • Path: firstName
  • Case Sensitive: disabled
  • Must Exist: disabled
Input
Output
{
  "lastName": "Smith"
}
null
  • Path: firstName
  • Case Sensitive: disabled
  • Must Exist: enabled
Input
Output
{
  "firstName": "",
  "lastName": "Smith"
}
(error)
  • Path: firstName
  • Case Sensitive: enabled
  • Must Exist: disabled
Input
Output
{
  "firstName": "John",
  "lastName": "Smith"
}
"John"

# Filter In/Out

The filter in/out transformer removes or keeps only certain values.

This transformer supports exactly one input and output.

# Examples

  • Filter Out: disabled
  • List of Values to Filter: ["A", "B"]
Input
Output
"A"
"A"
  • Filter Out: disabled
  • List of Values to Filter: ["A", "B"]
Input
Output
"C"
""
  • Filter Out: enabled
  • List of Values to Filter: ["A", "B"]
Input
Output
"A"
""
  • Filter Out: enabled
  • List of Values to Filter: ["A", "B"]
Input
Output
"C"
"C"

# Flip Values

The flip values transformation accepts two inputs and flips them during indexing. This is an easy way to compare values from input a with values from input b during matching or searching.

This transformer supports exactly two inputs and two outputs.

# Example

  • Phase: Indexing
Input (a)
Input (b)
Output (a)
Output (b)
"John"
"Jane"
"Jane"
"John"
  • Phase: Linking or Searching
Input (a)
Input (b)
Output (a)
Output (b)
"John"
"Jane"
"John"
"Jane"

# Fork

The fork transformer can be used for simple and complex branching, including IF and SWITCH statements or cloning values for different results.

The transformer supports a dynamic number of inputs and will create exactly the same number of outputs for each configured condition, e.g. two inputs and three conditions will result in six outputs.

The fork strategy will define whether all conditions will be checked or if the check stops after the first satisfied condition (treating all other conditions as if they were not satisfied).

# Conditions

The output for each condition will contain a value if the condition is satisfied, otherwise its outputs will all be null.

Each condition must have one of the following condition types.

# Equal Values

Equal values will be satisfied if the value input equals the provided static value or another input. Whether two inputs or one input and a static value will be compared, can be toggled using the "compare with other input" checkbox.

All equality checks are case sensitive. If the data was extracted originally without the case sensitive option in the extract transformer, then you must provide a lower case variant for the static value.

# Always True

This condition will always be satisfied. Use this for cloning values or creating a else or default branch when setting up an IF or SWITCH statement.

# Match Regular Expression

This will be satisfied if the input matches the provided regular expression.

All equality checks are case sensitive. If the data was extracted originally without the case sensitive option in the extract transformer, then ensure that your regular expression matches with any lower case variant of the input.

# Rule Set Type

This will be satisfied if the current phase is one of the selected phases. Use with caution as this might lead to unexpected, but correct results.

# Example for Cloning Values

When working with the same data and applying two or more different transformations afterwards an easy way would be to use multiple outputs from one extract transformer. While this might require you to duplicate some common transformations, the alternative might be to split a single branch using the fork into multiple branches that contain the same data.

  • 1 Input
  • Fork Strategy: All Satisfied Conditions
  • Condition 1:
    • Condition Type: Always True
  • Condition 2:
    • Condition Type: Always True
Input
Output #1
Output #2
"John"
"John"
"John"
  • 2 Inputs
  • Fork Strategy: All Satisfied Conditions
  • Condition 1:
    • Condition Type: Always True
  • Condition 2:
    • Condition Type: Always True
Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
Output #2 (1)
Output #2 (2)
"John"
"Smith"
"John"
"Smith"
"John"
"Smith"

# Example for IF Statement

A simple IF statement (allowing a value to pass if a condition is satisfied) is also possible.

  • 2 Inputs
  • Fork Strategy: First Statisfied Condition
  • Condition 1:
    • Condition Type: Equal Values
    • Value From Input: 1
    • Compare With Other Input: disabled
    • Equals Value: John

With matching condition:

Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
"John"
"Smith"
"John"
"Smith"

Without matching condition:

Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
"Jim"
"Smith"
null
null

# Example for IF-ELSE Statement

This is an extension of a IF statement with an else branch.

  • 2 Inputs
  • Fork Strategy: First Statisfied Condition
  • Condition 1:
    • Condition Type: Equal Values
    • Value From Input: 1
    • Compare With Other Input: disabled
    • Equals Value: John
  • Condition 2:
    • Condition Type: Always True

With matching condition:

Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
Output #2 (1)
Output #2 (2)
"John"
"Smith"
"John"
"Smith"
null
null

Without matching condition:

Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
Output #2 (1)
Output #2 (2)
"Jim"
"Smith"
null
null
"Jim"
"Smith"

# Example for Comparing Inputs

This is similar like the simple IF statement, with the difference that two inputs were compared.

  • 2 Inputs
  • Fork Strategy: First Statisfied Condition
  • Condition 1:
    • Condition Type: Equal Values
    • Value From Input: 1
    • Compare With Other Input: enabled
    • Equals Value from Input: 2

With matching condition:

Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
"John"
"John"
"John"
"John"

Without matching condition:

Input (1)
Input (2)
Output #1 (1)
Output #1 (2)
"Jim"
"John"
null
null

# Example for SWITCH Statement

This is an example for a SWITCH statement where multiple conditions can be satisfied. For simplicity this example only uses one input.

  • 1 Input
  • Fork Strategy: All Statisfied Conditions
  • Condition #1:
    • Condition Type: Equal Values
    • Value From Input: 1
    • Compare With Other Input: disabled
    • Equals Value: John
  • Condition #2:
    • Condition Type: Equal Values
    • Value From Input: 1
    • Compare With Other Input: disabled
    • Equals Value: Jim
  • Condition #3:
    • Condition Type: Match Regular Expression
    • Value From Input: 1
    • Matches With Regular Expression: J.* (meaning: must start with J)

Conditions #1 and #3 satisfied:

Input (1)
Output #1 (1):
Output #2 (1):
Output #3 (1):
"John"
"John"
null
"John"

Conditions #2 and #3 satisfied:

Input (1)
Output #1 (1):
Output #2 (1):
Output #3 (1):
"Jim"
null
"Jim"
"Jim"

Condition #3 satisfied:

Input (1)
Output #1 (1):
Output #2 (1):
Output #3 (1):
"Jane"
null
null
"Jane"

# Generate Value

The generate value transformation creates new values that can be used within the whole transformation process. This may be useful for fallback/default values.

This transformer supports exactly one input and output.

The following generation strategy are available:

# Static Text

This strategy returns the configured value at the output.

# Example

  • Value: John Smith
Input
Output
(anything)
"John Smith"

# Random Number

This strategy returns a random number. The lowest and highest number are also possible returned values.

# Example

  • Lowest Number: 0
  • Highest Number: 10
Input
Output
(anything)
3

# Number Range

This strategy returns a list with all the numbers between the lowest and highest number (including both values).

# Example

  • Lowest Number: 0
  • Highest Number: 10
Input
Output
(anything)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Join

The join transformer can be used for merging multiple branches into a single branch. This is e.g. helpful after a fork or when the same data is present in different fields. Furthermore the join is also helpful when working with text lists to create a single, text out of it.

The join supports a dynamic number of outputs and by default one input for each output. It is possible to change how many inputs are expected by modifying the input per outputs configuration.

The join strategy will define how the data is processed.

# First Non-Empty Value

This will return the first non-empty value from the inputs with the same name and return it as the output.

# Example

  • 2 Outputs
  • Inputs Per Output: 2
Input #1 (1)
Input #1 (2)
Input #2 (1)
Input #2 (2)
Output (1)
Output (2)
"John"
null
"J."
"Smith"
"John"
"Smith"

# Concatenate Texts

This will concatenate the texts with the same input name into the corresponding output, optionally separated by the provided separator. Empty values (null) and empty texts will be ignored.

If an input contains a string list, then each element will be concatenated.

# Example with String Inputs

  • 2 Outputs
  • Inputs Per Output: 2
Input #1 (1)
Input #1 (2)
Input #2 (1)
Input #2 (2)
Output (1)
Output (2)
"John"
null
"J."
"Smith"
"John,J."
"Smith"

# Example with String Lists

  • 1 Output
  • Inputs Per Output: 1
Input
Output
["John", "", "Jim", null, "Jane"]
"John,Jim,Jane"

# Merge Into Array

This will merge the values with the same input name into a list for the corresponding output. Empty values (null) will be ignored.

  • 2 Outputs
  • Inputs Per Output: 2
Input #1 (1)
Input #1 (2)
Input #2 (1)
Input #2 (2)
Output (1)
Output (2)
"John"
null
"J."
"Smith"
["John", "J."]
["Smith"]

# Make Date

The make date transformer will construct a date from different inputs. This is an easy way to combine multiple date related fields into a single text representation.

Examples of valid values for the full input, where 2023 represents the year, 4 represents the month and 5 represents the day:

  • 2023-04-05
  • 2023-4-5
  • 2023-04-5
  • 2023-04
  • 2023

The other inputs must be either a number or a text that can be interpreted as a number.

The output will combine different inputs, but prioritize the value from the full input.

The full output will always be in the 2023-04-05 format. Missing values will be filled with 0, unless the validate option is enabled. The other outputs will be strings representing the corresponding part from the full output, e.g. 04

If the validate option is enabled and the year, month or day is empty, then an error will be returned.

This transformer supports the following inputs and outputs: full, year, month and day.

# Example with Full Input Priority

  • Validate: enabled
Input (full)
Input (year)
Input (month)
Input (day)
Output (full)
Output (year)
Output (month)
Output (day)
"2023-04"
null
"12"
5
"2023-04-05"
"2023"
"04"
"05"

# Examples with Partial Date

  • Validate: enabled
Input (full)
Input (year)
Input (month)
Input (day)
Output (full)
Output (year)
Output (month)
Output (day)
"2023-04"
null
null
null
(error)
(error)
(error)
(error)
  • Validate: disabled
Input (full)
Input (year)
Input (month)
Input (day)
Output (full)
Output (year)
Output (month)
Output (day)
"2023-04"
null
null
null
"2023-04-00"
"2023"
"04"
"00"

# Examples with Invalid Date

  • Validate: enabled
Input (full)
Input (year)
Input (month)
Input (day)
Output (full)
Output (year)
Output (month)
Output (day)
null
2023
2
31
(error)
(error)
(error)
(error)
  • Validate: disabled
Input (full)
Input (year)
Input (month)
Input (day)
Output (full)
Output (year)
Output (month)
Output (day)
null
2023
2
31
"2023-02-31"
"2023"
"02"
"31"

# Normalize Company

The normalize company transformer will split the provided text into a company name and its legal form if present.

This transformer supports exactly one input and two outputs.

# Example

Input
Output (companyName)
Output (legalForm)
"Tilo Tech GmbH"
"Tilo Tech"
"GmbH"

# Normalize Phone Number

The normalize phone number transformer will normalize the provided phone number and split it into a local number and a country code.

This transformer supports exactly two inputs and two outputs.

The default country is used to provide context information for better number recognition. You can either provide a record specific value using the extract transformer, or provide a static value using the generate value transformer. Different country name formats will be accepted, but it is recommended to provide a ISO 3166-1 Alpha-2 or Alpha-3 code.

The presence of a country code in the phone number (typically represented by a + sign, will overwrite the default country). If the country cannot be assumed and no default country was provided, then the phone number cannot be normalized.

# Example

Input (number)
Input (defaultCountry)
Output (nationalNumber)
Output (countryCode)
"+49650-253-0000"
"US"
"6502530000"
"49"

In this example, the default country is ignored, because the +49 in the input number clearly indicates a german phone number.

# Output

The output must be the final step for each transformation branch. Only outputs can be used from matchers. Each output must have a unique label to identify it later in the matchers.