# How to Create a Golden Record?

A golden record represents a unified and aggregated version of a single entity. Typically, an entity in Tilores is represented as a collection of records, where each record was added and automatically linked to other relevant records. Each of these records may contain different spellings of the same thing (e.g. different variations of a persons first name). Identifying the true value is challenging and may depend on the use case.

# Person Example

For the purpose of this example, we will focus on entities representing a person. The same concepts and thoughts can be applied for other types of entities. Due to Tilores' highly flexible API you can select the relevant attributes and choose the strategy that works best for your data.

For this test, we assume that a person record has following fields available:

  • firstName
  • lastName
  • dateOfBirth
  • city
  • street
  • postalCode
  • phoneNumber
  • email
  • receivedDate

A very simple example of such a record might look like this:

{
  "firstName": "John",
  "lastName": "Smith",
  "dateOfBirth": "1990-12-31",
  "city": "New York",
  "street": "1000 5th Ave",
  "postalCode": "10028",
  "phoneNumber": "646 555-1234",
  "email": "john.smith@example.com",
  "receivedDate": "2024-06-15"
}

# Name Attributes - Selection by Frequency

When selecting attributes such as the first and last name of a person, we recommend choosing the correct value based on the number of occurrences based on the underlying records.

E.g. given the following first names from five different records

  • John
  • J.T.
  • John
  • John
  • Thomas

we recommend to choose the first name John.

Following is an example function to query to query both the first and last name using this approach. Note how we provided the (optional) alias goldenRecord for the recordInsights field.

query {
  entity(input: {id: "123"}) {
      goldenRecord: recordInsights {
        firstName: frequencyDistribution(field: "firstName", top: 1, direction: DESC) { value }
        lastName: frequencyDistribution(field: "lastName", top: 1, direction: DESC) { value }
      }
  }
}
{
  "data": {
    "entity": {
      "goldenRecord": {
        "firstName": {
          "value": "John"
        },
        "lastName": {
          "value": "Smith"
        }
      }
    }
  }
}

# Address Attributes - Selection by Date

When using the previous approach for the address fields, the result may not be correct if attributes that necessarily belong together are stored in different fields. The most frequent city might not belong to the most frequent house number. Hence, it is required to return values that belong to the same source record. Additionally you may only be interested in the latest value.

The following example extends the previous one by querying the latest address of John Smith. For this purpose we will query the newest record using the receivedDate field and then return the individual values.

query {
  entity(input: {id: "123"}) {
      goldenRecord: recordInsights {
          firstName: frequencyDistribution(field: "firstName", top: 1, direction: DESC) { value }
          lastName: frequencyDistribution(field: "lastName", top: 1, direction: DESC) { value }
          address: newest(field: "receivedDate") {
            city
            street
            postalCode
          }
      }
  }
}
{
  "data": {
    "entity": {
      "goldenRecord": {
        "firstName": {[
          "value": "John"
        ]},
        "lastName": {[
          "value": "Smith"
        ]},
        "address": {
          "city": "New York",
          "street": "1000 5th Ave",
          "postalCode": "10028",
        }
      }
    }
  }
}

# Email and Phone Attributes - Select All Unique Values

For the remaining attributes email and phone number, it is often required to receive a list of all contact possibilities. There are two alternatives that you could use for this approach. When using valuesDistinct, you will receive an unordered list of all unique attribute values. If you need a prioritized list of the best contact, you could use the frequencyDistribution again.

The following example shows both approaches to give you and idea of how to access the different structures. Additionally it extends the previous examples.

query {
  entity(input: {id: "123"}) {
      goldenRecord: recordInsights {
          firstName: frequencyDistribution(field: "firstName", top: 1, direction: DESC) { value }
          lastName: frequencyDistribution(field: "lastName", top: 1, direction: DESC) { value }
          address: newest(field: "receivedDate") {
            city
            street
            postalCode
          }
          emails: valuesDistinct(field: "email")
          phoneNumbers: frequencyDistribution(field: "phoneNumber", direction: DESC) { value }
      }
  }
}
{
  "data": {
    "entity": {
      "goldenRecord": {
        "firstName": [{
          "value": "John"
        }],
        "lastName": [{
          "value": "Smith"
        }],
        "address": {
          "city": "New York",
          "street": "1000 5th Ave",
          "postalCode": "10028",
        },
        "emails": [
          "j.smith@example.com",
          "john.smith@example.com",
          "tommy-the-wild@example.com"
        ],
        "phoneNumbers": [
          {"value": "646 555-1234"},
          {"value": "646 555-9876"}
        ]
      }
    }
  }
}

# Conclusion

With just a few simple queries and a little bit of GraphQL alias-magic, we can easily create a customized golden record creation. If neither of these approaches fits for your use case, then we recommend you to have a look at the full record insights documentation. Most advanced cases should be implementable by a little bit of grouping and sorting.