Documentation / Product / Features / Data Onboarding and Management

User Fields

User fields are a filtered and aggregated view of your raw data. Although Lytics will collect and store all data you send, in order to keep your marketing efforts focused, not all of the data is surfaced. For instance, you may want to integrate Salesforce with Lytics. Not all of your Salesforce data will be relevant for your marketing efforts, so instead of immediately making everything available as a user field, Lytics favors a selective approach by mapping and processing key user fields.

Processed events by means of User Fields is the second stage of the Lytics data funnel

How Lytics filters data

Lytics has a proprietary data mapping system called Lytics Query Language (LQL). LQL is used to map and transform fields from data stream events to user fields. Since Lytics stores the raw data, mappings can be changed and reapplied. This process, called rebuilding, allows for transformations of existing user field data or the ability to add new user fields from the same raw data.

How Lytics aggregates data

When the same event for the same user is seen more than once, Lytics will react in one of a few ways depending on the desired aggregation.

  • The first value seen may always be preserved such as the time that the user first visited your site.
  • The last value seen may overwrite the previous value such as the user's current location.
  • New values may be appended to the list of all seen values, for example collecting all the pages the user has visited.
  • Combinations of raw events can be aggregated together, for example the number of times each page has been visited.

Aggregation is determined by a couple different attributes during the mapping process such as data structure, merge operation, and other aggregate functions.

Unique identifiers

Unique identifiers, sometimes referred to as “by fields” are a special type of user field. Lytics uses these fields to match and merge user data across different data sources. The two most common unique identifiers are _uid and email address. Unique identifiers should represent valuable identification fields of an individual.

For example, you may collect email addresses from a lead form on your website. You may also be running an import integration with your email service provider (ESP) to collect email data. Lytics can merge a user’s web activity data, such as visits and clicks, and email activity data, such as clicks and opens, for users who appear in both data sources if the email address is mapped as a unique identifier for both streams. In the scenario provided, Lytics will merge these profile fragments out of the box, because mappings are already provided for both streams.

How Lytics ranks identifiers

Lytics ranks identifiers by their reliability. An identifier is considered to be highly reliable if it has a 1:1 relationship with a user. If there is a chance that it is associated with more than one user, it is considered to be less reliable. Between email and _uid, email is a more reliable identifier. It is uncommon for users to share email addresses, so usually, all data collected for a single email address belongs to a single user. In contrast, it is more common for many users to share a computer and a browser.

Lytics maintains a ranking for profile identifiers from most reliable to least reliable. Here is an example of a very basic ranking.

IDENTIFIER RANK CONSTRAINT (
  email EXCLUSIVE,
  _uid EXPIRES
) ON user

The EXCLUSIVE keyword dictates that only one email can be included in each profile. The EXPIRES keyword indicates that a particular _uid will only be seen for a short period of time, since a user can clear their cookies or the user’s browser could clear them automatically. These statements are automatically generated and no additional work is required by you.

Graph Compaction

Lytics uses an identity graph to stitch together a full user profile from the source of data you send to your account. For example, one source might contain emails, another might contain emails and cookies, and yet another might contain cookies and user_id. Because of the identity graph, Lytics can stitch all three together automatically.

As the identity graph grows with weak identifiers like cookies which can become outdated over time, graph compaction is able to move behavioral data associated with these weak identifiers and associate it to stronger identifiers like email or user_id if they exist in the identity graph. Once the behavioral data is moved, the oldest and most outdated, weak identifiers can be removed from the identity graph, since they are not likely to contribute any useful information. In this manner, identity graph compaction keeps the graph small and only with relevant information.

How user fields relate to user profiles

User profiles are a grouping of all known values of user fields seen for one or more unique identifiers. Through the process of merging, a user profile becomes a rich set of information collected from a variety of channels. For example, see how Lytics reports anonymous visitors.

Viewing available user fields

User fields can be found under Data at Data > User Fields.

The user fields section lists all user fields that have data. There are three available filters for finding a particular user field.

FilterDescription
SearchOnly user fields with names or identifiers that match the search value will be shown.
CoverageOnly user fields that are seen on at least this percentage of users or more will be shown.
StreamOnly user fields that receive data from the specified stream will be shown.

Reading a user field card

User-Field-Card

  1. Name: The name will be seen throughout Lytics, notably in the audience builder and integrations.
  2. Identifier: (Hover over the name to view) This is generally a variant of the name in all lowercase with underscores. It is used in some exports and in APIs and SDKs.
  3. Source: The stream(s) that provides data for the user field.
  4. Coverage: The percentage of users with this field. Common fields, such as last visit country, will be high while uncommon fields, like conversion events, will be low.
  5. Distinct Values: The number of different values seen for this user field.
  6. Volume Visualization: The type of visualization used will vary based on the type for the user field, but the visualization will always aim to illustrate the distribution of values. Here the United States is the dominant value.
  7. Sampling Note: Displayed if a user field has a large number of distinct values indicating not all values are shown.
  8. Audience Shortcut: This button opens the audience builder with the user field preselected as a custom rule.