By using a set of data science techniques to look at topic overlap between classified content, Lytics will programmatically build a topic taxonomy. In addition to programmatically buiding this taxonomy, Lytics dynamically adjusts the taxonomy as new content gets published.
The topic taxonomy is stored as a weighted and directional graph. Although this structure may be daunting to look at—and even more daunting to try and utilize by hand—Lytics makes use of it when determining content recommendations.
Topics are determined to be related by evaluating how they occur together and how they occur independently. When two topics occur together frequently (meaning in multiple documents), it is safe to assume they are related. In addition to the relationship existing, Lytics will determine the direction of the relationship. Is "Cookies" a subtopic of "Baking" or is "Baking" a subtopic of "Cookies"?
The relationship is determined by the co-occurrence of topics. The direction is determined by the independent occurence of a topic. Since the topic "Cookies" frequently occurs with "Baking", but "Baking" frequently occurs without "Cookies", "Cookies" must be a subtopic of "Cookies". To draw this example further, "Baking" may be a subtopic of "Recipes", and "Recipes" may be a subtopic of "Cooking", and "Cooking" may be a subtopic of "Hobbies".
This is important because it allows Lytics to make affinity inferences. A user with a high affinity for "Cookies" may be interested in general "Baking" content.
The weight between relationships is incredibly important to consider when making affinity inferences. For instance, Michael Jordan played baseball professionally for one dark year in the 90s. This means there are valid relationships between Michael Jordan and Baseball as well as Michael Jordan and Basketball. Most people who know anything about sports or pop culture know Michael Jordan as a basketball icon. The way this gets reflected in a taxonomy is through relationship weights. The weight between Michael Jordan and Basketball is strong, while the weight between Michael Jordan and Baseball is weak.
By having weights, affinity inferences can use those weights as thresholds for building further relationships or recommending content. A user who has shown interest in Michael Jordan is more likely to be interested in Basketball than they are to be interested in Baseball. Unless they are actually interested in Sports Icons From The 90s, in which case they might be more interested in Ken Griffey Jr. A rich topic taxonomy will surface this nuanced information.
The Graph Representation of the Taxonomy
A graph is the ideal data structure for topic taxonomies, and taxonomies in general. Since each topic can have many subtopics as well as be the subtopic of many things, the correct way to structure this data is with a graph.