Developing with Lytics / Lytics API Documentation

Query

Schema management api to add/edit queries and user-fields.

Lytics Query Language

The Lytics Query Language (LQL) is used to define the transformation of uploaded records, and event data into user profiles. It transforms row-level event data into document-oriented user info. This Query langage is similar to the HIVE or SQL query langauges, however departs from these in order to offer more of a Rich Document (json user profile) construction.

Query example

# Build a user from web data
SELECT
    name                       -- Simple field, by default = string
    , age            KIND INT  -- cast field as int
    , last_visit_ts  KIND DATE -- cast as date

    -- Showing the aggregate counter function and aliasing name of output column AS
    , count(_ref)         AS ref_ct

    -- Valuect makes a map[string]int count of occurences of a key
    , valuect(`my field`) AS myfield_mapct

    -- showcase every optional syntax element in column
    --     meregeop oldest we don't want to over-write this value, keep oldest
    --     KIND INT   normally we don't have to cast as most functions have a specific type
    , amt AS first_order_amount
          IF  event == "cart checkout"
          SHORTDESC "Amount of First Order"
          LONGDESC "Amount of First Order"
          KIND INT
          MERGEOP OLDEST

    -- lets keep around the date at which they signed up (mergeop oldest)
    , now()               AS signedup_date                  IF event == "signed up"    KIND DATE  MERGEOP oldest

    -- maps:   map all fields that start with "user." into a fact map
    , match("user.")        AS user_attributes     KIND map[string]string

    -- list of strings
    , set(event) AS all_events


    -- Identified By Columns allow merging across streams
    , email(EmailAddress) AS email
    , _uid
    , fbuid

FROM
    default
INTO
    user
BY
    _uid OR email OR fbuid
WHERE
    _bot = "f" OR NOT EXISTS _bot
ALIAS
    web_user;

# validate the query
curl -s -XPOST "https://api.lytics.io/api/query/_validate" \
   -H "Authorization: $LIOKEY" \
   -H "Content-Type: text/plain" \
   --data-binary @/tmp/tmp.lql | jq '.'

# upload the query
curl -s -XPOST "https://api.lytics.io/api/query" \
   -H "Authorization: $LIOKEY" \
   -H "Content-Type: text/plain" \
   --data-binary @your_file.lql | jq '.'

# look at schema it output:
curl -s -H "Authorization: $LIOKEY" \
 -XGET "https://api.lytics.io/api/schema/user" | jq '.'

Standard Syntax

Select = "SELECT" COLUMNS FROM INTO BY [WHERE] ALIAS

# required from, the stream to operate on for this query
FROM = "FROM" Identifier

# Required Identified By field, name of column "AS" from Column
BY = "BY" Identifier ["OR" Identifier]

# Required Alias for giving a query a unqique identifier
ALIAS = "ALIAS" Identifier

# Optional Where Filter, same as SQL where
WHERE = "WHERE" LogicalExpression

COLUMNS         = COLUMN [, COLUMN]

COLUMN     = Expression ["AS" Identifier]
     ["IF" LogicalExpression] ["SHORTDESC" String]
     ["LONGDESC" String] ["KIND" Kind] ["MERGEOP" MergeOp]

LogicalExpression = NOT
           | Comparison
           | EXISTS
           | IN
           | CONTAINS
           | LIKE
           | Function
           | Expression
           | "(" LogicalExpression ")"
           | LogicalExpression OR LogicalExpression
           | LogicalExpression AND LogicalExpression

Expression =
    Identifier
    | Function
    | Literal

Function = Identifier "(" Expression [, Expression] ")"

NOT            = "NOT" LogicalExpression
Comparison     = Identifier ComparisonOp Literal
ComparisonOp   = ">" | ">=" | "<" | "<=" | "==" | "!="
EXISTS         = "EXISTS" Identifier
IN             = Identifier "IN" (Literal, Literal, ...)
CONTAINS       = Identifier "CONTAINS" Literal
LIKE           = Identifier "LIKE" String # uses * for wildcards


Literal = String | Int | Float | Bool | Timestamp

Identifier = [a-zA-Z][a-zA-Z0-9_]+ | "`" + String + "`"

Kind = "int" | "number" | "string" | "date" | "[]string" |
    "ts[]string" | "map[string]int" | "map[string]number" | "map[string]string*

# MergeOp's are very seldom used and have to be used on the right Kind
#  ie string can use Latest, Oldest (but not min, max)
MergeOp =  "max" | "min" | "latest" | "oldest" | "mapmax"
  • SELECT Select data to be added to user profiles. Including Maps, Counts, and other complex data types.

  • FROM The stream to select from

  • INTO This is USER for all user profiles. (technically you could create other types, such as "account")

  • WHERE Filters out entire records to not be included/analyzed. Bots, Employees, Test data.

  • BY What field are we going to identify this entity by

  • ALIAS When a selection query has an alias, that is the profile-fragment(table) name to use

Functions

There are a variety of functions for transformation and logic evaluation.

Aggregate Functions

There are a variety of expressions for building document type structures (maps, lists, sets). These are functional expressions but can only be used in Columns.

  • cap Limit the items stored in a field by count or by date. cap(field, int) cap(field, "number_of_days") (e.g. cap(field, "30d")). Returns an array containing the values within the capped count or length of time.

  • count Count of this key. For instance, count occurences of sessions that have started (ie, visited web site).

  • set Create a unique list/array of each value we have seen from this field

  • min,max Minimum or Maxium value (for numerics)

  • sum Sum values (keep track of total video play time, etc)

Logical Functions Local Evaluation, return boolean true/false.

  • all checks for existince of n keys all(key1,key2,key3,...) returns boolean.

  • any accepts a list of values and returns True if any are the contents of a field any(fieldname, value1,value2,value3)

  • exists Check for field (aka key) existence.

    • exists(purchase_total) checks to see if purchase_total is defined for the current message
    • valuect(yymm()) AS visits_by_yymm IF exists(_sesstart) Only fires valuect(yymm()) if _sesstart exists
  • in Determines if a field value is in a set of values.

    • "t" AS is_student IF role_type IN ("student","other")
    • dailyContact AS dailyContact IF dailyContact IN ("student","other")
  • eq Equal to eq(domain,"google.com")

  • ne Not Equal to ne(domain,"google.com")

  • lt Less Than lt(seconds(video_time), 30)

  • le Less Than or Equal to le(seconds(video_time), 30)

  • gt Greater Than gt(seconds(video_time), 30)

  • ge Greater Than or Equal to ge(seconds(video_time), 30)

  • not Not not(exists(domain))

  • or Or or(exists(domain), contains(domain,"google.com")) AS from_google

String Functions

  • join Join together multiple values, coerce them into strings. Last argument is which string to use to join (may be empty string).

    • join("apples","oranges",",") => "apples,oranges"
    • join("apples","oranges","") => "applesoranges"
  • len Length (of array, string)

  • oneof Choose value from the first field that has a non nil value.

    • oneof(fielda,fieldb,fieldc)
  • replace - Replace a matching part of a string with an empty string. Converts to string first.

    • replace(url,"/search/apachesolr_search/") - Removes /search/apachesolr_search/ from URL (in this case, leaving the search term
  • split Breaks a variable into smaller fragments given a specific delimiter

    • split(cc,",") - Splits the variable cc at each comma it contains
  • strip(field) Strips leading and trailing whitespace (spaces, tabs, newline, carriage-return) from string, or arrays of strings.

  • string.lowercase Convert strings to lower case

  • string.uppercase Convert strings to upper case

  • string.titlecase Convert strings to title case

  • contains Does this value contain this string? Is a sub-string match, not full match (eq)

    • IF contains(total_price, "$") - Check to see if total_price has a $ in it
    • IF not(contains(subscriber_key,"-")) AND not(contains(subscriber_key,"@")) check to make sure - or @ is not in it.
  • hasprefix Does this value start with this string?

    • hasprefix(event, "created") - Check to see if event starts with "created"
  • hassuffix Does this value start with this string?
    • hassuffix(subscriber_key, "user") - Check to see if subscriber_key ends with "user"

Hash & Encoding Functions

  • hash.sip hash.sip(email) Hash the given value using sip hash to integer output.

  • hash.md5 hash.md5(email) Hash the given value using md5

  • hash.sha1 hash.sha1(email) Hash the given value using sha1

  • hash.sha256 hash.sha256(email) Hash the given value using sha256

  • hash.sha512 hash.sha512(email) Hash the given value using sha512

  • encoding.b64encode(field) base64 encode.

  • encoding.b64decode(field) base64 decode.

Cast & Convert

  • toint Converts strings to integers. Useful for converting a string to a number before applying a number-based expression.

    • toint(order_total) - Converts order_total to an int
    • set(toint(split(cc,","))) - Takes the field cc and splits it at commas, and converts the results to integers. Then adds them to a set.
  • tonumber Convert to Number

  • todate Converts strings to dates, see full doc in Date/Time section below.

  • tobool(field) Cast to Boolean.

Map & Set/Array Functions

  • filter Filter out Values that match specified list of match filter criteria

    • filter(split("apples,oranges",","),"ora*") => ["apples"]
  • len Length (of array, string)

  • map Type: Map map(key1, todate(date_field))

    • map(key1, todate(date_field)) KIND map[string]time By default the map is generic map, cast to map[string]time with
  • match Type: Map (generic map, use KIND to cast) Match a key, and then keep a map of key/values with the match value removed

    • , match("topic_") AS global KIND map[string]number
  • mapkeys Type: Map input, []string{} output. Given a map, return a list of string of each of the keys.

  • mapvalues Type: Map input, []string{} output. Given a map, return a list of string values of each of the values.

  • mapinvert Type: Map input, MapString output. Given a map, return a map[string]string inverting keys/values.

  • array.index Cherry pick a single item out of an array:

    • array.index(split("apples,oranges,peaches",","),1) => ["oranges"]
  • array.slice Slice an array of items selecting some sub-set of them.
    • array.slice(split("apples,oranges,peaches,pineapple",","),2) => ["peaches","pineapple"]
    • array.slice(split("apples,oranges,peaches,pineapple",","),1,3) => ["oranges","peaches"]

Url/Http & Email Functions

  • email Extract email address from "Bob <[email protected]>" format

  • emailname Extract Bob from "Bob <[email protected]>" or [email protected]

  • emaildomain Extract gmail.com from "Bob <em[email protected]>" or [email protected]

  • domain Extract domain from url

  • host Extract host from url

  • path Extract the url path from url (no query string or domain), must be valid url parserable string.

  • qs Extract the querystring parameter from url qs(urlfield, "nameOfParam")

    • qs(url, "mc_eid") - Extracts the MailChimp user ID
    • set(qs(url, "video_id") - Creates a set of video_id
    • qs(tolower(url), "riid") - Converts the complete URL to lowercase before attempting to match
    • email(oneof(email, qs(url, "email"))) - Attempts to get the email address from the URL and from the regular fields, chooses whichever is populated and treats it like an email field
  • qs2 Extract a querystring parameter without lowercasing before checking for the parameter. qs2 is the same as qs above except that it does not lowercase before checking for a querystring.

  • urldecode Perform URL decode on a field. urldecode(field)

    • If field contains "my%20value", urldecode(field) will return "my value"
  • urlmain Extract the domain, subdomain, and path from the url

  • urlminusqs The url minus the querystring portion

  • useragent Extract info from user-agent string. Below examples based on Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11

    • useragent(user_agent, "bot") - Extracts True/False is this a bot?
    • useragent(user_agent, "mobile") - Extracts True/False is this mobile?
    • useragent(user_agent, "mozilla") - Extracts "5.0"
    • useragent(user_agent, "platform") - Extracts "X11"
    • useragent(user_agent, "os") - Extracts "Linux x86_64"
    • useragent(user_agent, "engine") - Extracts "Linux x86_64"
    • useragent(user_agent, "engine_version") - Extracts "AppleWebKit"
    • useragent(user_agent, "browser") - Extracts "Chrome"
    • useragent(user_agent, "browser_version") - Extracts "23.0.1271.97"
  • useragent.map(field) Extract map of all of above.

Date & Time Functions

Our core date parser recognizes about 50 date formats, so in general these will operate on any format. If you are using EU dates, you will need to specify the parser format.

  • dayofweek Type: Integer. 0-6 integer of day of week.

    • Examples: dayofweek() => 4 OR dayofweek(mydatefield)
  • epochms Type: Integer. Unix MS of the date stamp on the current message being processed

  • extract Can be used to extract parts of date and time. Example usage on the strftime site

    • extract(reg_date, "%B") Returns name of month
    • extract(reg_date, "%d") Returns day of month
  • hourofday Type: Integer. Hour of day (in 24 hour utc time). hourofday() OR hourofday(field)

  • hourofweek 0-167 integer for hour of week

  • mm Type: Integer. 0-11 month (alias for monthofyear) mm() => current month, 6 for june, mm(my_date_field)

  • monthofyear Type: Integer Output the 0-11 month value

  • now Type: Date The current message/event times.

  • seconds Type: Integer. Seconds, extracts things like seconds("00:30") => 30 and seconds("10:30") => 630

  • todate Converts strings to dates.

    • Datemath: todate("now-3m") Date math relative to message timestamp.
    • Parser: todate("02/01/2006") More than 30 formats supported. Date Parser
    • Examples with 2 arguments: todate("02/01/2006","07/04/2014") use golang's time package formatting
      • todate("02/01/2006","07/04/2014") Reformats the date 07/04/2014 from US formatting to UK formatting, with the resulting output being 04/07/2014
      • todate("02/01/2006",date_field_name) Outputs date_field_name as European format (where 01 is a placeholder for month, 02 is a placeholder for day, and 2006 is a placeholder for year)
  • todatein Converts strings to dates, if no location info is provided in date string such as "2017-09-30 17:00:00" this will allow you to apply a timezone. We still convert back to UTC for storage.

  • totimestamp Convert to Integer Unix Seconds (UTC).

  • yy Type: int Date conversion to YY format, so May 1 2014 is expressed as 14. yy(dob), or yy() for record time stamp

  • yymm String The YYMM date format, so May 1 2014 is expressed as 1405. yy(dob), or yy() for record time stamp

  • timebucket Creates a tabulation of timestamps which can be used to segment based on timewindows. See Segment examples for more information. timebucket(now()) for collect time, or timebucket(todate(field)) to bucket on the value of a field

KINDS (aka Data Types)

Allows explicitly setting data type. Often this os optional as it is inferred from functional expression.

  • int 64 bit signed integer

  • number 64 bit signed Float value

  • bool Boolean

  • date Date-Time

  • string string

  • []time Array of times

  • []string Array of strings

  • ts[]string Time ordered Unique set of strings (useful for keeping track of order in which they performed set of unique events)

  • map[string]int Map of key/integers

  • map[string]number

  • map[string]string

  • map[string]time

Merge Operations

MERGEOP Allow Merge behavior's to determine if given new data we want the new field, or keep the previous.

  • , my_date KIND DATE MERGEOP oldest -- Holds the first value seen for my_date

  • , old_score KIND INT MERGEOP oldest -- Holds the oldest value passed in to the field

  • set(lists) AS lists KIND []string MERGEOP latest -- only store latest set (all previous values of set discarded)

Query List

/api/query

Get all current queries.

GET

Query List

/api/query

List of all queries

Response 200

Headers
Content-Type: application/json
Body
{
    "status": "success",
    "data": [
      {
          "id": "abcdef123",
          "updated": "2014-10-30T21:01:06.493Z",
          "created": "2014-10-30T21:01:06.493Z",
          "text": "SELECT ..."
      }
    ]
}
POST

Query Upsert

/api/query{?version}

Uses ALIAS name inside parsed query for ID, and either creates/updates a query.

CHANGE NOTIFICATION This api in the past has returned a single object but is changing to return an array of query objects (because posted QL text may contain more than one statement). To get the old behavior of single object pass version=old.

# new version, will be default in Sept 2017.
curl -s -XPOST "https://api.lytics.io/api/query?version=new" \
   -H "Authorization: $LIOKEY" \
   --data-binary @your_file.lql

# old version, returns object
curl -s -XPOST "https://api.lytics.io/api/query?version=old" \
   -H "Authorization: $LIOKEY" \
   --data-binary @your_file.lql
Parameters
versionstring (optional)
Ex: old
Use the "new" array based response or "old" object.

Request

Headers
Content-Type: text/plain
Body
SELECT .....
FROM stream_name
BY field_name
ALIAS my_query_name

Response 200

Headers
Content-Type: application/json
Body
{
    "status": "success",
    "data": [
      {
          "id": "abcdef123",
          "updated": "2014-10-30T21:01:06.493Z",
          "created": "2014-10-30T21:01:06.493Z",
          "text": "SELECT ..."
      }
    ]
}

Query

/api/query/{idOrAlias}

Get a single Query resource.

Parameters
idOrAliasstring (required)
Ex: 1234
query Id OR Alias in path
GET

Query Fetch

/api/query/{idOrAlias}

Get single query by ALIAS OR ID

Response 200

Headers
Content-Type: application/json
Body
{
    "status": "success",
    "data": {
          "id": "abcdef123",
          "updated": "2014-10-30T21:01:06.493Z",
          "created": "2014-10-30T21:01:06.493Z",
          "text": "SELECT ..."
    }
}
DELETE

Query Delete

/api/query/{idOrAlias}

Delete A query by idOrAlias.

Response 204

POST

Query Validation

/api/query/_validate{?version}

Upload a query for syntax validation only.

Optionally, a segments=true parameter can be passed that will validate this query both for syntax, as well as checking that it doesn't invalidate existing segments. If a segment uses a field that this query is about to remove/alter such that the segment is no longer valid, this will warn.

# validate query syntax
curl -s -XPOST "https://api.lytics.io/api/query/_validate" \
   -H "Authorization: $LIOKEY" \
   -H "Content-Type: text/plain" \
   --data-binary @your_file.lql

# validate query syntax AND segments still valid
curl -s -XPOST "https://api.lytics.io/api/query/_validate?segments=true" \
   -H "Authorization: $LIOKEY" \
   -H "Content-Type: text/plain" \
   --data-binary @your_file.lql
Parameters
versionstring (optional)
Ex: old
Use the "new" array based response or "old" object.

Request

Headers
Content-Type: text/plain
Body
SELECT .....
FROM stream_name
BY field_name
ALIAS my_query_name

Response 200

Headers
Content-Type: application/json
Body
{
    "status": "success",
    "data": {
          "id": "abcdef123",
          "updated": "2014-10-30T21:01:06.493Z",
          "created": "2014-10-30T21:01:06.493Z",
          "text": "SELECT ..."
    }
}
POST

Query Test Evaluation

/api/query/_test

Upload a query AND data to see how it will be interpreted.

# add any name/value paris to query string param for data input
# then upload query and get evaluation response
curl -s -XPOST "https://api.lytics.io/api/query/_test?name=value" \
   -H "Authorization: $LIOKEY" \
   -H "Content-Type: text/plain" \
   --data-binary @your_file.lql

Request

Headers
Content-Type: text/plain
Body
    SELECT .....
    FROM stream_name
    BY field_name
    ALIAS my_query_name

Response 200

Headers
Content-Type: application/json
Body
{
    "status": "success",
    "data": {
          "id": "abcdef123",
          "updated": "2014-10-30T21:01:06.493Z",
          "created": "2014-10-30T21:01:06.493Z",
          "text": "SELECT ..."
    }
}