BYOA Model Features

Custom Brain allows the client to use the BYOA API to upload a set of logistic coefficients corresponding to any of the variables currently in use by the MediaMath Brain.

Available Features

We accommodate the features listed below. Please refer to "Data Platform Schemas" list on MediaMath Support for the variable type and a description of what these features represent. Only some of the Categorical Types below will be in that list.

Intercept

Feature Name	Type	Corresponding field(s) in LLDS Impressions	Notes
`__const`	-	-	represents the Intercept weight

Mapped numerical Types

Feature Name	Type	Corresponding field(s) in LLDS Impressions	Notes
bidder_pixel_frequency	Mapped numerical	overlapped_brain_pixel_selections	See 'Audience Data' section below.
bidder_pixel_recency	Mapped numerical	overlapped_brain_pixel_selections	See 'Audience Data' section below.

Numerical Types

Feature Name	Type	Corresponding field(s) in LLDS Impressions	Notes
exchange_ctr	Numerical	prebid_historical_ctr	Click Through Rate. If it is -1, we populate it with 0.
exchange_vcr	Numerical	prebid_video_completion	Video Completion Rate. If it is -1, we populate it with 0.
exchange_viewability_rate	Numerical	prebid_viewability	Viewability Rate. If it is -1, we populate it with 0.
id_vintage	Numerical	id_vintage	0 if the primary MM user id from bid request is between [0 and 1 week old], 1 if it’s between [1 and 18 weeks old], 2 if it’s between [18 and 52 weeks old], 3 if it’s more than 52 weeks old, 999 or higher If there is no MM UUID or incoming request has id_vintage >= 999, the calculation of the response rate will use id_vintage = 0 but LLDS Impressions will still log id_vintage >= 999.

Hardcoded Interaction Types

Feature Name	Type	Corresponding field(s) in LLDS Impressions	Notes
country_id_cs_region_id	Hardcoded interaction	region_id and country_id	country_id joins region_id by dash
exchange_id_cs_category_id	Hardcoded interaction	N/A	Read exchange_id and category_id from impression table. Combination of exchange_id and category_id
exchange_id_cs_vcr	Hardcoded interaction	N/A	Read exchange_id and prebid_video_completion from impression table. Combination of exchange_id and prebid_video_completion
exchange_id_cs_ctr	Hardcoded interaction	N/A	Read exchange_id and prebid_historical_ctr from impression table. We round prebid_historical_ctr using the same rounding convention as prebid_video_completion above. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information.
exchange_id_cs_vrate	Hardcoded interaction	N/A	Read exchange_id and prebid_viewability from impression table. Combination of exchange_id and prebid_viewability. We round prebid_viewability down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information.
exchange_id_cs_site_id	Hardcoded interaction	N/A	Combination of exchange_id and site_id

Categorical Types

Feature Name	Type	Corresponding field(s) in LLDS Impressions	Notes
base_domain	Simple categorical	site_url	Extract effective top-level domain from t1.site_url (see Base Domain notes below here)
browser	Simple categorical	contextual_data	See WURFL features extraction
browser_id	Simple categorical	browser_id	We recommend using the 'browser' and 'browser version' features instead of this one due to better granularity and accuracy.
browser_language_id	Simple categorical	browser_language_id	BrowserLanguageID: If browserLanguage is set to 0 in LLDS Impressions, browserLanguage id was not send to BYOA.
browser_version	Simple categorical	contextual_data	See WURFL features extraction
category_id	Simple categorical	category_id	-
channel_type	Simple categorical	channel_type	1 = Display, 2 = Video, 3 = Social, 4 = mobile display (web), 5 = mobile video (web), 6 = search, 7 = email, 8 = mobile display (in-app), 9 = mobile video (in-app), 10 = Newsfeed (FBX)
conn_speed	Simple categorical	conn_speed_id	read LLDS Impressions.conn_speed_id and store as conn_speed in the model
cookieless	Simple categorical	cross_device_flag	cookieless can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (2 or 3) then cross_device=TRUE else FALSE
creative_id	Simple categorical	creative_id	-
cross_device	Simple categorical	cross_device_flag	cross_device can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (1 or 3) then cross_device=TRUE else FALSE
country_id	Simple categorical	country_id	-
device_id	Simple categorical	device_id	We recommend using ‘device_manufacturer’, ‘device_model’ and ‘device_type’ features instead of this one due to better granularity and accuracy.
device_manufacturer	Simple categorical	contextual_data	See WURFL features extraction
device_model	Simple categorical	contextual_data	See WURFL features extraction
device_type	Simple categorical	contextual_data	See WURFL features extraction
day_of_week	Simple categorical	timestamp_gmt	0 = Sunday ... 6 = Saturday
day_part	Simple categorical	timestamp_gmt	The day part is based on when the impression was served and the user’s timezone. 0 = 12AM to 5:59AM; 1 = 6AM to 11:59AM; 2 = 12PM to 5:59PM; 3 = 6PM to 11:59PM
deal_id	Simple categorical	deal_id	-
dma_id	Simple categorical	dma_id	-
exchange_id	Simple categorical	exchange_id	-
fold_position	Simple categorical	fold_position	1 = Above the fold, 2 = Below the fold, 0 = Unknown
hashed_app_id	Simple categorical	read LLDS Impressions. app_id -> calculate hashed_app_id	See App ID notes.
isp_id	Simple categorical	isp_id	-
interstitial	Simple categorical	interstitial	-
num_device_ids	Simple categorical		t1.num_device_ids is null or 0, it should be 0, if 1, then it should be 1, else if > 1, it should be set as 2
os	Simple categorical	contextual_data	See 'Device Information' below
os_id	Simple categorical	os_id	We recommend using the 'os' and 'os_version' fields instead of this one due to better granularity and accuracy.
os_version	Simple categorical	contextual_data	See WURFL features extraction
pixel	Simple categorical	overlapped_brain_pixel_selections	See 'Audience Data' section below.
region_id	Simple categorical	region_id	-
size	Simple categorical	size	32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height, see 'Size Encoding/Decoding' section below
site_id	Simple categorical	site_id	-
user_frequency	Simple categorical	user_frequency	Refers to 'Session Frequency' on T1 Knowledge Base page
video_placement_type	Simple categorical	video_placement_type	Video placement type. 1: In-stream; 2: In-banner; 3: In-article; 4: In-feed; 5: Interstitial/slider/floating
video_skippability	Simple categorical	video_skippability	Flag to identify video skippability ; 0: non-skippable; 1: skippable; null: unknown
week_part	Simple categorical	timestamp_gmt	The week part is based on when the impression was served and the user’s timezone. 0 = weekday; 1 = weekend

Audience Data

There are three audience-based features in our model that are derived from the overlapped_brain_pixel_selections field in LLDS Impressions are separated by | char. format is: pixel, bidder_pixel_frequency, and bidder_pixel_recency

For context, overlapped_brain_pixel_selections is a pipe-delimited list of tuples that contain segment membership information. Each tuple is of the format mm:px1:r1:f1; the components of the tuple are separated by colons and can be interpreted as follows:

"mm" - the namespace of the pixel

"px1" - the pixel_id of the audience segment. In the logistic model, this is converted into a binary field indicating whether the user is in this segment.

"r1" - the recency, or amount of time that has elapsed, since the user was added to px1. In the logistic model, this is converted into a mapped numerical field whose value is equal to 1440.0/r1. By way of background, 1440.0/recency is simply converting the recency value, which is denominated in minutes, to its inverse, measured in days—there are 1,440 minutes in a day.

Input: recency_minutes

we calculate recencyDays := math.Max(recencyMinutes/1440.0, 1.0)

and limit recencyDays as follows:
recencyDaysFn = 1.0 / math.Min(200, recencyDays) // math.Min(200, recencyDays) will limit recencyDaysFn in the range 0.005...1

If the recency is zero (perhaps because recency data is not available for that audience segment), the corresponding map-entry for recency would not exist. I.e. we do not allow division by zero.

"f1" - the frequency, or amount of time that has elapsed, since the user was added to px1. In the logistic model, this is converted into a mapped numerical field whose value is simply equal to f1.

If frequency > 200 frequency will be set to 200.

Device Data

Model features related to a device, including browser, browser_version, os, os_version, device_manufacturer, device_model, device_type are derived from the contextual_data field in LLDS Impressions. For example

{
  "24": {
    "1": {
      "targeted": [],
      "untargeted": [
        "br_Chrome:ve_60.0.3112"
      ]
    }
  },
  "25": {
    "1": {
      "targeted": [],
      "untargeted": [
        "os_Windows:ve_10.0.0"
      ]
    }
  },
  "26": {
    "1": {
      "targeted": [],
      "untargeted": [
        "fo_Desktop"
      ]
    }
  },
  "27": {
    "1": {
      "targeted": [],
      "untargeted": [
        "ma_Desktop Make:mo_Desktop Model"
      ]
    }
  },
  "28": {
    "1": {},
    "2": {},
    "3": {}
  }
}

Would be read as

browser = "br_Chrome"
browser_version = "br_Chrome:ve_60.0.3112"
os = "os_Windows"
os_version = "os_Windows:ve_10.0.0"
device_model = "ma_Desktop Make:mo_Desktop Model"
device_manufacturer = "ma_Desktop Make"
device_type = "fo_Desktop"

We include browser name in browser version (i.e. we prepend "br" in "br_Chrome:vs60.0.3112") because two different browsers could have the same version. The same logic holds for os_version and device_model.

Hardcoded Interactions

For exchange_id_cs_site_id: Formed by appending exchange id with other half of feature value, e.g. ExchangeID = 4 and site_id = 100. We will lookup exchange_id_cs_site_id^4-100 in features -> weights.

For exchange_id_cs_vcr

package main

import (
    "fmt"
    "strconv"
)

//Round is a bit slower but easier to read
func Round(input string, numberDec int) (string, error) {
    flval, err := strconv.ParseFloat(input, 64)
    if err != nil {
        return "", err
    }
    if flval < 0 {
        return "", nil
    }
    if numberDec == 3 && flval > 0 {
        return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil
    }
    return fmt.Sprintf("%.1f", flval), nil
}

// RoundAndTrim Rounds and Trims
func RoundAndTrim(input string, numberDec int) (string, error) {
    res, err := Round(input, numberDec)
    if err != nil {
        return "", err
    }
    return TrimRight(res, '0'), nil
}

// TrimRight removes zero padding. E.g.,
// 10.000 -> 10.0
// 10.100 -> 10.1
// 10.120 -> 10.12
func TrimRight(input string, cut byte) string {
    count := 0
    for i := len(input) - 1; i > 0; i-- {
        if input[i] == cut && input[i-1] != '.' {
            count++
        } else {
            break
        }
    }

    return input[0 : len(input)-count]
}

func exchange_id_cs_vcr(exchangeID, videoCompletion string) {
    vcRounded, _ := RoundAndTrim(videoCompletion, 1)
    fmt.Println("exchange_id_cs_vcr^" + exchangeID + "-" + vcRounded)
}

func main() {
    exchangeID := "10"
    prebidVideoCompletion := "20"
    exchange_id_cs_vcr(exchangeID, prebidVideoCompletion)
}

For exchange_id_cs_ctr

package main

import (
    "fmt"
    "strconv"
)

//Round is a bit slower but easier to read
func Round(input string, numberDec int) (string, error) {
    flval, err := strconv.ParseFloat(input, 64)
    if err != nil {
        return "", err
    }
    if flval < 0 {
        return "", nil
    }
    if numberDec == 3 && flval > 0 {
        return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil
    }
    return fmt.Sprintf("%.1f", flval), nil
}

// RoundAndTrim Rounds and Trims
func RoundAndTrim(input string, numberDec int) (string, error) {
    res, err := Round(input, numberDec)
    if err != nil {
        return "", err
    }
    return TrimRight(res, '0'), nil
}

// TrimRight removes zero padding. E.g.,
// 10.000 -> 10.0
// 10.100 -> 10.1
// 10.120 -> 10.12
func TrimRight(input string, cut byte) string {
    count := 0
    for i := len(input) - 1; i > 0; i-- {
        if input[i] == cut && input[i-1] != '.' {
            count++
        } else {
            break
        }
    }
    return input[0 : len(input)-count]
}

func exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr string) {
    ctRounded, _ := RoundAndTrim(prebidHistoricalCtr, 3)
    fmt.Println("exchange_id_cs_ctr^" + exchangeID + "-" + ctRounded)
}

func main() {
    exchangeID := "10"
    prebidHistoricalCtr := "0.002"
    exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr)
}

For exchange_id_cs_vrate

We round prebid_viewability down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120.

package main

import (
    "fmt"
    "math"
    "strconv"
)

func exchange_id_cs_vrate(exchangeID, prebidViewability string) {
    vr, _ := strconv.ParseFloat(prebidViewability, 64)
    finalVal := int64(math.Floor(vr/10)) * 10
    fmt.Println("exchange_id_cs_vrate^" + exchangeID + "-" + strconv.FormatInt(finalVal, 10))
}

func main() {
    exchangeID := "10"
    prebidViewability := "19"
    exchange_id_cs_vrate(exchangeID, prebidViewability)
}

AppID

The raw bid request sends the hashed_app_id but we log app_id in the impression_log. If the impression.app_id is equal to "N/A" then the hashed_app_id was equal to "0" in the raw bid request. If the impression.app_id is different from "N/A" then the hashed_app_id needs to be calculated manually as per the following pseudo-code.

Use Boost Library 1.58

uint32_t m_HashedAppId = 0;

void setHashedAppId(const char* appid)
{
    if (appid) {
        m_HashedAppId = atoi(appid);
        if (m_HashedAppId == 0) {
            m_HashedAppId = MM::Utils::pstr_ihash()(appid) % INT_MAX;
        }
    }
}

struct pstr_ihash
    : std::unary_function<const char*, std::size_t>
{
    std::size_t operator()(const char* x) const
    {
        std::size_t seed = 0;

        while (*x) {
            boost::hash_combine(seed, ::toupper(*x++));
        }
        return seed;
    }
};

Please use the following code to test the calcHashedAppId. It takes app_id and calculates the hashed_app_id to be used in the model and these examples will make sure the implementation is correct

#include <iostream>
#include <string>
#include <boost/functional/hash.hpp>
#include <climits>
#include <cassert>
#include <cstring>

struct pstr_ihash
    : std::unary_function<const char*, std::size_t>
{
    std::size_t operator()(const char* x) const
    {
        std::size_t seed = 0;

        while (*x) {
            boost::hash_combine(seed, ::toupper(*x++));
        }
        return seed;
    }
};

// pass appId from impressions
// TODO: add handling for special case:
// If App ID is absent, LLDS Impressions logs N/A for app_id
// if app_id = N/A -> hashed_app_id = 0
unsigned int calcHashedAppId(const char* appid)
{
    unsigned int m_HashedAppId = 0;
    if (appid) {
        if (std::strcmp(appid, "N/A") == 0) {
            return 0;
        }
        m_HashedAppId = atoi(appid);
        if (m_HashedAppId == 0) {
            m_HashedAppId = pstr_ihash()(appid) % INT_MAX;
        }
    }
    return m_HashedAppId;
}

struct testcase {
    const char* input;
    unsigned int expected_output;
};
  
int main() {
   // your code goes here

   std::vector<testcase> tests {
      {"com.fivemobile.thescore", 1453566594},
      {"605581486", 605581486},
      {"tunein.player", 1173358324},
      {"com.aws.android", 1903276095},
      {"com.document.pdf.scanner.docscan", 1812585910},
      {"com.apalon.weatherlive.free", 591449217},
      {"com.pandora.android", 1387399900},
      {"com.weather.weather", 447752198},
      {"de.wetteronline.wetterapp", 1107246225},
      {"439873467", 439873467},
      {"N/A", 0},
   };

   for (unsigned int i = 0; i < tests.size(); i++) {
      assert(calcHashedAppId(tests[i].input) == tests[i].expected_output);
   }

   return 0;
}

Base Domain

We recommend using a library from https://publicsuffix.org/learn/ to derive the base_domain from site_url. Bellow is an example of what we expect.

site_url	base_domain	comment
www.yahoo.com	yahoo.com
finance.yahoo.com	yahoo.com
https://www.sports.yahoo.com	yahoo.com
w.main.welcomescreen.aol.com	aol.com
bap.navigator.web.de	web.de
www.ebay.co.uk	ebay.co.uk
www.u.gg	u.gg
https://www.u.gg	u.gg
https://u.gg	u.gg
http://sqlserverbuilds.blogspot.com	sqlserverbuilds.blogspot.com
prebidsetup.an.r.appspot.com	prebidsetup.an.r.appspot.com	*.r.appspot.com is a valid suffix
r.appspot.com	r.appspot.com	appspot.com is a valid suffix
this.is.a.test.readthedocs.io	test.readthedocs.io	readthedocs.io is a valid suffix
pythonguidecn.readthedocs.io	pythonguidecn.readthedocs.io	readthedocs.io is a valid suffix
check-ozmall.global.ssl.fastly.net	check-ozmall.global.ssl.fastly.net	global.ssl.fastly.net is a valid suffix
test.fastly.net	fastly.net	fastly.net is a valid suffix
http://mp2f-m-env.ap-northeast-1.elasticbeanstalk.com	mp2f-m-env.ap-northeast-1.elasticbeanstalk.com	ap-northeast-1.elasticbeanstalk.com is a valid suffix
thisisatest		error: publicsuffix: cannot derive eTLD+1 for domain "thisisatest""

Size Encoding/Decoding

32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height.

//encore_decode_size.go
package main

import (
	"errors"
	"fmt"
	"strconv"
)

func encode(width uint32, height uint32) uint32 {
	var size uint32
	size = (0xFFFF & height)
	size |= (width << 16)
	return size
}

func decode(val string) (int, int, error) {
	size, err := strconv.ParseUint(val, 10, 32)
	if err != nil {
		return 0, 0, errors.New("strconv.ParseUint() failed")
	}
	width := int((size >> 16) & 0xffff)
	height := int(size & 0xffff)

	return width, height, nil
}

func main() {
	fmt.Printf("%d\n", encode(320, 50)) // Will output 20971570

	width, height, err := decode("20971570")
	if err != nil {
		fmt.Println(err)
		return
	}
	fmt.Printf("%dx%d\n", width, height) // Will output 320x50
}