BYOA Model Features
Custom Brain allows the client to use the BYOA API to upload a set of logistic coefficients corresponding to any of the variables currently in use by the MediaMath Brain.
Available Features
We accommodate the features listed below. Please refer to "Data Platform Schemas" list on MediaMath Support for the variable type and a description of what these features represent. Only some of the Categorical Types below will be in that list.
Intercept
Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
---|---|---|---|
__const | - | - | represents the Intercept weight |
Mapped numerical Types
Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
---|---|---|---|
bidder_pixel_frequency | Mapped numerical | overlapped_brain_pixel_selections | See 'Audience Data' section below. |
bidder_pixel_recency | Mapped numerical | overlapped_brain_pixel_selections | See 'Audience Data' section below. |
Numerical Types
Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
---|---|---|---|
exchange_ctr | Numerical | prebid_historical_ctr | Click Through Rate. If it is -1, we populate it with 0. |
exchange_vcr | Numerical | prebid_video_completion | Video Completion Rate. If it is -1, we populate it with 0. |
exchange_viewability_rate | Numerical | prebid_viewability | Viewability Rate. If it is -1, we populate it with 0. |
id_vintage | Numerical | id_vintage | 0 if the primary MM user id from bid request is between [0 and 1 week old], 1 if it’s between [1 and 18 weeks old], 2 if it’s between [18 and 52 weeks old], 3 if it’s more than 52 weeks old, 999 or higher If there is no MM UUID or incoming request has id_vintage >= 999, the calculation of the response rate will use id_vintage = 0 but LLDS Impressions will still log id_vintage >= 999. |
Hardcoded Interaction Types
Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
---|---|---|---|
country_id_cs_region_id | Hardcoded interaction | region_id and country_id | country_id joins region_id by dash |
exchange_id_cs_category_id | Hardcoded interaction | N/A | Read exchange_id and category_id from impression table. Combination of exchange_id and category_id |
exchange_id_cs_vcr | Hardcoded interaction | N/A | Read exchange_id and prebid_video_completion from impression table. Combination of exchange_id and prebid_video_completion |
exchange_id_cs_ctr | Hardcoded interaction | N/A | Read exchange_id and prebid_historical_ctr from impression table. We round prebid_historical_ctr using the same rounding convention as prebid_video_completion above. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information. |
exchange_id_cs_vrate | Hardcoded interaction | N/A | Read exchange_id and prebid_viewability from impression table. Combination of exchange_id and prebid_viewability. We round prebid_viewability down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information. |
exchange_id_cs_site_id | Hardcoded interaction | N/A | Combination of exchange_id and site_id |
Categorical Types
Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
---|---|---|---|
base_domain | Simple categorical | site_url | Extract effective top-level domain from t1.site_url (see Base Domain notes below here) |
browser | Simple categorical | contextual_data | See WURFL features extraction |
browser_id | Simple categorical | browser_id | We recommend using the 'browser' and 'browser version' features instead of this one due to better granularity and accuracy. |
browser_language_id | Simple categorical | browser_language_id | BrowserLanguageID: If browserLanguage is set to 0 in LLDS Impressions, browserLanguage id was not send to BYOA. |
browser_version | Simple categorical | contextual_data | See WURFL features extraction |
category_id | Simple categorical | category_id | - |
channel_type | Simple categorical | channel_type | 1 = Display, 2 = Video, 3 = Social, 4 = mobile display (web), 5 = mobile video (web), 6 = search, 7 = email, 8 = mobile display (in-app), 9 = mobile video (in-app), 10 = Newsfeed (FBX) |
conn_speed | Simple categorical | conn_speed_id | read LLDS Impressions.conn_speed_id and store as conn_speed in the model |
cookieless | Simple categorical | cross_device_flag | cookieless can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (2 or 3) then cross_device=TRUE else FALSE |
creative_id | Simple categorical | creative_id | - |
cross_device | Simple categorical | cross_device_flag | cross_device can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (1 or 3) then cross_device=TRUE else FALSE |
country_id | Simple categorical | country_id | - |
device_id | Simple categorical | device_id | We recommend using ‘device_manufacturer’, ‘device_model’ and ‘device_type’ features instead of this one due to better granularity and accuracy. |
device_manufacturer | Simple categorical | contextual_data | See WURFL features extraction |
device_model | Simple categorical | contextual_data | See WURFL features extraction |
device_type | Simple categorical | contextual_data | See WURFL features extraction |
day_of_week | Simple categorical | timestamp_gmt | 0 = Sunday ... 6 = Saturday |
day_part | Simple categorical | timestamp_gmt | The day part is based on when the impression was served and the user’s timezone. 0 = 12AM to 5:59AM; 1 = 6AM to 11:59AM; 2 = 12PM to 5:59PM; 3 = 6PM to 11:59PM |
deal_id | Simple categorical | deal_id | - |
dma_id | Simple categorical | dma_id | - |
exchange_id | Simple categorical | exchange_id | - |
fold_position | Simple categorical | fold_position | 1 = Above the fold, 2 = Below the fold, 0 = Unknown |
hashed_app_id | Simple categorical | read LLDS Impressions. app_id -> calculate hashed_app_id | See App ID notes. |
isp_id | Simple categorical | isp_id | - |
interstitial | Simple categorical | interstitial | - |
num_device_ids | Simple categorical | t1.num_device_ids is null or 0, it should be 0, if 1, then it should be 1, else if > 1, it should be set as 2 | |
os | Simple categorical | contextual_data | See 'Device Information' below |
os_id | Simple categorical | os_id | We recommend using the 'os' and 'os_version' fields instead of this one due to better granularity and accuracy. |
os_version | Simple categorical | contextual_data | See WURFL features extraction |
pixel | Simple categorical | overlapped_brain_pixel_selections | See 'Audience Data' section below. |
region_id | Simple categorical | region_id | - |
size | Simple categorical | size | 32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height, see 'Size Encoding/Decoding' section below |
site_id | Simple categorical | site_id | - |
user_frequency | Simple categorical | user_frequency | Refers to 'Session Frequency' on T1 Knowledge Base page |
video_placement_type | Simple categorical | video_placement_type | Video placement type. 1: In-stream; 2: In-banner; 3: In-article; 4: In-feed; 5: Interstitial/slider/floating |
video_skippability | Simple categorical | video_skippability | Flag to identify video skippability ; 0: non-skippable; 1: skippable; null: unknown |
week_part | Simple categorical | timestamp_gmt | The week part is based on when the impression was served and the user’s timezone. 0 = weekday; 1 = weekend |
Audience Data
There are three audience-based features in our model that are derived from the overlapped_brain_pixel_selections field in LLDS Impressions are separated by | char. format is: pixel, bidder_pixel_frequency, and bidder_pixel_recency
For context, overlapped_brain_pixel_selections is a pipe-delimited list of tuples that contain segment membership information. Each tuple is of the format mm:px1:r1:f1
; the components of the tuple are separated by colons and can be interpreted as follows:
"mm" - the namespace of the pixel
"px1" - the pixel_id of the audience segment. In the logistic model, this is converted into a binary field indicating whether the user is in this segment.
"r1" - the recency, or amount of time that has elapsed, since the user was added to px1. In the logistic model, this is converted into a mapped numerical field whose value is equal to 1440.0/r1. By way of background, 1440.0/recency is simply converting the recency value, which is denominated in minutes, to its inverse, measured in days—there are 1,440 minutes in a day.
Input: recency_minutes
we calculate recencyDays := math.Max(recencyMinutes/1440.0, 1.0)
and limit recencyDays as follows:
recencyDaysFn = 1.0 / math.Min(200, recencyDays) // math.Min(200, recencyDays) will limit recencyDaysFn in the range 0.005...1
If the recency is zero (perhaps because recency data is not available for that audience segment), the corresponding map-entry for recency would not exist. I.e. we do not allow division by zero.
"f1" - the frequency, or amount of time that has elapsed, since the user was added to px1. In the logistic model, this is converted into a mapped numerical field whose value is simply equal to f1.
If frequency > 200 frequency will be set to 200.
Device Data
Model features related to a device, including browser, browser_version, os, os_version, device_manufacturer, device_model, device_type are derived from the contextual_data field in LLDS Impressions. For example
{
"24": {
"1": {
"targeted": [],
"untargeted": [
"br_Chrome:ve_60.0.3112"
]
}
},
"25": {
"1": {
"targeted": [],
"untargeted": [
"os_Windows:ve_10.0.0"
]
}
},
"26": {
"1": {
"targeted": [],
"untargeted": [
"fo_Desktop"
]
}
},
"27": {
"1": {
"targeted": [],
"untargeted": [
"ma_Desktop Make:mo_Desktop Model"
]
}
},
"28": {
"1": {},
"2": {},
"3": {}
}
}
Would be read as
browser = "br_Chrome"
browser_version = "br_Chrome:ve_60.0.3112"
os = "os_Windows"
os_version = "os_Windows:ve_10.0.0"
device_model = "ma_Desktop Make:mo_Desktop Model"
device_manufacturer = "ma_Desktop Make"
device_type = "fo_Desktop"
We include browser name in browser version (i.e. we prepend "br" in "br_Chrome:vs60.0.3112") because two different browsers could have the same version. The same logic holds for os_version and device_model.
Hardcoded Interactions
For exchange_id_cs_site_id: Formed by appending exchange id with other half of feature value, e.g. ExchangeID = 4 and site_id = 100. We will lookup exchange_id_cs_site_id^4-100
in features -> weights.
For exchange_id_cs_vcr
package main
import (
"fmt"
"strconv"
)
//Round is a bit slower but easier to read
func Round(input string, numberDec int) (string, error) {
flval, err := strconv.ParseFloat(input, 64)
if err != nil {
return "", err
}
if flval < 0 {
return "", nil
}
if numberDec == 3 && flval > 0 {
return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil
}
return fmt.Sprintf("%.1f", flval), nil
}
// RoundAndTrim Rounds and Trims
func RoundAndTrim(input string, numberDec int) (string, error) {
res, err := Round(input, numberDec)
if err != nil {
return "", err
}
return TrimRight(res, '0'), nil
}
// TrimRight removes zero padding. E.g.,
// 10.000 -> 10.0
// 10.100 -> 10.1
// 10.120 -> 10.12
func TrimRight(input string, cut byte) string {
count := 0
for i := len(input) - 1; i > 0; i-- {
if input[i] == cut && input[i-1] != '.' {
count++
} else {
break
}
}
return input[0 : len(input)-count]
}
func exchange_id_cs_vcr(exchangeID, videoCompletion string) {
vcRounded, _ := RoundAndTrim(videoCompletion, 1)
fmt.Println("exchange_id_cs_vcr^" + exchangeID + "-" + vcRounded)
}
func main() {
exchangeID := "10"
prebidVideoCompletion := "20"
exchange_id_cs_vcr(exchangeID, prebidVideoCompletion)
}
For exchange_id_cs_ctr
package main
import (
"fmt"
"strconv"
)
//Round is a bit slower but easier to read
func Round(input string, numberDec int) (string, error) {
flval, err := strconv.ParseFloat(input, 64)
if err != nil {
return "", err
}
if flval < 0 {
return "", nil
}
if numberDec == 3 && flval > 0 {
return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil
}
return fmt.Sprintf("%.1f", flval), nil
}
// RoundAndTrim Rounds and Trims
func RoundAndTrim(input string, numberDec int) (string, error) {
res, err := Round(input, numberDec)
if err != nil {
return "", err
}
return TrimRight(res, '0'), nil
}
// TrimRight removes zero padding. E.g.,
// 10.000 -> 10.0
// 10.100 -> 10.1
// 10.120 -> 10.12
func TrimRight(input string, cut byte) string {
count := 0
for i := len(input) - 1; i > 0; i-- {
if input[i] == cut && input[i-1] != '.' {
count++
} else {
break
}
}
return input[0 : len(input)-count]
}
func exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr string) {
ctRounded, _ := RoundAndTrim(prebidHistoricalCtr, 3)
fmt.Println("exchange_id_cs_ctr^" + exchangeID + "-" + ctRounded)
}
func main() {
exchangeID := "10"
prebidHistoricalCtr := "0.002"
exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr)
}
For exchange_id_cs_vrate
We round prebid_viewability
down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120.
package main
import (
"fmt"
"math"
"strconv"
)
func exchange_id_cs_vrate(exchangeID, prebidViewability string) {
vr, _ := strconv.ParseFloat(prebidViewability, 64)
finalVal := int64(math.Floor(vr/10)) * 10
fmt.Println("exchange_id_cs_vrate^" + exchangeID + "-" + strconv.FormatInt(finalVal, 10))
}
func main() {
exchangeID := "10"
prebidViewability := "19"
exchange_id_cs_vrate(exchangeID, prebidViewability)
}
AppID
The raw bid request sends the hashed_app_id
but we log app_id
in the impression_log. If the impression.app_id
is equal to "N/A" then the hashed_app_id
was equal to "0" in the raw bid request. If the impression.app_id
is different from "N/A" then the hashed_app_id
needs to be calculated manually as per the following pseudo-code.
Use Boost Library 1.58
uint32_t m_HashedAppId = 0;
void setHashedAppId(const char* appid)
{
if (appid) {
m_HashedAppId = atoi(appid);
if (m_HashedAppId == 0) {
m_HashedAppId = MM::Utils::pstr_ihash()(appid) % INT_MAX;
}
}
}
struct pstr_ihash
: std::unary_function<const char*, std::size_t>
{
std::size_t operator()(const char* x) const
{
std::size_t seed = 0;
while (*x) {
boost::hash_combine(seed, ::toupper(*x++));
}
return seed;
}
};
Please use the following code to test the calcHashedAppId
. It takes app_id and calculates the hashed_app_id to be used in the model and these examples will make sure the implementation is correct
#include <iostream>
#include <string>
#include <boost/functional/hash.hpp>
#include <climits>
#include <cassert>
#include <cstring>
struct pstr_ihash
: std::unary_function<const char*, std::size_t>
{
std::size_t operator()(const char* x) const
{
std::size_t seed = 0;
while (*x) {
boost::hash_combine(seed, ::toupper(*x++));
}
return seed;
}
};
// pass appId from impressions
// TODO: add handling for special case:
// If App ID is absent, LLDS Impressions logs N/A for app_id
// if app_id = N/A -> hashed_app_id = 0
unsigned int calcHashedAppId(const char* appid)
{
unsigned int m_HashedAppId = 0;
if (appid) {
if (std::strcmp(appid, "N/A") == 0) {
return 0;
}
m_HashedAppId = atoi(appid);
if (m_HashedAppId == 0) {
m_HashedAppId = pstr_ihash()(appid) % INT_MAX;
}
}
return m_HashedAppId;
}
struct testcase {
const char* input;
unsigned int expected_output;
};
int main() {
// your code goes here
std::vector<testcase> tests {
{"com.fivemobile.thescore", 1453566594},
{"605581486", 605581486},
{"tunein.player", 1173358324},
{"com.aws.android", 1903276095},
{"com.document.pdf.scanner.docscan", 1812585910},
{"com.apalon.weatherlive.free", 591449217},
{"com.pandora.android", 1387399900},
{"com.weather.weather", 447752198},
{"de.wetteronline.wetterapp", 1107246225},
{"439873467", 439873467},
{"N/A", 0},
};
for (unsigned int i = 0; i < tests.size(); i++) {
assert(calcHashedAppId(tests[i].input) == tests[i].expected_output);
}
return 0;
}
Base Domain
We recommend using a library from https://publicsuffix.org/learn/ to derive the base_domain
from site_url
. Bellow is an example of what we expect.
site_url | base_domain | comment |
---|---|---|
www.yahoo.com | yahoo.com | |
finance.yahoo.com | yahoo.com | |
https://www.sports.yahoo.com | yahoo.com | |
w.main.welcomescreen.aol.com | aol.com | |
bap.navigator.web.de | web.de | |
www.ebay.co.uk | ebay.co.uk | |
www.u.gg | u.gg | |
https://www.u.gg | u.gg | |
https://u.gg | u.gg | |
http://sqlserverbuilds.blogspot.com | sqlserverbuilds.blogspot.com | |
prebidsetup.an.r.appspot.com | prebidsetup.an.r.appspot.com | *.r.appspot.com is a valid suffix |
r.appspot.com | r.appspot.com | appspot.com is a valid suffix |
this.is.a.test.readthedocs.io | test.readthedocs.io | readthedocs.io is a valid suffix |
pythonguidecn.readthedocs.io | pythonguidecn.readthedocs.io | readthedocs.io is a valid suffix |
check-ozmall.global.ssl.fastly.net | check-ozmall.global.ssl.fastly.net | global.ssl.fastly.net is a valid suffix |
test.fastly.net | fastly.net | fastly.net is a valid suffix |
http://mp2f-m-env.ap-northeast-1.elasticbeanstalk.com | mp2f-m-env.ap-northeast-1.elasticbeanstalk.com | ap-northeast-1.elasticbeanstalk.com is a valid suffix |
thisisatest | error: publicsuffix: cannot derive eTLD+1 for domain "thisisatest"" |
Size Encoding/Decoding
32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height.
//encore_decode_size.go
package main
import (
"errors"
"fmt"
"strconv"
)
func encode(width uint32, height uint32) uint32 {
var size uint32
size = (0xFFFF & height)
size |= (width << 16)
return size
}
func decode(val string) (int, int, error) {
size, err := strconv.ParseUint(val, 10, 32)
if err != nil {
return 0, 0, errors.New("strconv.ParseUint() failed")
}
width := int((size >> 16) & 0xffff)
height := int(size & 0xffff)
return width, height, nil
}
func main() {
fmt.Printf("%d\n", encode(320, 50)) // Will output 20971570
width, height, err := decode("20971570")
if err != nil {
fmt.Println(err)
return
}
fmt.Printf("%dx%d\n", width, height) // Will output 320x50
}