# BYOA Model Features Custom Brain allows the client to use the BYOA API to upload a set of logistic coefficients corresponding to any of the variables currently in use by the MediaMath Brain. ## Available Features We accommodate the features listed below. Please refer to "Data Platform Schemas" list on [MediaMath Support](/apis/log-level-data-service/overview) for the variable type and a description of what these features represent. Only some of the Categorical Types below will be in that list. ### Intercept | Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes | | --- | --- | --- | --- | | `__const` | - | - | represents the Intercept weight | ### Mapped numerical Types | Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes | | --- | --- | --- | --- | | bidder_pixel_frequency | Mapped numerical | overlapped_brain_pixel_selections | See 'Audience Data' section below. | | bidder_pixel_recency | Mapped numerical | overlapped_brain_pixel_selections | See 'Audience Data' section below. | ### Numerical Types | Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes | | --- | --- | --- | --- | | exchange_ctr | Numerical | prebid_historical_ctr | Click Through Rate. If it is -1, we populate it with 0. | | exchange_vcr | Numerical | prebid_video_completion | Video Completion Rate. If it is -1, we populate it with 0. | | exchange_viewability_rate | Numerical | prebid_viewability | Viewability Rate. If it is -1, we populate it with 0. | | id_vintage | Numerical | id_vintage | 0 if the primary MM user id from bid request is between [0 and 1 week old], 1 if it’s between [1 and 18 weeks old], 2 if it’s between [18 and 52 weeks old], 3 if it’s more than 52 weeks old, 999 or higher If there is no MM UUID or incoming request has id_vintage >= 999, the calculation of the response rate will use id_vintage = 0 but LLDS Impressions will still log id_vintage >= 999. | ### Hardcoded Interaction Types | Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes | | --- | --- | --- | --- | | country_id_cs_region_id | Hardcoded interaction | region_id and country_id | country_id joins region_id by dash | | exchange_id_cs_category_id | Hardcoded interaction | N/A | Read exchange_id and category_id from impression table. Combination of exchange_id and category_id | | exchange_id_cs_vcr | Hardcoded interaction | N/A | Read exchange_id and prebid_video_completion from impression table. Combination of exchange_id and prebid_video_completion | | exchange_id_cs_ctr | Hardcoded interaction | N/A | Read exchange_id and prebid_historical_ctr from impression table. We round prebid_historical_ctr using the same rounding convention as prebid_video_completion above. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information. | | exchange_id_cs_vrate | Hardcoded interaction | N/A | Read exchange_id and prebid_viewability from impression table. Combination of exchange_id and prebid_viewability. We round prebid_viewability down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information. | | exchange_id_cs_site_id | Hardcoded interaction | N/A | Combination of exchange_id and site_id | ### Categorical Types | Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes | | --- | --- | --- | --- | | base_domain | Simple categorical | site_url | Extract effective top-level domain from t1.site_url (see Base Domain notes below [here](#base-domain)) | | browser | Simple categorical | contextual_data | See WURFL features extraction | | browser_id | Simple categorical | browser_id | We recommend using the 'browser' and 'browser version' features instead of this one due to better granularity and accuracy. | | browser_language_id | Simple categorical | browser_language_id | BrowserLanguageID: If browserLanguage is set to 0 in LLDS Impressions, browserLanguage id was not send to BYOA. | | browser_version | Simple categorical | contextual_data | See WURFL features extraction | | category_id | Simple categorical | category_id | - | | channel_type | Simple categorical | channel_type | 1 = Display, 2 = Video, 3 = Social, 4 = mobile display (web), 5 = mobile video (web), 6 = search, 7 = email, 8 = mobile display (in-app), 9 = mobile video (in-app), 10 = Newsfeed (FBX) | | conn_speed | Simple categorical | conn_speed_id | read LLDS Impressions.conn_speed_id and store as conn_speed in the model | | cookieless | Simple categorical | cross_device_flag | cookieless can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (2 or 3) then cross_device=TRUE else FALSE | | creative_id | Simple categorical | creative_id | - | | cross_device | Simple categorical | cross_device_flag | cross_device can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (1 or 3) then cross_device=TRUE else FALSE | | country_id | Simple categorical | country_id | - | | device_id | Simple categorical | device_id | We recommend using ‘device_manufacturer’, ‘device_model’ and ‘device_type’ features instead of this one due to better granularity and accuracy. | | device_manufacturer | Simple categorical | contextual_data | See WURFL features extraction | | device_model | Simple categorical | contextual_data | See WURFL features extraction | | device_type | Simple categorical | contextual_data | See WURFL features extraction | | day_of_week | Simple categorical | timestamp_gmt | 0 = Sunday ... 6 = Saturday | | day_part | Simple categorical | timestamp_gmt | The day part is based on when the impression was served and the user’s timezone. 0 = 12AM to 5:59AM; 1 = 6AM to 11:59AM; 2 = 12PM to 5:59PM; 3 = 6PM to 11:59PM | | deal_id | Simple categorical | deal_id | - | | dma_id | Simple categorical | dma_id | - | | exchange_id | Simple categorical | exchange_id | - | | fold_position | Simple categorical | fold_position | 1 = Above the fold, 2 = Below the fold, 0 = Unknown | | hashed_app_id | Simple categorical | read LLDS Impressions. app_id -> calculate hashed_app_id | See App ID notes. | | isp_id | Simple categorical | isp_id | - | | interstitial | Simple categorical | interstitial | - | | num_device_ids | Simple categorical | | t1.num_device_ids is null or 0, it should be 0, if 1, then it should be 1, else if > 1, it should be set as 2 | | os | Simple categorical | contextual_data | See 'Device Information' below | | os_id | Simple categorical | os_id | We recommend using the 'os' and 'os_version' fields instead of this one due to better granularity and accuracy. | | os_version | Simple categorical | contextual_data | See WURFL features extraction | | pixel | Simple categorical | overlapped_brain_pixel_selections | See 'Audience Data' section below. | | region_id | Simple categorical | region_id | - | | size | Simple categorical | size | 32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height, see 'Size Encoding/Decoding' section below | | site_id | Simple categorical | site_id | - | | user_frequency | Simple categorical | user_frequency | Refers to 'Session Frequency' on T1 Knowledge Base page | | video_placement_type | Simple categorical | video_placement_type | Video placement type. 1: In-stream; 2: In-banner; 3: In-article; 4: In-feed; 5: Interstitial/slider/floating | | video_skippability | Simple categorical | video_skippability | Flag to identify video skippability ; 0: non-skippable; 1: skippable; null: unknown | | week_part | Simple categorical | timestamp_gmt | The week part is based on when the impression was served and the user’s timezone. 0 = weekday; 1 = weekend | ### Audience Data There are three audience-based features in our model that are derived from the overlapped_brain_pixel_selections field in LLDS Impressions are separated by | char. format is: pixel, bidder_pixel_frequency, and bidder_pixel_recency For context, overlapped_brain_pixel_selections is a pipe-delimited list of tuples that contain segment membership information. Each tuple is of the format `mm:px1:r1:f1`; the components of the tuple are separated by colons and can be interpreted as follows: "mm" - the namespace of the pixel "px1" - the pixel_id of the audience segment. In the logistic model, this is converted into a binary field indicating whether the user is in this segment. "r1" - the recency, or amount of time that has elapsed, since the user was added to px1. In the logistic model, this is converted into a mapped numerical field whose value is equal to 1440.0/r1. By way of background, 1440.0/recency is simply converting the recency value, which is denominated in minutes, to its inverse, measured in days—there are 1,440 minutes in a day. ```go Input: recency_minutes we calculate recencyDays := math.Max(recencyMinutes/1440.0, 1.0) and limit recencyDays as follows: recencyDaysFn = 1.0 / math.Min(200, recencyDays) // math.Min(200, recencyDays) will limit recencyDaysFn in the range 0.005...1 ``` If the recency is zero (perhaps because recency data is not available for that audience segment), the corresponding map-entry for recency would not exist. I.e. we do not allow division by zero. "f1" - the frequency, or amount of time that has elapsed, since the user was added to px1. In the logistic model, this is converted into a mapped numerical field whose value is simply equal to f1. ```bash If frequency > 200 frequency will be set to 200. ``` ### Device Data Model features related to a device, including browser, browser_version, os, os_version, device_manufacturer, device_model, device_type are derived from the contextual_data field in LLDS Impressions. For example ```json { "24": { "1": { "targeted": [], "untargeted": [ "br_Chrome:ve_60.0.3112" ] } }, "25": { "1": { "targeted": [], "untargeted": [ "os_Windows:ve_10.0.0" ] } }, "26": { "1": { "targeted": [], "untargeted": [ "fo_Desktop" ] } }, "27": { "1": { "targeted": [], "untargeted": [ "ma_Desktop Make:mo_Desktop Model" ] } }, "28": { "1": {}, "2": {}, "3": {} } } ``` Would be read as ```ini browser = "br_Chrome" browser_version = "br_Chrome:ve_60.0.3112" os = "os_Windows" os_version = "os_Windows:ve_10.0.0" device_model = "ma_Desktop Make:mo_Desktop Model" device_manufacturer = "ma_Desktop Make" device_type = "fo_Desktop" ``` We include browser name in browser version (i.e. we prepend "br" in "br_Chrome:vs60.0.3112") because two different browsers could have the same version. The same logic holds for os_version and device_model. ### Hardcoded Interactions For exchange_id_cs_site_id: Formed by appending exchange id with other half of feature value, e.g. ExchangeID = 4 and site_id = 100. We will lookup `exchange_id_cs_site_id^4-100` in features -> weights. #### For exchange_id_cs_vcr ```go package main import ( "fmt" "strconv" ) //Round is a bit slower but easier to read func Round(input string, numberDec int) (string, error) { flval, err := strconv.ParseFloat(input, 64) if err != nil { return "", err } if flval < 0 { return "", nil } if numberDec == 3 && flval > 0 { return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil } return fmt.Sprintf("%.1f", flval), nil } // RoundAndTrim Rounds and Trims func RoundAndTrim(input string, numberDec int) (string, error) { res, err := Round(input, numberDec) if err != nil { return "", err } return TrimRight(res, '0'), nil } // TrimRight removes zero padding. E.g., // 10.000 -> 10.0 // 10.100 -> 10.1 // 10.120 -> 10.12 func TrimRight(input string, cut byte) string { count := 0 for i := len(input) - 1; i > 0; i-- { if input[i] == cut && input[i-1] != '.' { count++ } else { break } } return input[0 : len(input)-count] } func exchange_id_cs_vcr(exchangeID, videoCompletion string) { vcRounded, _ := RoundAndTrim(videoCompletion, 1) fmt.Println("exchange_id_cs_vcr^" + exchangeID + "-" + vcRounded) } func main() { exchangeID := "10" prebidVideoCompletion := "20" exchange_id_cs_vcr(exchangeID, prebidVideoCompletion) } ``` #### For exchange_id_cs_ctr ```go package main import ( "fmt" "strconv" ) //Round is a bit slower but easier to read func Round(input string, numberDec int) (string, error) { flval, err := strconv.ParseFloat(input, 64) if err != nil { return "", err } if flval < 0 { return "", nil } if numberDec == 3 && flval > 0 { return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil } return fmt.Sprintf("%.1f", flval), nil } // RoundAndTrim Rounds and Trims func RoundAndTrim(input string, numberDec int) (string, error) { res, err := Round(input, numberDec) if err != nil { return "", err } return TrimRight(res, '0'), nil } // TrimRight removes zero padding. E.g., // 10.000 -> 10.0 // 10.100 -> 10.1 // 10.120 -> 10.12 func TrimRight(input string, cut byte) string { count := 0 for i := len(input) - 1; i > 0; i-- { if input[i] == cut && input[i-1] != '.' { count++ } else { break } } return input[0 : len(input)-count] } func exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr string) { ctRounded, _ := RoundAndTrim(prebidHistoricalCtr, 3) fmt.Println("exchange_id_cs_ctr^" + exchangeID + "-" + ctRounded) } func main() { exchangeID := "10" prebidHistoricalCtr := "0.002" exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr) } ``` #### For exchange_id_cs_vrate We round `prebid_viewability` down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120. ```go package main import ( "fmt" "math" "strconv" ) func exchange_id_cs_vrate(exchangeID, prebidViewability string) { vr, _ := strconv.ParseFloat(prebidViewability, 64) finalVal := int64(math.Floor(vr/10)) * 10 fmt.Println("exchange_id_cs_vrate^" + exchangeID + "-" + strconv.FormatInt(finalVal, 10)) } func main() { exchangeID := "10" prebidViewability := "19" exchange_id_cs_vrate(exchangeID, prebidViewability) } ``` ### AppID The raw bid request sends the `hashed_app_id` but we log `app_id` in the impression_log. If the `impression.app_id` is equal to "N/A" then the `hashed_app_id` was equal to "0" in the raw bid request. If the `impression.app_id` is different from "N/A" then the `hashed_app_id` needs to be calculated manually as per the following pseudo-code. ```cpp Use Boost Library 1.58 uint32_t m_HashedAppId = 0; void setHashedAppId(const char* appid) { if (appid) { m_HashedAppId = atoi(appid); if (m_HashedAppId == 0) { m_HashedAppId = MM::Utils::pstr_ihash()(appid) % INT_MAX; } } } struct pstr_ihash : std::unary_function { std::size_t operator()(const char* x) const { std::size_t seed = 0; while (*x) { boost::hash_combine(seed, ::toupper(*x++)); } return seed; } }; ``` Please use the following code to test the `calcHashedAppId`. It takes app_id and calculates the hashed_app_id to be used in the model and these examples will make sure the implementation is correct ```cpp #include #include #include #include #include #include struct pstr_ihash : std::unary_function { std::size_t operator()(const char* x) const { std::size_t seed = 0; while (*x) { boost::hash_combine(seed, ::toupper(*x++)); } return seed; } }; // pass appId from impressions // TODO: add handling for special case: // If App ID is absent, LLDS Impressions logs N/A for app_id // if app_id = N/A -> hashed_app_id = 0 unsigned int calcHashedAppId(const char* appid) { unsigned int m_HashedAppId = 0; if (appid) { if (std::strcmp(appid, "N/A") == 0) { return 0; } m_HashedAppId = atoi(appid); if (m_HashedAppId == 0) { m_HashedAppId = pstr_ihash()(appid) % INT_MAX; } } return m_HashedAppId; } struct testcase { const char* input; unsigned int expected_output; }; int main() { // your code goes here std::vector tests { {"com.fivemobile.thescore", 1453566594}, {"605581486", 605581486}, {"tunein.player", 1173358324}, {"com.aws.android", 1903276095}, {"com.document.pdf.scanner.docscan", 1812585910}, {"com.apalon.weatherlive.free", 591449217}, {"com.pandora.android", 1387399900}, {"com.weather.weather", 447752198}, {"de.wetteronline.wetterapp", 1107246225}, {"439873467", 439873467}, {"N/A", 0}, }; for (unsigned int i = 0; i < tests.size(); i++) { assert(calcHashedAppId(tests[i].input) == tests[i].expected_output); } return 0; } ``` ### Base Domain We recommend using a library from https://publicsuffix.org/learn/ to derive the `base_domain` from `site_url`. Bellow is an example of what we expect. | site_url | base_domain | comment | | --- | --- | --- | | www.yahoo.com | yahoo.com | | | finance.yahoo.com | yahoo.com | | | https://www.sports.yahoo.com | yahoo.com | | | w.main.welcomescreen.aol.com | aol.com | | | bap.navigator.web.de | web.de | | | www.ebay.co.uk | ebay.co.uk | | | www.u.gg | u.gg | | | https://www.u.gg | u.gg | | | https://u.gg | u.gg | | | http://sqlserverbuilds.blogspot.com | sqlserverbuilds.blogspot.com | | | prebidsetup.an.r.appspot.com | prebidsetup.an.r.appspot.com | *.r.appspot.com is a valid suffix | | r.appspot.com | r.appspot.com | appspot.com is a valid suffix | | this.is.a.test.readthedocs.io | test.readthedocs.io | readthedocs.io is a valid suffix | | pythonguidecn.readthedocs.io | pythonguidecn.readthedocs.io | readthedocs.io is a valid suffix | | check-ozmall.global.ssl.fastly.net | check-ozmall.global.ssl.fastly.net | global.ssl.fastly.net is a valid suffix | | test.fastly.net | fastly.net | fastly.net is a valid suffix | | http://mp2f-m-env.ap-northeast-1.elasticbeanstalk.com | mp2f-m-env.ap-northeast-1.elasticbeanstalk.com | ap-northeast-1.elasticbeanstalk.com is a valid suffix | | thisisatest | | error: publicsuffix: cannot derive eTLD+1 for domain "thisisatest"" | ### Size Encoding/Decoding 32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height. ```go //encore_decode_size.go package main import ( "errors" "fmt" "strconv" ) func encode(width uint32, height uint32) uint32 { var size uint32 size = (0xFFFF & height) size |= (width << 16) return size } func decode(val string) (int, int, error) { size, err := strconv.ParseUint(val, 10, 32) if err != nil { return 0, 0, errors.New("strconv.ParseUint() failed") } width := int((size >> 16) & 0xffff) height := int(size & 0xffff) return width, height, nil } func main() { fmt.Printf("%d\n", encode(320, 50)) // Will output 20971570 width, height, err := decode("20971570") if err != nil { fmt.Println(err) return } fmt.Printf("%dx%d\n", width, height) // Will output 320x50 } ```