# BYOA Model Features

Custom Brain allows the client to use the BYOA API to upload a set of logistic coefficients corresponding to any of the variables currently in use by the MediaMath Brain.

## Available Features

We accommodate the features listed below.  Please refer to "Data Platform Schemas" list on [MediaMath Support](/apis/log-level-data-service/overview) for the variable type and a description of what these features represent.  Only some of the Categorical Types below will be in that list.

### Intercept

| Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
|  --- | --- | --- | --- |
| `__const` | - | - | represents the Intercept weight |


### Mapped numerical Types

| Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
|  --- | --- | --- | --- |
| bidder_pixel_frequency | Mapped numerical | overlapped_brain_pixel_selections | See 'Audience Data' section below. |
| bidder_pixel_recency | Mapped numerical | overlapped_brain_pixel_selections | See 'Audience Data' section below. |


### Numerical Types

| Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
|  --- | --- | --- | --- |
| exchange_ctr | Numerical | prebid_historical_ctr | Click Through Rate. If it is -1, we populate it with 0. |
| exchange_vcr | Numerical | prebid_video_completion | Video Completion Rate. If it is -1, we populate it with 0. |
| exchange_viewability_rate | Numerical | prebid_viewability | Viewability Rate. If it is -1, we populate it with 0. |
| id_vintage | Numerical | id_vintage | 0 if the primary MM user id from bid request is between [0 and 1 week old], 1 if it’s between [1 and 18 weeks old], 2 if it’s between [18 and 52 weeks old], 3 if it’s more than 52 weeks old, 999 or higher If there is no MM UUID or incoming request has id_vintage >= 999, the calculation of the response rate will use id_vintage = 0 but LLDS Impressions will still log id_vintage >= 999. |


### Hardcoded Interaction Types

| Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
|  --- | --- | --- | --- |
| country_id_cs_region_id | Hardcoded interaction | region_id and country_id | country_id joins region_id by dash |
| exchange_id_cs_category_id | Hardcoded interaction | N/A | Read exchange_id and category_id from impression table. Combination of exchange_id and category_id |
| exchange_id_cs_vcr | Hardcoded interaction | N/A | Read exchange_id and prebid_video_completion from impression table. Combination of exchange_id and prebid_video_completion |
| exchange_id_cs_ctr | Hardcoded interaction | N/A | Read exchange_id and prebid_historical_ctr from impression table. We round prebid_historical_ctr using the same rounding convention as prebid_video_completion above. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information. |
| exchange_id_cs_vrate | Hardcoded interaction | N/A | Read exchange_id and prebid_viewability from impression table. Combination of exchange_id and prebid_viewability. We round prebid_viewability down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120. If there is no record for this section, it will record -1. See the rest of 'Hardcoded Interactions' for more information. |
| exchange_id_cs_site_id | Hardcoded interaction | N/A | Combination of exchange_id and site_id |


### Categorical Types

| Feature Name | Type | Corresponding field(s) in LLDS Impressions | Notes |
|  --- | --- | --- | --- |
| base_domain | Simple categorical | site_url | Extract effective top-level domain from t1.site_url (see Base Domain notes below [here](#base-domain)) |
| browser | Simple categorical | contextual_data | See WURFL features extraction |
| browser_id | Simple categorical | browser_id | We recommend using the 'browser' and 'browser version' features instead of this one due to better granularity and accuracy. |
| browser_language_id | Simple categorical | browser_language_id | BrowserLanguageID: If browserLanguage is set to 0 in LLDS Impressions, browserLanguage id was not send to BYOA. |
| browser_version | Simple categorical | contextual_data | See WURFL features extraction |
| category_id | Simple categorical | category_id | - |
| channel_type | Simple categorical | channel_type | 1 = Display, 2 = Video, 3 = Social, 4 = mobile display (web), 5 = mobile video (web), 6 = search, 7 = email, 8 = mobile display (in-app), 9 = mobile video (in-app), 10 = Newsfeed (FBX) |
| conn_speed | Simple categorical | conn_speed_id | read LLDS Impressions.conn_speed_id and store as conn_speed in the model |
| cookieless | Simple categorical | cross_device_flag | cookieless can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (2 or 3) then cross_device=TRUE else FALSE |
| creative_id | Simple categorical | creative_id | - |
| cross_device | Simple categorical | cross_device_flag | cross_device can be derived from the cross_device_flag field in LLDS Impressions using the following logic: If cross_device_flag = (1 or 3) then cross_device=TRUE else FALSE |
| country_id | Simple categorical | country_id | - |
| device_id | Simple categorical | device_id | We recommend using ‘device_manufacturer’, ‘device_model’ and ‘device_type’ features instead of this one due to better granularity and accuracy. |
| device_manufacturer | Simple categorical | contextual_data | See WURFL features extraction |
| device_model | Simple categorical | contextual_data | See WURFL features extraction |
| device_type | Simple categorical | contextual_data | See WURFL features extraction |
| day_of_week | Simple categorical | timestamp_gmt | 0 = Sunday ... 6 = Saturday |
| day_part | Simple categorical | timestamp_gmt | The day part is based on when the impression was served and the user’s timezone. 0 = 12AM to 5:59AM; 1 = 6AM to 11:59AM; 2 = 12PM to 5:59PM; 3 = 6PM to 11:59PM |
| deal_id | Simple categorical | deal_id | - |
| dma_id | Simple categorical | dma_id | - |
| exchange_id | Simple categorical | exchange_id | - |
| fold_position | Simple categorical | fold_position | 1 = Above the fold, 2 = Below the fold, 0 = Unknown |
| hashed_app_id | Simple categorical | read LLDS Impressions. app_id -> calculate hashed_app_id | See App ID notes. |
| isp_id | Simple categorical | isp_id | - |
| interstitial | Simple categorical | interstitial | - |
| num_device_ids | Simple categorical |  | t1.num_device_ids is null or 0, it should be 0, if 1, then it should be 1, else if > 1, it should be set as 2 |
| os | Simple categorical | contextual_data | See 'Device Information' below |
| os_id | Simple categorical | os_id | We recommend using the 'os' and 'os_version' fields instead of this one due to better granularity and accuracy. |
| os_version | Simple categorical | contextual_data | See WURFL features extraction |
| pixel | Simple categorical | overlapped_brain_pixel_selections | See 'Audience Data' section below. |
| region_id | Simple categorical | region_id | - |
| size | Simple categorical | size | 32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height, see 'Size Encoding/Decoding' section below |
| site_id | Simple categorical | site_id | - |
| user_frequency | Simple categorical | user_frequency | Refers to 'Session Frequency' on T1 Knowledge Base page |
| video_placement_type | Simple categorical | video_placement_type | Video placement type. 1: In-stream; 2: In-banner; 3: In-article; 4: In-feed; 5: Interstitial/slider/floating |
| video_skippability | Simple categorical | video_skippability | Flag to identify video skippability ; 0: non-skippable; 1: skippable; null: unknown |
| week_part | Simple categorical | timestamp_gmt | The week part is based on when the impression was served and the user’s timezone. 0 = weekday; 1 = weekend |


### Audience Data

There are three audience-based features in our model that are derived from the overlapped_brain_pixel_selections field in LLDS Impressions are separated by | char. format is: pixel, bidder_pixel_frequency, and bidder_pixel_recency

For context, overlapped_brain_pixel_selections is a pipe-delimited list of tuples that contain segment membership information.  Each tuple is of the format `mm:px1:r1:f1`; the components of the tuple are separated by colons and can be interpreted as follows:

"mm" - the namespace of the pixel

"px1" - the pixel_id of the audience segment.  In the logistic model, this is converted into a binary field indicating whether  the user is in this segment.

"r1" - the recency, or amount of time that has elapsed, since the user was added to px1.  In the logistic model, this is converted into a mapped numerical field whose value is equal to 1440.0/r1.  By way of background, 1440.0/recency is simply converting the recency value, which is denominated in minutes, to its inverse, measured in days—there are 1,440 minutes in a day.


```go
Input: recency_minutes

we calculate recencyDays := math.Max(recencyMinutes/1440.0, 1.0)

and limit recencyDays as follows:
recencyDaysFn = 1.0 / math.Min(200, recencyDays) // math.Min(200, recencyDays) will limit recencyDaysFn in the range 0.005...1
```

If the recency is zero (perhaps because recency data is not available for that audience segment), the corresponding map-entry for recency would not exist.  I.e. we do not allow division by zero.

"f1" - the frequency, or amount of time that has elapsed, since the user was added to px1.  In the logistic model, this is converted into a mapped numerical field whose value is simply equal to f1.


```bash
If frequency > 200 frequency will be set to 200.
```

### Device Data

Model features related to a device, including browser, browser_version, os, os_version, device_manufacturer, device_model, device_type are derived from the contextual_data field in LLDS Impressions. For example


```json
{
  "24": {
    "1": {
      "targeted": [],
      "untargeted": [
        "br_Chrome:ve_60.0.3112"
      ]
    }
  },
  "25": {
    "1": {
      "targeted": [],
      "untargeted": [
        "os_Windows:ve_10.0.0"
      ]
    }
  },
  "26": {
    "1": {
      "targeted": [],
      "untargeted": [
        "fo_Desktop"
      ]
    }
  },
  "27": {
    "1": {
      "targeted": [],
      "untargeted": [
        "ma_Desktop Make:mo_Desktop Model"
      ]
    }
  },
  "28": {
    "1": {},
    "2": {},
    "3": {}
  }
}
```

Would be read as


```ini
browser = "br_Chrome"
browser_version = "br_Chrome:ve_60.0.3112"
os = "os_Windows"
os_version = "os_Windows:ve_10.0.0"
device_model = "ma_Desktop Make:mo_Desktop Model"
device_manufacturer = "ma_Desktop Make"
device_type = "fo_Desktop"
```

We include browser name in browser version (i.e. we prepend "br" in "br_Chrome:vs60.0.3112") because two different browsers could have the same version. The same logic holds for os_version and device_model.

### Hardcoded Interactions

For exchange_id_cs_site_id:
Formed by appending exchange id with other half of feature value, e.g. ExchangeID = 4 and site_id = 100. We will lookup `exchange_id_cs_site_id^4-100` in features -> weights.

#### For exchange_id_cs_vcr


```go
package main

import (
    "fmt"
    "strconv"
)

//Round is a bit slower but easier to read
func Round(input string, numberDec int) (string, error) {
    flval, err := strconv.ParseFloat(input, 64)
    if err != nil {
        return "", err
    }
    if flval < 0 {
        return "", nil
    }
    if numberDec == 3 && flval > 0 {
        return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil
    }
    return fmt.Sprintf("%.1f", flval), nil
}

// RoundAndTrim Rounds and Trims
func RoundAndTrim(input string, numberDec int) (string, error) {
    res, err := Round(input, numberDec)
    if err != nil {
        return "", err
    }
    return TrimRight(res, '0'), nil
}

// TrimRight removes zero padding. E.g.,
// 10.000 -> 10.0
// 10.100 -> 10.1
// 10.120 -> 10.12
func TrimRight(input string, cut byte) string {
    count := 0
    for i := len(input) - 1; i > 0; i-- {
        if input[i] == cut && input[i-1] != '.' {
            count++
        } else {
            break
        }
    }

    return input[0 : len(input)-count]
}

func exchange_id_cs_vcr(exchangeID, videoCompletion string) {
    vcRounded, _ := RoundAndTrim(videoCompletion, 1)
    fmt.Println("exchange_id_cs_vcr^" + exchangeID + "-" + vcRounded)
}

func main() {
    exchangeID := "10"
    prebidVideoCompletion := "20"
    exchange_id_cs_vcr(exchangeID, prebidVideoCompletion)
}
```

#### For exchange_id_cs_ctr


```go
package main

import (
    "fmt"
    "strconv"
)

//Round is a bit slower but easier to read
func Round(input string, numberDec int) (string, error) {
    flval, err := strconv.ParseFloat(input, 64)
    if err != nil {
        return "", err
    }
    if flval < 0 {
        return "", nil
    }
    if numberDec == 3 && flval > 0 {
        return TrimRight(fmt.Sprintf("%.3f", flval), '0'), nil
    }
    return fmt.Sprintf("%.1f", flval), nil
}

// RoundAndTrim Rounds and Trims
func RoundAndTrim(input string, numberDec int) (string, error) {
    res, err := Round(input, numberDec)
    if err != nil {
        return "", err
    }
    return TrimRight(res, '0'), nil
}

// TrimRight removes zero padding. E.g.,
// 10.000 -> 10.0
// 10.100 -> 10.1
// 10.120 -> 10.12
func TrimRight(input string, cut byte) string {
    count := 0
    for i := len(input) - 1; i > 0; i-- {
        if input[i] == cut && input[i-1] != '.' {
            count++
        } else {
            break
        }
    }
    return input[0 : len(input)-count]
}

func exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr string) {
    ctRounded, _ := RoundAndTrim(prebidHistoricalCtr, 3)
    fmt.Println("exchange_id_cs_ctr^" + exchangeID + "-" + ctRounded)
}

func main() {
    exchangeID := "10"
    prebidHistoricalCtr := "0.002"
    exchange_id_cs_ctr(exchangeID, prebidHistoricalCtr)
}
```

#### For exchange_id_cs_vrate

We round `prebid_viewability` down to the nearest multiple of 10. For example, 120, 121, and 129 all become 120.


```go
package main

import (
    "fmt"
    "math"
    "strconv"
)

func exchange_id_cs_vrate(exchangeID, prebidViewability string) {
    vr, _ := strconv.ParseFloat(prebidViewability, 64)
    finalVal := int64(math.Floor(vr/10)) * 10
    fmt.Println("exchange_id_cs_vrate^" + exchangeID + "-" + strconv.FormatInt(finalVal, 10))
}

func main() {
    exchangeID := "10"
    prebidViewability := "19"
    exchange_id_cs_vrate(exchangeID, prebidViewability)
}
```

### AppID

The raw bid request sends the `hashed_app_id` but we log `app_id` in the impression_log. If the `impression.app_id` is equal to "N/A" then the `hashed_app_id` was equal to "0" in the raw bid request. If the `impression.app_id` is different from "N/A" then the `hashed_app_id` needs to be calculated manually as per the following pseudo-code.


```cpp
Use Boost Library 1.58

uint32_t m_HashedAppId = 0;

void setHashedAppId(const char* appid)
{
    if (appid) {
        m_HashedAppId = atoi(appid);
        if (m_HashedAppId == 0) {
            m_HashedAppId = MM::Utils::pstr_ihash()(appid) % INT_MAX;
        }
    }
}

struct pstr_ihash
    : std::unary_function<const char*, std::size_t>
{
    std::size_t operator()(const char* x) const
    {
        std::size_t seed = 0;

        while (*x) {
            boost::hash_combine(seed, ::toupper(*x++));
        }
        return seed;
    }
};
```

Please use the following code to test the `calcHashedAppId`. It takes app_id and calculates the hashed_app_id to be used in the model and these examples will make sure the implementation is correct


```cpp
#include <iostream>
#include <string>
#include <boost/functional/hash.hpp>
#include <climits>
#include <cassert>
#include <cstring>

struct pstr_ihash
    : std::unary_function<const char*, std::size_t>
{
    std::size_t operator()(const char* x) const
    {
        std::size_t seed = 0;

        while (*x) {
            boost::hash_combine(seed, ::toupper(*x++));
        }
        return seed;
    }
};

// pass appId from impressions
// TODO: add handling for special case:
// If App ID is absent, LLDS Impressions logs N/A for app_id
// if app_id = N/A -> hashed_app_id = 0
unsigned int calcHashedAppId(const char* appid)
{
    unsigned int m_HashedAppId = 0;
    if (appid) {
        if (std::strcmp(appid, "N/A") == 0) {
            return 0;
        }
        m_HashedAppId = atoi(appid);
        if (m_HashedAppId == 0) {
            m_HashedAppId = pstr_ihash()(appid) % INT_MAX;
        }
    }
    return m_HashedAppId;
}

struct testcase {
    const char* input;
    unsigned int expected_output;
};
  
int main() {
   // your code goes here

   std::vector<testcase> tests {
      {"com.fivemobile.thescore", 1453566594},
      {"605581486", 605581486},
      {"tunein.player", 1173358324},
      {"com.aws.android", 1903276095},
      {"com.document.pdf.scanner.docscan", 1812585910},
      {"com.apalon.weatherlive.free", 591449217},
      {"com.pandora.android", 1387399900},
      {"com.weather.weather", 447752198},
      {"de.wetteronline.wetterapp", 1107246225},
      {"439873467", 439873467},
      {"N/A", 0},
   };

   for (unsigned int i = 0; i < tests.size(); i++) {
      assert(calcHashedAppId(tests[i].input) == tests[i].expected_output);
   }

   return 0;
}
```

### Base Domain

We recommend using a library from https://publicsuffix.org/learn/ to derive the `base_domain` from `site_url`. Bellow is an example of what we expect.

| site_url | base_domain | comment |
|  --- | --- | --- |
| www.yahoo.com | yahoo.com |  |
| finance.yahoo.com | yahoo.com |  |
| https://www.sports.yahoo.com | yahoo.com |  |
| w.main.welcomescreen.aol.com | aol.com |  |
| bap.navigator.web.de | web.de |  |
| www.ebay.co.uk | ebay.co.uk |  |
| www.u.gg | u.gg |  |
| https://www.u.gg | u.gg |  |
| https://u.gg | u.gg |  |
| http://sqlserverbuilds.blogspot.com | sqlserverbuilds.blogspot.com |  |
| prebidsetup.an.r.appspot.com | prebidsetup.an.r.appspot.com | *.r.appspot.com is a valid suffix |
| r.appspot.com | r.appspot.com | appspot.com is a valid suffix |
| this.is.a.test.readthedocs.io | test.readthedocs.io | readthedocs.io is a valid suffix |
| pythonguidecn.readthedocs.io | pythonguidecn.readthedocs.io | readthedocs.io is a valid suffix |
| check-ozmall.global.ssl.fastly.net | check-ozmall.global.ssl.fastly.net | global.ssl.fastly.net is a valid suffix |
| test.fastly.net | fastly.net | fastly.net is a valid suffix |
| http://mp2f-m-env.ap-northeast-1.elasticbeanstalk.com | mp2f-m-env.ap-northeast-1.elasticbeanstalk.com | ap-northeast-1.elasticbeanstalk.com is a valid suffix |
| thisisatest |  | error: publicsuffix: cannot derive eTLD+1 for domain "thisisatest"" |


### Size Encoding/Decoding

32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height.


```go
//encore_decode_size.go
package main

import (
	"errors"
	"fmt"
	"strconv"
)

func encode(width uint32, height uint32) uint32 {
	var size uint32
	size = (0xFFFF & height)
	size |= (width << 16)
	return size
}

func decode(val string) (int, int, error) {
	size, err := strconv.ParseUint(val, 10, 32)
	if err != nil {
		return 0, 0, errors.New("strconv.ParseUint() failed")
	}
	width := int((size >> 16) & 0xffff)
	height := int(size & 0xffff)

	return width, height, nil
}

func main() {
	fmt.Printf("%d\n", encode(320, 50)) // Will output 20971570

	width, height, err := decode("20971570")
	if err != nil {
		fmt.Println(err)
		return
	}
	fmt.Printf("%dx%d\n", width, height) // Will output 320x50
}
```