The MediaMath Log Level Data Service provides access to raw, event-level data generated through the MediaMath platform. Leveraging cloud resources like AWS, clients can build value-added analytics services on top of their data with a minimum of developer time, money, or hassle.
- Architecture
- Accessing Log Level Data Using AWS
- Working with the Log Level Data Service
- Data Schemas - Impressions - Events - Attributed Events - Viewability Events
- Appendix - Day/Hour/Week Part Logic
- Appendix - Batch ID Logic
Data Platform Architecture
As campaigns and strategies created within MediaMath Platform bid and win, logs are written recording the dimensions and metrics of each individual impression. In addition, MediaMath pixel servers log every MathTag event that is generated on an advertiser's site (such as a page load, interaction or purchase) and app. These log files are collected into a central processing system where attribution is run.
At the end of this process, raw log files are made available for access using the Logs Service, which collects the different log types and stores each organization's data into a separate buckets using Amazon S3.
The following data sets are made available as part of Logs Service:
- Impressions -- Impressions served by the MediaMath Platform
- Events -- Event pixel fires from advertiser sites and apps
- Attributed Events -- Clicks, video and conversions based on campaign merit events
- Viewability Events -- Viewability event fires from viewability partners
Each event type is stored in a corresponding S3 bucket, partitioned by client account (organization ID), and secured using the AWS Identity and Access Management framework.
For Attributed Events, there is one additional partition, which is event type (event_type), and possible values for this partition are "click", "conversion", and "video". This added partition groups the attributed events into click events, campaign merit conversion events, and video events.
Accessing Log Level Data Using AWS
How Data is Stored in S3
Data sets in the Log Level Data Service are stored in S3 in the following locations:
Data set | S3 location |
---|---|
Impressions | s3://mm-prod-platform-impressions/data/organization_id=[MediaMath_ORGANIZATION_ID]/ |
Events | s3://mm-prod-platform-events/data/organization_id=[MediaMath_ORGANIZATION_ID]/ |
Attributed Events | s3://mm-prod-platform-attributed-events/data/organization_id=[MediaMath_ORGANIZATION_ID]/ |
Viewability | s3://mm-prod-platform-viewability-events/data/organization_id=[MediaMath_ORGANIZATION_ID]/ |
In addition, each bucket has a public common
directory which includes various utility data sets. See Using assets from the common folder for more details.
Data Security and Authorization
Data security is enforced using AWS Identity and Access Management. To access the Log Level Data Service you will need your own, valid AWS account and an IAM user with the sts:AssumeRole
privilege.
To grant access to the Log Level Data Service, MediaMath takes your MediaMath account (e.g. organization) ID and your AWS account ID and creates a new role within the MediaMath AWS account. This role has a unique identifier called a Role ARN. This role grants access to your data, and can only be assumed from IAM users within your AWS account.
If you're working with LiveRamp, MediaMath will authorize LiveRamp's user (User ARN: arn:aws:iam::609251445204:user/svc-liveramp-ingestion-connectors) to access your data.
Below is a diagram showing how MediaMath leverages IAM to ensure the highest level of security for your data.
Obtaining Temporary Credentials to Access Log Level Data
In order to access your data from the MediaMath AWS account, first make sure that you have received a role ARN and an External ID from MediaMath. The role ARN should take the form of arn:aws:iam::794878508631:role/<ROLE_NAME>
and the External ID should take the form of XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
.
Next, attach a group policy to the IAM user who is trying to assume the role. The policy should use the role ARN provided by MediaMath and take the following form:
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::794878508631:role/<ROLE_NAME>"
}
}
Then, you need to call the sts:AssumeRole
action from within your own AWS account, using both the role ARN and External ID provided by MediaMath. If the call succeeds (you are allowed to assume roles, and have been authorized to assume the role MediaMath created) you will receive a set of temporary credenials to access your data.
The below Python script can be used as a starting point.
import os
import boto3 # 1.7.4
import sys
AWS_ACCESS_KEY_ID = '' # Access key ID
AWS_SECRET_ACCESS_KEY = '' # Secret access key
ROLE_NAME = '' # DPA-{USER} ex: DPA-Mediamath
EXTERNAL_ID = '' # MediaMath provided
ORGANIZATION_ID = '' # MediaMath organization id
S3_BUCKET = 'mm-prod-platform-impressions'
# ------------------------------------------------ DO NOT ALTER BELOW ------------------------------------------------ #
ROLE_SESSION_NAME = 'data-platform'
BASE_ROLE_ARN = 'arn:aws:iam::794878508631:role/'
ROLE_ARN = BASE_ROLE_ARN + ROLE_NAME
DURATION_SECONDS = 3600
client = boto3.client(
'sts',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
role = client.assume_role(
RoleArn=ROLE_ARN,
RoleSessionName=ROLE_SESSION_NAME,
DurationSeconds=DURATION_SECONDS,
ExternalId=EXTERNAL_ID
)
session = boto3.session.Session(
aws_access_key_id=role['Credentials']['AccessKeyId'],
aws_secret_access_key=role['Credentials']['SecretAccessKey'],
aws_session_token=role['Credentials']['SessionToken']
)
S3 = session.resource('s3')
my_bucket = S3.Bucket(S3_BUCKET)
for object_summary in my_bucket.objects.filter(Prefix='data/organization_id=' + ORGANIZATION_ID + '/'):
print (object_summary.key)
NOTE - If you have requested access to the Log Level Data Service partitioned at the MediaMath advertiser level, the below Python script can be used as a starting point.
WS_ACCESS_KEY_ID = '' # Access key ID
AWS_SECRET_ACCESS_KEY = '' # Secret access key
ROLE_NAME = '' # DPA-{USER} ex: DPA-Mediamath
EXTERNAL_ID = '' # MediaMath provided
ORGANIZATION_ID = '' # MediaMath organization id
AGENCY_ID = '' # MediaMath agency id
ADVERTISER_ID = '' # MediaMath advertiser id
S3_BUCKET = 'mm-prod-platform-impressions-v1-1' #'mm-prod-platform-attributed-events-v1-1' #'mm-prod-platform-events-v1-1'
# ------------------------------------------------ DO NOT ALTER BELOW ------------------------------------------------ #
ROLE_SESSION_NAME = 'data-platform'
BASE_ROLE_ARN = 'arn:aws:iam::794878508631:role/'
ROLE_ARN = BASE_ROLE_ARN + ROLE_NAME
DURATION_SECONDS = 3600
client = boto3.client(
'sts',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
role = client.assume_role(
RoleArn=ROLE_ARN,
RoleSessionName=ROLE_SESSION_NAME,
DurationSeconds=DURATION_SECONDS,
ExternalId=EXTERNAL_ID
)
session = boto3.session.Session(
aws_access_key_id=role['Credentials']['AccessKeyId'],
aws_secret_access_key=role['Credentials']['SecretAccessKey'],
aws_session_token=role['Credentials']['SessionToken']
)
S3 = session.resource('s3')
my_bucket = S3.Bucket(S3_BUCKET)
for object_summary in my_bucket.objects.filter(Prefix='data/organization_id=' + ORGANIZATION_ID + '/agency_id=' AGENCY_ID + '/'):
print (object_summary.key)
Temporary security credentials last a maximum of 1 hour. See the best practices section for information about loading many days worth of data in bulk.
When connecting to a Log Level Data Service bucket using AWS CLI, you will need to configure the client to assume the correct role. To do this, we will need to set up two things:
- A valid Access Key ID and Secret Access Key from the source account in .aws/credentials
- The role ARN provided by MediaMath and source profile (from step 1) in .aws/config
For instance:
.aws/credentials:
[log_level_source]
aws_access_key_id = myaccesskey
aws_secret_access_key = mysecretkey
.aws/config:
[profile log_level]
role_arn = arn:aws:iam::794878508631:role/DPA-MyCompany
source_profile = log_level_source
With this set up, you can use the CLI with the --profile
flag, like such:
$ aws --profile log_level s3 ls s3://mm-prod-platform-impressions/common/
The AWS CLI will take care of assuming the role and caching the temporary credentials for you. When the token provided is expired, the CLI will also re-issue the AssumeRole command and store the new temporary credentials. You can actually see the result of the AssumeRole command if you look in .aws/cli/cache—there should be a file named, in this example, log_level--arn_aws_iam__794878508631_role-DPA-MyCompany.json
.
Working with the Log Level Data Service
The following section details information needed to process the raw data available via the Log Level Data Service.
File Formats and Schemas
Log level data sets are stored as tab-separated files. TSV is an easy to parse format, ready for loading into any downstream system such as an RDBMS or Hadoop.
Any string fields containing tab characters are cleansed so tabs are replaced as .
to avoid breaking downstream parsers.
Unless noted otherwise, all columns should be considered nullable. A null is written in the TSV files as \N
.
See Data Schemas for a detailed description of the schemas, including data types.
Data Update Cycle
Data is updated 3x per day. However, due to ever changing data sets the completion times of the refresh is not held to a strict SLA. The times below are estimates - please allow for delays as data growth can produce unexpected delays.
GMT (London)
Batch number | Delivered around (GMT) | Includes data from (GMT) |
---|---|---|
0 | 06:30 today | 18:00 yesterday - 02:00 today |
1 | 14:30 today | 02:00 today - 10:00 today |
2 | 22:30 today | 10:00 today - 18:00 today |
EST (New York)
Batch number | Delivered around (GMT-5) | Includes data from (GMT-5) |
---|---|---|
1 | 09:30 today | 21:00 yesterday - 05:00 today |
2 | 17:30 today | 05:00 today - 13:00 today |
0 | 01:30 tomorrow | 13:00 today - 21:00 today |
JST (Tokyo)
Batch number | Delivered around (GMT+9) | Includes data from (GMT+9) |
---|---|---|
2 | 07:30 today | 19:00 yesterday - 03:00 today |
0 | 15:30 today | 03:00 today - 11:00 today |
1 | 23:30 today | 11:00 today - 19:00 today |
Using assets from the common/
folder
The common/
directory of each S3 bucket has several utility data sets including
- Create table statements for different RDBMs
- Header row files which list the column titles
- Refresh logs that can be used to tell when Data Platform data sets have been updated
Update logs
Every time a batch of logs are completely uploaded to S3, an empty file is uploaded to the /common/completion-status/
directory of the bucket (this is true for impression logs, event logs, and attributed event logs). The first time a batch of data is processed and uploaded to S3 a 0-byte file of the name
mm_impressions_batch_[BATCH_ID].done
mm_events_batch_[BATCH_ID].done
mm_attributed_events_batch_[BATCH_ID].done
is loaded to /common/completion-status/
. The batch ID is an 8-digit code of the type YYYYMMDDNN
where YYYYMMDD
represents a date and NN
is either 00
, 01
, or 02
(representing the first, second and third batches of data processed every day).
For example, on 7/1, we would normally expect the directory s3://mm-prod-platform-impressions/common/completion-status/
to contain the following three files:
mm_impressions_batch_2018070100.done
- loaded upon completion of the first batch on 7/1mm_impressions_batch_2018070101.done
- loaded upon completion of the second batch on 7/1mm_impressions_batch_2018070102.done
- loaded upon completion of the third batch on 7/1
It is important not to download or begin processing any log data before the appearance of this '.done' file: a batch usually will upload to S3 several files, even for a particular organization and impression date. Thus even if some files from batch 2018070102
have appeared in /data/
, we can not be certain that more files will not appear presently until we find the 0-byte mm_impressions_batch_2018070102.done
file in /common/completion-status/
.
Rarely, but on occasion, a particular batch of data may get processed later than expected. For example, suppose an unexepected service interruption prevented batch 2018070102
of impression log data from being processed on 7/1 (batch 2 is usually available in S3 no later than 20:30 GMT), and instead this batch of data was not processed until 7/3. In this case, upon completion of the upload of data from batch 2018070102
the file mm_impressions_batch_2018070102.done
would get uploaded into
/common/completion-status/
Rewrite window
From time to time there may be changes to the data after its being written into s3 (for example later arrival data, margin changes, latent conversions etc.). For those cases we have automated a rewrite of the data as follows:
- Impressions - every day the past 5 days are being rewritten (into the original partition date along with the current date included in the batch name)
- Events - every day the past 5 days are being rewritten (into the original partition date along with the current date included in the batch name)
- Attributed events - every day the past 30 days are being rewritten (into the original partition date along with the current date included in the batch name)
- Viewability - There is no lookback days window defined. Rewrites will be done on the original partition date along with the current date included in the batch name
Please implement a reload logic on your side to assure completness of data.
Special backfills
On the rare occasion of a major incident where we are unable to backfill the data during the standard rewrite window, we will write a new batch (and a .done file) outside of the standard 00, 01, 02 batch names and will write it into the partition of the date it was written in s3.
Special Considerations for Loading Attributed Events
Clients have the ability to change the merit pixel, post-click (pc) window, and post-view (pv) window at any time throughout the duration of a campaign. If changed, the changes to the window will only be reflected moving forward. MediaMath systems and services will not recalculate the past data with the new window.
Attributed events data is not lzo-compressed.
Schema changes
From time to time we may add columns in a non-breaking change to the end of our schema. We may send out a courtesy notification when this occurs to let users know the description of the new field and what it can be used for; however, the processing of these files from your side must be set up to handle these additional fields.
Please note: an up-to-date schema can always be found on each bucket under the "common/create-table/" prefix.
Best Practices for Loading Data
- Load into a staging area
- Load N days at a time to account for late arriving data
- Split the load into phases to so your credentials don't expire
- Reload conversion data for past 30 days and for impressions for the past 5 days
- Be defensive with invalid lines
Data Schemas
All columns should be considered nullable unless otherwise noted.
Please note: an up-to-date schema can always be found on each bucket under the "common/create-table/" prefix.
Impressions
Name | Type | Description |
---|---|---|
timestamp_gmt | timestamp | time of the impression in GMT |
report_timestamp | timestamp | time of the impression in campaign's local timezone |
auction_id | long | auction ID that uniquely identifies an impression |
mm_uuid | string | MediaMath unique user Id |
organization_id | integer | |
organization_name | string | |
agency_id | integer | |
agency_name | string | |
advertiser_id | integer | |
advertiser_name | string | |
campaign_id | integer | MediaMath unique ID for campaign |
campaign_name | string | MediaMath name for campaign |
strategy_id | integer | MediaMath unique ID for strategies |
strategy_name | string | MediaMath name for strategies |
concept_id | integer | MediaMath unique ID for creative concept |
concept_name | string | MediaMath name for creative concept |
creative_id | integer | MediaMath ID of the MediaMath creative displayed for the impression |
creative_name | string | MediaMath name of the MediaMath creative displayed for the impression |
exchange_id | integer | MediaMath unique ID for exchange of impression bid |
exchange_name | string | MediaMath unique name for exchange of impression bid |
width | integer | Creative width in pixels |
height | integer | Creative height in pixels |
site_url | string | domain where impression was served |
day_of_week | integer | day of week of impression served; 0=Sunday to 6=Saturday. |
week_hour_part | integer | hour of week of impression served; 0=Sunday 12am, 1=Sunday 12:15am, ... 671=Saturday 11:45pm. |
media_cost_cpm | double | Total of clearing prices for impressions won |
total_ad_cost_cpm | double | Total of Media cost plus any cost designated as Ad Serving, Ad Verification, Audience/Data, Contextual, Privacy Compliance, Surveys |
total_spend_cpm | double | Total Ad Cost plus MediaMath Fees and Prospective Margin |
mm_creative_size | integer | 32-bit encoding of creative size, calculated by making the high 16 bits the width and the low 16 bits the height |
placement_id | long | The PlacementSlot ID for PMP-D (the value in "pmp" param in a PMP-D request) |
deal_id | long | deal ID when the impression is associated with a private deal arranged to be served through the exchange |
country_id | integer | MediaMath unique ID of country in which the impression is served |
country | string | country in which the impression is served |
region_id | integer | state ID for the state in which the impression is served |
region | string | state in which the impression is served |
dma_id | integer | DMA ID in which the impression is served |
dma | string | DMA in which the impression is served |
zip_code_id | integer | zip code ID for the zip code in which the impression is served |
zip_code | string | zip code in which the impression is served |
conn_speed_id | integer | MediaMath unique ID for Internet connection speed through which the impression is served |
conn_speed | string | Internet connection speed through which the impression is served |
isp_id | integer | MediaMath unique ID for Internet service provider through which the impression is served |
isp | string | Internet service provider through which the impression is served |
category_id | integer | The exchange-provided category ID for the page (e.g. sports page vs. news page). |
publisher_id | long | online publisher where the impression is served (passed from exchange) |
site_id | long | siteID as provided by the exchange |
watermark | integer | whether impression used in Watermark process by MediaMath's optimization engine |
fold_position | integer | indication of fold position (above, below, unknown) for the impression served where 1 = Above the fold, 2 = Below the fold, 0 = Unknown |
empty_field_1 | null | Deprecated field. Will always be NULL |
user_frequency | integer | exchange provided user frequency |
browser_id | integer | MedaMath unique ID of Browser where impression served |
browser | string | browser where impression served |
os_id | integer | MediaMath unique ID for the Operating System on which the impression is served |
os | string | Operating System on which the impression is served |
browser_language_id | integer | MediaMath unique ID for the lanugage setting on user's browser on which impression is served |
user_agent | string | user agent header string as passed by Browser |
week_part | integer | week part of impression served (based on user's time zone); 0=weekday, 1=weekend. |
day_part | integer | day Part in which the impression was served (based on the user's timezone); 0=12am-6am, 1=6am-12pm, 2=12pm-6pm, 3=6pm-12am. |
day_hour | integer | hour of day impression served (based on user's timezone); Irrespective of day (0-23). |
week_part_hour | integer | hour of week of impression served 0-23 for weekday hours, 24-47 for weekend |
hour_part | integer | hour of a given day in which the impression served (based on the user's timezone). 15 min increment of hour of impression served0=12:00am, 1=12:15am, ... 95=11:45pm |
week_part_hour_part | integer | 0-95 weekday, 96-191 weekend. |
week_hour | integer | 0=Sunday 12am, 1=Sunday 1am, ...167=Saturday 11pm. |
page_url | string | page URL of impression served as passed by exchange |
batch_id | long | eight-digit identifier of the batch that generated the file, of the format YYYYMediaMathDDNN where YYYYMediaMathDD is the processing date and NN is either 00, 01 or 02, corresponding to the first, second, and third batches of processed date each day |
browser_language | string | the language of the browser on which impression is served |
inventory_type_id | integer | Internal unique identifier that corresponds to inventory_type: 50000=Web, 50001=Optimized, 50002=In-App, -1=Unknown. |
channel_type | long | 1 = Display, 2 = Video, 3 = Social, 4 = mobile display (web), 5 = mobile video (web), 6 = search, 7 = email, 8 = mobile display (in-app), 9 = mobile video (in-app), 10 = Newsfeed (FBX) |
empty_field_2 | null | Deprecated field. Will always be NULL |
empty_field_3 | null | Deprecated field. Will always be NULL |
inventory_type | string | Environment in which the user received the impression: Web, Optimized, In-App, Unknown. Derived from the user_agent string. |
device | string | Device type on which the impression is served, e.g. "Smartphone", "Laptop/Desktop" |
device_id | integer | Numeric ID corresponding to the value of the 'device' field |
empty_field_4 | null | Deprecated field. Will always be NULL |
empty_field_5 | null | Deprecated field. Will always be NULL |
connected_id | string | NOTE: this column will only be populated with data if the Organization has additional agreements establish with MediaMath. Please contact your account representatives if you are interested in receiving data for this. |
app_id | string | either App ID or Bundle ID of app to which impression is served, if present |
city | integer | |
city_code | string | |
city_code_id | integer | |
aib_pixel_ids | string | |
aib_recencies | string | |
supply_source_id | integer | |
id_vintage | integer | 0 if the primary MM user id from bid request is between [0 and 1 week old], 1 if it's between [1 and 18 weeks old], 2 if it's between [18 and 52 weeks old], 3 if it's more than 52 weeks old, 999 if there is no MM UUID or if the ID is invalid for some reason. |
interstitial | integer | 1 if it's an interstitial inventory |
cross_device_flag | integer | A flag to tell if the impression is cross-device and/or cookieless, value from 0 to 8 |
overlapped_brain_pixel_selections | string | Pipe-separated list of pixel ID's eligible for the user from the list of nominated pixels in MediaMath.Includes the Recency (measured in minutes) and Frequency (# of fires over the past 30d days) of the pixel. Example: mm:745587:20488:5|mm:745334:20500:1 |
ip_address | string | defaulted to "NULL" due to GDPR |
delphi_metadata | string | Price engine metadata. Currently formatted as campaign_id&"_"&model_id |
norm_rr | float | Predicted response rate from FMV Brain. Denominated in RR per 1k imps |
browser_name | string | Name of the browser from WURFL Enrichment |
browser_version | string | Version of the browser from WURFL Enrichment |
os_name | string | Name of the operating system from WURFL Enrichment |
os_version | string | Version of the operating system from WURFL Enrichment |
model_name | string | Name of the model from from WURFL Enrichment |
brand_name | string | Name of the brand from WURFL Enrichment |
form_factor | string | Device Type from WURFL Enrichment |
contextual_data | string | |
viewability_tag_injection | integer | Whether we injected viewability tag |
prebid_viewability | string | Prebid viewability value as reported by the exchange |
prebid_historical_ctr | string | Prebid historical click through rate as reported by the exchange |
prebid_video_completion | string | Prebid predicted video completion as reported by the exchange |
overlapped_pixels | string | Comma-separated list of any pixel IDs targeted and matched for this impression |
auction_type | integer | from bid request (0 - unknown; 1 - first price; 2 - second price plus; 3 - fixed price (deal) ; 501 - Not Indicated) |
deal_group | integer | The deal group of the deal which is associated with the impression |
brain_type_id | integer | An internal MediaMath ID that describes which brain model was used to bid for the corresponding impression |
video_skippability | integer | Flag to identify video skippability ; 0: non-skippable; 1: skippable; null: unknown |
video_placement_type | integer | Video placement type. 1: In-stream; 2: In-banner; 3: In-article; 4: In-feed; 5: Interstitial/slider/floating |
display_manager | string | Name of ad mediation partner, SDK technology, or player responsible for rending ad (typically video or mobile) |
display_manager_ver | string | Version of ad mediation partner, SDK technology, or player responsible for rending ad |
click_browser | integer | Type of browser opened upon clicking the creative in an app, where 0 = embedded, 1 = native. |
ads_txt_verified | integer | Whether an impression is ads.txt verified, where 0 = no, 1 = yes, 0 = null |
ads_txt_type | string | Ads.txt type, either Direct or Reseller |
request_id | string | Request-id sent by supply source (Exchange auction-id), truncated for 40 chars |
gdpr_flag | boolean | Flag from publisher/exchange saying whether we need to respect GDPR for this user |
gdpr_string | string | Daisy bit from publisher/exchange expressing consent for the user |
uuid_cid_connection_type | integer | This flag represents the relationship between the MM_UUID (primary device ID in bid request) and the ConnectedID that is retrieved from Cross-Device ID Graph (IDM). 0 - Probabilistic; 1 - Deterministic. |
first_party_id_value | string | 1P ID value for the user identified and targeted for this impression |
first_party_id_vendor | string | 1P ID vendor for the ID in the '1p_id_value' column |
ccpa_privacy_string | string | Version 1 of the US Privacy String supports CCPA Compliance, which contains the following information:
Sample values: "1N–", "1—", "1-N-", "1YNN", "1YNY" |
imp_count | float | Represents the multiplier reporting will count for each impression record. The value will be 1.0 for all non-DOOH imps and either 1.0 for a full DOOH imp or less than 1.0 for a fractional DOOH impression. |
total_audience | float | Represents the number of total audiences who tually watch the DOOH ad play. The value will be 1.0 for all non-DOOH impressions, DOOH exchanges that don't support it, and the actual number sent by DOOH exchange at win notice time if supported. |
targeted_audience_percentage | float | Represents the number of targeted audiences over the number of total audiences who watch the DOOH ad play as percentage. The value will be 1.0 for all non-DOOH impressions, DOOH exchanges that don't support, and the actual number sent by DOOH exchange at win notice time if supported. |
advanced_content_type | integer | Shows if an impression is a CTV or DOOH impression. 1 = CTV, 2 = DOOH |
Notes on 1P ID columns
This is information that you as a client need for the values you will see in these columns.
Merkury ID - Merkle provides clients with user data in their M1 platform. Merkle can pick up client logs from a client's S3 bucket to ingest data with Merkury IDs. Please loop your client team in with your Merkury client and provide written permission for Merkle reps to pick up this data. Merkle will then be permissioned to your data in our S3 buckets.
LiveRamp RampID - Currently, the RampID (formerly IDL) provided in logs is in the MediaMath format (LiveRamp encrypts RampIDs per partner, so MediaMath's RampIDs are different from a client's RampIDs). In order to receive impression log data, along with their own RampID, clients can work with LiveRamp to share logs in their format/compression of preference. If needed, MediaMath can assist the clients in setting up a new feed to have logs shared with LiveRamp according to their desired specs, note that an SOW with an additional charge will apply.
Notes on total_spend_cpm field
As stated, total_spend_cpm includes the campaign's margin cpm. MediaMath allows for use of the margin management tool that applies changes to margin going back five days in arrears. When this happens, the log level impression data does not update to reflect this change. Therefore when comparing total spend to the Performance report in MediaMath you may see a discrepancy. However, when comparing Media Cost or Impression Count you should find little to no discrepancy as these metrics are not affected by retroactive changes in margin.
Events
Name | Type | Description |
---|---|---|
timestamp_gmt | timestamp | timestamp of event pixel fire in GMT |
mm_uuid | string | MediaMath unique user ID |
hostname | string | The hostname of the server handling the request, almost always </span><a href="http://pixel.mathtag.com/" class="external-link">pixel.mathtag.com</a><span> |
organization_id | integer | |
organization_name | string | |
agency_id | integer | |
agency_name | string | |
advertiser_id | integer | |
advertiser_name | string | |
pixel_id | integer | MediaMath unique ID for event pixel |
pixel_name | string | event pixel name (user defined) |
v1 | string | custom variable (associated with pixel_id above - Since not all pixels are passing back same variables); max 64 characters |
v2 | string | custom variable; max 64 characters |
v3 | string | custom variable; max 64 characters |
v4 | string | custom variable; max 64 characters |
v5 | string | custom variable; max 64 characters |
v6 | string | custom variable; max 64 characters |
v7 | string | custom variable; max 64 characters |
v8 | string | custom variable; max 64 characters |
v9 | string | custom variable; max 64 characters |
v10 | string | custom variable; max 64 characters |
s1 | string | custom variable; max 64 characters |
s2 | string | custom variable; max 64 characters |
s3 | string | custom variable; max 64 characters |
s4 | string | custom variable; max 64 characters |
s5 | string | custom variable; max 64 characters |
s6 | string | custom variable; max 64 characters |
s7 | string | custom variable; max 64 characters |
s8 | string | custom variable; max 64 characters |
s9 | string | custom variable; max 64 characters |
s10 | string | custom variable; max 64 characters |
country_id | integer | MediaMath unique ID of country in which the impression is served |
country | string | country in which the impression is served |
region_id | integer | state ID for the state in which the impression is served |
region | string | state in which the impression is served |
dma_id | integer | DMA ID in which the impression is served |
dma | string | DMA in which the impression is served |
zip_code_id | integer | zip code ID for the zip code in which the impression is served |
zip_code | string | zip code in which the impression is served |
conn_speed_id | integer | MediaMath unique ID for Internet connection speed through which the impression is served |
conn_speed | string | Internet connection speed through which the impression is served |
isp_id | integer | MediaMath unique ID for Internet service provider through which the impression is served |
isp | string | Internet service provider through which the impression is served |
user_agent | string | user agent header string as passed by Browser |
referrer | string | referring URL of the page where the event pixel has fired |
batch_id | long | eight-digit identifier of the batch that generated the file, of the format YYYYMediaMathDDNN where YYYYMediaMathDD is the processing date and NN is either 00, 01 or 02, corresponding to the first, second, and third batches of processed date each day |
empty_field_1 | null | Deprecated field. Will always be NULL |
empty_field_2 | null | Deprecated field. Will always be NULL |
connected_id | string | NOTE: this column will only be populated with data if the Organization has additional agreements establish with MediaMath. Please contact your account representatives if you are interested in receiving data for this. |
city | integer | |
city_code | string | |
city_code_id | integer | |
pixel_params | string | |
id_type | string | |
ip_address | set to NULL | defaulted to "NULL" due to GDPR |
gdpr_flag | boolean | Flag from advertiser saying whether we need to respect GDPR for this user |
gdpr_string | string | Daisy bit from advertiser expressing consent for the user |
imp_auction_id | long | Auction ID will only be present if the Auction ID was included in the click-through URL. It can be used to connect event pixel fires to a clicked ad. Introduced: 2024-08-05 |
Attributed Events
Name | Type | Description |
---|---|---|
impression_timestamp_gmt | timestamp | timestamp of impression served in GMT |
event_timestamp_gmt | timestamp | timestamp of merit pixel fire in GMT |
event_report_timestamp | timestamp | timestamp of event pixel record in MediaMath |
imp_auction_id | long | unique ID generated by MediaMath's bidder to identify an impression |
mm_uuid | string | MediaMath unique user identifier |
organization_id | integer | |
organization_name | string | |
agency_id | integer | |
agency_name | string | |
advertiser_id | integer | |
advertiser_name | string | |
event_type | string | 'click' or 'conversion' or 'video' |
pixel_id | integer | merit pixel ID for the attributed event record |
pixel_name | string | merit pixel name for the attributed event record |
pv_pc_flag | string | post-View or post-Click conversion Flag. Returns PV or PC |
pv_time_lag | integer | time in seconds following impression view to event pixel fire |
pc_time_lag | integer | time in seconds following impression click to event pixel fire |
campaign_id | integer | MediaMath unique ID for campaign |
campaign_name | string | MediaMath name for campaign |
strategy_id | integer | MediaMath unique ID for strategies |
strategy_name | string | MediaMath name for strategies |
concept_id | integer | MediaMath unique ID for creative concept |
concept_name | string | MediaMath name for creative concept |
creative_id | integer | MediaMath unique ID for specific creative |
creative_name | string | MediaMath name for specific creative |
exchange_id | integer | MediaMath unique ID for exchange of impression bid |
exchange_name | string | MediaMath name for exchange of impression bid |
width | integer | sizes encoded as 32-bit integers with the high 16 bits are width and the low 16 bits are height |
height | integer | sizes encoded as 32-bit integers with the high 16 bits are width and the low 16 bits are height |
site_url | string | base url where impression was served |
mm_v1 | string | custom variable; Max 64 characters |
mm_v2 | string | custom variable; Max 64 characters |
mm_s1 | string | custom variable; Max 64 characters |
mm_s2 | string | custom variable; Max 64 characters |
day_of_week | integer | day of a given week in which the impression served (based on the user's timezone) |
week_hour_part | integer | hour of week of conversion |
mm_creative_size | integer | 32-bit encoding of creative size |
placement_id | long | PlacementSlot ID for PMP-D (the value in "pmp" param in a PMP-D request) |
deal_id | long | deal ID when the impression is associated with a private deal arranged to be served through the exchange |
country_id | integer | MediaMath unique ID of country where impression served |
country | string | country where impression served |
region_id | integer | state ID for the state in which the impression is served |
region | string | state in which the impression is served |
dma_id | integer | DMA ID in which the impression is served |
dma | string | DMA in which the impression is served |
zip_code_id | integer | zip code ID for the zip code in which the impression is served |
zip_code | string | zip code in which the impression is served |
conn_speed_id | integer | MediaMath unique ID for Internet connection speed through which the impression is served |
conn_speed | string | Internet connection speed through which the impression is served |
isp_id | integer | MediaMath unique ID for Internet service provider through which the impression is served |
isp | string | Internet service provider through which the impression is served |
category_id | long | content category of page - passed from publisher |
publisher_id | string | exchange provided publisher ID |
site_id | long | siteID as provided by the exchange; The online website where the impression is served |
watermark | integer | whether impression used as part of the watermark process by MediaMath's optimization algorithm |
fold_position | integer | indication of fold position (above, below, unknown) for the impression served where 1 = Above the fold, 2 = Below the fold, 0 = Unknown |
user_frequency | integer | exchange provided user frequency |
browser_id | integer | MedaMath unique ID of Browser where impression served |
browser | string | browser where impression served |
os_id | integer | MediaMath unique ID for the Operating System on which the impression is served |
os | string | Operating System on which the impression is served |
browser_language_id | integer | MediaMath unique ID for the lanugage setting on user's browser on which impression is served |
week_part | integer | enumerated value representative of weekday vs weekend in which the impression was served (based on the user's timezone) |
day_part | integer | day part of impression served |
day_hour | integer | hour of day impression served |
week_part_hour | integer | enumerated value representative of the hour within a given day and with distinction between weekday and weekend hours in which the impression served (based on the user's timezone) |
hour_part | integer | 15 min block of a given day in which the impression served (based on the user's timezone) |
week_part_hour_part | integer | enumerated value representative of the 15 min block of a day with distinction between weekday and weekend 15 min blocks in which the impression served (based on the user's timezone) |
week_hour | integer | enumerated value representative of the hour of a given week in which the impression served (based on the user's timezone) |
batch_id | long | eight-digit identifier of the batch that generated the file, of the format YYYYMediaMathDDNN where YYYYMediaMathDD is the processing date and NN is either 00, 01 or 02, corresponding to the first, second, and third batches of processed date each day |
browser_language | string | language configured on the browser at the time the impression served |
empty_field_1 | null | Deprecated field. Will always be NULL |
empty_field_2 | null | Deprecated field. Will always be NULL |
empty_field_3 | null | Deprecated field. Will always be NULL |
inventory_type_id | integer | corresponding ID to the inventory type field |
inventory_type | string | Type of environment ad was shown (e.g., web, In-Application) |
device_type_id | integer | MediaMath unique ID for the device on which the impression is served |
device_type | string | the device on which the impression is served |
connected_id | string | NOTE: this column will only be populated with data if the Organization has additional agreements establish with MediaMath. Please contact your account representatives if you are interested in receiving data for this. |
app_id | string | either App ID or Bundle ID of App to which impression is served, if present |
event_subtype | string | This field is blank for non-video events. For video events, here are the possible values: companion_click, companion_impression, accept_invitation, close, collapse, vv, engagedView, expand, fullscreen, mute, pause, q1, q2, q3, q4, resume, rewind, skip, skippable, vst, unmute, na, impression, error. |
city | integer | |
city_code | string | |
city_code_id | integer | |
supply_source_id | integer | |
ip_address | set to NULL | defaulted to "NULL" due to GDPR |
browser_name | string | Name of the browser from WURFL Enrichment |
browser_version | string | Version of the browser from WURFL Enrichment |
os_name | string | Name of the operating system from WURFL Enrichment |
os_version | string | Version of the operating system from WURFL Enrichment |
model_name | string | Name of the model from WURFL Enrichment |
brand_name | string | Name of the brand from WURFL Enrichment |
form_factor | string | Device Type from WURFL Enrichment |
impressions_stream_uuid | string | Impression Stream Unique Identifier |
clicks | bigint | Clicks |
pc_conversions | bigint | Post click conversion |
pv_conversions | bigint | Post view conversion |
pc_revenue | float | Post click revenue |
pv_revenue | float | Post view revenue |
Viewability Events
Name | Type | Description |
---|---|---|
impression_timestamp_gmt | timestamp | Timestamp of impression served in GMT |
event_timestamp_gmt | timestamp | Timestamp of viewability event fired by the vendor in GMT. This is not related to a MediaMath pixel or campaign merit event |
event_report_timestamp | timestamp | Timestamp of viewability event fired by the vendor in campaign's local timezone |
imp_auction_id | long | unique ID generated by MediaMath's bidder to identify an impression |
mm_uuid | string | MediaMath unique user identifier |
organization_id | integer | |
organization_name | string | |
agency_id | integer | |
agency_name | string | |
advertiser_id | integer | |
advertiser_name | string | |
pixel_id | integer | merit pixel ID for the attributed event record |
pixel_name | string | merit pixel name for the attributed event record |
campaign_id | integer | MediaMath unique ID for campaign |
campaign_name | string | MediaMath name for campaign |
strategy_id | integer | MediaMath unique ID for strategies |
strategy_name | string | MediaMath name for strategies |
concept_id | integer | MediaMath unique ID for creative concept |
concept_name | string | MediaMath name for creative concept |
creative_id | integer | MediaMath unique ID for specific creative |
creative_name | string | MediaMath name for specific creative |
exchange_id | integer | MediaMath unique ID for exchange of impression bid |
exchange_name | string | MediaMath name for exchange of impression bid |
supply_source_id | integer | |
width | integer | sizes encoded as 32-bit integers with the high 16 bits are width and the low 16 bits are height |
height | integer | sizes encoded as 32-bit integers with the high 16 bits are width and the low 16 bits are height |
site_url | string | base url where impression was served |
mm_v1 | string | custom variable; Max 64 characters |
mm_v2 | string | custom variable; Max 64 characters |
mm_s1 | string | custom variable; Max 64 characters |
mm_s2 | string | custom variable; Max 64 characters |
day_of_week | integer | day of a given week in which the impression served (based on the user's timezone) |
week_hour_part | integer | hour of week of conversion |
mm_creative_size | integer | 32-bit encoding of creative size |
placement_id | long | PlacementSlot ID for PMP-D (the value in "pmp" param in a PMP-D request) |
deal_id | long | deal ID when the impression is associated with a private deal arranged to be served through the exchange |
country_id | integer | MediaMath unique ID of country where impression served |
country | string | country where impression served |
region_id | integer | state ID for the state in which the impression is served |
region | string | state in which the impression is served |
dma_id | integer | DMA ID in which the impression is served |
dma | string | DMA in which the impression is served |
zip_code_id | integer | zip code ID for the zip code in which the impression is served |
zip_code | string | zip code in which the impression is served |
city | integer | |
city_code_id | integer | |
city_code | string | |
conn_speed_id | integer | MediaMath unique ID for Internet connection speed through which the impression is served |
conn_speed | string | Internet connection speed through which the impression is served |
isp_id | integer | MediaMath unique ID for Internet service provider through which the impression is served |
isp | string | Internet service provider through which the impression is served |
category_id | long | content category of page - passed from publisher |
publisher_id | long | exchange provided publisher ID |
site_id | long | siteID as provided by the exchange; The online website where the impression is served |
watermark | integer | whether impression used as part of the watermark process by MediaMath's optimization algorithm |
fold_position | integer | Indication of fold position (above, below, unknown) for the impression served where 1 = Above the fold, 2 = Below the fold, 0 = Unknownd |
user_frequency | integer | exchange provided user frequency |
browser_id | integer | MedaMath unique ID of Browser where impression served |
browser | string | browser where impression served |
os_id | integer | MediaMath unique ID for the Operating System on which the impression is served |
os | string | Operating System on which the impression is served |
browser_language_id | integer | MediaMath unique ID for the lanugage setting on user's browser on which impression is served |
browser_language | string | |
device_id | integer | |
device | string | |
week_part | integer | enumerated value representative of weekday vs weekend in which the impression was served (based on the user's timezone) |
day_part | integer | day part of impression served |
hour | integer | hour of day impression served |
week_part_hour | integer | enumerated value representative of the hour within a given day and with distinction between weekday and weekend hours in which the impression served (based on the user's timezone) |
hour_part | integer | 15 min block of a given day in which the impression served (based on the user's timezone) |
week_part_hour_part | integer | enumerated value representative of the 15 min block of a day with distinction between weekday and weekend 15 min blocks in which the impression served (based on the user's timezone) |
week_hour | integer | enumerated value representative of the hour of a given week in which the impression served (based on the user's timezone) |
batch_id | long | eight-digit identifier of the batch that generated the file, of the format YYYYMediaMathDDNN where YYYYMediaMathDD is the processing date and NN is either 00, 01 or 02, corresponding to the first, second, and third batches of processed date each day |
inventory_type_id | integer | |
inventory_type | string | |
empty_field_1 | null | Deprecated field. Will always be NULL |
empty_field_2 | null | Deprecated field. Will always be NULL |
connected_id | string | NOTE: this column will only be populated with data if the Organization has additional agreements establish with MediaMath. Please contact your account representatives if you are interested in receiving data for this. |
app_id | string | either App ID or Bundle ID of App to which impression is served, if present |
viewability_event_id | integer | Metrics that are available to filter on: 1 = measurable impression (display) 2 = in view impression (display) 3 = 100% in view impression (display) 4 = in view 5 seconds impression (display) 5 = in view 15 seconds impression (display) 51 = measurable impression (video) 52 = viewable impression (video) 53 = 100% in view impression (video) 54 = in view 5 seconds impression (video) 55 = in view 15 seconds impression (video) |
viewability_event_name | string | The string corresponding to the viewability_event_id. |
Appendix - Day/Hour/Week Part Logic
NOTE: The below values should really only be used for Bring Your Own Algorithm integrations.
Day/hour/week part columns are generated from GMT timestamps using the below formulas, all based on week_hour_part
(with values of 0 for Sunday 12:00 – 12:15am, through 671 for Saturday 11:45 – 11:59pm).
// 24 hours * 4 15-minute parts in an hour
day_of_week = (int) week_hour_part / 96;
// 7 days * 4 15-minute parts in an hour
day_hour = (int) week_hour_part / 28;
// 7 days * 4 parts * 6 hours in a day part
day_part = (int) week_hour_part / 168;
// 5 days * 4 parts * 24 hours
week_part = (week_hour_part < 480) ? 0 : 1;
// Check if it's after Friday then divide by 7 days
// times 4 hours. Add 24 hours if over weekend
week_part_hour = (week_hour_part < 480) ? (int) week_hour_part / 28 : (int) (week_hour_part / 28) + 24;
// Check if it's after Friday then mod by 4
week_part_hour_part = (week_hour_part < 480) ? week_hour_part % 4 : (week_hour_part % 4) + 4;
// Straight mod by 4
hour_part = week_hour_part % 4;
// Divide by 4 parts then mod by 24 hours
week_hour = (int) (week_hour_part / 4) % 24;
Appendix - Batch ID Logic
Data platform data files for impressions and events are of the format:
/data/organization_id=[ORGANIZATION-ID]/[LOG-TYPE]_date=[YYYY-MM-DD]/mm_[LOG-TYPE]s_[ORGANIZATION-ID]_[YYYYMMDD]_[BATCH-ID]_[FILE-NUM].txt.lzo
LOG-TYPE
: "impression" for impression log files, "event" for event log filesORGANIZATION-ID
: the six-digit TerminalOne organization IDYYYY-MM-DD
/YYYYMMDD
: the GMT date of the data contained in the directory/fileBATCH-ID
: the eight-digit identifier of the batch that generated the file, of the formatYYYYMMDDNN
whereYYYYMMDD
is the processing date andNN
is either00
,01
or02
, corresponding to the first, second, and third batches of processed date each day. Note that the date in the batch ID is not necessarily the date of the data in the files; for example, batch2014070100
would be expected to contain a great deal of data both from 2014-06-31 and 2014-07-01, and would write impressions files both as:impression_date=2014-06-31/mm_impressions_123456_20140631_2014070100_0000.txt.lzo
impression_date=2014-07-01/mm_impressions_123456_20140701_2014070100_0000.txt.lzo.
FILE-NUM
: a particular batch may write many files for a single date. If the first batch on 2014-07-01 wrote, for organization 123456, two files of 2014-06-31 data and five files of 2014-07-01 data, the files would appear as:impression_date=2014-06-31/mm_impressions_123456_20140631_2014070100_0000.txt.lzo
impression_date=2014-06-31/mm_impressions_123456_20140631_2014070100_0001.txt.lzo
impression_date=2014-07-01/mm_impressions_123456_20140701_2014070100_0000.txt.lzo
impression_date=2014-07-01/mm_impressions_123456_20140701_2014070100_0001.txt.lzo
impression_date=2014-07-01/mm_impressions_123456_20140701_2014070100_0002.txt.lzo
impression_date=2014-07-01/mm_impressions_123456_20140701_2014070100_0003.txt.lzo
impression_date=2014-07-01/mm_impressions_123456_20140701_2014070100_0004.txt.lzo
The file numbers are sequential but have no meaning otherwise; i.e., they do not represent a distribution of records by a particular campaign/creative/pixel, and are not sorted by time.