Parsing the last 4 days of google analytics table GA_sessions in google BigQuery

The problems with querying google BigQuery:

  1. The table is sharded not partitioned – this means each table has it owen table name e.g ga_sessions_20190202
  2. The GA sharded table also has some temporary data on a table called ga_sessions_intraday_20190505
  3. so when you select * from ga_sessions_* you will also get some unexpected ga_sessions_intraday_* shards

Below is example of querying the last 4 days in any GA_sessions table

select * from myProject.MyDataset.ga_sessions_*
where TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 4 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) OR (_TABLE_SUFFIX = CONCAT('intraday',CAST(FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 0 DAY)) AS string) ) OR TABLE_SUFFIX = CONCAT('intraday',CAST(FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) AS string) ))

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s