Identifying High-Cost Queries Using BigQuery's INFORMATION_SCHEMA

gcp

You can retrieve executed queries and total_bytes_billed from INFORMATION_SCHEMA.JOBS. To avoid duplication, exclude statement_type = “SCRIPT”.

SELECT
  CASE
    WHEN STARTS_WITH(job_id, 'materialized_view_refresh_') then 'materialized_view_refresh'
    WHEN STARTS_WITH(job_id, 'scheduled_query_') then 'scheduled_query'
    WHEN STARTS_WITH(job_id, 'bquxjob_') then 'bq user interface'
    ELSE 'other' 
  END AS job_type,
  user_email,
  query,
  state,
  total_bytes_billed / 1024 / 1024 / 1024 / 1024 * 6.25 /* USD / TB */ as cost
FROM
  `region-us.INFORMATION_SCHEMA.JOBS`
WHERE DATE(creation_time) = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) AND statement_type != 'SCRIPT'
ORDER BY total_bytes_billed DESC

For materialized views, since the query value will be like CALL BQ.REFRESH_MATERIALIZED_VIEW(view_name), INFORMATION_SCHEMA.TABLES is also helpful.

SELECT table_name, ddl
FROM (account).(dataset).INFORMATION_SCHEMA.TABLES
WHERE table_type = 'MATERIALIZED VIEW'

Reference

Categorize bigquery queries based on job_id coming from information_schema.jobs to look at costs