sql - 从 BigQuery Table 中选择最佳表现者

我有一个如下所示的 BigQuery table:

User  | URL            | Sessions
user1 | example.com/1/ | 3000
user2 | example.com/2/ | 4000
user3 | example.com/2/ | 5000
user4 | example.com/1/ | 1000
...   | ...            | ...

我希望为每个 URL 拉出表现最好的用户。因此,理想情况下,最终输出为我提供了一个较小的 table,每个 URL 都有一个用户 value,它是顶级会话驱动程序。

我尝试了一个 SQL 查询,例如:

SELECT User, URL, ARRAY_AGG(Sessions ORDER BY Sessions DESC LIMIT 1) FROM 'table'

但不断收到错误。任何帮助深表感谢!

回答1

假设我正确地遵循了您的问题,您只想在每个 URL 的基础上汇总所有会话并将这些 values 拆分为每个用户?如果没有用户有重复的 URL,则总和实际上不会有任何要聚合的内容,但它允许您在对其他列进行分组时仍然显示它。

试一试:

SELECT 
  User,
  URL,
  SUM(Sessions) AS Total_Sessions
FROM `table`
GROUP BY User, URL
ORDER BY Total_Sessions DESC

回答2

您将不得不使用 rank 或 row_number 函数:

样本:

WITH input AS
 (SELECT 1 as user, 'x' as url, 100 as session
  UNION ALL SELECT 1 as user, 'x' as url, 200 as session
  UNION ALL SELECT 1 as user, 'y' as url, 400 as session
  UNION ALL SELECT 2 as user, 'x' as url, 200 as session
  UNION ALL SELECT 2 as user, 'x' as url, 300 as session
)
select user, url, session from (
SELECT user, url, session,
  ROW_NUMBER() OVER (partition by user, url ORDER BY session desc) AS top_rank
FROM input)
where top_rank = 1

相似文章