DISTINCT ON 是 PostgreSQL 对标准 SQL 的扩展,通常是最简单和最快的方法。示例代码如下:
1 2 3 4
SELECTDISTINCTON (customer) id, customer, total FROM purchases ORDERBY customer, total DESC, id;
如果 total 可能为 null,可以添加 NULLS LAST:
1 2 3 4
SELECTDISTINCTON (customer) id, customer, total FROM purchases ORDERBY customer, total DESCNULLS LAST, id;
2. 使用 row_number() 窗口函数
1 2 3 4 5 6 7 8
WITH cte AS ( SELECT id, customer_id, total , row_number() OVER (PARTITIONBY customer_id ORDERBY total DESC) AS rn FROM purchases ) SELECT id, customer_id, total FROM cte WHERE rn = 1;
3. 使用 array_agg() 函数
1 2 3 4 5
SELECT (array_agg(id ORDERBY total DESC))[1] AS id , customer_id , max(total) AS total FROM purchases GROUPBY customer_id;
WITHRECURSIVE cte AS ( ( SELECT id, customer_id, total FROM purchases ORDERBY customer_id, total DESC LIMIT1 ) UNIONALL SELECT u.* FROM cte c , LATERAL ( SELECT id, customer_id, total FROM purchases WHERE customer_id > c.customer_id ORDERBY customer_id, total DESC LIMIT1 ) u ) SELECT id, customer_id, total FROM cte ORDERBY customer_id;
5. 使用 customer 表与 LATERAL 子查询
1 2 3 4 5 6 7 8 9
SELECT l.* FROM customer c , LATERAL ( SELECT id, customer_id, total FROM purchases WHERE customer_id = c.customer_id ORDERBY total DESC LIMIT1 ) l;
核心代码
以下是几种常见方法的核心代码示例:
DISTINCT ON 方法
1 2 3 4
SELECTDISTINCTON (customer) id, customer, total FROM purchases ORDERBY customer, total DESC, id;
row_number() 方法
1 2 3 4 5 6 7 8
WITH cte AS ( SELECT id, customer_id, total , row_number() OVER (PARTITIONBY customer_id ORDERBY total DESC) AS rn FROM purchases ) SELECT id, customer_id, total FROM cte WHERE rn = 1;
array_agg() 方法
1 2 3 4 5
SELECT (array_agg(id ORDERBY total DESC))[1] AS id , customer_id , max(total) AS total FROM purchases GROUPBY customer_id;
最佳实践
数据分布与索引:对于每个客户行数较少(customer 列基数高)的情况,使用 DISTINCT ON 并结合合适的多列索引(如 CREATE INDEX purchases_3c_idx ON purchases (customer, total DESC, id);)非常高效。对于每个客户行数较多(customer 列基数低)的情况,在 Postgres 18 及以后可以使用索引跳过扫描(Index skip scan),之前可以考虑使用递归 CTE 或带有 LATERAL 子查询的方法。