SQL面试题

上周去面试了某电商的数据分析师职位。
面试前几天临时抱佛脚看完了3年前买的《SQL必知必会》。不过最后还是跪在了2道SQL题目上。
Google之后发现2道题分别代表一类常见的查询。都是套路啊~


问题描述:以AMD和Intel近5年的Revenue数据为例。首先创建表和插入数据。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- MySQL, PostgreSQL, SQL Server
CREATE TABLE financials (
company_name VARCHAR(10) NOT NULL,
fiscal_year SMALLINT NOT NULL,
revenue DECIMAL(10,2) NOT NULL
);
INSERT INTO financials (company_name, fiscal_year, revenue) VALUES
('AMD', 2012, 5.42),
('AMD', 2013, 5.30),
('AMD', 2014, 5.51),
('AMD', 2015, 3.99),
('AMD', 2016, 4.27),
('Intel', 2012, 53.34),
('Intel', 2013, 52.71),
('Intel', 2014, 55.87),
('Intel', 2015, 55.36),
('Intel', 2016, 59.39);


Q1: 将原始的表转换成如下形式。

Company 2012 2013 2014 2015 2016
AMD 5.42 5.3 5.51 3.99 4.27
Intel 53.3 52.7 55.9 55.4 59.4

A1: 这是行转列,或者说创建透视表(pivot table) / 列联表(cross tabulation)的操作。
标准SQL可以用CASE语句来实现。MySQL的IF和Oracle的DECODE可以代替CASE

1
2
3
4
5
6
7
8
9
10
-- MySQL, PostgreSQL, SQL Server, Oracle
SELECT
company_name AS Company,
SUM(CASE WHEN fiscal_year = 2012 THEN revenue END) AS "2012",
SUM(CASE WHEN fiscal_year = 2013 THEN revenue END) AS "2013",
SUM(CASE WHEN fiscal_year = 2014 THEN revenue END) AS "2014",
SUM(CASE WHEN fiscal_year = 2015 THEN revenue END) AS "2015",
SUM(CASE WHEN fiscal_year = 2016 THEN revenue END) AS "2016"
FROM financials
GROUP BY company_name;

SQL Server提供了PIVOT操作符来简化一串SELECT...CASE语句。Oracle也支持PIVOT

1
2
3
4
5
6
7
8
9
10
-- SQL Server
SELECT
company_name AS Company,
[2012], [2013], [2014], [2015], [2016]
FROM financials
PIVOT
(
SUM(revenue) FOR fiscal_year IN ([2012], [2013], [2014], [2015], [2016])
) AS PivotTable
ORDER BY company_name;

PostgreSQL的tablefunc模块提供了类似功能的crosstab函数。

1
2
3
4
CREATE extension tablefunc;
SELECT *
FROM crosstab('SELECT company_name, fiscal_year, revenue FROM financials ORDER BY 1,2')
AS ct(Company VARCHAR, "2012" DECIMAL, "2013" DECIMAL, "2014" DECIMAL, "2015" DECIMAL, "2016" DECIMAL)


Q2: 列出每家公司Revenue第二多的年份。

A2: 最简单的方法是使用窗口函数。
PostgreSQL, SQL Server, Oracle支持ROW_NUMBER函数,MySQL不支持。

1
2
3
4
5
6
-- PostgreSQL, SQL Server
SELECT company_name, fiscal_year, revenue
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY company_name ORDER BY revenue DESC) AS row_num
FROM financials ) AS ranked
WHERE row_num = 2;

MySQL可以使用变量来实现分组排序。

1
2
3
4
5
6
7
SELECT company_name, fiscal_year, revenue
FROM (SELECT *,
@rank := IF(@company = company_name, @rank + 1, 1) AS row_num,
@company := company_name
FROM financials
ORDER BY company_name, revenue DESC) AS ranked
WHERE row_num = 2;


P.S.

  • 《SQL必知必会》的内容很基础,没有CASE语句、PIVOT、窗口函数等。 后来在书城看到《SQL基础教程》,觉得更适合入门。
  • 发现MySQL相比其他关系数据库缺少一些有用的特性。