在 MySQL 表中查找重复记录

Victor A. Oguntuase 2023年1月30日 2022年5月13日
  1. 使用 SELECT DISTINCT 语句检查 MySQL 表中的重复项
  2. 使用 SELECT COUNT 语句查找 MySQL 表中重复出现的次数
  3. 使用 INNER JOIN 语句查看 MySQL 表中的重复记录
在 MySQL 表中查找重复记录

本教程探讨如何分别使用 SELECT DISTINCTSELECT COUNT 语句检查 MySQL 表中的重复项并检索重复项的数量。

使用 SELECT DISTINCT 语句检查 MySQL 表中的重复项

数据库表中有重复记录的影响可能从轻微的不便到灾难。幸运的是,MySQL 有一些漂亮的关键字可以组合起来扫描表中的重复项。

此外,我们可以计算重复记录的出现次数,并在必要时将其删除。

让我们使用名为 employee_details 的示例表创建一个 test_company 数据库。

CREATE DATABASE test_company;
USE test_company;

CREATE TABLE employees_details(
    id INT AUTO_INCREMENT,
    name VARCHAR (255) NOT NULL,
    title VARCHAR(255) NOT NULL,
    salary INT,

    PRIMARY KEY (id)
);

该表填充了值,包括重复值。

INSERT INTO employees_details (name, title, salary) Values
('James Maddison','Computer Engineer',80000),
('Matthew Defoe','Software Architect',150000),
('Daniel Jameson','Software Engineer II', 95000),
('Jules Reddington','Senior Software Engineer',120000),
('Carlos Rodriguez','Data Engineer',100000),
('Matthew Defoe','Software Architect',150000),
('Daniel Jameson','Software Engineer II', 95000),
('Jules Reddington','Senior Software Engineer',120000);

SELECT * FROM employees_details;

输出:

+----+------------------+--------------------------+--------+
| id | name             | title                    | salary |
+----+------------------+--------------------------+--------+
|  1 | James Maddison   | Computer Engineer        |  80000 |
|  2 | Matthew Defoe    | Software Architect       | 150000 |
|  3 | Daniel Jameson   | Software Engineer II     |  95000 |
|  4 | Jules Reddington | Senior Software Engineer | 120000 |
|  5 | Carlos Rodriguez | Data Engineer            | 100000 |
|  6 | Matthew Defoe    | Software Architect       | 150000 |
|  7 | Daniel Jameson   | Software Engineer II     |  95000 |
|  8 | Jules Reddington | Senior Software Engineer | 120000 |
+----+------------------+--------------------------+--------+
8 rows in set (0.00 sec)

此表中有三个重复项,由于表的大小较小,因此很容易发现。我们对较大的表使用 SELECT DISTINCT 语句从表中检索唯一记录。

根据官方文档SELECT DISTINCT 语句仅检索输出记录的一个实例,即使它出现多次。

因此,当 SELECT DISTINCT 语句返回的记录数小于表中的总记录数时,我们可以确定存在重复。

-- Retrieving only distinct records.
SELECT DISTINCT name,title,salary FROM employees_details;

输出:

+------------------+--------------------------+--------+
| name             | title                    | salary |
+------------------+--------------------------+--------+
| James Maddison   | Computer Engineer        |  80000 |
| Matthew Defoe    | Software Architect       | 150000 |
| Daniel Jameson   | Software Engineer II     |  95000 |
| Jules Reddington | Senior Software Engineer | 120000 |
| Carlos Rodriguez | Data Engineer            | 100000 |
+------------------+--------------------------+--------+
5 rows in set (0.00 sec)

请注意,查询中不包含 id 列,因为 MySQL 将 id 列标识为唯一记录。

因此,在查询中包含 id 列将返回所有记录(包括重复项)作为唯一记录。

从结果集中,我们可以推断出有三 (3) 条重复记录,因为查询返回五 (5) 条不同的记录并且表中有八 (8) 条记录。

使用 SELECT COUNT 语句查找 MySQL 表中重复出现的次数

现在,在检测到重复项的存在后,我们可以使用 SELECT COUNT 语句来查找重复项的出现次数。

SELECT name AS 'employee name', COUNT(*) AS Occurrence FROM employees_details
GROUP BY name
HAVING Occurrence > 1;

输出:

+------------------+------------+
| employee name    | Occurrence |
+------------------+------------+
| Daniel Jameson   |          2 |
| Jules Reddington |          2 |
| Matthew Defoe    |          2 |
+------------------+------------+
3 rows in set (0.001 sec)

这将检索重复的记录和表中每条记录的重复数。正如预期的那样,有三 (3) 条重复记录。

SELECT COUNT 语句的使用将通过此官方参考进一步讨论。

使用 INNER JOIN 语句查看 MySQL 表中的重复记录

我们可以在目标表和 SELECT DISTINCT 查询之间使用 INNER JOIN 查询来查看主记录旁边的重复记录。

SELECT I.id, O.name, O.title, O.salary
FROM employees_details AS I
INNER JOIN(SELECT DISTINCT name, title, salary FROM employees_details) AS O
ON I.name = O.name;

输出:

+----+------------------+--------------------------+--------+
| id | name             | title                    | salary |
+----+------------------+--------------------------+--------+
|  1 | James Maddison   | Computer Engineer        |  80000 |
|  2 | Matthew Defoe    | Software Architect       | 150000 |
|  6 | Matthew Defoe    | Software Architect       | 150000 |
|  3 | Daniel Jameson   | Software Engineer II     |  95000 |
|  7 | Daniel Jameson   | Software Engineer II     |  95000 |
|  4 | Jules Reddington | Senior Software Engineer | 120000 |
|  8 | Jules Reddington | Senior Software Engineer | 120000 |
|  5 | Carlos Rodriguez | Data Engineer            | 100000 |
+----+------------------+--------------------------+--------+
8 rows in set (0.001 sec)
Victor A. Oguntuase avatar Victor A. Oguntuase avatar

Victor is an experienced Python Developer, Machine Learning Engineer and Technical Writer with interests across various fields of science and engineering. He is passionate about learning new technologies and skill and working on challenging problems. He enjoys teaching, intellectual discourse, and gaming, among other things.

LinkedIn GitHub