How To Delete Duplicate Rows in MySQL
Duplicate rows in a table can cause data inconsistencies and affect the integrity of your database. Deleting duplicates in MySQL requires identifying rows with identical values in one or more columns and keeping only one unique record.
Steps to Delete Duplicate Rows
Step 1: Identify Duplicates
First, identify the duplicate rows using a GROUP BY
query with an aggregate function like COUNT(*)
.
- Replace
column1
,column2
, etc., with the columns you want to check for duplicates.
Step 2: Delete Duplicates Using a Temporary Table
- Create a temporary table to store unique rows.
- Delete all data from the original table.
- Reinsert the unique rows into the original table.
Step 3: Delete Duplicates Using ROW_NUMBER()
(MySQL 8.0 and later)
You can use the ROW_NUMBER()
function to assign a unique row number to each duplicate group and delete rows where the row number is greater than 1.
- Replace
id
with the primary key or a unique column in your table. - The
PARTITION BY
clause groups rows with identical values incolumn1
,column2
, etc. - The
ROW_NUMBER()
function assigns a sequential number to each row in the group.
Step 4: Delete Duplicates Without Using a Temporary Table or CTE
If your table has a unique identifier, you can delete duplicates by targeting rows with the greatest or least value of the identifier.
- This query compares the table to itself (
t1
andt2
are aliases) to find duplicates. - The condition
t1.id > t2.id
ensures that only the row with the greaterid
is deleted, leaving the smallestid
intact.
Examples
Example Table: products
id | product_name | category | price |
---|---|---|---|
1 | Laptop | Electronics | 800 |
2 | Laptop | Electronics | 800 |
3 | Phone | Electronics | 500 |
4 | Phone | Electronics | 500 |
5 | Tablet | Electronics | 300 |
Deleting Duplicate Rows Using ROW_NUMBER()
Result:
id | product_name | category | price |
---|---|---|---|
1 | Laptop | Electronics | 800 |
3 | Phone | Electronics | 500 |
5 | Tablet | Electronics | 300 |
Deleting Duplicate Rows Without ROW_NUMBER()
Result:
id | product_name | category | price |
---|---|---|---|
1 | Laptop | Electronics | 800 |
3 | Phone | Electronics | 500 |
5 | Tablet | Electronics | 300 |
Best Practices
- Backup Data: Always create a backup of your table before performing delete operations.
- Test Query: Run a
SELECT
query first to ensure the rows you intend to delete are correct. - Use Transactions: Wrap your delete operations in a transaction to allow rollback if something goes wrong.
Conclusion
Deleting duplicate rows in MySQL can be achieved using various methods, depending on the version of MySQL and the structure of your data. For MySQL 8.0 and later, the ROW_NUMBER()
function is highly effective. For older versions, self-joins or temporary tables provide a reliable alternative. Always ensure data safety by testing and backing up your database before proceeding.