One common task is looking for duplicate records and deleting them. I locate them by looking for records where the field in question has the same value across multiple records:
select field1 from
(
select cnt,field1 from
(
select count(1) cnt,field1 from table1 group by field1
) x where cnt > 1
) x where cnt > 1
But I can’t just delete all of those, because I want to keep 1 record and delete only the duplicate. So, I find the lowest id for the duplicate records (this is the primary key, hopefully it is always an integer), and that is the record that I want to delete:
select min(id) id,field1
from table1
where field1 in
(
select field1 from
(
select count(1) cnt,field1 from table1 group by field1
) x where cnt > 1
)
group by field1
Here’s how I “delete” them. First, I change the parentid (e.g. I change the foreign key to a non-existent record.). Then, I test my application. If the application breaks, I update the parentid to make it what it used to be. It is a quick way of testing and restoring.
update table1
set parentfield = -999
where field1 in
(
select id from
(
select min(id) id,field1
from table1
where field1 in
(
select field1 from
(
select count(1) cnt,field1 from table1 group by field1
) x where cnt > 1
)
group by field1
) x
)
When I’m convinced that my changes didn’t cause a problem, I then delete them:
delete from table1 where parentid = -999