I need to find duplicates in a table named lead
, which contains around 100k records. The duplicates have similar values in the company
column, such as:
The goal is to retain only the latest leadid
(95803 in this case). However, there's an issue with leadid
95803, which has some extra characters after a space.
I've attempted to use the following script, but it's not providing the desired results:
select t1.*
FROM [dbo].[LEAD] t1
LEFT JOIN (
SELECT
company,
city,
MAX(leadid) AS keep_leadid
FROM [dbo].[LEAD]
GROUP BY company, city
) t2 ON t1.company = t2.company AND t1.city = t2.city
WHERE t1.leadid <> t2.keep_leadid
AND t1.company LIKE '%Uvalde Country%'
Any assistance in refining the script to achieve the intended outcome would be greatly appreciated.
I want to all delete except of this:
There are many companies, with different strings, I want to apply same script for all.