Below is the original post from Richard French at Axiom Systems Inc.
From: Richard S. French Sent: Wednesday, October 10, 2001 1:10 PM To: robelle-l Subject: [robelle-l] Suprtool - Deleting duplicates in detail dataset Using Suprtool, I inadvertently created duplicate records in my detail dataset. Is there an easier way to delete these duplicate records, other than what I have described below? Note - currently using Suprtool Version 4.3 Thanks. Open database Get detail dataset records Sort records Duplicate none keys Output flat file Delete Xeq Input flat file Put to detail dataset Xeq Richard S. French
Richard's code above deletes all the records from the dataset and then re-puts them, but without the duplicates. Glenn Cole suggested using Adager to erase the dataset, which would be faster. But Hans Hendriks from Robelle support came up with another way to do the entire job, only much faster:
Sent: Wednesday, October 10, 2001 5:35 PM To: robelle-l Subject: Re: Suprtool - Deleting duplicates in detail dataset Richard, You example deletes *all* records, then writes back *all* "originals". This could be expensive if you have a large number of records, and only a few duplicates. If this is the case, I suggest you do the following: Pass 1 - find duplicated records get dataset define longfield,1,80 (enough to make records unique) sort longfield duplicate only keys output dupfile,link xeq Pass 2, Delete all records that have duplicates get dataset table duptab,longfield,sorted,dupfile if $lookup(duptab,longfield) delete Output $null xeq Pass 3, Add back 1 of each previously duplicated record: input dupfile sort longfield duplicate none keys {in case there were 3 or more of some....} put dataset xeq /Hans
And Richard confirmed the beauty of the solution:
Subject: Re: Suprtool - Deleting duplicates in detail dataset To: robelle-l Thanks, Hans. That was just what I was looking for - as my detail dataset had 1.5 million records and I had roughly 20,000 duplicates. I just tested this and it worked great.!