Can you device an algorithm to de-duplicate from a billion sets (given that the parameters in the set may slightly vary)? Use searching / Sorting / Approximations / statistics (When can you say TWO sets are more or less same?).
Sample Data: Can be provided; contact Shri B.S. Jagadeesh, firstname.lastname@example.org or click below link