Python: Removing Duplicates while Maintaining Order

Python provides several techniques to remove duplicates from a list while preserving the original order of the elements. Let us explore 4 such methods for removing duplicate elements and compare their performance, syntax, and use cases.

1. Different Methods to Remove Duplicates and Maintain Order

Let’s start by comparing the 4 different techniques:

MethodWhen to UsePerformance
Seen SetFor simple lists with non-hashable elementsModerate
OrderedDictFor simple lists, hashable and non-hashable elements, clean codeModerate
NumPy ArrayFor numerical and large lists or arraysHigh
Pandas DataframeFor dataframes or tabular dataHigh

Now, let’s delve into each method in detail.

2. Seen Set: Removing Duplicates using Iteration

This method utilizes a set to keep track of seen elements while iterating through the list. When encountering a new element, it checks if it’s already in the set. If not, it adds the element to both the result list and the set. This ensures that only unique elements are retained, preserving the original order.

It is suitable for lists containing non-hashable elements such as lists, dictionaries, or other sets. It offers moderate performance for smaller lists due to the overhead of set operations.

def remove_duplicates_seen(lst):
    seen = set()
    result = []
    for item in lst:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

# Example Usage
original_list = [5, 1, 2, 4, 2, 3, 1]
print(remove_duplicates_seen(original_list))

The program output:

[5, 1, 2, 4, 3]

3. OrderedDict: Removing Duplicates using Collections Module

This method uses OrderedDict to maintain the order of elements while removing duplicates. OrderedDict is a dictionary subclass that remembers the order in which its contents are added. This method creates an OrderedDict from the list, which automatically eliminates duplicates, and then converts it back to a list.

It method provides clean code and guarantees the preservation of the original order of elements.

from collections import OrderedDict

def remove_duplicates_ordered(lst):
    return list(OrderedDict.fromkeys(lst))

# Example Usage
original_list = [5, 1, 2, 4, 2, 3, 1]
print(remove_duplicates_ordered(original_list))

The program output:

[5, 1, 2, 4, 3]

4. Numpy Array: Remove Duplicates from Large Numerical Arrays

Numpy‘s unique() function returns the unique elements of an array while preserving the order. This efficiently removes duplicates while maintaining the original order.

It is ideal for numerical lists or large arrays and offers high performance due to optimized C-level implementations.

import numpy as np

def remove_duplicates_numpy(lst):
    return list(np.unique(lst, return_index=True)[0])

# Example Usage
original_list = [5, 1, 2, 4, 2, 3, 1]
print(remove_duplicates_numpy(original_list))

The program output:

[5, 1, 2, 4, 3]

5. Pandas Dataframe: Remove duplicares from Dataframe or Tabular Data

Pandas provides efficient data manipulation tools, and its DataFrame can be used to remove duplicates while maintaining order, suitable for dataframes or tabular data. This method converts the list into a pandas DataFrame, removes duplicates using the drop_duplicates() function, and then converts the result back to a list.

This method provides high performance for dataframes or tabular data due to optimized implementations.

import pandas as pd

def remove_duplicates_pandas(lst):
    return pd.DataFrame(lst, columns=['Original']).drop_duplicates()['Original'].tolist()

# Example Usage
original_list = [5, 1, 2, 4, 2, 3, 1]
print(remove_duplicates_pandas(original_list))

The program output:

[5, 1, 2, 4, 3]

6. Conclusion

In this Python tutorial, we explored 4 different techniques to remove duplicates from a list while preserving order. Each method has its use cases, performance considerations, and syntax ease. Depending on the data type and requirements, you can choose the most suitable method.

Happy Learning !!

Source Code on Github

Comments

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.

Our Blogs

REST API Tutorial

Dark Mode

Dark Mode