Python | Панды TimedeltaIndex.drop_duplicates

Опубликовано: 28 Марта, 2022

Python - отличный язык для анализа данных, в первую очередь из-за фантастической экосистемы пакетов Python, ориентированных на данные. Pandas - один из таких пакетов, который значительно упрощает импорт и анализ данных.

Pandas TimedeltaIndex.drop_duplicates() function return Index with duplicate values removed. The function provides the flexibility to choose which which duplicate value to keep and rest to drop.

Syntax : TimedeltaIndex.drop_duplicates(keep=’first’)

Parameters :
keep : {‘first’, ‘last’, False}, default ‘first’
-> first : Drop duplicates except for the first occurrence.
-> last : Drop duplicates except for the last occurrence.
-> False : Drop all duplicates

Return : deduplicated : Index

Example #1: Use TimedeltaIndex.drop_duplicates() function to drop all the duplicate value from the given TimedeltaIndex object. Keep the first occurrences only.

# importing pandas as pd
import pandas as pd
  
# Create the TimedeltaIndex object
tidx = pd.TimedeltaIndex(data =["06:05:01.000030", "+23:59:59.999999",
                                "22 day 2 min 3us 10ns", "+23:59:59.999999"
                                "+23:29:59.999999", "+12:19:59.999999"])
  
# Print the TimedeltaIndex object
print(tidx)

Output :

Now we will use the TimedeltaIndex.drop_duplicates() function to drop all the duplicate values while keeping the first occurrence.

# drop all duplicates and keep the first occurrence
tidx.drop_duplicates(keep ="first")

Output :

As we can see in the output, the TimedeltaIndex.drop_duplicates() function has returned a new object which has all the duplicate values removed except the first occurrence.
 
Example #2: Use TimedeltaIndex.drop_duplicates() function to drop all the duplicate value from the given TimedeltaIndex object. Keep the last duplicate value.

# importing pandas as pd
import pandas as pd
  
# Create the TimedeltaIndex object
tidx = pd.TimedeltaIndex(data =["1 days 02:00:00", "1 days 06:05:01.000030",
           "1 days 02:00:00", "1 days 02:00:00", "21 days 06:15:01.000030"])
  
# Print the TimedeltaIndex object
print(tidx)

Output :

Now we will use the TimedeltaIndex.drop_duplicates() function to drop all the duplicate values while keeping the last occurrence.

# drop all duplicates and keep the first occurrence
tidx.drop_duplicates(keep ="last")

Output :

As we can see in the output, the TimedeltaIndex.drop_duplicates() function has returned a new object which has all the duplicate values removed except the last occurrence.

Внимание компьютерщик! Укрепите свои основы с помощью базового курса программирования Python и изучите основы.

Для начала подготовьтесь к собеседованию. Расширьте свои концепции структур данных с помощью курса Python DS. А чтобы начать свое путешествие по машинному обучению, присоединяйтесь к курсу Машинное обучение - базовый уровень.