Python | Панды dataframe.infer_objects ()

Опубликовано: 28 Марта, 2022

Python - отличный язык для анализа данных, в первую очередь из-за фантастической экосистемы пакетов Python, ориентированных на данные. Pandas - один из таких пакетов, который значительно упрощает импорт и анализ данных.

Pandas dataframe.infer_objects() function attempts to infer better data type for input object column. This function attempts soft conversion of object-dtyped columns, leaving non-object and unconvertible columns unchanged. The inference rules are the same as during normal Series/DataFrame construction.

Syntax: DataFrame.infer_objects()
Returns : converted : same type as input object

Example #1: Use infer_objects() function to infer better data type.

# importing pandas as pd
import pandas as pd
  
# Creating the dataframe 
df = pd.DataFrame({"A":["sofia", 5, 8, 11, 100],
                   "B":[2, 8, 77, 4, 11],
                   "C":["amy", 11, 4, 6, 9]})
  
# Print the dataframe
df

Выход :

Let’s see the dtype (data type) of each column in the dataframe.

# to print the basic info
df.info()

As we can see in the output, first and third column is of object type. whereas the second column is of int64 type. Now slice the dataframe and create a new dataframe from it.

# slice from the 1st row till end
df_new = df[1:]
  
# Let"s print the new data frame
df_new
  
# Now let"s print the data type of the columns
df_new.info()

Выход :

As we can see in the output, column “A” and “C” are of object type even though they contain integer value. So, let’s try the infer_objects() function.

# applying infer_objects() function.
df_new = df_new.infer_objects()
  
# Print the dtype after applying the function
df_new.info()

Output :

Now, if we look at the dtype of each column, we can see that the column “A” and “C” are now of int64 type.
 

Example #2: Use infer_objects() function to infer better data type for the object.

# importing pandas as pd
import pandas as pd
  
# Creating the dataframe 
df = pd.DataFrame({"A":["sofia", 5, 8, 11, 100], 
                   "B":[2 + 2j, 8, 77, 4, 11],
                   "C":["amy", 11, 4, 6, 9]})
  
# Print the dataframe
df

Let’s see the dtype (data type) of each column in the dataframe.

# to print the basic info
df.info()

As we can see in the output, first and third column is of object type. whereas the second column is of complex128 type. Now slice the dataframe and create a new dataframe from it.

# slice from the 1st row till end
df_new = df[1:]
  
# Let"s print the new data frame
df_new
  
# Now let"s print the data type of the columns
df_new.info()


As we can see in the output, column “A” and “C” are of object type even though they contain integer value. Similar is the case with column “B”. So, let’s try the infer_objects() function.

# applying infer_objects() function.
df_new = df_new.infer_objects()
  
# Print the dtype after applying the function
df_new.info()

Выход :

Notice, the dtype for column “B” did not change. infer_objects() function tries to do soft conversion leaving non-object and unconvertible columns unchanged.

Внимание компьютерщик! Укрепите свои основы с помощью базового курса программирования Python и изучите основы.

Для начала подготовьтесь к собеседованию. Расширьте свои концепции структур данных с помощью курса Python DS. А чтобы начать свое путешествие по машинному обучению, присоединяйтесь к курсу Машинное обучение - базовый уровень.