Python 中的 Pandas 插入方法
Suraj Joshi
2023年1月30日
2021年1月22日
本教程解釋瞭如何使用 insert()
方法在 Pandas DataFrame 中插入一列。
import pandas as pd
countries_df = pd.DataFrame({
'Country': ["Nepal","Switzerland","Germany","Canada"],
'Continent': ["Asia","Europe","Europe","North America"],
'Primary Language':["Nepali","French","German","English"]
})
print("Countries DataFrame:")
print(countries_df,"\n")
輸出:
Countries DataFrame:
Country Continent Primary Language
0 Nepal Asia Nepali
1 Switzerland Europe French
2 Germany Europe German
3 Canada North America English
我們將使用上例中所示的 countries_df
DataFrame 來解釋如何使用 insert()
方法在 Pandas DataFrame 中插入一列。
pandas.DataFrame.insert()
方法
語法
DataFrame.insert(loc,
column,
value,
allow_duplicates=False)
它將名為 column
的列插入到 DataFrame
中,其值由 value
指定,位於 loc
位置。
使用 insert()
方法插入對所有行具有相同值的列
import pandas as pd
countries_df = pd.DataFrame({
'Country': ["Nepal","Switzerland","Germany","Canada"],
'Continent': ["Asia","Europe","Europe","North America"],
'Primary Language':["Nepali","French","German","English"]
})
print("Countries DataFrame:")
print(countries_df,"\n")
countries_df.insert(3,"Capital","Unknown")
print("Countries DataFrame after inserting Capital column:")
print(countries_df)
輸出:
Countries DataFrame:
Country Continent Primary Language
0 Nepal Asia Nepali
1 Switzerland Europe French
2 Germany Europe German
3 Canada North America English
Countries DataFrame after inserting Capital column:
Country Continent Primary Language Capital
0 Nepal Asia Nepali Unknown
1 Switzerland Europe French Unknown
2 Germany Europe German Unknown
3 Canada North America English Unknown
它在 countries_df
DataFrame 的 3
索引位置插入 ·Capital·列,所有行的 ·Capital·列值均設定為 Unknown
。
該位置從 0
開始,因此 3
位置指的是 DataFrame 中的 4
列。
在 DataFrame 中插入一列,指定每行的值
如果我們想使用 insert()
方法為要插入的列指定每一行的值,我們可以在 insert()
方法中傳遞一個值列表作為 value
引數。
import pandas as pd
countries_df = pd.DataFrame({
'Country': ["Nepal","Switzerland","Germany","Canada"],
'Continent': ["Asia","Europe","Europe","North America"],
'Primary Language':["Nepali","French","German","English"]
})
print("Countries DataFrame:")
print(countries_df,"\n")
capitals=["Kathmandu","Zurich","Berlin","Ottawa"]
countries_df.insert(2,"Capital",capitals)
print("Countries DataFrame after inserting Capital column:")
print(countries_df)
輸出:
Countries DataFrame:
Country Continent Primary Language
0 Nepal Asia Nepali
1 Switzerland Europe French
2 Germany Europe German
3 Canada North America English
Countries DataFrame after inserting Capital column:
Country Continent Capital Primary Language
0 Nepal Asia Kathmandu Nepali
1 Switzerland Europe Zurich French
2 Germany Europe Berlin German
3 Canada North America Ottawa English
它在 DataFrame countries_df
中的索引 2
插入了列 Capital
,併為 DataFrame 中的 Capital
列指定了每一行的值。
在 insert()
方法中設定 allow_duplicates = True
來新增已經存在的列
import pandas as pd
countries_df = pd.DataFrame({
'Country': ["Nepal","Switzerland","Germany","Canada"],
'Continent': ["Asia","Europe","Europe","North America"],
'Primary Language':["Nepali","French","German","English"],
'Capital':["Kathmandu","Zurich","Berlin","Ottawa"]
})
print("Countries DataFrame:")
print(countries_df,"\n")
capitals=["Kathmandu","Zurich","Berlin","Ottawa"]
countries_df.insert(4,"Capital",capitals,allow_duplicates = True)
print("Countries DataFrame after inserting Capital column:")
print(countries_df)
輸出:
Countries DataFrame:
Country Continent Primary Language Capital
0 Nepal Asia Nepali Kathmandu
1 Switzerland Europe French Zurich
2 Germany Europe German Berlin
3 Canada North America English Ottawa
Countries DataFrame after inserting Capital column:
Country Continent Primary Language Capital Capital
0 Nepal Asia Nepali Kathmandu Kathmandu
1 Switzerland Europe French Zurich Zurich
2 Germany Europe German Berlin Berlin
3 Canada North America English Ottawa Ottawa
它將列 Capital
新增到 countries_df
DataFrame 中,儘管 countries_df
DataFrame 中已經存在 Capital
列。
如果我們嘗試插入已經存在於 DataFrame 中的列,而沒有在 insert()
方法中設定 allow_duplicates = True
,它就會向我們丟擲一個錯誤資訊:ValueError: cannot insert column, already exists.
。
Author: Suraj Joshi
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn