从字典列表创建 Pandas DataFrame
Luqman Khan
2022年5月16日
字典是一个紧凑而灵活的 Python 容器,用于存储单独的键值映射。字典用大括号 ({}
) 编写,其中包括用逗号 (,)
和 :
分隔每个键与其值的关键字对。
下面显示了三个字典,其中包含一个骰子游戏的示例。
让我们以骰子游戏为例。在这种情况下,两名玩家滚动他们的六个骰子并与相应的玩家存储骰子。
import pandas as pd
from numpy.random import randint
# create datset from multiple dictionaries
dataset_list=[{'Harry':1,'Josh':3,'dices':'first dice'},
{'Harry':5,'Josh':1,'dices':'second dice'},
{'Harry':6,'Josh':2,'dices':'third dice'},
{'Harry':2,'Josh':3,'dices':'fourth dice'},
{'Harry':6,'Josh':6,'dices':'fifth dice'},
{'Harry':4,'Josh':3,'dices':'sixth dice'}]
df=pd.DataFrame(dataset_list)
print(df)
print()
harry=[]
josh=[]
for i in range(6):
harry.append(randint(1,7))
josh.append(randint(1,7))
我们从包含字典的项目列表中创建了一个数据集,因为我们知道 DataFrame
采用键值对。这就是为什么这适用于字典。
输出:
Harry Josh dices
0 1 3 first dice
1 5 1 second dice
2 6 2 third dice
3 2 3 fourth dice
4 6 6 fifth dice
5 4 3 sixth dice
我们在上一个例子中手动设置骰子;现在,我们将使用 numpy
库中定义的 randint
方法。我们在下一行创建了两个名为 harry
和 josh
的空白列表。接下来,我们创建了一个 for
循环,该范围定义为 0-6,使用 append()
方法将两个已定义列表中的随机数作为元素附加,如下所示。
import pandas as pd
from numpy.random import randint
print()
harry=[]
josh=[]
for i in range(6):
harry.append(randint(1,7))
josh.append(randint(1,7))
# create datset from multiple dictionaries
dataset_list=[{'Harry':harry[0],'Josh':josh[0],'dices':'first dice'},
{'Harry':harry[1],'Josh':josh[1],'dices':'second dice'},
{'Harry':harry[2],'Josh':josh[2],'dices':'third dice'},
{'Harry':harry[3],'Josh':josh[3],'dices':'fourth dice'},
{'Harry':harry[4],'Josh':josh[4],'dices':'fifth dice'},
{'Harry':harry[5],'Josh':josh[5],'dices':'sixth dice'}]
df=pd.DataFrame(dataset_list)
print(df)
请记住,randint()
的范围从给定的 1 到 n-1
,或者默认情况下从零到 n-1
,这就是我们定义从 1-7
的范围的原因。
输出
Harry Josh dices
0 4 1 first dice
1 4 2 second dice
2 3 4 third dice
3 1 1 fourth dice
4 4 5 fifth dice
5 4 4 sixth dice
现在我们在 for
循环的帮助下减少了代码行,并将整个字典附加到一个列表中,进一步在名为 index
的列表中附加索引与玩家回合相对应并设置为 DataFrame
中的索引。
import pandas as pd
from numpy.random import randint
dataset_list=[]
index=[]
for i in range(1,7):
dataset_list.append({'Harry':randint(1,7),'Josh':randint(1,7)})
index.append('dice '+str(i))
print('\nAfter reducing the code\n')
df=pd.DataFrame(dataset_list,index=index)
print(df)
输出:
Harry Josh
dice 1 2 4
dice 2 2 3
dice 3 6 5
dice 4 5 2
dice 5 4 2
dice 6 1 1
所有示例:
import pandas as pd
from numpy.random import randint
# create datset from multiple dictionaries
dataset_list=[{'Harry':1,'Josh':3,'dices':'first dice'},
{'Harry':5,'Josh':1,'dices':'second dice'},
{'Harry':6,'Josh':2,'dices':'third dice'},
{'Harry':2,'Josh':3,'dices':'fourth dice'},
{'Harry':6,'Josh':6,'dices':'fifth dice'},
{'Harry':4,'Josh':3,'dices':'sixth dice'}]
df=pd.DataFrame(dataset_list)
print(df)
print()
harry=[]
josh=[]
for i in range(6):
harry.append(randint(1,7))
josh.append(randint(1,7))
# create datset from multiple dictionaries
dataset_list=[{'Harry':harry[0],'Josh':josh[0],'dices':'first dice'},
{'Harry':harry[1],'Josh':josh[1],'dices':'second dice'},
{'Harry':harry[2],'Josh':josh[2],'dices':'third dice'},
{'Harry':harry[3],'Josh':josh[3],'dices':'fourth dice'},
{'Harry':harry[4],'Josh':josh[4],'dices':'fifth dice'},
{'Harry':harry[5],'Josh':josh[5],'dices':'sixth dice'}]
df=pd.DataFrame(dataset_list)
print(df)
dataset_list=[]
index=[]
for i in range(1,7):
dataset_list.append({'Harry':randint(1,7),'Josh':randint(1,7)})
index.append('dice '+str(i))
print('\nAfter reducing the code\n')
df=pd.DataFrame(dataset_list,index=index)
print(df)
输出:
Harry Josh dices
0 1 3 first dice
1 5 1 second dice
2 6 2 third dice
3 2 3 fourth dice
4 6 6 fifth dice
5 4 3 sixth dice
Harry Josh dices
0 4 1 first dice
1 4 2 second dice
2 3 4 third dice
3 1 1 fourth dice
4 4 5 fifth dice
5 4 4 sixth dice
After reducing the code
Harry Josh
dice 1 2 4
dice 2 2 3
dice 3 6 5
dice 4 5 2
dice 5 4 2
dice 6 1 1