python – 为什么我的PanelND工厂抛出KeyError?
|
我在Ubuntu 13.04上使用Pandas版本0.12.0.我正在尝试创建一个5D面板对象,以包含按条件分割的一些EEG数据. 我如何选择构建我的数据: 首先让我演示一下我对pandas.core.panelnd.creat_nd_panel_factory的使用. Subject = panelnd.create_nd_panel_factory(
klass_name='Subject',axis_orders=['setsize','location','vfield','channels','samples'],axis_slices={'labels': 'location','items': 'vfield','major_axis': 'major_axis','minor_axis': 'minor_axis'},slicer=pd.Panel4D,axis_aliases={'ss': 'setsize','loc': 'location','vf': 'vfield','major': 'major_axis','minor': 'minor_axis'}
# stat_axis=2 # dafuq is this?
)
从本质上讲,该组织如下: > setsize:一个实验条件,可以是1或2 最后两个轴对应于DataFrame的major_axis和minor_axis.为清楚起见,它们已重命名: >频道:列,EEG频道(其中129个) 我正在做的事情: 每个实验条件(主题x设置x位置x vfield)存储在它自己的制表符分隔文件中,我用pandas.read_table读取它,获取DataFrame对象.我想为每个主题创建一个5维面板(即主题),其将包含该主题的所有实验条件(即DataFrame). 首先,我正在为每个主题/主题构建一个嵌套字典: # ... do some boring stuff to get the text files,etc...
for _,factors in df.iterrows():
# `factors` is a 4-tuple containing
# (subject number,setsize,location,vfield,# and path to the tab-delimited file).
sn,ss,loc,vf,path = factors
eeg = pd.read_table(path,sep='t',names=range(1,129) + ['ref'],header=None)
# build nested dict
subjects.setdefault(sn,{}).setdefault(ss,{}).setdefault(loc,{})[vf] = eeg
# and now attempt to build `Subject`
for sn,d in subjects.iteritems():
subjects[sn] = Subject(d)
完整堆栈跟踪 ---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-2-831fa603ca8f> in <module>()
----> 1 import_data()
/home/louist/Dropbox/Research/VSTM/scripts/vstmlib.py in import_data()
64
65 import ipdb; ipdb.set_trace()
---> 66 for sn,d in subjects.iteritems():
67 subjects[sn] = Subject(d)
68
/usr/local/lib/python2.7/dist-packages/pandas/core/panelnd.pyc in __init__(self,*args,**kwargs)
65 if 'dtype' not in kwargs:
66 kwargs['dtype'] = None
---> 67 self._init_data(*args,**kwargs)
68 klass.__init__ = __init__
69
/usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _init_data(self,data,copy,dtype,**kwargs)
250 mgr = data
251 elif isinstance(data,dict):
--> 252 mgr = self._init_dict(data,passed_axes,dtype=dtype)
253 copy = False
254 dtype = None
/usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _init_dict(self,axes,dtype)
293 raxes = [self._extract_axis(self,axis=i)
294 if a is None else a for i,a in enumerate(axes)]
--> 295 raxes_sm = self._extract_axes_for_slice(self,raxes)
296
297 # shallow copy
/usr/local/lib/python2.7/dist-packages/pandas/core/panel.pyc in _extract_axes_for_slice(self,axes)
1477 """ return the slice dictionary for these axes """
1478 return dict([(self._AXIS_SLICEMAP[i],a) for i,a
-> 1479 in zip(self._AXIS_ORDERS[self._AXIS_LEN - len(axes):],axes)])
1480
1481 @staticmethod
KeyError: 'location'
我知道panelnd是一个实验性功能,但我很确定我做错了什么.有人可以指点我正确的方向吗?如果它是一个bug,有什么可以做的吗? 像往常一样,非常感谢你! 解决方法工作实例.您需要通过切片指定轴到内轴名称的映射.这与内部结构混淆,但大熊猫的固定名称仍然存在(并且通过Panel / Panel4D进行了一些硬编码),因此您需要提供映射.我会首先创建一个Panel4D,然后像下面一样创建你的主题. 如果你发现更多错误,请在github /这里发帖.这不是一个使用频繁的功能. 产量 <class 'pandas.core.panelnd.Subject'> Dimensions: 3 (setsize) x 1 (location) x 1 (vfield) x 10 (channels) x 2 (samples) Setsize axis: level0_0 to level0_2 Location axis: level1_0 to level1_0 Vfield axis: level2_0 to level2_0 Channels axis: level3_0 to level3_9 Samples axis: level4_1 to level4_2 码 import pandas as pd
import numpy as np
from pandas.core import panelnd
Subject = panelnd.create_nd_panel_factory(
klass_name='Subject',axis_slices={'location' : 'labels','vfield' : 'items','channels' : 'major_axis','samples': 'minor_axis'},'loc': 'labels','vf': 'items','minor': 'minor_axis'})
subjects = dict()
for i in range(3):
eeg = pd.DataFrame(np.random.randn(10,2),columns=['level4_1','level4_2'],index=[ "level3_%s" % x for x in range(10)])
loc,vf = ('level1_0','level2_0')
subjects["level0_%s" % i] = pd.Panel4D({ loc : { vf : eeg }})
print Subject(subjects) (编辑:安卓应用网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
