collections之DataFrame和Series
DataFrame:用於把json字符串轉化成表格形式
frame如果是DataFrame類型,那麽可以把他看成一個表
其中frame['列名']得到的就是一列數據,也稱之為Series
使用series.value_counts()可以得到數據出現的頻度
frame Out[64]: a al c \ 0 Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKi... en-US,en;q=0.8 US 1 GoogleMaps/RochesterNY NaN US 2 Mozilla/4.0 (compatible; MSIE 8.0; Windows NT ... en-US US 3 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8)... pt-br BR 4 Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKi... en-US,en;q=0.8 US cy g gr h hc hh l \ 0 Danvers A6qOVH MA wfLQtf 1331822918 1.usa.gov orofrog 1 Provo mwszkS UT mwszkS 1308262393 j.mp bitly 2 Washington xxr3Qb DC xxr3Qb 1331919941 1.usa.gov bitly 3 Braz zCaLwp 27 zUtuOu 1331923068 1.usa.gov alelex88 4 Shrewsbury 9b6kNl MA 9b6kNl 1273672411 bit.ly bitly ll nk \ 0 [42.576698, -70.954903] 1 1 [40.218102, -111.613297] 0 2 [38.9007, -77.043098] 1 3 [-23.549999, -46.616699] 0 4 [42.286499, -71.714699] 0 r t \ 0 http://www.Facebook.com/l/7AQEFzjSi/1.usa.gov/... 1331923247 1 http://www.AwareMap.com/ 1331923249 2 http://t.co/03elZC4Q 1331923250 3 direct 1331923249 4 http://www.shrewsbury-ma.gov/selco/ 1331923251 tz u 0 America/New_York http://www.ncbi.nlm.nih.gov/pubmed/22415991 1 America/Denver http://www.monroecounty.gov/etc/911/rss.php 2 America/New_York http://boxer.senate.gov/en/press/releases/0316... 3 America/Sao_Paulo http://apod.nasa.gov/apod/ap120312.html 4 America/New_York http://www.shrewsbury-ma.gov/egov/gallery/1341... In [65]: frame['tz'] Out[65]: 0 America/New_York 1 America/Denver 2 America/New_York 3 America/Sao_Paulo 4 America/New_York Name: tz, dtype: object In [66]: frame['tz'].value_counts() Out[66]: America/New_York 3 America/Sao_Paulo 1 America/Denver 1 Name: tz, dtype: int64
補上未知值的兩個方法
clean_tz = frame['tz'].fillna("Missing")
clean_tz[clean_tz == ''] = "unknown"
Tags: compatible Windows python Series frame
文章來源: