Pandas章节应用的数据可以在以下链接下载:
https://files.cnblogs.com/files/AI-robort/Titanic_Data-master.zip
import pandas as pd
df=pd.read_csv('./Titanic_Data-master/Titanic_Data-master/train.csv')
.head():可以读取前几条数据,或指定前几条都可以
df.head(6)
.info():返回当前的信息
df.info()
<class 'pandas.core.frame.DataFrame'>RangeIndex: 891 entries, 0 to 890Data columns (total 12 columns):PassengerId 891 non-null int64Survived 891 non-null int64Pclass 891 non-null int64Name 891 non-null objectSex 891 non-null objectAge 714 non-null float64SibSp 891 non-null int64Parch 891 non-null int64Ticket 891 non-null objectFare 891 non-null float64Cabin 204 non-null objectEmbarked 889 non-null objectdtypes: float64(2), int64(5), object(5)memory usage: 83.6+ KB
df.index#索引值的属性
RangeIndex(start=0, stop=891, step=1)
df.columns#每一列的名字
Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'], dtype='object')
df.dtypes#每一列的值的类型
PassengerId int64Survived int64Pclass int64Name objectSex objectAge float64SibSp int64Parch int64Ticket objectFare float64Cabin objectEmbarked objectdtype: object
df.values#每行的值
array([[1, 0, 3, ..., 7.25, nan, 'S'], [2, 1, 1, ..., 71.2833, 'C85', 'C'], [3, 1, 3, ..., 7.925, nan, 'S'], ..., [889, 0, 3, ..., 23.45, nan, 'S'], [890, 1, 1, ..., 30.0, 'C148', 'C'], [891, 0, 3, ..., 7.75, nan, 'Q']], dtype=object)
自己创建data_frame数据
data={'country':['aaa','bbb','ccc'],'population':[10,12,14]}df_data=pd.DataFrame(data)df_data
df_data.info()
<class 'pandas.core.frame.DataFrame'>RangeIndex: 3 entries, 0 to 2Data columns (total 2 columns):country 3 non-null objectpopulation 3 non-null int64dtypes: int64(1), object(1)memory usage: 128.0+ bytes
age=df['Age']#搜索对应的一列age[:5]#显示前5行数据
0 22.01 38.02 26.03 35.04 35.0Name: Age, dtype: float64
series:dataframe中的一行/列
age.index
age.values[:5]
array([22., 38., 26., 35., 35.])
df.head()
df['Age'][:5]
改变索引对象
df=df.set_index('Name')df.head()
NameBraund, Mr. Owen Harris 22.0Cumings, Mrs. John Bradley (Florence Briggs Thayer) 38.0Heikkinen, Miss. Laina 26.0Futrelle, Mrs. Jacques Heath (Lily May Peel) 35.0Allen, Mr. William Henry 35.0Name: Age, dtype: float64
age=df['Age']age[:5]
age['Allen, Mr. William Henry']#索引名字对应的值
35.0
age=age+10age[:5]
NameBraund, Mr. Owen Harris 32.0Cumings, Mrs. John Bradley (Florence Briggs Thayer) 48.0Heikkinen, Miss. Laina 36.0Futrelle, Mrs. Jacques Heath (Lily May Peel) 45.0Allen, Mr. William Henry 45.0Name: Age, dtype: float64
对值统计指标
age.mean()
39.69911764705882
age.max()
90.0
age.min()
10.42
df.describe()####整体一次性统计各项的指标基本统计特性
原文链接:http://www.cnblogs.com/AI-robort/p/11636703.html
本站QQ群:前端 618073944 | Java 606181507 | Python 626812652 | C/C++ 612253063 | 微信 634508462 | 苹果 692586424 | C#/.net 182808419 | PHP 305140648 | 运维 608723728