Basic

import pandas as pd
print (f" Using {pd.__name__},Version {pd.__version__}")
 Using pandas,Version 0.23.0

创建空Dataframe

df = pd.DataFrame() 
print(df)
Empty DataFrame
Columns: []
Index: []

从Dict创建Dataframe

dict = {'name':["Tom", "Bob", "Mary", "James"], 
        'age': [18, 30, 25, 40], 
        'city':["Beijing", "ShangHai","GuangZhou", "ShenZhen"]} 
  
df = pd.DataFrame(dict) 
df

name

age

city

0

Tom

18

Beijing

1

Bob

30

ShangHai

2

Mary

25

GuangZhou

3

James

40

ShenZhen

age

city

person

Tom

18

Beijing

Bob

30

ShangHai

Mary

25

GuangZhou

James

40

ShenZhen

对columns的基础操作

add column

name

age

city

0

Tom

18

Beijing

1

Bob

30

ShangHai

2

Mary

25

GuangZhou

3

James

40

ShenZhen

name

age

city

country

0

Tom

18

Beijing

USA

1

Bob

30

ShangHai

USA

2

Mary

25

GuangZhou

USA

3

James

40

ShenZhen

USA

name

age

city

country

adress

0

Tom

18

Beijing

USA

USA

1

Bob

30

ShangHai

USA

USA

2

Mary

25

GuangZhou

USA

USA

3

James

40

ShenZhen

USA

USA

Change column values

name

age

city

country

adress

0

Tom

18

Beijing

China

USA

1

Bob

30

ShangHai

China

USA

2

Mary

25

GuangZhou

China

USA

3

James

40

ShenZhen

China

USA

name

age

city

country

adress

0

Tom

18

Beijing

China

Beijing,China

1

Bob

30

ShangHai

China

ShangHai,China

2

Mary

25

GuangZhou

China

GuangZhou,China

3

James

40

ShenZhen

China

ShenZhen,China

Delete columns

name

age

adress

0

Tom

18

Beijing,China

1

Bob

30

ShangHai,China

2

Mary

25

GuangZhou,China

3

James

40

ShenZhen,China

Select columns

age

name

0

18

Tom

1

30

Bob

2

25

Mary

3

40

James

Rename columns

Name

Age

Adress

0

Tom

18

Beijing,China

1

Bob

30

ShangHai,China

2

Mary

25

GuangZhou,China

3

James

40

ShenZhen,China

name

age

adress

0

Tom

18

Beijing,China

1

Bob

30

ShangHai,China

2

Mary

25

GuangZhou,China

3

James

40

ShenZhen,China

Name

Age

Adress

0

Tom

18

Beijing,China

1

Bob

30

ShangHai,China

2

Mary

25

GuangZhou,China

3

James

40

ShenZhen,China

Set column value with conditions

Name

Age

Adress

Group

0

Tom

18

Beijing,China

young

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

对rows的基础操作

loc函数查询

Name

Age

Adress

Group

0

Tom

18

Beijing,China

young

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

Name

Age

Adress

Group

0

Tom

18

Beijing,China

young

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

loc函数条件查询

Name

Age

Adress

Group

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

loc函数条件行列查询

Where 查询

Name

Age

Adress

Group

0

NaN

NaN

NaN

NaN

1

Bob

30.0

ShangHai,China

middle_aged

2

NaN

NaN

NaN

NaN

3

James

40.0

ShenZhen,China

elderly

Query 筛选

Name

Age

Adress

Group

0

Tom

18

Beijing,China

young

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

Name

Age

Adress

Group

3

James

40

ShenZhen,China

elderly

Dataframe其他信息

Age

count

4.000000

mean

28.250000

std

9.251126

min

18.000000

25%

23.250000

50%

27.500000

75%

32.500000

max

40.000000

Name

Age

Adress

Group

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

读写CSV

把df导出为CSV,不要index

读取CSV为dataframe

Name

Age

Adress

Group

0

Tom

18

Beijing,China

young

1

Bob

30

ShangHai,China

middle_aged

2

Mary

25

GuangZhou,China

middle_aged

3

James

40

ShenZhen,China

elderly

Last updated

Was this helpful?