> For the complete documentation index, see [llms.txt](https://zeliang-yao.gitbook.io/my-note-zeliang-yao/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://zeliang-yao.gitbook.io/my-note-zeliang-yao/useful/pandas/basic.md).

# Basic

```python
import pandas as pd
print (f" Using {pd.__name__},Version {pd.__version__}")
```

```python
 Using pandas,Version 0.23.0
```

## 创建空Dataframe <a href="#chuang-jian-kong-dataframe" id="chuang-jian-kong-dataframe"></a>

```python
df = pd.DataFrame() 
print(df)
```

```
Empty DataFrame
Columns: []
Index: []
```

## 从Dict创建Dataframe <a href="#cong-dict-chuang-jian-dataframe1" id="cong-dict-chuang-jian-dataframe1"></a>

```python
dict = {'name':["Tom", "Bob", "Mary", "James"], 
        'age': [18, 30, 25, 40], 
        'city':["Beijing", "ShangHai","GuangZhou", "ShenZhen"]} 
  
df = pd.DataFrame(dict) 
df
```

|   | name  | age | city      |
| - | ----- | --- | --------- |
| 0 | Tom   | 18  | Beijing   |
| 1 | Bob   | 30  | ShangHai  |
| 2 | Mary  | 25  | GuangZhou |
| 3 | James | 40  | ShenZhen  |

```python
index = pd.Index(["Tom", "Bob", "Mary", "James"],name = 'person')
cols = ['age','city']
data = [[18,'Beijing'],
        [30,'ShangHai'],
        [25,'GuangZhou'],
        [40,'ShenZhen']]

df =pd.DataFrame(index = index,data =data,columns = cols)
df
```

|        | age | city      |
| ------ | --- | --------- |
| person |     |           |
| Tom    | 18  | Beijing   |
| Bob    | 30  | ShangHai  |
| Mary   | 25  | GuangZhou |
| James  | 40  | ShenZhen  |

## 对columns的基础操作 <a href="#dui-columns-de-ji-chu-cao-zuo" id="dui-columns-de-ji-chu-cao-zuo"></a>

### add column <a href="#add-column" id="add-column"></a>

```python
dict = {'name':["Tom", "Bob", "Mary", "James"], 
        'age': [18, 30, 25, 40], 
        'city':["Beijing", "ShangHai","GuangZhou", "ShenZhen"]} 
  
df = pd.DataFrame(dict) 
df
```

|   | name  | age | city      |
| - | ----- | --- | --------- |
| 0 | Tom   | 18  | Beijing   |
| 1 | Bob   | 30  | ShangHai  |
| 2 | Mary  | 25  | GuangZhou |
| 3 | James | 40  | ShenZhen  |

```python
df['country'] = 'USA'
df
```

|   | name  | age | city      | country |
| - | ----- | --- | --------- | ------- |
| 0 | Tom   | 18  | Beijing   | USA     |
| 1 | Bob   | 30  | ShangHai  | USA     |
| 2 | Mary  | 25  | GuangZhou | USA     |
| 3 | James | 40  | ShenZhen  | USA     |

```python
df['adress'] = df['country']
df
```

|   | name  | age | city      | country | adress |
| - | ----- | --- | --------- | ------- | ------ |
| 0 | Tom   | 18  | Beijing   | USA     | USA    |
| 1 | Bob   | 30  | ShangHai  | USA     | USA    |
| 2 | Mary  | 25  | GuangZhou | USA     | USA    |
| 3 | James | 40  | ShenZhen  | USA     | USA    |

### Change column values <a href="#change-column-values" id="change-column-values"></a>

```python
df['country'] = 'China'
df
```

|   | name  | age | city      | country | adress |
| - | ----- | --- | --------- | ------- | ------ |
| 0 | Tom   | 18  | Beijing   | China   | USA    |
| 1 | Bob   | 30  | ShangHai  | China   | USA    |
| 2 | Mary  | 25  | GuangZhou | China   | USA    |
| 3 | James | 40  | ShenZhen  | China   | USA    |

```python
df['adress'] = df['city']+','+ df['country']
df
```

|   | name  | age | city      | country | adress          |
| - | ----- | --- | --------- | ------- | --------------- |
| 0 | Tom   | 18  | Beijing   | China   | Beijing,China   |
| 1 | Bob   | 30  | ShangHai  | China   | ShangHai,China  |
| 2 | Mary  | 25  | GuangZhou | China   | GuangZhou,China |
| 3 | James | 40  | ShenZhen  | China   | ShenZhen,China  |

### Delete columns <a href="#delete-columns" id="delete-columns"></a>

```python
df.drop('country',axis=1, inplace=True)
del df['city']
df
```

|   | name  | age | adress          |
| - | ----- | --- | --------------- |
| 0 | Tom   | 18  | Beijing,China   |
| 1 | Bob   | 30  | ShangHai,China  |
| 2 | Mary  | 25  | GuangZhou,China |
| 3 | James | 40  | ShenZhen,China  |

### Select columns <a href="#select-columns" id="select-columns"></a>

```python
df['age']
```

```python
0    18
1    30
2    25
3    40
Name: age, dtype: int64
```

```python
df.name
```

```python
0      Tom
1      Bob
2     Mary
3    James
Name: name, dtype: object
```

```python
df[['age','name']]  
```

|   | age | name  |
| - | --- | ----- |
| 0 | 18  | Tom   |
| 1 | 30  | Bob   |
| 2 | 25  | Mary  |
| 3 | 40  | James |

```python
df.columns
```

```python
Index(['name', 'age', 'adress'], dtype='object')
```

### Rename columns <a href="#rename-columns" id="rename-columns"></a>

```python
df.rename(index = str, columns = {'age':'Age','name':'Name','adress':'Adress'},inplace=True)
df
```

|   | Name  | Age | Adress          |
| - | ----- | --- | --------------- |
| 0 | Tom   | 18  | Beijing,China   |
| 1 | Bob   | 30  | ShangHai,China  |
| 2 | Mary  | 25  | GuangZhou,China |
| 3 | James | 40  | ShenZhen,China  |

```python
df.rename(str.lower, axis='columns',inplace =True)
df
```

|   | name  | age | adress          |
| - | ----- | --- | --------------- |
| 0 | Tom   | 18  | Beijing,China   |
| 1 | Bob   | 30  | ShangHai,China  |
| 2 | Mary  | 25  | GuangZhou,China |
| 3 | James | 40  | ShenZhen,China  |

```python
df.rename(str.capitalize, axis='columns',inplace =True)
df
```

|   | Name  | Age | Adress          |
| - | ----- | --- | --------------- |
| 0 | Tom   | 18  | Beijing,China   |
| 1 | Bob   | 30  | ShangHai,China  |
| 2 | Mary  | 25  | GuangZhou,China |
| 3 | James | 40  | ShenZhen,China  |

### Set column value with conditions <a href="#set-column-value-with-conditions" id="set-column-value-with-conditions"></a>

```python
df['Group'] = 'elderly'
df.loc[df['Age']<=18, 'Group'] = 'young'
df.loc[(df['Age'] >18) & (df['Age'] <= 30), 'Group'] = 'middle_aged'
df
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 0 | Tom   | 18  | Beijing,China   | young        |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |

## 对rows的基础操作 <a href="#dui-rows-de-ji-chu-cao-zuo" id="dui-rows-de-ji-chu-cao-zuo"></a>

### loc函数查询 <a href="#loc-han-shu-cha-xun" id="loc-han-shu-cha-xun"></a>

```
df
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 0 | Tom   | 18  | Beijing,China   | young        |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |

```python
df.loc[:]
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 0 | Tom   | 18  | Beijing,China   | young        |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |

### loc函数条件查询 <a href="#loc-han-shu-tiao-jian-cha-xun" id="loc-han-shu-tiao-jian-cha-xun"></a>

```python
df.loc[df['Age']>20]
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |

### loc函数条件行列查询 <a href="#loc-han-shu-tiao-jian-hang-lie-cha-xun" id="loc-han-shu-tiao-jian-hang-lie-cha-xun"></a>

```python
df.loc[df['Group']=='middle_aged','Name']
```

```python
1     Bob
2    Mary
Name: Name, dtype: object
```

### Where 查询 <a href="#where-cha-xun" id="where-cha-xun"></a>

```python
filter_adult = df['Age']>25
result = df.where(filter_adult)
result
```

|   | Name  | Age  | Adress         | Group        |
| - | ----- | ---- | -------------- | ------------ |
| 0 | NaN   | NaN  | NaN            | NaN          |
| 1 | Bob   | 30.0 | ShangHai,China | middle\_aged |
| 2 | NaN   | NaN  | NaN            | NaN          |
| 3 | James | 40.0 | ShenZhen,China | elderly      |

### Query 筛选 <a href="#query-shai-xuan" id="query-shai-xuan"></a>

```
df
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 0 | Tom   | 18  | Beijing,China   | young        |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |

```python
df.query('Group=="middle_aged"'and 'Age>30' )
```

|   | Name  | Age | Adress         | Group   |
| - | ----- | --- | -------------- | ------- |
| 3 | James | 40  | ShenZhen,China | elderly |

### Dataframe其他信息 <a href="#le-jie-dataframe" id="le-jie-dataframe"></a>

```
df.shape
```

```
(4, 4)
```

```
df.describe()
```

|       | Age       |
| ----- | --------- |
| count | 4.000000  |
| mean  | 28.250000 |
| std   | 9.251126  |
| min   | 18.000000 |
| 25%   | 23.250000 |
| 50%   | 27.500000 |
| 75%   | 32.500000 |
| max   | 40.000000 |

```
df.head(3)
df.tail(3)
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |

## 读写CSV <a href="#du-qu-xie-chu-csv" id="du-qu-xie-chu-csv"></a>

### 把df导出为CSV，不要index <a href="#ba-df-dao-chu-wei-csv-bu-yao-index" id="ba-df-dao-chu-wei-csv-bu-yao-index"></a>

```python
df.to_csv('person.csv',index=None,sep=',')
```

### 读取CSV为dataframe <a href="#du-qu-csv-wei-dataframe" id="du-qu-csv-wei-dataframe"></a>

```python
person = pd.read_csv('person.csv')
person
```

|   | Name  | Age | Adress          | Group        |
| - | ----- | --- | --------------- | ------------ |
| 0 | Tom   | 18  | Beijing,China   | young        |
| 1 | Bob   | 30  | ShangHai,China  | middle\_aged |
| 2 | Mary  | 25  | GuangZhou,China | middle\_aged |
| 3 | James | 40  | ShenZhen,China  | elderly      |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://zeliang-yao.gitbook.io/my-note-zeliang-yao/useful/pandas/basic.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
