itertools

A simple class to write / read data between hdf5 <=> dataframe

Count 功能详解

count(start=0,step=1) 函数有两个参数，其中 step 是默认参数，可选的，默认值为 1。该函数返回一个新的迭代器，从 start 开始，返回以 step 为步长的均匀间隔的值。

import itertools
nums  = itertools.count()
for i in nums:
    if i > 4:
        break
    print(i)
0
1
2
3
4

nums = itertools.count(10, 2)
for i in nums:
    if i>14:
        break
    print(i)

Repeat 功能详解

repeat(object, times) 该函数创建一个迭代器，不断的重复 object，当然如果指定 times 的话，则只会重复 times 次

ns = itertools.repeat('AB', 3)
for n in ns:
    print(n)
AB
AB
AB

Chain()

chain()可以把一组迭代对象串联起来，形成一个更大的迭代器：

x = itertools.chain("abc", "xyz")
print(list(x))

=> ['a', 'b', 'c', 'x', 'y', 'z']

groupby()

groupby()把迭代器中相邻的重复元素挑出来放在一起：

for key, group in itertools.groupby('AAABBBCCAAA'):
    print(key, list(group))
    
A ['A', 'A', 'A']
B ['B', 'B', 'B']
C ['C', 'C']
A ['A', 'A', 'A']


def sortBy(score):
    if score > 80:
        return "A"
    elif score >= 60:
        return "B"
    else:
        return "C"

scores = [81, 82, 84, 76, 64, 78, 59, 44, 55, 89]
for m, n in itertools.groupby(scores, key=sortBy):
	print(m, list(n))
	
	
A [81, 82, 84]
B [76, 64, 78]
C [59, 44, 55]
A [89]

我们可以看到，该函数根据我们自定义的排序函数 sortBy 将列表中的元素进行了分组操作，只是我们发现最后一个怎么多了一个 A 的分组呢，这就是我们上面说所得「当 key 函数的返回值改变时，迭代器就会生成一个新的分组」。所以，我们需要事先对列表用 sortBy 函数排一下序。


scores = [81, 82, 84, 76, 64, 78, 59, 44, 55, 89]
scores = sorted(scores, key=sortBy)
print(scores)
for m, n in itertools.groupby(scores, key=sortBy):
    print(m, list(n))
    
[81, 82, 84, 89, 76, 64, 78, 59, 44, 55]
A [81, 82, 84, 89]
B [76, 64, 78]
C [59, 44, 55]

compress 功能详解

compress(data, selectors) 该函数功能很简单，就是根据 selectors 中的值判断是否保留 data 中对应位置的值。

data = [81, 82, 84, 76, 64, 78]
tf = [1,1,0,1,1,0]
print(list(itertools.compress(data, tf)))

[81, 82, 76, 64]

filterfalse 功能详解

filterfalse(predicate, iterable) 创建一个迭代器，返回 iterable 中 predicate 为 false 的元素。

x = itertools.filterfalse(lambda x: x < 5, [1,3,5,7,4,2,1])
print(list(x))

[5, 7]

islice 功能详解

islice(iterable, start, stop[, step]) 对 iterable 进行切片操作。从 start 开始到 stop 截止，同时支持以步长为 step 的跳跃。

print(list(itertools.islice('123456789', 2)))
print(list(itertools.islice('123456789', 2, 4)))
print(list(itertools.islice('123456789', 2, None)))
print(list(itertools.islice('123456789', 0, None, 2)))

['1', '2']
['3', '4']
['3', '4', '5', '6', '7', '8', '9']
['1', '3', '5', '7', '9']

dropwhile 功能详解

dropwhile(predicate, iterable) 创建一个迭代器，从 predicate 首次为 false 时开始迭代元素。

x = itertools.dropwhile(lambda x: x < 5, [1,3,5,7,4,2,1])
print(list(x))

[5, 7, 4, 2, 1]

takewhile 功能详解

takewhile(predicate, iterable) 创建一个迭代器，遇到 predicate 为 false 则停止迭代元素。与 dropwhile 完全相反。

x = itertools.takewhile(lambda x: x < 5, [1,3,5,7,4,2,1])
print(list(x))

[1, 3]

permutations 功能详解

permutations(iterable, r=None) 返回 iterable 中长度为 r 的所有排列。默认值 r 为 iterable 的长度。即使元素的值相同，不同位置的元素也被认为是不同的。

print(list(itertools.permutations("aba", r=2)))
[('a', 'b'), ('a', 'a'), ('b', 'a'), ('b', 'a'), ('a', 'a'), ('a', 'b')]

combinations 功能详解

combinations(iterable, r=None) 返回 iterable 中长度为 r 的有序排列。默认值 r 为 iterable 的长度。与 permutations 操作不同的是该函数严格按照 iterable 中元素的顺序进行排列。

print(list(itertools.combinations("abc", r=2)))
[('a', 'b'), ('a', 'c'), ('b', 'c')]

combinations_with_replacement

combinations_with_replacement(iterable, r=None) 返回 iterable 中长度为 r 的有序排列。默认值 r 为 iterable 的长度。与 combinations 操作不同的是该函数允许每个元素重复出现。

print(list(itertools.combinations_with_replacement("abc", r=2)))
[('a', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'b'), ('b', 'c'), ('c', 'c')]

PreviousGenrators NextCollections

Last updated 3 years ago

Was this helpful?