My Note / Zeliang YAO
  • Zeliang's Note
  • Dremio
    • Custom Class
  • 💕Python
    • Design Pattern
      • Creational
        • Abstract Factory
        • Factory Method
        • Singleton
        • Builder / Director
      • Structural
        • Adapter
    • Boto3
    • Typing
    • String
    • Requests
    • Iterator & Iterable
      • Consuming iterator manually
      • Lazy Iterable
    • Genrators
    • itertools
    • Collections
    • Customization
      • Customize built-in
      • Logging
      • Hdf5
      • Sqlite3 & Df
    • Pandas
      • Basic
      • Data cleaning
      • Merge, Join, Concat
      • Useful tricks
      • Simple model
      • Pandas acceleration
    • Pandas time series
      • Date Range
      • Datetime Index
      • Holidays
      • Function_to_date_time
      • Period
      • Time zone
    • *args and**kwargs
    • Context Manager
    • Lambda
    • SHA
    • Multithreading
      • Threading
      • Speed Up
    • Email
    • Improvement
    • Useful functions
    • Python OOP
      • Basic
      • @static / @class method
      • attrs module
      • Dataclasses
      • Dataclasses example
      • Others
    • Design patterns
      • Creational Patterns
      • Structural Patterns
      • Behavioral Patterns
  • 🐣Git/Github
    • Commands
  • K8s
    • Useful commands
  • Linux
    • Chmod
Powered by GitBook
On this page

Was this helpful?

  1. Python

Iterator & Iterable

Iterators and Iterables

We could create iterator objects by simply implementing:

  • a __next__ method that returns the next element in the container

  • an __iter__ method that just returns the object itself (the iterator object)

However, we had two outstanding issues/questions:

  • when we looped over the iterator using a for loop (or a comprehension, or other functions that do some form of iteration), we saw that the __iter__ was always called first.

  • the iterator gets exhausted after we have finished iterating it fully - which means we have to create a new iterator every time we want to use a new iteration over the collection - can we somehow avoid having to remember to do that every time?

class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
        
class CityIterator:
    def __init__(self, city_obj):
        # cities is an instance of Cities
        self._city_obj = city_obj
        self._index = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._city_obj):
            raise StopIteration
        else:
            item = self._city_obj._cities[self._index]
            self._index += 1
            return item

Now we can create the iterator objects without having to recreate the Cities object every time.

But, we still have to remember to create a new iterator, and we can no longer iterate over the cities object anymore!

cities = Cities()
iter_1 = CityIterator(cities)

for city in iter_1:
    print(city)

New York
Newark
New Delhi
Newcastle

for city in cities:
    print(city)
    
TypeError                                 Traceback (most recent call last)
<ipython-input-11-5ab6add74170> in <module>
----> 1  for city in cities:
 2     print(city)

TypeError: 'Cities' object is not iterable

Iterables

Now we finally come to how an iterable is defined in Python.

An iterable is an object that:

  • implements the __iter__ method

  • and that method returns an iterator which can be used to iterate over the object

Now we can put the iterator class inside our Cities class to keep the code self-contained:

class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __iter__(self):
        print('Calling Cities instance __iter__')
        return self.CityIterator(self)
    
    class CityIterator:
        def __init__(self, city_obj):
            # cities is an instance of Cities
            print('Calling CityIterator __init__')
            self._city_obj = city_obj
            self._index = 0

        def __iter__(self):
            print('Calling CitiyIterator instance __iter__')
            return self

        def __next__(self):
            print('Calling __next__')
            if self._index >= len(self._city_obj):
                raise StopIteration
            else:
                item = self._city_obj._cities[self._index]
                self._index += 1
                return item

cities = Cities()
list(enumerate(cities))

Calling Cities instance __iter__
Calling CityIterator __init__
Calling __next__
Calling __next__
Calling __next__
Calling __next__
Calling __next__

Out[25]:

[(0, 'New York'), (1, 'Newark'), (2, 'New Delhi'), (3, 'Newcastle')]

Since our Cities could also be a sequence, we could also decide to implement the __getitem__ method to make it into a sequence:

class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __getitem__(self, s):
        print('getting item...')
        return self._cities[s]
    
    def __iter__(self):
        print('Calling Cities instance __iter__')
        return self.CityIterator(self)
    
    class CityIterator:
        def __init__(self, city_obj):
            # cities is an instance of Cities
            print('Calling CityIterator __init__')
            self._city_obj = city_obj
            self._index = 0

        def __iter__(self):
            print('Calling CitiyIterator instance __iter__')
            return self

        def __next__(self):
            print('Calling __next__')
            if self._index >= len(self._city_obj):
                raise StopIteration
            else:
                item = self._city_obj._cities[self._index]
                self._index += 1
                return item

It's a sequence, also an iterable

cities = Cities()
for city in cities:
    print(city)

Python Built-In Iterables and Iterators

l = [1, 2, 3]
iter_l = iter(l)

'__next__' in dir(iter_l)
True
'__iter__' in dir(iter_l)
True
#but does not implement a `__next__` method:
'__next__' in dir(l)
False
# Of course, since lists are also sequence types, they also implement the `__getitem__` method:
'__getitem__' in dir(l)
True
from collections.abc import Iterable,Iterator               
l = [1,2,3]

hasattr(l,'__iter__'),hasattr(l,'__next__')
(True, False)

isinstance(l,Iterable),isinstance(l,Iterator)
(True, False)

new_l = iter(l)

hasattr(new_l,'__iter__'),hasattr(new_l,'__next__')
(True, True)

isinstance(new_l,Iterable),isinstance(new_l,Iterator)
(True, True)
PreviousRequestsNextConsuming iterator manually

Last updated 3 years ago

Was this helpful?

💕
Page cover image