Consuming iterator manually

Consuming Iterators Manually

A fairly typical use case for this would be when reading data from a CSV file where you know the first few lines consist of information abotu teh data rather than just the data itself.

with open('cars.csv') as file:
    for line in file:
        print(line)  

Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin
STRING;DOUBLE;INT;DOUBLE;DOUBLE;DOUBLE;DOUBLE;INT;CAT

Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;3504.;12.0;70;US
Buick Skylark 320;15.0;8;350.0;165.0;3693.;11.5;70;US

As we can see, the values are delimited by ; and the first two lines consist of the column names, and column types.

The reason for the spacing between each line is that each line ends with a newline, and our print statement also emits a newline by default. So we'll have to strip those out.

Here's what we want to do:

  • read the first line to get the column headers and create a named tuple class

  • read data types from second line and store this so we can cast the strings we are reading to the correct data type

  • read the data rows and parse them into a named tuples

We still need to parse the data into strings, integers, floats... First we need to figure cast to a data type based on the data type string:

  • STRING --> str

  • DOUBLE --> float

  • INT --> int

  • CAT --> str

Let's go back and fix up our original code now:

We can clean up this code by using iterators directly:

Last updated

Was this helpful?