Note: I have edited the original question based on comments and answers.
My question is if a large quantity of Python data is input into a program, how can that data be made lazy, so memory does not overflow?
For example, if a list is built by reading in a file and appending each line or portion of a line to a list, is that list lazy? In other words, can a list be appended to and the list be lazy? Is appending to a list reading the entire file into memory?
I understand that if I wanted to walk through that list, I would write a generator function to keep the access lazy.
What is triggering this question is this recent SO post
If this data were coming from a database table with 10M rows, like one of our MySQL daily water meter reads tables, I would not use the mysqldb fetchall() command without knowing how to make the data lazy. Instead, I would read one row at a time.
But what If I did want the contents of that data in memory as a lazy sequence? How would I do it in Python?
Given that I am not presenting source code with a specific problem, the answer I’m looking for is a pointer or pointers to a place in the Python documentation or somewhere else to solve this problem.
Thanks.
The basic idea of “lazy” code is that the code does not get data until it needs the data.
For example, suppose I am writing a function to copy a text file. It would not be lazy to read the entire file into memory, then write the entire file. It also would not be lazy to use the
.readlines()method to build a list out of all the input lines. But it would be lazy to read one line at a time and then write each line after reading.To help make your code lazy, Python lets you use “generators”. Functions written using the
yieldstatement are generators. For your database example, you could write a generator that would yield up one row at a time from the database, and then you could write code like this: