-
Notifications
You must be signed in to change notification settings - Fork 142
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently the DataFrame.collect() returns a list of all of the buffered RecordBatches. This is often not desirable as a user may, for example, want to write the result out to disk as it is materialized to save memory.
Describe the solution you'd like
It would be great to have a to_arrow_batches() method which returned a RecordBatchReader which deferred the execution of the batches until they are requested from the RecordBatchReader.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.