2017-06-29
History /
Edit /
PDF /
EPUB /
BIB /
Created: June 29, 2017 / Updated: November 2, 2024 / Status: in progress / 1 min read (~100 words)
Created: June 29, 2017 / Updated: November 2, 2024 / Status: in progress / 1 min read (~100 words)
- How to deal with loading and batching huge amount of data, more particularly in the form of images?
- Loading thousands of images directly from the filesystem is efficient due to a lot of system calls
- It seems straightforward to pack these images into more concise structures, such as numpy arrays and using compressed files such as npz
- However, how does one deal with loading all this data at training time, such that 10 GB of compressed data does not equal 20 GB of RAM used all throughout training?