Efficiently copying a file#
TLDR: Use shutil.copyfileobj(fsrc, fdst)
I’m writing a Datasette plugin that handles an uploaded file, borrowing the Starlette mechanism for handling file uploads, documented here.
Starlette uploads result in a SpooledTemporaryFile file-like object. These look very much like a file, with one frustrating limitation: they don’t have a defined, stable path on disk.
I thought that this meant that you couldn’t easily copy it, and ended up coming up with this recipe based on this code I spotted in the BáiZé framework:
1from shutil import COPY_BUFSIZE2
3with open(new_filepath, "wb+") as target_file:4 source_file.seek(0)5 source_read = source_file.read6 target_write = target_file.write7 while True:8 buf = source_read(COPY_BUFSIZE)9 if not buf:10 break11 target_write(buf)COPY_BUFSIZE defined by Python here - it handles the difference in ideal buffer size between Windows and other operating systems:
1COPY_BUFSIZE = 1024 * 1024 if _WINDOWS else 64 * 1024But then I sat down to write this TIL, and stumbled across shutil.copyfileobj(fsrc, fdst) in the standard library which implements the exact same pattern!
1def copyfileobj(fsrc, fdst, length=0):2 """copy data from file-like object fsrc to file-like object fdst"""3 # Localize variable access to minimize overhead.4 if not length:5 length = COPY_BUFSIZE6 fsrc_read = fsrc.read7 fdst_write = fdst.write8 while True:9 buf = fsrc_read(length)10 if not buf:11 break12 fdst_write(buf)So you should use that instead.