python - Is the file read or remembered? -
if copying file , comparing back:
import shutil, filecmp # dummy file names, they're not important infile = "d:\\some\\path\\file.ext" copyfile = "d:\\some\\other\\path\\file_copy.ext" # copy file shutil.copyfile(infile,copyfile) # compare 2 files if not filecmp.cmp(infile,copyfile,shallow=false): print "file not copied correctly"
why? seems kind of pointless doesn't it? after i've copied file has identical, doesn't it? wrong! hard drives have acceptable error rate that's small still present. way sure re-read file it's been in memory how can sure system (windows 7) has actually read file media , not returned page standby memory?
let's assume i've got write 16 tb of data removable hard disc drives , have sure none of files on disc corrupt - or @ least no more corrupt live files. in 16 tb of disc space there few files not identical; using windiff check files byte-for-byte file comparison utility slow, @ least can reasonably sure it's reading file copied disc page should long gone.
can offer expert opinion, based on certainties, on happen: read or remember?
it suspicious if copy less installed memory verification process quicker copy - should be, reading quicker writing, not that quick. if copy 3gb of files (i have 32 gb installed memory) , takes minute verification should take 50 seconds or , should 100% disc use on resource monitor.. it's not, verification takes less 10 seconds , resource monitor doesn't budge. if copy more installed memory verification takes long , resource monitor shows 100% - i'd expect! what's happening here?
for reference, real code error checking removed:
import shutil, filecmp, os, sys fromfolder = sys.argv[1] tofolder = sys.argv[2] verifylist = list() verifytolist = list() bytescopied = 0 if not os.path.exists(tofolder): os.mkdir(tofolder) (path, dirs, files) in os.walk(fromfolder): relpath = path[len(fromfolder):len(path)] outpath = tofolder + relpath if not os.path.exists(outpath): os.mkdir(outpath) thisfile in files: infile = path + "\\" + thisfile copyfile = outpath + "\\" + thisfile bytesize = os.path.getsize(infile) if bytesize < 1024: repsize = "%d bytes" % bytesize elif bytesize < 1048576: repsize = "%.1f kb" % (bytesize / 1024) elif bytesize < 1073741824: repsize = "%.1f mb" % (bytesize / 1048576) else: repsize = "%.1f gb" % (bytesize / 1073741824) print "copy %s > %s " % (repsize, thisfile) verifylist.append(infile) verifytolist.append(copyfile) shutil.copyfile(infile,copyfile) # finished copying, verify fileindex = range(len(verifylist)) reverifylist = list() reverifytolist = list() thisindex in fileindex: infile = verifylist[thisindex] copyfile = verifytolist[thisindex] thisfile = os.path.basename(infile) bytesize = os.path.getsize(infile) if bytesize < 1024: repsize = "%d bytes" % bytesize elif bytesize < 1048576: repsize = "%.1f kb" % (bytesize / 1024) elif bytesize < 1073741824: repsize = "%.1f mb" % (bytesize / 1048576) else: repsize = "%.1f gb" % (bytesize / 1073741824) print "verify %s > %s" % (repsize, thisfile) if not filecmp.cmp(infile,copyfile,shallow=false): #thisfile = os.path.basename(infile) print "file not copied correctly " + thisfile # copy, second chance reverifylist.append(infile) reverifytolist.append(copyfile) shutil.copyfile(infile,copyfile) del verifylist del verifytolist if len(reverifylist) > 0: fileindex = range(len(reverifylist)) thisindex in fileindex: infile = reverifylist[thisindex] copyfile = reverifytolist[thisindex] if not filecmp.cmp(infile,copyfile,shallow=false): thisfile = os.path.basename(infile) print "file failed 2nd chance " + thisfile
if use external hard drive, can switch off write cache drive.
but can never 100% sure because modern hdds have internal buffers (ssds) buffering transparently - way os recognize it...
Comments
Post a Comment