python - Is the file read or remembered? -


if copying file , comparing back:

import shutil, filecmp  # dummy file names, they're not important infile = "d:\\some\\path\\file.ext" copyfile = "d:\\some\\other\\path\\file_copy.ext"  # copy file shutil.copyfile(infile,copyfile)  # compare 2 files if not filecmp.cmp(infile,copyfile,shallow=false):     print "file not copied correctly" 

why? seems kind of pointless doesn't it? after i've copied file has identical, doesn't it? wrong! hard drives have acceptable error rate that's small still present. way sure re-read file it's been in memory how can sure system (windows 7) has actually read file media , not returned page standby memory?

let's assume i've got write 16 tb of data removable hard disc drives , have sure none of files on disc corrupt - or @ least no more corrupt live files. in 16 tb of disc space there few files not identical; using windiff check files byte-for-byte file comparison utility slow, @ least can reasonably sure it's reading file copied disc page should long gone.

can offer expert opinion, based on certainties, on happen: read or remember?

it suspicious if copy less installed memory verification process quicker copy - should be, reading quicker writing, not that quick. if copy 3gb of files (i have 32 gb installed memory) , takes minute verification should take 50 seconds or , should 100% disc use on resource monitor.. it's not, verification takes less 10 seconds , resource monitor doesn't budge. if copy more installed memory verification takes long , resource monitor shows 100% - i'd expect! what's happening here?

for reference, real code error checking removed:

import shutil, filecmp, os, sys  fromfolder = sys.argv[1] tofolder   = sys.argv[2]  verifylist = list() verifytolist = list()  bytescopied = 0  if not os.path.exists(tofolder):     os.mkdir(tofolder)  (path, dirs, files) in os.walk(fromfolder):     relpath = path[len(fromfolder):len(path)]     outpath = tofolder + relpath      if not os.path.exists(outpath):         os.mkdir(outpath)      thisfile in files:         infile = path + "\\" + thisfile         copyfile = outpath + "\\" + thisfile          bytesize = os.path.getsize(infile)         if bytesize < 1024:             repsize = "%d bytes" % bytesize         elif bytesize < 1048576:             repsize = "%.1f kb" %  (bytesize / 1024)          elif bytesize < 1073741824:             repsize = "%.1f mb" %  (bytesize / 1048576)         else:             repsize = "%.1f gb" %  (bytesize / 1073741824)          print "copy %s > %s " % (repsize, thisfile)          verifylist.append(infile)         verifytolist.append(copyfile)          shutil.copyfile(infile,copyfile)  # finished copying, verify fileindex = range(len(verifylist)) reverifylist = list() reverifytolist = list()  thisindex in fileindex:     infile = verifylist[thisindex]     copyfile = verifytolist[thisindex]      thisfile = os.path.basename(infile)     bytesize = os.path.getsize(infile)      if bytesize < 1024:         repsize = "%d bytes" % bytesize     elif bytesize < 1048576:         repsize = "%.1f kb" %  (bytesize / 1024)      elif bytesize < 1073741824:         repsize = "%.1f mb" %  (bytesize / 1048576)     else:         repsize = "%.1f gb" %  (bytesize / 1073741824)      print "verify %s > %s" % (repsize, thisfile)      if not filecmp.cmp(infile,copyfile,shallow=false):         #thisfile = os.path.basename(infile)         print "file not copied correctly " + thisfile         # copy, second chance         reverifylist.append(infile)         reverifytolist.append(copyfile)         shutil.copyfile(infile,copyfile)  del verifylist del verifytolist  if len(reverifylist) > 0:     fileindex = range(len(reverifylist))     thisindex in fileindex:         infile = reverifylist[thisindex]         copyfile = reverifytolist[thisindex]          if not filecmp.cmp(infile,copyfile,shallow=false):             thisfile = os.path.basename(infile)             print "file failed 2nd chance " + thisfile 

if use external hard drive, can switch off write cache drive.

but can never 100% sure because modern hdds have internal buffers (ssds) buffering transparently - way os recognize it...


Comments

Popular posts from this blog

google api - Incomplete response from Gmail API threads.list -

Installing Android SQLite Asset Helper -

Qt Creator - Searching files with Locator including folder -