python - memory issues with big numpy/scipy arrays -


i've below code snippet:

data/imat data matrices of 100000 x 500, while matrix s i'm constructing of order 50000 x 100000. matrix s super sparse 1 entry in each column

def getsparsecoverr(imat, sketch):    ata = np.dot(imat.transpose(), imat)    btb = sketch.transpose().dot(sketch)    fn = np.linalg.norm(imat, 'fro') ** 2    val = np.linalg.norm(ata - btb , 2)/fn    del ata    del btb    return val  nrows, ncols = data.shape samples = noofsamples(ncols, eps, delta)  cols = np.arange(nrows) rows = np.random.random_integers(samples - 1, size = nrows) diag = [] in range(len(cols)):     if np.random.random() < 0.5:         diag.append(1)     else:         diag.append(-1) s = sparse.csc_matrix((diag, (rows, cols)), shape = (samples, nrows))/np.sqrt(samples) q = s.dot(data)  q = sparse.bsr_matrix(q)  print getsparsecoverr(data, q) 

when run above code first time gives me print statement output. after that, if run below error:

python: malloc.c:2369: sysmalloc: assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed. 

then if run once again, like:

    q = sparse.bsr_matrix(q)   file "/usr/lib64/python2.7/site-packages/scipy/sparse/bsr.py", line 170, in __init__     arg1 = coo_matrix(arg1, dtype=dtype).tobsr(blocksize=blocksize)   file "/usr/lib64/python2.7/site-packages/scipy/sparse/coo.py", line 186, in __init__     self.data  = m[self.row, self.col] indexerror: index -1517041769959067988 out of bounds axis 0 size 178133 none 

it seems me first run creating memory issues. how can debug , possible problems , solutions?

would work?

def getsparsecoverr(imat, sketch):    return np.linalg.norm(np.dot(imat.transpose(), imat) - sketch.transpose().dot(sketch)) / (np.linalg.norm(imat, 'fro') ** 2)  def getq(data, rows, cols, diag, samples, nrows):     return sparse.bsr_matrix((sparse.csc_matrix((diag, (rows, cols)), shape = (samples, nrows))/np.sqrt(samples)).dot(data))  print getsparsecoverr(data, getq(data, rows, cols, diag, samples, nrows)) 

that is, trying things out of scope possible. might parenthesis wrong since it's hard test without functions.

if not, assume 1 of functions storing changing state / storing data.

using original code , given use ipython can following:

in [5]: %%bash ps -e -orss=,args= | sort -b -k1,1n | pr -tw$columns | tail -n 10 

to monitor allocation of memory each step of code nail down problem.


Comments

Popular posts from this blog

google api - Incomplete response from Gmail API threads.list -

Installing Android SQLite Asset Helper -

Qt Creator - Searching files with Locator including folder -