performance - Slow down when executing multiple Racket programs -


i have racket program long running. executing many instances of same programs finding answer faster. (it depends on randomness.) execute 10 instances of same program command line on 24-core machine. average throughput when executing 1 instance (on 1 core) 500 iterations/s. average throughput when executing 10 instances (on 10 cores) goes down 100 iterations/s per core. expect see similar throughput per core because each execution not interface others @ all. else experience behavior? happening? how can fix this?

--------------------------- additional information -----------------------------

os: ubuntu 13.10 cores: 24

each instance write own output file. approximately once per minute, each instance replace same output file updated result 10 lines of text. so, don't think hit i/o bound.

according top, each core uses 1.5-2.5% of memory. when running 10 core, 16 gb used and, 9 gb free. nothing running, 11 gb used, , 14 gb free.

there no network request.

the follows (current-memory-use) divided 1,000,000 on 12 minutes on 3 of 10 cores (mb).

  • core 3: 313, 48, 73, 154, 292, 242
  • core 4: 56, 245, 261, 106, 229, 190
  • core 6: 55, 238, 66, 229, 275, 207

when run (current-memory-use) without else, returns 29 mb.

i found issue. program indeed used memory. therefore, when i'm running multiple instances @ same time, either can't fit in cache (probably l3) or exceeds memory bandwidth.

i tried discover source of problem why program used memory. putting (current-memory-use) @ many places in program, found issue arithmetic-shift. because of 1 operation, somehow memory usage became doubled immediately.

the problem occured when executing (arithmetic-shift x y) when x big , y positive. in case, believe result represented using "flonum" (boxed) instead of "fixnum" (unboxed).

even though masked result 32-bit later, prevented racket optimizing that, first-order functions. fixed masking x before passing arithmetic-shift such result never greater 32-bit number, , fixed problem. now, program uses 80 mb instead of 300 mb, , speed expect!


Comments

Popular posts from this blog

google api - Incomplete response from Gmail API threads.list -

qml - Is it possible to implement SystemTrayIcon functionality in Qt Quick application -

double exclamation marks in haskell -