performance - Slow down when executing multiple Racket programs -
i have racket program long running. executing many instances of same programs finding answer faster. (it depends on randomness.) execute 10 instances of same program command line on 24-core machine. average throughput when executing 1 instance (on 1 core) 500 iterations/s. average throughput when executing 10 instances (on 10 cores) goes down 100 iterations/s per core. expect see similar throughput per core because each execution not interface others @ all. else experience behavior? happening? how can fix this?
--------------------------- additional information -----------------------------
os: ubuntu 13.10 cores: 24
each instance write own output file. approximately once per minute, each instance replace same output file updated result 10 lines of text. so, don't think hit i/o bound.
according top, each core uses 1.5-2.5% of memory. when running 10 core, 16 gb used and, 9 gb free. nothing running, 11 gb used, , 14 gb free.
there no network request.
the follows (current-memory-use) divided 1,000,000 on 12 minutes on 3 of 10 cores (mb).
- core 3: 313, 48, 73, 154, 292, 242
- core 4: 56, 245, 261, 106, 229, 190
- core 6: 55, 238, 66, 229, 275, 207
when run (current-memory-use) without else, returns 29 mb.
i found issue. program indeed used memory. therefore, when i'm running multiple instances @ same time, either can't fit in cache (probably l3) or exceeds memory bandwidth.
i tried discover source of problem why program used memory. putting (current-memory-use) @ many places in program, found issue arithmetic-shift. because of 1 operation, somehow memory usage became doubled immediately.
the problem occured when executing (arithmetic-shift x y) when x big , y positive. in case, believe result represented using "flonum" (boxed) instead of "fixnum" (unboxed).
even though masked result 32-bit later, prevented racket optimizing that, first-order functions. fixed masking x before passing arithmetic-shift such result never greater 32-bit number, , fixed problem. now, program uses 80 mb instead of 300 mb, , speed expect!
Comments
Post a Comment