The shared memory programming model on top of a physically distributed memory machine (SDMM) is a promising candidate for easy-to-program general purpose parallel computation. There are, however, certain open technical problems which should be sufficiently solved before SDMM can meet the expectations. Among them is low-level structure of memory system, because most academic studies of the subject assume unrealisticly ideal memory properties, ignoring completely, e.g., the speed difference between processors and memories. In this paper we propose three memory module structures based on low-level interleaving and caching for solving this speed difference problem. We evaluate these structures along with three reference solutions by determining the overall cost factor of memory references in respect to ideal SDMM using both generic traffic and real parallel programs. According to our evaluation, a cost less than two is achieved with an interleaved solution using modest queues, if proper amount of parallelism is available. Moreover, caching increases the speed of the memory, if caches are placed in modules rather than next to processors, but it provides much lower throughput than interleaving.
|Title of host publication||Proceedings of the International Conference on Advances in Infrastructure for Electronic Business, Education, Science, and Medicine on the Internet|
|Subtitle of host publication||L'Aquila, Italy, 21-27 January 2002|
|Publisher||Scuola Superiore Guglielmo Reiss Romoli|
|Publication status||Published - 2002|
|MoE publication type||A4 Article in a conference publication|