Exploiting local data in parallel array I/O on a practical network of workstations

Y. Cho*, M. Winslett, M. Subramaniam, Y. Chen, S. Kuo, K. E. Seamons

*Corresponding author for this work

Research output: Contribution to conferenceConference Paperpeer-review

17 Scopus citations

Abstract

A cost-effective way to run a parallel application is to use existing workstations connected by a local area network such as Ethernet or FDDI. In this paper, we present an approach for parallel I/O of multidimensional arrays on small networks of workstations with a shared-media interconnect, using the Panda I/O library. In such an environment, the message passing throughput per node is lower than the throughput obtainable from a fast disk and it is not easy for users to determine the configuration which will yield the best I/O performance. We introduce an I/O strategy that exploits local data to reduce the amount of data that must be shipped across the network, present experimental results, and analyze the results using an analytical performance model and predict the best choice of I/O parameters. Our experiments show that the new strategy results in a factor of 1.2-2.1 speedup in response time compared to the Panda version originally developed for the IBM SP2, depending on the array sizes, distributions and compute and I/O node meshes. Further, the performance model predicts the results within a 13% margin of error.

Original languageEnglish
Pages1-13
Number of pages13
DOIs
StatePublished - 1997
Externally publishedYes
EventProceedings of the 1997 5th Workshop on I/O in Parallel and Distributed Systems - San Jose, CA, USA
Duration: 17 11 199717 11 1997

Conference

ConferenceProceedings of the 1997 5th Workshop on I/O in Parallel and Distributed Systems
CitySan Jose, CA, USA
Period17/11/9717/11/97

Fingerprint

Dive into the research topics of 'Exploiting local data in parallel array I/O on a practical network of workstations'. Together they form a unique fingerprint.

Cite this