TY - JOUR
T1 - Efficient algorithms for selection and sorting of large distributed files on de bruijn and hypercube structures
AU - Wei, David S.L.
AU - Rajasekaran, Sanguthevar
AU - Naik, Kshirasagar
AU - Kuo, Sy Yen
PY - 2003
Y1 - 2003
N2 - In this paper we show the power of sampling techniques in designing efficient distributed algorithms. In particular, we apply sampling techniques in the design of selection algorithms on the hypercube and de Bruijn networks, and show that the message complexity of selecting an item from a set (file) is less sensitive to the cardinality of the set (file). Given a file with n keys, our algorithm performs a selection on a p-node de Bruijn network or hypercube using only O(p log log n) messages and suffering a delay of O(τ log p log log n), with high probability. Our selection scheme outperforms the existing approaches in terms of both message complexity and communication delay. Because of the lesser sensitivity of message complexity and communication delay of our algorithms to the file size, our distributed selection schemes are very attractive in applications where very large database systems are involved. Using our selection algorithms, we also show that both quicksort-based sorting scheme and enumeration sorting scheme can be developed for sorting large distributed files on the hypercube and de Bruijn networks. Both of our sorting algorithms outperform the existing distributed sorting schemes in terms of both message complexity and communication delay.
AB - In this paper we show the power of sampling techniques in designing efficient distributed algorithms. In particular, we apply sampling techniques in the design of selection algorithms on the hypercube and de Bruijn networks, and show that the message complexity of selecting an item from a set (file) is less sensitive to the cardinality of the set (file). Given a file with n keys, our algorithm performs a selection on a p-node de Bruijn network or hypercube using only O(p log log n) messages and suffering a delay of O(τ log p log log n), with high probability. Our selection scheme outperforms the existing approaches in terms of both message complexity and communication delay. Because of the lesser sensitivity of message complexity and communication delay of our algorithms to the file size, our distributed selection schemes are very attractive in applications where very large database systems are involved. Using our selection algorithms, we also show that both quicksort-based sorting scheme and enumeration sorting scheme can be developed for sorting large distributed files on the hypercube and de Bruijn networks. Both of our sorting algorithms outperform the existing distributed sorting schemes in terms of both message complexity and communication delay.
KW - de Bruijn network
KW - Distributed selection
KW - distributed sorting
KW - hypercube
KW - large distributed files
UR - http://www.scopus.com/inward/record.url?scp=60749121313&partnerID=8YFLogxK
U2 - 10.1142/S0129054103002229
DO - 10.1142/S0129054103002229
M3 - 文章
AN - SCOPUS:60749121313
SN - 0129-0541
VL - 14
SP - 1129
EP - 1146
JO - International Journal of Foundations of Computer Science
JF - International Journal of Foundations of Computer Science
IS - 6
ER -