Network multicomputing using recoverable distributed shared memory

John B. Carter*, Alan L. Cox, Sandhya Dwarkadas, Elmootazbellah Elnozahy, David B. Johnson, Pete Keleher, Steven Rodriguez, Weimin Yu, Willy Zwaenepoel

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations


A network multicomputer is a multiprocessor in which the processors are connected by general-purpose networking technology, in contrast to current distributed memory multi-processors where a dedicated special-purpose interconnect is used. The advent of high-speed general-purpose networks provides the impetus for a new look at the network multiprocessor model, by removing the bottleneck of current slow networks. However, major software issues remain unsolved. A convenient machine abstraction must be developed that hides from the application programmer low-level details such as message passing or machine failures. We use distributed shared memory as a programming abstraction, and rollback recovery through consistent checkpointing to provide fault tolerance. Measurements of our implementations of distributed shared memory and consistent checkpointing show that these abstractions can be implemented efficiently.

Original languageEnglish (US)
Title of host publication1993 IEEE Compcon Spring
PublisherPubl by IEEE
Number of pages9
ISBN (Print)0780312945
StatePublished - Jan 1 1993
Event38th Annual IEEE Computer Society International Computer Conference - COMPCON SPRING '93 - San Francisco, CA, USA
Duration: Feb 22 1993Feb 26 1993

Publication series

Name1993 IEEE Compcon Spring


Other38th Annual IEEE Computer Society International Computer Conference - COMPCON SPRING '93
CitySan Francisco, CA, USA

ASJC Scopus subject areas

  • Engineering(all)


Dive into the research topics of 'Network multicomputing using recoverable distributed shared memory'. Together they form a unique fingerprint.

Cite this