TY - GEN
T1 - C3
T2 - 12th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2015
AU - Suresh, Lalith
AU - Canini, Marco
AU - Schmid, Stefan
AU - Feldmann, Anja
N1 - Publisher Copyright:
© 2015 by The USENIX Association. All Rights Reserved.
PY - 2015
Y1 - 2015
N2 - Achieving predictable performance is critical for many distributed applications, yet difficult to achieve due to many factors that skew the tail of the latency distribution even in well-provisioned systems. In this paper, we present the fundamental challenges involved in designing a replica selection scheme that is robust in the face of performance fluctuations across servers. We illustrate these challenges through performance evaluations of the Cassandra distributed database on Amazon EC2. We then present the design and implementation of an adaptive replica selection mechanism, C3, that is robust to performance variability in the environment. We demonstrate C3's effectiveness in reducing the latency tail and improving throughput through extensive evaluations on Amazon EC2 and through simulations. Our results show that C3 significantly improves the latencies along the mean, median, and tail (up to 3 times improvement at the 99.9th percentile) and provides higher system throughput.
AB - Achieving predictable performance is critical for many distributed applications, yet difficult to achieve due to many factors that skew the tail of the latency distribution even in well-provisioned systems. In this paper, we present the fundamental challenges involved in designing a replica selection scheme that is robust in the face of performance fluctuations across servers. We illustrate these challenges through performance evaluations of the Cassandra distributed database on Amazon EC2. We then present the design and implementation of an adaptive replica selection mechanism, C3, that is robust to performance variability in the environment. We demonstrate C3's effectiveness in reducing the latency tail and improving throughput through extensive evaluations on Amazon EC2 and through simulations. Our results show that C3 significantly improves the latencies along the mean, median, and tail (up to 3 times improvement at the 99.9th percentile) and provides higher system throughput.
UR - http://www.scopus.com/inward/record.url?scp=84967020037&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84967020037
T3 - Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2015
SP - 513
EP - 527
BT - Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2015
PB - USENIX
Y2 - 4 May 2015 through 6 May 2015
ER -