System design is one of the most fascinating topics in software engineering. What makes system design so interesting is that there are no right or wrong design choices; there are only tradeoffs. For people who are interested in system design and the tradeoffs and design choices used when designing large scale systems, we have collected a link of original papers of multiple large scale systems.
These will help you improve your system design capabilities and also give you a peek into how real world systems are designed which handle data at a massive scale reliably.
DynamoDB
DynamoDb is a highly available fully managed NoSQL database service developed by AWS. It is used across multiple services across Amazon and various other companies which handle data at a massive scale.
It’s original paper can be downloaded here
Consistent Hashing
Consistent hashing is a distributed hashing technique which hashes keys in a distributed hash table. The primary benefit of consistent hashing is that it is completely independent of the number of servers in the system.
It’s original paper can be downloaded from here
Hadoop
Hadoop is a distributed file system which can handle petabytes of data reliably and economically.
It’s original paper can be found here
Gossip Protocol
Gossip protocol (sometimes known as epidemic protocol) is a peer to peer communication protocol used in distributed systems. It is used to make sure that the information is spread throughout the group without the need of a central authority/server.
It’s original paper can be found here
Paxos
Paxos is a set of algorithms which are used for consensus building in a distributed system.
It’s paper can be found here
ZooKeeper
Zookeeper is a highly available distributed hierarchical key-value data store.
Original ZooKeeper Paper can be found here