This paper studies the architectural issues related to partial replication in Web clusters, in terms of storage capacity, reliability, and performance. Our study shows that partial replication offers much higher storage capacity than full replication, while the reliability achieved is much higher than full distribution. Furthermore, we demonstrate that a partial replication architecture using a small number of replicas per file may give a global reliability equivalent to that of a fully replicated solution. To provide the solution we have modified a typical Web switch including new modules to keep track of content allocation to nodes, execute request dispatching, synchronize state among several switches, etc. In order to improve replica allocation on web clusters, we propose a greedy algorithm based on the relative importance of contents, capable to handle heterogeneous storage capacity clusters. For request dispatching, we present the PLARD algorithm, which is aware of requests locality of nodes and it is also aware of the fact that each file is only present in a few nodes. We have defined a novel storage improvement efficiency metric targeted to measure the effective storage capacity, obtained when adding new storage resources to a Web cluster. We also present an analytic model for reliability of Web clusters. Combining those two models, we provide a design criteria which aims to select the configuration of a partially replicated Web cluster in terms of replication degree, reliability, and storage capacity. We have conducted a performance evaluation through simulation using widely accepted parameters. Our results show that there is no evidence that our solution imposes a performance penalty, while it has advantages in terms of reliability and storage capacity.