The performance and scalibility of bus based shared memory multiprocessors are limited by the amount of the traffic most of which are generated by cache coherence overhead. A part of this overhead is avoidable, since data sharing is exaggerated when multiple processors access different words in the same cache line. This phenomenon is called false sharing since hardware treats it as if data are shared even through the cache lines containing the data, not the data, are shared.
In this paper we present a new cache coherence protocol called word unit cache protocol to eliminate the false sharing, we use an execution-driven simulation model to study the performance improvements of the new cache scheme. I measure the amount of the false sharing and the bus traffic using several coarse-grained parallel applications in this simulation study. The result of the simulation indicates that the new cache protocol causes less the bus traffic than the other write invalidation protocols.