Cache-coherent non-uniform-memory-access architecture (CC-NUMA) and cache-only memory architecture (COMA) are two major alternatives in designing large-scale shared memory multiprocessors. COMA is different in that caching mechanisms encompass the main memory. COMA is classified into flat COMA (COMA-F) and hierarchical COMA (COMA-H) depending on the structure of the directory. COMA-F and CC-NUMA have same directory structures and can be configured on the same interconnection network. In this thesis we compare the performance of CC-NUMA and COMA-F architectures using simulation methods. First we implemented two architecture models for execution driven simulation. And we present quantitative simulation results for four parallel applications. We show that COMA is better than CC-NUMA for all the tested cases even though the differences are slight. COMA's potential for performance improvement is mainly due to the compensation of capacity miss using its own local memory. If multiprogramming workloads were used, the performance gap would be much larger than the tested cases where a single application occupies several processors.