We describe the use of reliable multicasts in existing message passing (MPI) programs to reduce collective communication bottlenecks. By broadcasting information through one point-to-multipoint message instead of a series of (tree-structured) point-to-point TCP messages, communication delays can be reduced by factors up to $log_2(N_{nodes})$, e.g., 6x for 64-node clusters. This reduction in the cost of collective communications broadens the class of problems appropriate for Beowulf-style supercomputing. We present raw communication and science application (Swift N-body integrator) speed-up results. In addition to impressive gains in scaling for some classes of existing programs, the broader benefit of these results is to reduce the importance of domain decomposition. The cost of sending a message to all nodes is the same as one node-to-node message. This allows Beowulfs to be applied effectively even to problems requiring every node to know the current state of all other nodes' particles/cells/.... This should significantly increase the range of problems approachable with relatively cheap, Ethernet-based Beowulf clusters. The reduced cost of collective communications can also make it easier to write adequate parallel programs, thereby increasing the pool of potential Beowulf programmers. Finally, we discuss as a future possibility how Internet2's ubiquitous support for multicasts and high reliability make it a promising medium for expanding these results to geographically distributed computing resources. This research is supported by NASA AISRP through grant NAG5-9320.