High Performance Computation with Amazon EC2
We’ve been doing quite a lot of work in computational chemistry, virtual screening and docking to validate a set of compounds identified by a client as potentially binding to a particular receptor. This type of work is very computationally intensive, and a typical run can consume a multiple CPU cluster for days on end. As this project expanded, we simply ran out of CPU cycles. The option to expand the compute server or buy a cluster was evaluated and rejected. It was hard to justify that expense unless the server was running nearly 24/7, which never happens.
As a result, I prepared and deployed Linux instances on Amazon EC2 with the required software to perform the experiments and created management tools to allow quick provisioning and distribution of jobs. ###Results:
- Task: Run tests on multiple compounds, sequentially, on one system.
- Benchmark: A local 2.13ghz quad-core, 4GB RAM rack mounted server running CentOS 5.4.
- Time to completion: 17.5 hours
- Amazon EC2: High CPU Instance, 8 cores, 7gb memory, CentOS 5.4
- Time to completion: 8.75 hours
####Running tests on multiple compounds, on multiple systems
- Background: “Ligand set was 10,000 compounds from Zinc (subset 3) and same receptor. Use EC2 instances for load balancing across the instances and collate results.”
- Time to completion: 32.8 hours
###Summary: By using Linux images on Amazon EC2, B-Tech consulting was able to conduct experiments while radically reducing the total cost of ownership of IT resources. The first run on EC2 used $7 worth of compute time, while the second run with multiple machines cost less than $40.