Wednesday, July 24, 2013

Sharing the workload with distributed computing

Performance is a major concern for our users. With tight time constraints, fast-moving development cycles and the ever-present threat from competitors, it’s natural that engineers and researchers want to be able to simulate their models as quickly as possible. But how do you speed up your simulation without sacrificing accuracy?

CST STUDIO SUITE® supports several high-performance computing (HPC) methods which can help you push simulation to its limit. Over the coming weeks, we will discuss all of these here in a bit more detail, beginning with distributed computing.

Distributed computing (DC) is a very flexible way to spread out the work when carrying out multiple simulations. There are a lot of tasks that require many independent simulations to be carried out, even though not all of them are necessarily obvious.

It’s clear that a parameter sweep or an optimization, where the model is re-simulated with different parameters, will need a lot of simulation runs. However, the general transient and frequency domain solvers are also excellent candidates for DC – in the time domain, port excitations are calculated independently, while in the frequency domain, each frequency point in a broadband simulation is a separate simulation.

In DC, these independent calculations are distributed over a network to a cluster of solver servers. Each server carries out its calculation and returns it to the main controller, and if necessary, the next simulation is sent from a queue.

This makes DC a very helpful tool for users working in large teams. A lot of our users only carry out the most demanding types of simulations occasionally. While these simulations could be carried out on their own workstation, the complexity of the model means that the calculations are very resource-intensive. In a lot of cases, it’s not worthwhile for them to get a very powerful workstation which they only take advantage of occasionally.

A more efficient approach is to give everyone on the team a thin client or a simple workstation, and have a central computer cluster that the team has access to. One server acts as the main controller, communicating between workstations which run the frontend, and the solver servers which run the simulations. This way, everyone on the team has access to HPC when they need it, but without the resources going to waste when they don’t.

DC is also well-suited to relatively small scale projects. Distributed computing can use a variety of different hardware types, and computers in a cluster do not have to be identical. This means it can operate on the ad-hoc clusters often used in universities and small companies. In fact, the main controller can even take the performance of different solver servers into account when distributing jobs.

If, for example, only some computers on the cluster are fitted with GPU cards, the main controller can make sure that only these solvers are used for computationally demanding problems. The distributed computing system is also compatible with third-party job queuing software, so it can be used alongside other programs operating on the same cluster.

On the subject of GPU acceleration, stay tuned! Another blog post about HPC, explaining how GPU computing can make certain simulations run much faster, is coming soon.

No comments:

Post a Comment