Swarm FAQ:Speedups
Contents |
Q: How can I get my model to run faster?
There are a variety of ways to speed up Swarm models, and what works best will vary among models (depending, e.g., on how much time is spend on computation, communication among objects, etc.). Some of these methods are applicable to non-Swarm models, too.
All models
- Have plenty of RAM in your computer. Even if your model uses up much less than all your memory, you can often get a major speed-up by adding another memory card or two.
- Be aware that mathematical functions that use Taylor Series expansions (exponential, logarithm, and trigonometric functions) are especially slow. Avoid them if possible: Instead of "diam^2" use "diam * diam". If you are desperate, you can program in your own shorter, faster, less accurate implementation of the Taylor series.
- Use a profiler to find out where the execution time is being spent, then try to make those places more efficient. A profiler runs your model and generates a report on how much time is spent in which pieces of code (including your methods, Swarm's, and the run-time system's). You can learn about profilers by searching the web. There is an item on the JDK profiler for Java Swarm on Paul Johnson's FAQ (see: FAQs). There are instructions for using the gcc profiler for Objective-C Swarm models (or any code compiled by gcc) here on this wiki's faq. Also see the Objective-C section below.
Java Swarm
- Use Java lists instead of Swarm lists. According to Marcus Daniels: "Swarm can accept them in place of the native lists in places where Swarm want lists as arguments. They'll be faster in normal circumstances." (However, be aware that Swarm's QSort tool does not work on Java lists.)
- Try replacing Swarm classes with Java library or homemade code. If the Swarm code is slow (for reasons discussed below under "Objective-C"), then replacing it with Java could speed it up.
Objective-C Swarm
These speed-up methods are specific to Objective-C. None have been tested extensively for how much they might help.
- Reduce the number of method calls. Method calls appear to be computationally expensive because the run-time environment must figure out the message target. In addition, modern CPUs have optimizing routines that try to figure out which code is coming up for execution next, so it can be loaded and ready to go. Objective-C messages appear to make these optimization routines ineffective by keeping them from predicting which code will execute next. (Other languages are more restrictive and therefore their execution is easier to predict.) Some ways to reduce method calls include using C functions instead of methods when possible, and combining scheduled actions if possible. As Marcus says, write your models more in C and less in Objective-C.
- Another way to reduce method calls is to avoid "getter" methods: typically, when one object needs a value from another object, it gets the value using a getter method:
myBugsHappiness = [myBug getHappiness];
An alternative way that should be faster is using a pointer to the value you want. If myBug is an instance of the class Bug, which has a public variable "happiness", then the above statement can be replaced by:
myBugsHappiness = ((Bug *)myBug)->happiness); or (riskier): myBugsHappiness = myBug->happiness;
You can also set an object's values in the same way:
instead of [myBug setHappiness: 14.3]; you can use myBug->happiness = 14.3;
But there are two cautions for using pointers like this!!. First, you can accidentally change the value of myBug's "happiness" variable. Second, be aware that using the pointer to get myBug's happiness value just grabs whatever the current value of "happiness" is in myBug---it does not execute any code that might be in the method [myBug getHappiness].
- (Typing objects more tightly (e.g., declaring a variable by specifying the class object it will contain instead of just as "id") is not expected to improve execution speed.)
- Write parts of your code that are computationally intensive (but not dependent on Objective-C's run-time dynamism) in C++, and compile with the Objective-C++ compiler. C++ has two kind of methods: member functions and virtuals. Member functions are as fast as C functions while still giving an object-oriented way to describe classes. Pure virtuals allow for splitting interface from implementation and are between member functions and Objective C message dispatch in terms of versatility and performance. See the FAQ on C++.
- Create a version of Swarm "tuned" to your model. You can actually use other gcc options to create a compilation of Swarm, and your model, optimized to your model. This technique also works for any other language that gcc compiles. Marcus Daniels provided these (slightly edited) instructions.
Here is another way to use profiling with Swarm. As with -pg, it's best if both Swarm and the simulation are compiled using it. The idea is that the compiler instruments all of the branches and when the code first runs, it collect stats on the dynamics of the model. Then you take that dataset and feed it back into the compiler, and the instructions that are generated the second time around have branch hints for the CPU. I tried it with batch heatbugs and witnessed a 30% speed increase on a Athlon 64 (running in 64 bit mode). However, using a pathological batch model (version 16 batch of "StupidModel"), I didn't see much at all. It would depend on what branches (conditionals, while exits, message dispatches, etc) actually occur and if they have particular tendencies in one direction. The procedure begins by adding "-fprofile-generate" on the initial compile. (e.g. adding CC="gcc -fprofile-generate" to the configure of Swarm). That will be propagated to the makefiles, and the simulation will also link with it. Upon first run of the model you'll get a large set of .gcda files distributed through your Swarm build tree and in your simulation directory. The model should run long enough that some stable statistics can accumulate (e.g. more than a fraction of a second). Once you've got that batch run finished without an abnormal ending or interrupt, then go back to your Swarm build tree and change the config.status files in the toplevel, libobjc/ and avcall/, to change "profile-generate" to "profile-use" (e.g. just search and replace in the editor). Now run each of these scripts to have them extract new makefiles, etc. Before running `make' at the toplevel, first run something like this: find . -name \*.o -or -name \*.lo -or -name \*.a -or -name \*.la | xargs rm This will remove all the old instrumented object code. When you run `make', the new compile will draw upon all the statistics you collected and generate code that favors the dynamics of the model. When the build of Swarm is done, install it and then recompile your simulation. If there are branches that are usually taken in your code (and within Swarm itself, in the way your model exercises Swarm), you should see a speedup.
--SFRailsback 20:36, 23 Feb 2007 (EST); Thanks to Marcus Daniels for most of the ideas and research.
![[Main Page]](/stylesheets/images/wiki.png)