How do I use the gcc profiler?
A "profiler" is a system to tell you how much execution time your processor spends in each part of your code (and in which parts of the Swarm libraries and the run-time system). The gcc compiler used for Objective-C Swarm models includes a profiling option that is easy and often very helpful to use when you are trying to make a model run faster, or trying to figure out exactly what it is doing.
Using the gcc profiler requires (a) compiling the code with a special option, (b) running the model, and (c) using a special program that reports profiling results.
- You need to compile and link all parts of model using the gcc compiler option "-pg". (Do not also use the "-g" option; the profiler output will not be produced.) You can do this by inserting a couple lines into your makefile--the last two lines in the following example:
... APPLICATION=template OBJECTS=main.o Counter.o TemplateModelSwarm.o TemplateObserverSwarm.o EcoAverager.o Critter.o APPLIBS= include $(SWARMHOME)/etc/swarm/Makefile.appl CFLAGS+=-pg EXTRACPPFLAGS+=-pg ...
- Then use "make clean" and "make" to completely recompile your code. This makes what is called an instrumented executable, which includes the extra code to report profiling information.
- Next, run the instrumented executable just as you would normally run your model. The profiler writes a file "gmon.out" into your directory.
- Finally, use the built-in program "gprof" to interpret gmon.out and produce the profiling report. Just use the command:
where "mycode.exe" is the name of your executable. gprof writes a table of profiling results; it's best to pipe its output to a file:
gprof mycode.exe > profileoutput.txt
The following example output is from a Swarm implementation of the famous "Boids" model:
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls us/call us/call name 24.03 0.62 0.62 21624597 0.03 0.03 _i_Vector__getLength 23.84 1.24 0.61 _fu40__Member 9.30 1.48 0.24 13726341 0.02 0.02 _i_Vector__sub_ 6.20 1.64 0.16 3867902 0.04 0.12 _i_Vector__angle_ 5.43 1.77 0.14 35699811 0.00 0.00 _i_Vector__getY 5.43 1.92 0.14 13782653 0.01 0.02 _i_Vector__init_ 4.84 2.04 0.12 _fu37___obj_scratchZone 4.65 2.16 0.12 35744897 0.00 0.00 _i_Vector__getX 4.26 2.27 0.11 objc_msg_lookup 3.10 2.35 0.08 15654110 0.01 0.01 _i_SimObject__getPosition 2.71 2.42 0.07 4139991 0.02 0.02 _i_Vector__add_ 2.71 2.49 0.07 3867902 0.02 0.03 _i_Vector__dot_ 1.16 2.52 0.03 1972908 0.02 0.02 _i_SimObject__getObjectType
You can see that the method using up the most time is "getLength" in the model's class "Vector". This is not surprising because this method uses the function "sqrt" which is computationally demanding (because it uses a Taylor Series expansion to estimate the square root). If you wanted to speed the model up, you could replace "sqrt" with your own code to provide a rougher, faster estimate of the square root.
(Almost as much time is used up in class "fu40" which must be a run-time system function that you can't do anything about.)
--SFRailsback 20:34, 23 Feb 2007 (EST); Thanks to Steve Jackson.</field>