Swarm: IdeasList
Application to Google Summer of Code
Interested in applying to work on Swarm for Google's Summer of Code? Here is a bit of guidance for your application. In general, keep it short and simple. Tell us:
- where you are academically (school, degree you are working on, examples of relevant course work)
- about any relevant experience from work, school projects, for fun, etc. What interesting programs have you written?
- what experience you have (if any) with agent-based modeling software, discrete event simulation, or simulation modeling.
- what experience you have with Objective-C and other varieties of C.
- what other programming languages are you interested in?
- what you are interested in working on.
- the name and email address of approximately two references: collaborators on open-source projects or professors or supervisors who are familiar with your skills.
If you want to contact us with questions, etc., you can join our developers email list "swarm-hackers" or use the contact info for this wiki.
SDG participation in SoC
Swarm Development group participate in GSoC twice. Each year Google allocated two students slot for organization. You can find additional information about selected projects at Summer of Code SDG 2007 and SDG 2008 pages.
Ideas List
This page contains ideas for new contributions to Swarm software. It was prepared for Swarm's application to the 2007 Google Summer of Code, but anyone interested in contributing is welcome to contact us about working on these. Some of these ideas are tagged as research projects because, although we expect they would result in useful code, the purpose of that code would be motivate new projects more than to create an immediate utility gain for Swarm users. Also, some of these projects are a lot of work, and we'd have to discuss a suitable subset that would fit within the SoC.
defobj replacement (research project)
The most fundamental library of Swarm is defobj. Defobj allows methods to come and go during the lifetime of an object. For example, a caterpillar can only crawl but a butterfly can fly. It is currently based on Objective C, but that is really only an implementation detail. One project idea would be to start from a clean state, and design a new module that could create both evolving objects (assemblies of variables and methods), but also evolving primitives. Some possible starting points could include use of concatenative languages like Cat or Joy, or use of a just in time code generation system such as LLVM or CodeDOM. The goal would be to create a new foundation library for ABMs that allowed agents to both evolve in a completely general way, but still use computing resources efficiently. So, for example, if an agent or set of agents evolved a clever way to compute a logarithm that was more efficient than the best known method, the realization of that algorithm ought to be in the ballpark of outperforming the best existing implementation.
defobj portability across ObjC runtimes
Mentor: Scott Christley
Apple has come out with ObjC 2.0 with a significant change to the runtime APIs. The days of Swarm being able to directly manipulate the underlying data structures in the ObjC runtime is over, instead we need an abstraction layer for all the current runtimes (GNU, Apple 1.0, Apple 2.0). More importantly, such an abstraction layer will help other technologies to be explored, like just in time code generation with LLVM, without the Swarm core functionality requiring to be modified.
call frame disassembly
Swarm has some code called mframe that it uses to take apart messages that need to be forwarded (e.g. amongst language runtimes). This is very fragile code and either new code needs to be written (e.g. for x86_64 on Darwin and Linux), or else some suitable software needs to be put in its place. Any progress that could be made on this problem would be a big win. Swarm is generally pretty portable, and this the usual obstacle for supporting a new platform, along with support from ffcall or libffi,
more representation schemes for aggregations of agents
Swarm simulations typically involve some kind of aggregation of agents. Currently Swarm provides 2D spaces and a small set of collection interfaces. One project would be to develop more representation schemes. For example, 3D spaces, GIS interfaces (e.g. connect to a popular package like ArcView), or graph representations. There are probably some specialized cases where all custom code is justified, but there are also lots of third party libraries and applications to leverage. A project like this has a good chance of actually getting used in the Swarm community and it's something you can design to have a well-defined scope. It's also something others can elaborate without a big investment.
system thread parallelism for Swarm scheduler
Agents in a Swarm simulation interact not only in space, but also in time. Arguably the most fundamental contribution Swarm has made to ABM is its multilevel scheduler (the activity library), in which concurrency can be identified incrementally at run time. Today in this age of multicore CPUs, it would be a great benefit to users to spin off this concurrency onto these CPUs in parallel. To be practical this involves review of parts of Swarm outside of the scheduler, to ensure that they are thread safe, and also a review of models (or at least writing some new ones that are safe) to ensure that don't have race conditions, etc.
Cell processor port
One way to achieve parallelism would be with the very lightweight messaging provided by the Cell processor. There are compilers and IDL stubbing facilities provided with the Cell development kit. The idea here would be to introduce the notion of asynchronous messages.
write a new scheduler (research project)
A related project would be to write a new multilevel scheduler or to explore other technologies like X10 that have similar parallelism semantics to Swarm. It was originally anticipated that Swarm would have features like partially ordered sets so that ordering constraints amongst agents would be sufficient information to direct execution. One might also consider extending the existing activity library or writing a new one in some new language to have more of a parallel constraint-fitting feel.
separation of model from viewer
To gain qualitative insights to how a model implemented with Swarm behaves under different circumstances, it is useful to be able to look at it, collectively, as it runs. For quantitative understanding, it is necessary to make measurements. Swarm provides infrastructure to visualize and measure simulations, but they are essentially attached to a single display and process. It would be useful to completely abstract away measuring simulations from running them such that 1) it was possible to attach to a running model on one system using any other system, such as using a web browser, and 2) that from the simulation's point of view there was no difference between an observer recording time series statistics and one showing, for example, a space of heatbugs. There are a wide variety of technologies to consider for something like ranging from XML-RPC/SOAP for messaging to shared memory segments for accessing large, densely populated spaces. Roughly speaking, in this approach, the `analysis', `simtoolsgui' and `gui' modules of Swarm would be removed and put in a standalone application or web browser plugin.
more languages for Swarm
Language integration is an area where Swarm has some finished and unfinished work. A well-bounded, supportable, and useful project would be to leverage the features in Swarm for cross language intergration for another language. Near completed work exists for JavaScript, and also for Python. Somewhat more involved, and arguably more useful, would be a Swarm module for the R statistical package, where agent code could be prototyped in R, and simulations measured directly using R's many built in statistical and visualization features.
implementations of Swarm-like capabilities in other languages (research project)
Swarm is currently written in C and Objective C and then has language layers around that. Other implementations have been proposed, like a new implementation in C#. One benefit of C# is that it fits very naturally with portable execution on virtual machines, another is that it is fits very naturally with development on Windows. Other implementations of some subset of Swarm, or slightly different feature set, might be interesting. How would might purely functional (e.g. Haskell) fit with simulation? Could aggregations all use Monads?
plugins of Swarm and Swarm models for Firefox
Aside from the language integration, there are environment integration features. For example, using the XPCOM support code in Swarm, it ought to be possible to make a Firefox plugin of Swarm, and to augment it with model add-on plugins. This promises higher perfomance than Java applets as a model plugin could be precompiled. For packaging models, there could be a new Makefile target such that "make plugin" would package the simulation into a Firefox XPT installable.
browser based visualization features
To be useful, browser embedding would need dedicated visualization interfaces for the browser. Firefox's Canvas and SVG features provide much of the functionality. They would need to be given Swarm-friendly interfaces. Without having to do all of the plugin work described above, it would still be instructive for other Swarm developers to work through these details and make a plugin where C code was packaged as a Firefox XPT package and was able to draw rasters (at native-like speed) to the browser -- and similarly for vector graphics using SVG.
other front ends to Swarm
The notion of using a browser to do model visualization is that a browser is a well-supported cross platform, but fast native code application, and in contrast supporting GUIs are notorious time sinks. However, doing a native GUI (or several) does have the advantage of providing the tightest possible integration with the user on a given platform. Thus the motive for considering work on OpenStep/Cocoa, or Windows Presentation Foundation, or Gnome/Qt for Linux.
Swarm and OpenStep/GNUstep/Cocoa
Mentor: Scott Christley
The original Swarm code uses the tcl/tk/blt libraries for graphical display; one of primary reasons for getting Swarm to work with the OpenStep API [[1]] (as implemented in GNUstep [[2]] and Cocoa [[3]]) is to provide for greater graphical capabilities. OpenStep also provides many other general capabilities that Swarm applications can take advantage of like high-level OO classes for collections, networking, threads, XML, and much more. The goal of possible Google Summer of Code projects is to take the initial prototype and extend it in a number of useful ways.
- Write new versions of sample applications.
- Integration with platform development tools, Gorm for GNUstep and XCode for Cocoa. Create a Swarm palette which allows agent classes, spaces, graphics, etc. to be dragged/dropped into application. Develop a Swarm Application template which gives base menu and controller functionality.
Visualization
Mentor: Scott Christley
While Swarm currently works with OpenStep, not all of the graphic capabilities available with the tcl/tk/blt libraries have been integrated. However, besides just duplicating functionality, new display capability should be provided.
- Integrate with Narrative [[4]] plotting framework.
- Ability to create plots and graphs dynamically.
- 3D graphics?
model search and optimization
There's a class of applications for `model search'. Any library or remote application that can drive a simulation to an optimal point or provide a way to evolve agents toward optimal behavior (see defobj work described in first item), would be useful!
compliling Swarm to CIL
Intermediate representations are becoming more and more prevalent. Two of these best known are CIL of .NET, Java bytecodes and the ActionScript virtual machine (AVM) of Flash 9 that will soon be in Firefox. There exists a GCC backend that generates CIL, but there is little C library support. Pulling enough C routines together for a Swarm, batch model would be neat as it would mean models could be distributed to run over machines throughout the web or over heterogeneous clusters. CIL can be hosted on the JVM and vice versa, and it's reasonable to expect an on-the-fly translatfor for AVM as well. (Firefox/Adobe folks are working on hosting JavaScript on AVM bytecode.)
ready-made Windows Vista build images
It is hard for the SDG to keep binaries, esp. Windows binaries, up to date. It would be useful to have a VMware image of a new Vista install (that the SDG would pay for) that had all needed compiler tools and build scripts. Then a motivated Swarm developer could just download that, run it on a Linux system (or another Windows system) and have a stable environment for rebuilding Swarm. Also we don't have any experience running Swarm on 32 or 64 bit Vista which would be a valuable thing to gain.
improvements to Swarm using CPU profiling
Intel VTune provides detailed CPU-level performance information on program execution. There also exists an open source program called oprofile that a subset of these features. It would be instructive to Swarm users (and developers) to take a very badly behaving model like Steve Railsback's model#16/batch and absolutely tear it apart and rebuild it to run fast such that the CPU could be shown to be running at peak efficiency. This would likely take a few weeks of intensive work as there are many issues to study and alternative remedies to measure. How stable are these changes across processors? For example, do CPUs with indirect branch predictors like Intel Core 2 or Barcelona scale differently than older ones? Documenting this whole review process and its outcome would likely provide useful guidance to people that need fast simulations. This is arguably more of an analysis task rather than software development, but it could be made a software development project by actually putting Intel processor instrumentation features into Swarm itself or by fixing any problems in Swarm that come up as bottlenecks.
GIS integration
Many agent-based models use spatial data obtained from Geographic Information Systems. Any efforts to automate information transfer between Swarm and GIS could be widely useful. An example is the new GIS extension to NetLogo: http://ccl.northwestern.edu/netlogo/4.0/extensions/gis/ . In the long run, ABM platforms like Swarm need to be able to access GIS functionality as well as data.
Install packages for various Linux distributions
Mentor: Scott Christley
Enhance the Makefile/build system of Swarm to have targets to produce source/binary installation packages for various Linux distributions using the native package manager, e.g. rpm for Redhat.
Batch mode distribution
Mentor: Scott Christley
Swarm can be compiled with all GUI functionality disabled. This might be particularly useful in a grid computing environment where its known that simulations will be running without graphics. Enhance the makefile/build system of Swarm to provide a batch-mode deliverable of the Swarm libraries. Enhancements to the Swarm makefile and rework of sample applications to show users how to use the batch-mode version of Swarm. Maybe some example scripts for how simulations can be run in a grid computing environment.
Portability and performance of call forwarding
Objective C has potentially fast mechanisms for forwarding messages that rely the runtime and compiler following the same binary interfaces. Code like mframe (which is in Swarm itself), Apple's NSInvocation, and the libffi and ffcall libraries know about different aspects of deconstructing and constructing call frames. A person effective at low-level detective work would be very helpful in both improving the portability of Swarm and opening new options for high performance messaging. For example, given a quick and reliable way to iterate over the arguments of an incoming call, then it is feasible to do messaging using highly optimized messaging libraries like OpenMPI. This in turn would have applications to ports of Swarm to hybrid architectures like the Cell broadband engine.

![[Main Page]](/stylesheets/images/wiki.png)