Talk:Multi-state modeling of biomolecules

From PLoSWiki
Jump to navigation Jump to search

Status Update

We thank both reviewers for their helpful comments. We are in the process of attending to them. We will inform you once this has been completed. --Mstefan (talk) 11:21, 8 August 2013 (PDT)

We have gone through and addressed all the reviewer comments. --Mstefan (talk) 18:54, 26 February 2014 (PST)

We have gone through the second round of comments and the "final brushing" --Mstefan (talk) 10:34, 14 May 2014 (PDT)

Wikification

I am glad to see this draft progress well. From the perspective of adaptation to Wikipedia, though, there is still some way to go. Please take a look at the wiki components of the guidelines. Specifically, the introductory section is not yet conform with Wikipedia style, links to other Wikipedia entries are missing entirely, and the only figure in the current draft is supplied as a PNG rather than SVG. If there is anything unclear about the guidelines, please let us know. Thanks! --Daniel Mietchen (talk) 20:08, 28 June 2013 (PDT)

I see that you have now worked on most of the issues raised above, except for the image format. Please address that too. Thank you. --Daniel Mietchen (talk) 06:04, 11 July 2013 (PDT)
Figures 1 and 2 are now in svg format. --Mstefan (talk) 18:59, 26 February 2014 (PST)
Thanks for the SVG versions! The introductory section still does not really meet opening paragraph, but we are getting there. I just linked "Modeling" to wp:Computer simulation but had trouble finding a good way to link the first occurrence of "state" - wp:Biomolecular structure does not cover it, nor any of the options listed under wp:state. Later occurrences of a term are not to be linked again - I have thus removed a number of links to wp:protein, and I think there are a few more. Will go through with that in mind once more. --Daniel Mietchen (talk) 20:09, 9 March 2014 (PDT)

Final brushing

Now that content is stable, it's a good time to look at matters of style, some of which are subtle.

  1. As mentioned above, the introductory section should normally start with something like "Multi-state modeling of biomolecules is ". Exceptions are possible, but then at least all components of the article name should be linked to the respective articles to allow for background reading. I am not sure what to link "state" to.
    An introductory sentence of that style has now be added. --Mstefan (talk) 10:29, 12 May 2014 (PDT)
  2. Phrases using "we" are to be avoided. Similarly, phrases like "great overview" do not meet the neutrality requirement.
    We have revised the text accordingly --Mstefan (talk) 10:40, 12 May 2014 (PDT)
  3. Sentences phrased as questions - while didactically useful in instruction contexts - are discouraged on Wikipedia, which aims to inform.
    The page is now question-mark-free. --Mstefan (talk) 10:55, 12 May 2014 (PDT)
  4. "Headings should not normally contain links, especially where only part of a heading is linked."
    We have removed links from all headers. --Mstefan (talk) 10:59, 12 May 2014 (PDT)
  5. There are some instances of citation overkill that need to be addressed.
    We have fixed this. --Mstefan (talk) 10:27, 13 May 2014 (PDT)
  6. Phrases like "(reviewed in [21])" should be simplified to "[21]", since Wikipedia prefers secondary over primary (and tertiary) sources, and there is generally no assumption that a given reference is the sole (or first, etc.) source for a particular statement.
    This has now been fixed. --Mstefan (talk) 10:42, 13 May 2014 (PDT)
  7. I probably missed a few other things (perhaps overly technical language or some formatting issues) and will go through the text again later with that in mind.

--Daniel Mietchen (talk) 02:53, 15 April 2014 (PDT)

I went through the text again and fixed the remaining issues. --Daniel Mietchen (talk) 22:37, 12 July 2014 (PDT)

Reviews

Review by Bill Hlavacek

The comments immediately below pertain to the revised version of the Topic Page: http://topicpages.ploscompbiol.org/w/index.php?title=Multi-state_modeling_of_biomolecules&oldid=3726

The authors have addressed my concerns with the original version of the Topic Page.

The authors should consider a few small edits for correctness or precision:

  • Consider replacing "...for one googol (10100) distinct..." with "...for more than one googol (10100) of distinct..."
  • Consider replacing "...simulation using a Monte Carlo algorithm." with "...simulation using a kinetic Monte Carlo algorithm."
  • Consider replacing "...and its capability to generate reactions on-the-fly at each iteration..." with "...and its capability to generate reactions on-the-fly during a stochastic simulation..."
  • Consider replacing "One implementation of the κ-calculus is provided by the KaSim simulator..." with "A simulator compatible with Kappa is KaSim..."
  • Consider replacing "Specification of HPP models is implemented in BioNetGen..." with "Specification of HPP models is supported by BioNetGen..."
We have made all the above edits. --Mstefan (talk) 10:50, 13 May 2014 (PDT)

Here are some suggestions:

  • Consider deleting "Not all tools that enable model specification also provide functionality for model evaluation or computation."
  • Consider replacing "One of the earliest..." with "An early..."
  • Consider clarifying what is meant by "specification of reagents" in the last paragraph of the "Particle-based Rule Evaluation" section.
  • Consider mentioning that RuleMonkey and NFsim implement distinct but related simulation algorithms.
We have made all the above edits. --Mstefan (talk) 11:07, 13 May 2014 (PDT)
  • Consider adding to the list of example models.
The list of example models is, of course, far from complete. What we were aiming to do, however, was to assemble a list that represents the breadth of tools discussed in the text, and we think that the list provided does fulfil this function. --Mstefan (talk) 10:58, 13 May 2014 (PDT)

Finally, there are a few grammatical errors that should be corrected. Errors I noted are near these phrases:

  • "will necessarily populated"
  • "methods is"
  • "user can specifies"
  • "relative small"
All those have now been fixed. --Mstefan (talk) 11:20, 13 May 2014 (PDT)

The authors have written a nice introduction to "multi-state modeling of biomolecules."

My comments about the revised Topic Page end here.

This critique pertains to this version of the Topic Page: http://topicpages.ploscompbiol.org/w/index.php?title=Multi-state_modeling_of_proteins&oldid=2904

The Topic Page is well written and provides a good introduction to the subject of "multi-state modeling of proteins."

The authors may wish to choose a title that is more general, because the issues and modeling approaches discussed apply not only to proteins but also to other biomolecules, such as DNA. For example, see these references:

Marchisio MA, Colaiacovo M, Whitehead E, Stelling J (2013) Modular, rule-based modeling for the design of eukaryotic synthetic gene circuits. BMC Syst Biol 7:42. 

Vilar JM, Saiz L (2013) Reliable prediction of complex phenotypes from a modular design in free  energy space: an extensive exploration of the lac operon. ACS Synth Biol doi: 10.1021/sb400013w

This is indeed a good suggestion. I have edited the text to account for this. I do not think I have permission to rename the page, which would amount to moving it elsewhere on the Wiki, but we can certainly change the title to "Multi-state modeling of biomolecules" in the final version --Mstefan (talk) 12:22, 8 August 2013 (PDT)
I have now moved the page to "Multi-state modeling of biomolecules". --Daniel Mietchen (talk) 19:24, 30 September 2013 (PDT)

The coverage of software tools is uneven - some useful tools are not mentioned at all, such as KaSim, VCell, Simmune, SSC, Smoldyn, and SRSim. The Topic Page would benefit from addition of a table to list and summarize (more comprehensively) the software tools available. A recent review identifies nearly 30 available software tools (and references can be found therein):

Chylek LA, Stites EC, Posner RG, Hlavacek WS (2013) Innovations of the rule-based modeling approach. In Systems Biology: Integrative Biology and Simulation Tools, Volume 1 (Prokop A, Csukás B, Editors), Springer.

We have now added references to KaSim, Simmune, SSC, and SRSim at appropriate places in the text, as well as a reference to the Chylek et al. review. We did not include VCell or Smoldyn, because they do not have specific capabilities for specifying multi-state molecules. --Mstefan (talk) 18:22, 22 October 2013 (PDT)
The new updated figure 1 now corporates an overview of all the software tools discussed. --Mstefan (talk) 18:51, 26 February 2014 (PST)

Discussion of certain concepts is mixed with discussion of software tools when it might be better to discuss the concepts and tools independently. Some software tools and methods for multi-state modeling are based on theoretical formalisms. For example, the model-specification language used by BioNetGen and its network-generation capabilities are based on a graphical formalism, which involves a single-pushout approach to graph rewriting. This formalism could be used as a basis for developing new software tools (and it has indeed served this purpose). The formalism is independent of BioNetGen. Other tools/methods are based on different formalisms, including labeled state-transition systems, process algebras of different kinds, and finite-state machines. Some tools are not based on any recognizable theoretical foundation - they were designed ad hoc. Formalism is probably more useful for developers of software tools than for users of tools, so I am only suggesting brief coverage of formalism.

We have added a few words about the concept behind each of the rule-based software tools (where that information was previously missing). We are indeed not going into more detail, since this review is aimed at tool users, not developers --Mstefan (talk) 13:49, 12 February 2014 (PST)


Model specification, which is of high concern to users of tools, can probably be more carefully discussed and distinguished. There are various approaches to the "specification problem" (not all of which are tied to particular software tools). Some approaches are ad hoc. An example is the approach that one must use with StochSim. Other approaches are based on what can be considered a domain-specific programming language, a programming language specialized for a particular purpose, here specification of multi-state or rule-based models. Examples of specialized model-specification languages include Kappa (which has received much attention in the computer science literature) and BNGL, which are very closely related. An advantage of a domain-specific language is that the language can be used by many different software tools (and read and understood by a human). For example, besides BioNetGen, BNGL is compatible with DYNSTOC, RuleMonkey, NFsim, and other tools. A third approach to model specification is to use an embedded language, which leverages the power of a general-purpose programming language. This approach is advocated by the developers of PySB, which is based on the Python programming language. Finally, it would perhaps be appropriate to mention the SBML "multi" package, which is in development.

This is indeed an interesting point. From the point of view of the user, the distinction between the approaches to specification (ad hoc, domain-specific or embedded) is interesting insofar as they want to know what tools they need to use or what tools they can use together. We do not, therefore, spend much time on discussing the principle behind the three approaches, but we highlight their practical applications. For instance, in the description of BNGL, we have now added that BNGL can be imported into DYNSTOC, RuleMonkey etc. We have also added a sentence about tools that use ad-hoc specifications vs tools that can read specifications in a format like BNGL at the beginning of the section about particle-based rule evaluation. We have also added a mention of PySB and SBML "multi", as suggested. --Mstefan (talk) 15:23, 16 February 2014 (PST)


A rule-based model is in essence a set of rules. The number of parameters of a rule-based model is more or less proportional to the number of rules. In contrast, the number of parameters of an ODE model (the type of model most commonly used in systems biology) is more or less proportional to the number of equations. Thus, there tends to be some confusion when writers explain that a rule-based model is equivalent to a large number of ODEs or implies a large number of chemical species or molecular states (because an ODE model for chemical kinetics is composed of one equation for each species). The rule-based model seems "complex," which is not attractive. As Hilbert said (1900), "...what is clear and easily comprehended attracts, the complicated repels us." The irony of the false impression is that a rule-based model is usually rather simple when model complexity is assessed by number of parameters, which is a reasonable metric. Thus, I am not sure that some of the discussion in the Topic Page (e.g., the comments about the model that accounts for a mole of chemical species) will make multi-state modeling seem appealing for most modelers (or more generally most scientists or anyone actually) without this discussion being balanced by a discussion of model complexity. It might be helpful to mention that the number of rules (and parameters) in the "mole model" is rather small compared to the number of equations (and parameters) in many ODE models for the same cell signaling system. I note that a "googol model" exists and the report about this model includes a discussion of model complexity:

Creamer MS, Stites EC, Aziz M, Cahill JA, Tan CW, Berens ME, Von Hoff DD, Hlavacek WS, Posner RG (2012) Visualization, annotation and simulation of a large rule-based model for ErbB receptor signaling. BMC Syst Biol 6:107.

We have added a discussion of complexity which we hope will clarify the topic. We are grateful to you for bringing the "googol" model to our attention and have added a reference to it. --Mstefan (talk) 09:36, 17 February 2014 (PST)

Using local rules to represent protein interactions is a simplification. Given available knowledge of protein interaction networks, It is far easier to formulate local rules for protein interactions than to identify the chemical species that are populated as a result of these interactions. The number of possible species may be more than the number of molecules in a cell, so clearly not all of the possible species are important. But which are? There is usually no empirical information available to guide the identification of the important species, whereas numerous biological studies have primarily been concerned with discovering the local rules that govern protein interactions (e.g., the SH2 domain in Grb2 binds EGFR when such and such tyrosine residues are phosphorylated).

Indeed. We have added a brief discussion about this in the newly rewritten paragraph about complexity. --Mstefan (talk) 16:07, 16 February 2014 (PST)

In addition to the dichotomy of spatial vs. non-spatial models, the authors may wish to discuss the dichotomy of direct vs. indirect methods of simulation. In a direct method, such as one of the methods implemented in network-free simulators, the rules of a rule-based model are used directly to obtain simulation results. In an indirect method, the rules of a model are used to obtain not simulation results but rather an "equivalent" model that has a traditional form. (I mean "equivalent" in the sense that a convergent geometric series with a finite sum is equivalent to an algebraic expression that gives the same sum.) This form of the model is then simulated using a standard technique, such as one of the ODE solvers available in MATLAB. From this perspective, the equations derived from a set of rules are nothing more than algorithmic oddities.

We touch on this issue in the description of the various tools (e.g. we highlight those that can produce a set of ODE from a rule-based model). --Mstefan (talk) 16:17, 16 February 2014 (PST)

It should be mentioned that indirect methods, when they can be applied, are usually more efficient than direct methods.

We are not sure that this is always true, because the efficiency of either will depend on a lot of factors including the number of reactions in the equivalent model, the number of particles in the simulation, the implementation, software architecture etc. --Mstefan (talk) 10:07, 17 February 2014 (PST)


I strongly urge that "particle-based" be used in place of "agent-based." The reason is that the rules in the vast majority of models called "agent-based models" have no connection to physicochemical principles. In physical chemistry, the term "particle-based" is more standard.

We have changed "agent-based" to "particle based" everywhere in the article --Mstefan (talk) 10:52, 17 February 2014 (PST)


It should be noted somewhere that rule-based models are firmly based on the principles of chemical kinetics. Assumptions are of course involved (as with ODE models). The novel type of assumption made when formulating a rule-based model is that the interaction represented by a rule is modular (affected only by the molecular context explicitly considered in the rule), such that it is appropriate to take all reactions implied by the rule to be governed by the same rate law, which is a type of coarse graining. Such an assumption can break down - in this case, the rule can be replaced with finer rules that appropriately capture the effect of molecular context on an interaction. (In physics, principled coarse graining is usually viewed as elegant, not defective.)

We agree insofar as the rules will preserve the main features of the reactions, i.e. if the reactions are based on chemical kinetics, so will the rules be. (Depending on the modeling scenario, the reactions themselves need not be chemical reactions though, and in that case, nor will the rules). We have added a paragraph explaining this preservation of the main properties, and making explicit the assumption behind modularity. --Mstefan (talk) 11:24, 17 February 2014 (PST)

I disagree with the following statement in the Topic Page: "To date, all modeling packages that solve the computational problem are agent-based." The reason is that a set of rules that implies a very large number of equations (and thus not appropriate for simulation via an indirect method) can be analyzed to obtain a reduced-order equational model, meaning a system of ODEs that is manageable in size. There is a large number of papers in the literature about this methodology, which is a type of coarse graining. One is the following:

Feret J, Danos V, Krivine J, Harmer R, Fontana W (2009) Internal coarse-graining of molecular systems. Proc Natl Acad Sci USA 106, 6453-6458.

If one discusses this methodology, it would be nice to also recognize the work of Nikolai Borisov, Holger Conzelmann, and others.

We have deleted the statement from the page. --Mstefan (talk) 13:23, 17 February 2014 (PST)


Figure 1 is not legible when the Topic Page is printed.

This is because we only provide a thumbnail on the wiki page. The full-size figure becomes accessible when clicking on the thumbnail. --Mstefan (talk) 09:50, 20 February 2014 (PST)

Although several examples are mentioned in the text, it might be useful to include a table listing notable applications of multi-state modeling to obtain biological insights. There are many to choose from.

We have added a table with examples of multi-state modeling, reflecting the breadth of both the models and the methods described in the article. --Mstefan (talk) 17:30, 22 February 2014 (PST)

Review by Adelinde Uhrmacher

I wish first of all to thank the authors for their very thorough revision of the topic page. I have only few comments referring to the revised version.

"To tackle the computation problem, they have turned to particle-based methods, that are much more computationally efficient than population-based methods based on ODEs, PDEs, or the Gillespie stochastic simulation algorithm." This is in general not true, a multitude of different approaches exist to speed up the execution of models - one among them is the particle-based method, and referring to efficiency of computational algorithms no silver bullet does exist, the efficiency of a computational method depends largely on the model. So for many the particle-based method might be a suitable option for others not. See also your paragraph on HPP. One could write that particle-based approaches have shown in many cases to be more efficient and include some references to the corresponding studies which have compared the performance of particle-based approaches to the different population-based approaches.

Thank you for pointing this out. We have now re-written this, as suggested. --Mstefan (talk) 11:42, 13 May 2014 (PDT)

"Simulation algorithms can be classified into two groups, depending on the level of analysis at which the rules are applied: populations vs single particles." Different possibilities exist to classify simulation algorithms, e.g., exact or approximate, sequential or parallel, time-stepped or event-based etc. Distinguishing algorithms depending on whether they group species into populations or treat them individually is only one possibility. Also I would distinguish here three groups, i.e., population-based, individual-based and hybrid (see below).

We have made the wording more precise and included hybrid algorithms. --Mstefan (talk) 11:48, 13 May 2014 (PDT)

"This means that the space of all possible states is very large." Not necessarily, it again depends on the model.

We have changed "is" to "can be". --Mstefan (talk) 12:04, 13 May 2014 (PDT)

Minor: modeling vs. modelling and the colors of figure 1 if downloaded as pdf are too dark - please check.

We have made spelling consistent and uploaded a new version of figure 1 --Mstefan (talk) 10:33, 14 May 2014 (PDT)


The following comments refer to the previous version of the topic page.



Modeling and simulation of multi-state systems is an interesting and important topic. Enclosed you will find my remarks, which are structured according to the paragraphs of the proposed Wiki entry.

Summary:

"To solve the specification problem, modelers have in recent years moved away from explicit specification of all possible states, and towards rule-based formalisms that allow for implicit model specification, including the kappa-calculus,[1] BioNetGen,[2][3][4][5] the Allosteric Network Compiler[6] and others." Please be aware that other modeling approaches (in addition to rule-based ones) also allow a compact description of multi-state models, for example colored/attributed approaches of process calculi and Petri Nets.

This is no doubt true, and in fact some rule-based approaches (such as the kappa calculus) are based on process alegbras. However, we intend this review to focus on practical tools available to biologists who wish to model multi-state proteins. The most widely used (and most widely available) tools such as kappa, BioNetGen, etc. use rule-based formalisms, which is why we concentrate on those. --Mstefan (talk) 13:41, 18 February 2014 (PST)


"To solve the computation problem, they have turned to agent-based methods, that are much more computationally efficient than population-based methods based on ODEs, PDEs, or the Gillespie stochastic simulation algorithm. Indeed, agent-based methods are sometimes the only possible option given current computing technology." "To solve the computation problem" is likely to lead the reader astray, as this problem cannot be solved in general (see also comments below). To address the problem of calculating models that comprise proteins with a very large number of different states, object-based methods have been developed where each protein is explicitly represented as an individual software object. As you have stated yourself, agent-based approaches might be less efficient than other methods for specific models, so I would be careful with a general statement like "that are much more computationally efficient". (BTW: I would rather call them particle-based or object-based simulation methods.)

We recognise that the formulation of "solving the computation problem" was potentially misleading. We have rephrased the relevant text passages to say that particle-based approaches are one way of approaching the computation problem. We have also replaced the term "agent-based" with "particle-based". --Mstefan (talk) 16:09, 22 February 2014 (PST)

Specification vs computation:

"some solve the specification problem, while others solve both the specification problem and the computation problem." Again the term "solve" might be confusing. I would rather state some approaches focus on the specification, others on the computation problem (because the above sentence implies that nobody works on simply getting the execution faster, see also remarks below), and yet others address both problems.

We have now changed the text accordingly. --Mstefan (talk) 16:36, 22 February 2014 (PST)

Figure 1 should be changed to indicate that there is more than rule-based modeling approaches or agent-based approaches for executing these models, which is aimed at addressing the respective problems. Also the term combinatorial explosion (as it is too general) should be replaced by what this article is about, i.e., modeling and simulating cell-biological multi-state systems.

We have now made it clear both in the text and in the figure legend that it refers only to the rule-based and particle-based approaches discussed in this article. We have also changed "Combinatorial explosion" to "Modeling Multi-State Biomolecules" --Mstefan (talk) 16:00, 18 February 2014 (PST)

"Whether the computation problem can be solved or not depends on the complexity of the model and on the level of analysis (populations or individual agents)." This sentence is likely to lead the reader astray. Developing efficient execution algorithms is not a procedure that completes with a solved or failed. An algorithm might perform faster than another algorithm, however the efficiency of an algorithm depends typically on the model, used data structures and sub-algorithms, the infrastructure etc.

This is an excellent point. We have deleted this sentence. --Mstefan (talk) 16:43, 22 February 2014 (PST)

"depends on the level of analysis (populations or individual agents) ..." Both approaches can produce exact results, i.e. the same analysis can be done based on the results. Or am I wrong? I am not sure what this part of the sentence refers to: maybe it should read like "level of analysis (exact or approximative algorithms)"?

The formulation was indeed unclear. The sentence has now been deleted altogether. --Mstefan (talk) 16:47, 22 February 2014 (PST)

Rule-based model specification:

"As an analogy, if we were to describe a computer chip, an explicit description would specify all the parts of it, their positions and how they are connected, while a rule-based description would specify a set of rules by which a chip can be compiled from a set of components and specifications. In computer science, such concepts were developed in the late 1970s, when the introduction of structured design methodology[26] and silicon compilers[27] allowed for automated assembly of large silicon chips from a simple set of instructions, thus laying the foundation for Very-large-scale integration (VLSI)[26], and modern microprocessors with more than 1 billion transistors." I would skip this, as I do not think it helps understanding the concept of having rule-schemata that are instantiated into reactions.

We ourselves have found that analogy rather helpful when thinking about rule-based modeling. But we can see that it might do more to confuse a reader than to help them, so we have now removed it from the text. --Mstefan (talk) 17:44, 22 February 2014 (PST)

"Many rule-based specification methods exist.[1][7][2][3][4][5][6][8]" [2][3][4][5] refer to one approach, right? Why is [7] missing?

Correct, [2]-[5] all refer to BioNetGen. Reference [7] is not missing, but between [1] and [2] (this reflects where the reference has previously been cited in the article) --Mstefan (talk) 17:27, 22 February 2014 (PST)

Referring to rule-based approaches, the following approaches might be of interest as well: Referring to attributes and constraints, e.g., - John M, Lhoussaine C, Niehren J, Versari C: Biochemical Reaction Rules with Constraints. In European Symposium on Programming Languages. Volume 6602 of Lecture Notes in Computer Science. Edited by Barthe G. Springer; 2011:338-357. Referring to multi-level approaches, e.g., - Maus C, Rybacki S, Uhrmacher A.M: Rule-based multi-level modeling of cell biological systems. BMC Systems Biology 01/2011 - Oury N, Plotkin G: Multi-Level Modelling via Stochastic Multi-Level Multiset Rewriting. Mathematical Structures in Computer Science, Special Issue on DCM, 2013.

This was indeed an oversight on our part. We have now added paragraphs describing these approaches. --Mstefan (talk) 19:36, 23 February 2014 (PST)

"In general, rule-based model specification systems separate the specification of a model from the execution of the simulation."

This is not a specific feature of rule-based model specification systems, but state of the art in modeling and simulation. For example, a simulation system can offer different (parallel, sequential, approximative etc.) execution algorithms for one and the same modeling formalism.

This was indeed misleadingly formulated. What we meant to say was that because specification and simulation/evaluation are separate tasks, there are tools that do one, but not the other (although of course, there are also tools that do both). We have re-phrased the text passage and hope that it is now clearer. --Mstefan (talk) 19:32, 23 February 2014 (PST)

"However, many solutions to the specification problem also contain a method of interpreting the specified model."

I assume all do this, because otherwise the model could not be executed. Please clarify.

Some tools in computational biology are indeed only concerned with model specification, not execution (in non-rule-based modelling, SBMLeditor is one example). We have re-written the paragraph (see above) and hope it is now clearer. --Mstefan (talk) 19:47, 23 February 2014 (PST)


"Thus, by only considering states and features important for a particular reaction, rule-based model specification eliminates the need to explicitly enumerate every possible molecular state that can undergo a similar reaction,"

Please note that also non-rule based approaches allow this. Maybe you can make this more concrete, e.g., rule-based model specification in contrast to simple reaction equations?

We do not claim that rule-based approaches are the only approaches that allow for this. This paragraph is just a bridge back summarising the previous two sections (the one explaining the problem with explicit specification, and the one describing the idea behind the rule-based method and the most important implementations). --Mstefan (talk) 19:59, 23 February 2014 (PST)

" and thereby solves the specification problem. Not all solutions to the specification problem also solve the computation problem, although the converse is, in general, true."

I am not sure what is meant with "solving" the specification problem. This problem is earlier defined as "First, how can such a system be specified; i.e. how can a modeler specify all complexes, all changes those complexes undergo and all parameters and conditions governing those changes in a robust and efficient way?" What does "robust and efficient" mean here? To really assess a modeling language in addition to an analysis of what can be expressed in comparison to others, dedicated user studies would be needed. So this needs further clarification.

The confusion here arises from different viewpoints about the target group. We aim this review at biologists who want to use existing tools to describe and model multi-state biological systems, and not primarily at researchers concerned with the (albeit very important) areas of tool development and validation. We therefore intended terms like "robust" and "efficient" in the colloquial sense, i.e. how can a model be specified so that the potential for errors is minimised and so that the task can be completed in an amount of time that is practicable for use in research. Once these two conditions are fulfilled, we consider the specification problem solved for our purposes (again, in the colloquial sense of "solved"). We do recognise, however, that readers used to more stringent definitions might be thrown by the word "solve", so we have re-phrased the relevant passages to avoid the word "solve" as much as possible. --Mstefan (talk) 20:25, 23 February 2014 (PST)


Also, I have a problem with the phrase "solving the computation problem" - as no problem has been defined that can be solved: "Can the model be stored electronically? And can it be evaluated in a reasonable amount of computing time? We call this problem the computation problem". What is a reasonable amount of computing time? Please also note that the agent-based approach might be more storage demanding than the population-based approach.

We recognise that the problem is not well-posed enough to speak of a "solution", and have therefore rephrased the relevant passages. --Mstefan (talk) 20:46, 23 February 2014 (PST)

"although the converse is, in general, true." There is a plethora of research on executing models more efficiently (e.g., parallel, approximative, hybrid execution algorithms). Those approaches do often not care how easy it is to define those models. So it is not true that all approaches that address the computation problem also address the problem of how to support a succinct, compact modeling.

This is true. We have now removed that statement. --Mstefan (talk) 19:53, 23 February 2014 (PST)

Population-based Rule Evaluation:

"Some of the best-known classes of simulation approaches in computational biology belong to the PRE family, including ordinary and partial differential equations (ODE/PDE) and the Gillespie stochastic simulation algorithm."

Please note ODE and partial differential equations are not simulation approaches, but mathematical modeling approaches. A model specified with ODEs or PDEs is typically computed by (numerical) simulation methods. So numerical solvers, finite element methods etc. might belong to the PRE family.

We have now corrected this. --Mstefan (talk) 20:39, 23 February 2014 (PST)


"This means that the space of all possible states is very large. In general, when using ODE/PDE or the Gillespie stochastic algorithm, all possible pools of molecules and the reactions they undergo are defined at the start of the simulation, even if they are empty."

Again you mean numerical integration and finite element methods, or? ODEs/PDEs are not computation algorithms.

Indeed. This has now been corrected. --Mstefan (talk) 07:59, 26 February 2014 (PST)

Also, Gillespie stochastic algorithms do not necessitate that everything is defined from the beginning, e.g. variable or dynamic structure models change their structure during simulation and thus the state space is not fixed from the beginning of the simulation. See, for example, process algebra models that are executed based on SSAs.

This was indeed misleadingly formulated. We have rewritten it to say that the need for enumeration of all states is not a general feature of population-based methods, but only required by some of the implementations (the "generate-first" implementations). Further down, when we talk about on-the-fly generation, we added a sentence saying that these both for deterministic and stochastic methods. We also point to a review of the topic that has further details. --Mstefan (talk) 07:59, 26 February 2014 (PST)

Agent-based Rule Evaluation:

To illuminate how the approach works, a figure would be nice, maybe similar to the one which can be found in the paper about NFSim [13].

This is indeed a great suggestion. We have added a figure (the new figure 2) to illustrate the principle. --Mstefan (talk) 12:14, 26 February 2014 (PST)

"agents rather than populations, it comes at a higher computational cost when simple systems are modeled."

Could you clarify what is meant with 'simple', e.g., few types of species with a high number?

Yes, this is what we mean. We have clarified it now. --Mstefan (talk) 12:19, 26 February 2014 (PST)

Referring to pros and cons, the following paper might be of interest: Justin S. Hogg, Leonard A. Harris, Lori J. Stover, Niketh S. Nair, James R. Faeder: Exact hybrid particle/population simulation of rule-based models of biochemical systems (http://arxiv.org/abs/1301.6854).

Thank you for alerting us to this paper. We now mention it in the discussion of computational cost, and have added a paragraph describing the HPP approach and its implementation in NFSim. --Mstefan (talk) 14:05, 26 February 2014 (PST)

"This method reduces the complexity of the model not only at the specification stage"

How does the agent-based evaluation (you are now at the level of executing a model) have an influence on how you specify it? This is not clear to me. Please clarify.

This was misleadingly formulated. We have removed the "not only ... but also" part of the phrase, so it no longer refers to specification. --Mstefan (talk) 13:56, 26 February 2014 (PST)

"Reactions (or rules) are specified as applying to types of entities (e.g. "a monomeric CaMKII subunit"). Specific state configurations can modify the rate of a reaction, and conversely, a reaction can affect a molecule's state. Most agent-based modeling systems include mechanisms by which not only the state of a protein undergoing a reaction, but also the states of its neighbors in a holoenzyme or protein complex, can affect a reaction."

This refers to the question what you can describe (agent-based modeling systems), particularly constraints etc. (see also above rule-based approaches that might be of further interest in this respect). Here, the specification problem is mixed with the execution. This is rather confusing for the reader.

This was indeed confusing. We have rewritten this section to achieve a better separation between specification (in the upper sections) and execution (in this section). --Mstefan (talk) 14:14, 26 February 2014 (PST)

"To date, all modeling packages that solve the computational problem are agent-based."

What do you mean with "modeling package" - currently you refer to the computation of rule-based models, so do you mean execution packages? Please note: there are other possibilities to speed up the execution of models, be this hybrid approaches, approximative ones (e.g., tau-leaping), exploiting parallelism, architectures like GPUs, or even approaches that "configure" simulators on demand (e.g., by learning) - (at least) some of those have also been applied to rule-based languages.

This statement has now been removed. --Mstefan (talk) 14:12, 26 February 2014 (PST)

Non-spatial agent-based methods:

Spatial agent-based methods:

I am confused by this paragraph. We are still in the context of executing a rule-based model in an agent-based (object, particle-based) manner, right? Are models defined in a rule-based manner in Meredys and MCell? How is the problem of specifying multi-state systems addressed there? The relation to the paragraphs above is not clear to me. Please note: spatial simulation poses again new challenges, referring to specification as well as execution.

Yes, indeed, we are talking about particle-based methods that allow for the simulation of multi-state molecules. MCell uses an ad-hoc specification, which we briefly explain in the article (the "slot-and-state" model). We have removed the paragraph about Meredys, because the software seems to be no longer maintained, which does indeed make it difficult to gather information about the input/specification format (among other things). The difference between tools that can read a specification file created with another tool (most often, BioNetGen), and tools that use an ad-hoc way of specifying a model (like StochSim and MCell do) is something the other reviewer also pointed out, and something that we now explicitly mention. --Mstefan (talk) 18:42, 26 February 2014 (PST)


I understood your argumentation line: first to make people aware about the problems associated with modeling and simulating cell-biological multi-state systems, to present rules / rule-schemata as one (attractive) modeling approach which avoids enumeration, list some approaches and discuss how are they different (this could be made more clear), then turn to the problem of executing these rule-based models and possibilities to speed up their execution, where your focus is on agent-based (particle-, object-based) in comparison to population-based approaches. However, at some points I had problems following this argumentation line.

We believe that by incorporating the feedback from both our reviewers, this argumentation line has now become much clearer, and the article stronger. --Mstefan (talk) 18:48, 26 February 2014 (PST)