May 2005
Nathalie Chabrier-Rivier, François Fages, Sylvain Soliman,
Projet Contraintes, INRIA Rocquencourt, France
http://contraintes.inria.fr/BIOCHAM
Differences with BIOCHAM 2.2
Learning of (boolean) rules.
Export to xppaut's .ode format.
LTL model checking on numerical traces, and learning of numerical parameters.
A new, more versatile parser, replaced the old one, thanks to Daniel de Rauglaudre.
More shortcuts were introduced for NuSMV queries. Fairness (weak) or dynamic reordering BDD's variables can be imposed on the system. NuSMV 2.2.2 and 2.2.3 is supported.
The vim mode that was developped externally is now merged in the distribution, an emacs mode is available too.
Seasonal bugfixes ;)
Differences with BIOCHAM 2.1
BIOCHAM 2.2 brings more flexibility for plotting simulation results. The simulation can use (and that's the default) an implicit method, and thus handles stiff equations.
Any biocham command can now appear in a Biocham file.
SBML (resp. BIOCHAM) parameters are now imported (resp. exported) to BIOCHAM (resp. SBML) parameters.
Lots of minor bugs were fixed.
Differences with BIOCHAM 2.0
BIOCHAM 2.1 implements the adaptive step size method of step doubling for Runge-Kutta integration method.
Bug fixed on the parsing of arithmetic expressions (implies to add
parentheses around complexes given with stoichiometric coefficients).
Differences with BIOCHAM 1
BIOCHAM 2 introduces the ability to use quantitative models, with stoichiometric coefficients and kinetic laws (examples are available in the EXAMPLES/kinetics directory). These models can be numerically simulated, or used as before for model-checking via a boolean abstraction. Parameters and macros can also be defined, used in rate laws and saved in a separate file.
An import/export function to/from SBML has also been introduced.
Some shortcuts for common CTL queries are now available.
Differences with BIOCHAM 0
BIOCHAM 1 introduces a rich pattern language for defining reaction
rules in a concise manner. The complexation operator is now supposed to
be associative and commutative. The operators Ei and Ai have been
introduced in the query language to quantify over the initial states.
BIOCHAM 1.1 adds the possibility to export the interaction map to Graphviz dot format.
1. Installation
BIOCHAM is precompiled for x486 machines under Linux and under
Windows with Cygwin. BIOCHAM can
be installed in any directory. The command biocham
starts the BIOCHAM interpreter.
The commands ./configure
&& make recompile BIOCHAM if necessary. If a
recompilation
is needed on your machine, you need to install first:
See the file INSTALL of the distribution for more details.
BIOCHAM is a free software protected by the GNU General Public License
GPL version 2. This is an Open Source license that allows free
usage
of this software.
2. Syntax of BIOCHAM
Biochemical
objects:
BIOCHAM manipulates formal objects which represent chemical or
biochemical compounds, ranging from ions, to small molecules,
macromolecules and genes. BIOCHAM objects can be used also to represent
control variables and abstract processes. They are written with the
following syntax:
name = word of alphanumerical and '_' characters beginning with a letter.
molecule =
abstract = @name
A BIOCHAM object is either a "molecule" or an abstract object. The syntax of a molecule has five forms. In the first form, the simplest and the most flexible one, a molecule is simply given a name. Lower and upper case letters are distinguished. The words not, in, and, xor, declare, where, A, E, U, AF, EF, AG, EG, AX, EX cannot be used as names as they are logical connectives or operators for temporal queries (see section below).
The second form serves to denote multimolecular
complexes with the linking operator -. This binary operator is assumed to be associative and
commutative, hence the order of the elements in a complex does not
matter. In the cases where one would like to distinguish between
different orders of association, one can denote the different complexes
with specific names.
The third form serves to write modified forms of molecules, like attaching the set of phosphorylated sites with the operator ~. Several sets can be attached. The order of the elements is irrelevant.
The fourth form denotes a localised molecule. This allows to consider differently a molecule localised in several compartments, and thus to take into acount transport phenomena.
Example: cdk1, cdk1-cycB, cdk1~{tyr15,thr161}-cycB and cdk1::cytoplasm are valid BIOCHAM notations for, respectively, the cyclin dependent kinase one, the complex cdk1 with cyclin B, the phosphorylated form at sites tyrosine 15 and threonine 161 of cdk1 in the complex cdk1-cycB, and the cycline dependent kinase localised in the cytoplasm. (cdk1-cycB)~{tyr15,thr161} is another notation for the same phosphorylated form of the complex without making precise the constituent which is phosphorylated. Note that in this syntax, the complex is considered as formally different.
The fifth syntactical form is used to denote genes or gene
promotors, with a name beginning with #. These objects are assumed to
be unique, which has a consequence on the way reactions involving such
objects are interpreted by BIOCHAM (see below).
Example: DMP1-#p19ARF
can be used to denote the binding of protein DMP1 on the promotor of
the
gene producing protein p19ARF noted #p19ARF.
The same assumption of uniqueness is made on abstract objects that
are noted with a '@'. Abstract objects can be used to represent
particular phases of a process, complete subsystems or abstract
processes.
Example: @UbPro can
be used to denote the Ubiquitine/Proteasome system and write this
formal
object as a catalyst in degradation reaction rules.
BIOCHAM reaction rules are used primarily to represent biochemical
reactions. They can be used also to represent state transitions
involving control variables or abstract processes, or to represent the
main effects of complete subsystems such as protein synthesis by DNA
transcription without introducing RNAs in the model. They are written
with the following syntax:
A solution is thus a sum of objects, the character _ denotes the empty solution. The order and multiplicity of molecules in a solution (1 if not explicitly written) will be ignored for all qualitative operations (like model-checking), only the presence or absence of objects will then be considered.
In such a boolean abstraction of stoichiometric models, a reaction
transforms one solution matching the left-hand side of the rule, into
another solution in which the objects of the right-hand side have been
added. The molecules in the left-hand side of the rule which do not
appear in the right-hand side may be non-deterministically present or
consumed in the resulting solution. This convention reflects the
capability of BIOCHAM to reason about all possible behaviors of the
system with unknown kinetics parameters [1, 3].
Following the uniqueness assumption, molecules marked as "genes" with the '#' notation, or any compound built on such a molecule (such as DMP1- #p19ARF for instance) are not multiplied. These objects remain unique and are deterministically consumed in the form in which they appear in the left-hand side of the rule. The same goes for control variables, noted with a '@', which are deterministically consumed.
Reactions involving a catalyst molecule (i.e. a molecule appearing
in both the left and right hand sides of the rule) are written with the
catalyst molecule between square brackets in the arrow. Similarly a
catalyst reaction can be written between square brackets in the arrow
of
the rule.
The kinetic rates will be only used in numerical operations (like simulation), if no rate is provided zero will be assumed. Pairs of kinetic rates are given for reversible reactions. For floats, the exponential notation 1.0e6 or 7.2E-4 is accepted, but the part before the exponent must then contain a dot (.).
When a name is given in a law, a corresponding parameter or macro will be looked for. Conditions, expressed as conjunctions of simple linear (in)equations, can be used to constrain the use of some law.
Example: cdk1 + cycB =>
cdk1-cycB is a complexation rule. cdk1-cycB
=[Myt1]=> cdk1~thr14-cycB is a phosphorylation rule with
catalyst Mytosine 1. This rule is equivalent to cdk1-cycB + Myt1
=> Myt1 + cdk1~{thr14}-cycB. (1*[RAF]*[RAFK],0.4*[RAF-RAFK])
for RAF + RAFK <=> RAF-RAFK. is a reversible complexation
rule with two different kinetic rates: the product of the
concentrations for the complexation, 0.4 times the complex
concentration for the decomplexation.
Temporal
properties:
The temporal properties of a BIOCHAM model (or of its boolean
abstraction if the original model is quantitative) can be queried in
Computation Tree Logic (CTL). CTL is an extension of propositional
logic
with two path quantifiers for non-determinism:
E, A, and several operators for time:
F, G, X, U. The path quantifier E expresses the existence
of a path, A means for all paths, F means at some time
point (on the path), G means at all time points, X
means
at the next time point, U is a binary operator meaning that a
formula is true until a second formula becomes true.
Furthermore, since it is possible to have an initial state only partially
defined (some molecules are present, some others are absent, but
some are left unspecified), a BIOCHAM query prepends an operator about
initial state to a CTL query. There are two such operators: Ei
and Ai, meaning respectively, "there exists an initial state
such that..." and "for all initial states it is true that...".
query = | ||
object | ||
| (query) | ||
| EF(query) | ||
| AF(query) | ||
| EG(query) | ||
| AG(query) | ||
| E((query)U(query)) | ||
| A((query)U(query)) | ||
| EX(query) | ||
| AX(query) | ||
| !(query) | negation | |
| query & query | conjunction | |
| query '|' query | disjunction | |
| query xor query | exclusive or | |
| query -> query | implication | |
| query <-> query | equivalence | |
| reachable(query) | same as EF(query), i.e. a state where query is true is reachable | |
| stable(query) | same as AG(query), i.e. query is always true | |
| steady(query) | same as EG(query), i.e. query can remain always true | |
| checkpoint(query,query) | same as !(E(!(query1) U query2)), i.e. there is no path leading to the second query without making true the first | |
| loop(query,query) | same as AG((query1 -> EF(query2)) & (query2 -> EF(query1))), i.e. it is always true that if query1 is true query2 can become true and if query2 is true query1 can become true, the system thus oscillates between query1 and query2 | |
| oscil(query) | same as loop(query,!(query)), i.e. the system oscillates between states where query is true and false |
Example: Ei(EF(cycB)) expresses the existence of an initial state such that there exists a path on which at some time point cyclin B is present. Ai(AF(cycB)) expresses that for all initial states and on all paths, cycB is finally present at some time point. Ai(!(E(!(cdc25) U cdk1-cycB))) checks that cdc25 is a checkpoint for the complexation of cdk1 and cycB for all initial states, that is there does not exist a path on which cdc25 is always absent until the complex cdk1-cycB is present.
A query partially describes the state of the system. For instance: cdk1 and MPF and !(cdk7-cycH) represent states of the system where cdk1 and MPF are present and cdk7-cycH is absent, the rest being undetermined.
Patterns are used in the commands described in the next section for searching molecules or reaction
rules of a particular kind in a model. They can be used also with some
restrictions to define the
initial state or reaction rules by writing sets of similar reaction
rules in a concise manner with a single reaction rule pattern. Finally they
are used to specify a range of useful rules during the learning process.
Patterns introduce the special character ? and variables noted with a name beginning with '$', to denote unspecified parts of a molecule. The parts of the molecules matched by ? or a variable can be empty.
object_pattern = molecule_pattern |
abstract
variable = ? | $name
simple_pattern = name | variable
Example: cdk1~? is a pattern representing cdk1 itself and any phosphorylated form of it. cdk1~{tyr15,?} or equivalently cdk1~{tyr15}~? represents any form of cdk1 phosphorylated on site tyrosine 15 at least. cdk1-? represents cdk1 itself and any complex containing cdk1. cdk1~?-? is a pattern representing all forms of cdk1, phosphorylated and/or complexed.
Reaction patterns have the same syntax as reaction rules except that
object patterns are allowed in place of objects and there is an
optional where
construct
which is used to constrain the variables of the pattern. In a solution
pattern, the character ? can match an empty solution.
When used to define the initial state or reaction rules, like in a file or in the command add_rule (see below), variables that appear in a pattern have to be given a range of possibles. This is done through the use of constraints in the rule and of declarations.
The in constraint has its natural meaning. In the constraints the values all and all_simple refer to all the molecules already known by the system (i.e. used before), limited to non-complexed and non-phosphorylated forms in the second case. This means that using these constructs makes the semantics of rule dependant on the context (order of rules, models parts loaded before, etc.), and is thus recommended only to advanced users.
The diff and not in have the opposite meaning (the variable cannot take the value or a value in the set. The second case of not in corresponds to the absence of some phosphorylation site (the name) in a set of phosphorylated sites (the variable).
The phos_form constraint, forces the variable and the object_pattern to differ only by some phosphorylations. more_phos_than forces the variable to be more phosphorylated than the object_pattern.
The submol constraint forces the variable to be a sub-molecule of the pattern, the has_simple_mol_in_common constraint imposing a common sub-molecule.
Note:
When using constraints relating a variable to an object_pattern, variables
appearing in that object_pattern have to be constrained beforehand.
Moreover, to define a variable range (for instance to add a
rule), one has to use for each variable at least one positive
constraint: in or sub_mol, and if the adequate declaraion
exists, phos_form or more_phos_than.
Example: (cdk1~?-? + ?
=> ?) is a pattern for all rules reacting with any form
(phosphorylated or complexed) of cdk1. (? =[Myt1]=> ?)
is a pattern for all rules involving the catalyst Myt1. This pattern
will match all the rules having Myt1 in their left and right-hand
sides,
even if they were not written with the catalyst notation.
This pattern cannot be used to define reaction rules as it can match
unspecified (unconstrained) molecules.
Example:
cdk46~$P +
$cycD => cdk46~$P-$cycD
where
$cycD in {DMP1~?-cycD~?, cycD~?}.
is a reaction pattern which will match all complexation rules of all
phosphorylated forms of cdk46 with all phosphorylated forms of cycD or
DMP1-cycD.
This pattern can be used to define reaction rules. All possible
phosphorylated forms of molecules cdk46, DMP1 and cycD have to be
declared however (see section below) in order to constrain the variable
$P and the different occurrences of ?.
Note that if the three molecules in this pattern had three
phosphorylation sites each, they would have 2³=8 forms each, thus
the pattern would specify 8x(8x8+8)=576 reaction rules! Reaction
patterns must thus be used with care for specifying reaction rules and
only the relevant phosphorylation sites should be declared for a
molecule.
Example:
cdk1~$P-$cyc
=[Wee1]=> cdk1~$P~p2-$cyc
where p2 not
in $P
and $cycA in {cycA, cks1-cycA}
and $cyc in { $cycA, cycB, cycB-cks1}.
is a reaction pattern which specifies the phosphorylation on site p2 of
several cdk1 complexes not already phosphorylated on p2.
This pattern can be used to define reaction. If cdk1 is declared with 3
phosphorylation sites {p1,p2,p3}, there are 4 forms not containing p2
to
combine with the 4 possibilities for the variable $cyc. This reaction
pattern thus expands into 16 reaction rules.
The set of all possible phosphorylated forms of a given molecule can
be declared by associating to the molecule the set of sets of sites
which can be phosphorylated. In these declarations, the function
parts_of can be used to denote all subsets of a set.
Example: the
declaration
declare
cdk2~{{},{p1},{p2},{p1,p2}}.
specifies that cdk2 has two phosphorylation sites p1, p2 and that all
combinations are possible. This declaration is equivalent to
declare
cdk2~parts_of({p1,p2}).
The purpose of declarations is to constrain implicitly the domain of
phosphorylation variables which appear in molecule patterns. Only
phosphorylation sites can be declared. In a reaction pattern used for
defining reaction rules, the other variables, such as complexation
variables, must thus be explicitly
constrained in the where part of the pattern.
Declarations can be entered also at top-level with the command declare,
can be listed with the command declarations
and can be cleared with the command clear_rules.
Note:
Since the aim of declarations is to define the possible phosphorylation sites,
any rule where a molecule appears in a form contradictory with its declaration
will be rejected. Since most molecules appear also in non-phosphorylated form,
a warning will be given when a declaration does not include the empty set (but
you can safely ignore it if you know what you are doing).
The set of all possible localisations for a given molecule can be declare by
associating to the molecule the set of its possible locations or its single
possible localisation.
Example: The
location declarations localise p53::nucleus. and
localise p53::[nucleus]. denote the same localised molecule.
localise Mdm2::[cytoplasm,nucleus]. denotes that the Mdm2 molecule can be localised in the cytoplasm or in the nucleus, but in no others specified locations. This molecule could appear in the rules in the localised forms
Mdm2::cytoplasm or
Mdm2::nucleus or in the non-localised form
Mdm2.
To use a localised form of a molecule in a rule, this form must have been
previously defined with the command
localise. Once the locations of a molecule have been specified, you can add new possible locations with the command
extend_localise , witch is used with the same nomenclatur as localise .
All the localised molecules can be listed with the command list_locations
and cleared with the command clear_rules.
When several cell compartments are considered, this compartments may have different volumes. In this very common case, transport rules occure between compartments whose volume differ eatch other and volume ratio must be taken into account in all kinetic models.
Example: The volume ratio between nucleus and cytoplasm (nuclear volume/cytoplasmic volume) is currently above 1/15. It can defined with the command volume_ratio (15,nucleus),(1,cytoplasm). The volume ratio will automaticaly be considered during numerical simulations for every rules recognized as tranport rules, i.e. every reactions whose at least one reactant and one product are localised in different compartements. If no volume ratio are defined, the default ratio is set to unity (that means all the compartments have the same volume) and a warning is signaled.
All the volume ratio can be listed with the command list_volumes.
3. BIOCHAM files
The name of a BIOCHAM file must be suffixed by .bc. BIOCHAM files may
contain:
4. BIOCHAM commands at top-level
The command biocham
starts the BIOCHAM interpreter. Some BIOCHAM files can be passed as
arguments to the command to be loaded immediately.
Under the BIOCHAM prompt, the following
commands are available, they must be terminated by a dot. Several
commands can be executed by separating them with a ','. Some commands
take a list as argument. Lists are noted between square brackets, e.g.
[a,b,c]. The previous typed commands can be retrieved by pressing
ctrl-p. The commands can be automatically completed by pressing on the
tabulation key. A command can be interrupted by ctrl-c. To quit the
interpreter type quit.
Loading
and exporting files:
load_biocham(file).
the reaction rules of the BIOCHAM file are loaded and taken as current set of rules. The initial state is initialized according to the declarations in the file.
export_biocham(file).
saves the current BIOCHAM set of rules and initial state in a BIOCHAM file.
expand_biocham(file).
saves the current BIOCHAM set of rule instances and initial state in a
BIOCHAM file (all reaction rule patterns are expanded).
export_param(file).
saves the current initial state, macros and the value of parameters in a BIOCHAM file.
export_dot(file).
exports the current BIOCHAM set of rules and initial state into a .dot file. That file can then be used to generate pictures of the interaction map, with for instance tools of the Graphviz suite.
For instance typing the command
dot
-Tps file.dot > file.ps
in a terminal, generates a PNG image of the map, like the one shown
below (for EXAMPLES/MAPK/mapk.bc):
export_dot(file,
list_of_options).
exports the current BIOCHAM set of rules and initial state into a .dot file (see above). The list of options can contain: init_up, to force the molecules present in the initial state to be shown on the top of the image (see example above); mod_double, to show enzymes catalyzing a reaction or modifiers of some kind (i.e. both input and output) with a double arrow (input/output) instead of a simple dashed arrow; col_path, to color in red the latest pathway returned by a CTL query; double_size, to produce A3 sized graphs instead of the default A4.
Examples of invocation:
export_dot(test,[init_up,mod_double,col_path]).
to use all of the options, and save the result in the file test.dot;
export_dot(test,[init_up]).
to obtain an image like that of the above example (after using dot).
export_nusmv(file).
exports the current BIOCHAM set of rules
and initial state in a SMV file. Notations for molecules are translated
by replacing the characters _ ( ) } , and ~{ by respectively __ _L _R
_r
_l _c. Furthermore the names B, F, G, H, O, S, T, V, W, Y, Z are
prefixed by _.
export_lotos(file).
export_sbml(file).
load_sbml(file),
add_sbml(file).
export_ode(file).
About
reaction rules:
list_rules.
list_rules(reaction_pattern).
lists the current set of rules matching
the given pattern.
Example: list_rules(?
=>
cycA~?-? + ?) will list all the rules containing any form of
cyclin A in the right hand side.
expand_rules.
expand_rules(reaction_pattern).
lists all the instances of the current set of rules matching a given pattern.
add_rule(reaction_pattern), add_rule(set_of_reaction_patterns).
adds reaction rules to the current set of rules.
delete_rules(reaction_pattern), delete_rules(set_of_reaction_patterns).
deletes all reaction rules matching one pattern from the current set of rules.
clear_rules.
rule(integer), rule(name).
shows the n-th rule (after expansion), or
the
rule(s) with the corresponding name (i.e. of the form name
:...).
pathway(list_of_integers), pathway.
shows the rules of corresponding number (after expansion). If no list is given and the trace corresponding to a query has been generated, then use that trace as list of rules.
show_kinetics.
list all kinetic information (initial
concentration and evolution from rules) for each molecule (one line per
molecule).
About the molecules contained in
the rules:
declare(molecule_declaration).
declares the set of
all phosphorylated forms of a molecule.
declarations.
lists all the declarations of
phosphorylated forms of molecules.
list_molecules.
lists the molecules contained in all
instances of the current set of rules.
list_molecules(object_pattern), list_molecules(set_of_object_patterns).
lists the objects (appearing in the
instances of the current set of rules) and matching one of the given
object patterns.
check_molecules.
About locations and volumes:
localise(localised_molecule).
defines (for the first time) the set of possible locations of a molecule.
extend_localise(localised_molecule).
defines the set of possible locations of a molecule witch have already been localised.
list_locations.
lists all the localised molecules.
volume_ratio(list_of_volume_ratio).
defines the volume ratio . The list_of_volume_ratio is written as couples (integer,location_name), witch are separed with a comma. If no volume ratio is defined for one or several compartments, the default corresponding ratio is set to unity and a warning is signaled.
list_volumes.
lists all the previously defined volume ratio.
About the initial state:
The initial state can be partially defined by giving the list of
objects which are present in the initial state and the list of objects
which are absent from the initial state. The other objects can be
present or absent. When nothing is precised, present objects are given
a
default concentration of 1.
initial_state.
clear_initial_state.
present(object_pattern),
present(set_of_object_patterns).
all objects (appearing in the instances of the current set of rules) and matching one of the given object patterns are made present in the initial state.
Example: present({cycA,
cdk1, cycE~?}).
present(object_pattern,float),
present(object_pattern,name).
gives to the given object the given initial concentration, which might be a parameter name (see below).
Example: present(MAPK,0.3).
absent(object_pattern),
absent(set_of_object_patterns).
all objects (appearing in the instances of the current set of rules) and matching one of the given object patterns are made absent from the initial state.
undefined(object_pattern),
undefined(set_of_object_patterns).
all objects (appearing in the instances
of the current set of rules) and matching one of the given object
patterns are made possibly present or absent in the initial state.
make_present_not_absent.
make_absent_not_present.
parameter(name,float).
parameter(name).
list_parameters.
macro(name,simple_kinetics).
macro(name).
list_macros.
About the simulators
For the boolean simulator, the initial state is completely defined as the set of objects declared as present. All the other objects are considered as absent from the initial state. The simulator prints the objects present in the successive states of the simulation. The set of objects which are printed can be specified with patterns. By default all objects are printed.
boolean_simulation,
boolean_simulation(integer).
performs a random simulation up to a given number of transitions (default is 100).
boolean_enumeration.
numerical_simulation, numerical_simulation(number).
performs a numerical simulation (using one of the methods detailed below) up to a given number of time units (default is 20).
simulation_method, simulation_method(name).
Shows the current simulation method and the corresponding details (error, step_size, etc.). If a parameter is given, sets the desired simulation method. The currently available choices are: rk and stiff. The first one is a fourth-order Runge-Kutta method, with or without step doubling (see below), the second one a Rosenbrock method (implicit), with variable step size (this is the default choice).
step_size,
step_size(float).
step_doubling, step_doubling(float), no_step_doubling.
export_plot(file_template).
show_molecules(object_pattern),
show_molecules(set_of_object_patterns).
hide_molecules(object_pattern),
hide_molecules(set_of_object_patterns).
show_macros.
hide_macros.
show_hide.
keep_plot.
set_color(object, integer), set_color(name, integer).
test_plot.
set_xmin(float), set_xmax(float), set_ymin(float),
set_ymax(float).
Note that because BIOCHAM boolean models are highly
non-deterministic, random simulation is often impractical and not
representative of all possible behaviors. It is therefore much more
interesting to query the temporal properties of BIOCHAM models w.r.t.
all possible behaviors in CTL logic.
About
temporal queries
Formula in Computation Tree Logic (CTL) express the temporal properties of a model w.r.t. all possible runs. BIOCHAM queries are evaluated with a model-checker.
nusmv(biocham_query).
evaluates a temporal query using the
model checker NuSMV. The first call to this command compiles the rules
and may take a while.
Examples:
nusmv(Ei(EF(cdk1))).
explains the result of the last query by
producing a witness pathway when this is possible.
like nusmv(biocham_query),why. except that if the query is false, the model checker does not compute the query twice.
fairness_path, no_fairness_path.
nusmv_dynamic_reordering, nusmv_disable_dynamic_reordering
check_reachable(query).
Tests for the reachability of the
corresponding the state query. This is the same as asking: nusmv(Ei(EF(query))).
check_checkpoint(query1,query2).
Tests if the
first state query1 is a necessary step for obtaining the second state
query. This
is the same as asking: nusmv(Ai(!(E(!(query1)
U query2)))).
check_stable(query).
Tests if the state query is stable. This is the same as asking: nusmv(Ai(AG(query))).
check_steady(query).Tests if the state query is steady. This is the same as asking: nusmv(Ei(EG(query))).
check_loop(query1,query2).
check_oscil(query).
This is the same as loop(query,!(query)), i.e. nusmv(Ai(AG((query -> EF(!(query))) & (!(query)->EF(query))))).
Learning rules from a CTL specification
Since the rule language and property language have both been formalized, one
can use Machine Learning techniques to try and find completions or
modifications of a model such that a given set of CTL properties (the
specification) representing the expected behavior of the model (as
observed for instance by wet lab experiments) is true.
This is the very place where reaction_shortcuts are useful to define
the reaction patterns of interest.
add_spec(biocham_query). add_specs(set_of_biocham_queries).
Model-checking numerical models with LTL and constraints over reals
For models with kinetic information, it is also possible to query a trace in LTL. LTL formulae have the following syntax:
ltl_query = | ||
condition | (*) |
|
| (ltl_query) | ||
| F(query) | finally | |
| G(query) | globally | |
| X(query) | next | |
| (ltl_query)U(ltl_query) | until | |
| !(ltl_query) | negation | |
| ltl_query & ltl_query | conjunction | |
| ltl_query '|' ltl_query | disjunction | |
| ltl_query xor ltl_query | exclusive or | |
| ltl_query -> ltl_query | implication | |
| ltl_query <-> ltl_query | equivalence | |
| oscil(molecule, int) | see below | |
| cross(molecule, molecule, int) | see below |
(*): For LTL queries, derivatives of concentrations can also appear in the usual conditions with the syntax: d([Molecule])/dt.
The two special queries oscil(M,n) and cross(M1,M2,n) are abbreviations for: n successive repetitions of, in the first case alternance between positive and negative in the sign of d([X])/dt, in the second case crossings between [M1] and [M2].
Example: G(([RAF-RAFK] >= [RAF~{p1}]) U (d([RAF])/dti < 0.3)) expresses that all along the trace, the concentration of the RAF-RAFK complex is greater than that of phosphorylated RAF, until the derivative of the concentration of RAF becomes lower than 0.3.
Example: oscil(cycB,3) expresses the fact that, along the trace, the concentration of cycB goes up and down thrice.
Note that the notion of next state refers to the following
state as
computed by the (variable step size) simulation, and thus does not
necessarily
imply real-time neighborhood.
A trace being seen as a linear succession of states, it is possible to
use the
following command:
trace_check(ltl_query).
evaluates an LTL query on the latest trace produced (if none exists, one will be generated by numerical_simulation).
Note that since only the states effectively computed during the
simulation
are checked, a true notion of "equality" would not be meaningful
(always
false), an approximation is thus used but it is an under-approximation
(A = B implies that the two values are really equal at some point, maybe
between
two simulation points, but the converse does not hold).
The notion of next state, using the X operator can also
be a
little surprising for the same reasons: it relates to the next
calculated state.
In the same spirit as what is done for rule learning w.r.t. CTL specifications, one can use the above LTL model-checking techniques to automatize the search for parameter values satisfying some LTL specifications.
trace_get(list_of_name,list_of_pairs_of_floats,float,ltl_query,number).
Tries to find a value for the parameters of given names, between the min and max values given, with the given number of iterations (for each parameter, i.e. if 10 is the given numer 10 to the power of the numer of parameters iterations might happen!), such that the ltl_query representing a specification of the system is true for siulations up to the given time.
Example: trace_get([k],[(0,10)],100,F(([X]
> 0.3) & F(d([X])/dt =< 0)),20).
This command will search for a
value of parameter 'k' between 0 and 10 such that, before time 20, [X]
gets greater than 0.3 and later [X] decreases. The total number of iterations
will not be more than 100 to the power of 1, i.e. 100.
Since initial concentrations can be defined as parameters, it is possible to combine kinetic parameter search and initial concentration search.
Some secondary commands are also supplied to get information out of numerical traces.
get_max_from_trace(molecule),
get_min_from_trace(molecule).
Prints out the maximal/minimal value of the concentration of the given molecule for the current trace, and the corresponding time value.
set_init_from_trace(float).
Takes the closest calculated time point in the current simulation trace, and sets the initial state according to values at that time point. This can actually change some parameter values if certain molecules have their initial concentration set to a parameter.
Bibliography
[1] Modeling and querying biomolecular interaction networks by
Nathalie Chabrier, Marc Chiaverini, Vincent Danos, François
Fages
and Vincent Schächter.
To appear in Theoretical Computer Science, 2004. Available as pdf.
[2] The Biochemical Abstract Machine BIOCHAM by Nathalie
Chabrier, François Fages and Sylvain Soliman.
Proceedings of the Second International Workshop on Computational
Methods in Systems Biology CMSB'04, Springer-Verlag, Paris, France, May
2004. Available as ps.gz.
[3] Symbolic model checking of biochemical networks by
Nathalie Chabrier and François Fages.
Proceedings of the First International Workshop on Computational
Methods in Systems Biology CMSB'03, Springer-Verlag LNCS 2602,
pp.149-162, Rovereto, Italy, March 2003. Available as ps.
[4] Symbolic model-checking for biochemical systems: logic programming steps towards formal
biology (invited tutorial) by François Fages.
International Conference on Logic Programming ICLP'03, Mumbai, India.
December 2003. MIT Press. Slides available as ppt.
[5] Process Calculi and Biology of Molecular Networks. Cooperative
Research Action ARC CPBIO. http://contraintes.inria.fr/cpbio/
[6] CMBSlib: a
library for comparing formalisms and models of biological systems.
by Sylvain Soliman and François Fages.
Proceedings of the Second International Workshop on Computational
Methods in Systems Biology CMSB'04, Springer-Verlag, Paris, France, May
2004. Available as ps.gz.