Difference between revisions of "Metro Data Model"

From Funtoo Linux
Jump to: navigation, search
(I placed this in the Metro category.)
Line 1: Line 1:
 +
= Metro Data Model =
 +
 +
== Goals ==
 +
 
The Metro Data Model has been designed to provide you with an optimal way to organize build data.
 
The Metro Data Model has been designed to provide you with an optimal way to organize build data.
  
Line 16: Line 20:
  
  
= First Look =
+
== First Look ==
  
 
Here is some sample Metro data:
 
Here is some sample Metro data:
Line 75: Line 79:
 
Above, we have used two new parser features. Inside <tt>[section path/mirror]</tt>, we can define the <tt>path/mirror</tt> element itself by using a blank element name, followed by a <tt>:</tt>. The next parser feature we see above is that we can use <tt>$[]</tt> to reference the value of the <tt>path/mirror</tt> value. <tt>$[]</tt> will always reference the value of the element specified in the section annotation. Also note that as of Metro 1.1, <tt>$[:]</tt> can be used as an alternate form of <tt>$[]</tt>. In addition, as of Metro 1.2.4, <tt>$[:foo]</tt> can be used as an alternate form of <tt>$[section-name/foo]</tt>.
 
Above, we have used two new parser features. Inside <tt>[section path/mirror]</tt>, we can define the <tt>path/mirror</tt> element itself by using a blank element name, followed by a <tt>:</tt>. The next parser feature we see above is that we can use <tt>$[]</tt> to reference the value of the <tt>path/mirror</tt> value. <tt>$[]</tt> will always reference the value of the element specified in the section annotation. Also note that as of Metro 1.1, <tt>$[:]</tt> can be used as an alternate form of <tt>$[]</tt>. In addition, as of Metro 1.2.4, <tt>$[:foo]</tt> can be used as an alternate form of <tt>$[section-name/foo]</tt>.
  
= Collect Annotations =
+
== Collect Annotations ==
  
 
Many scripting languages have the notion of an &quot;include&quot; file, or &quot;importing&quot; additional data from a remote file. Metro has this concept as well, but it is implemented in a somewhat different way. You can tell Metro to include data from another file by using a ''collect annotation''.
 
Many scripting languages have the notion of an &quot;include&quot; file, or &quot;importing&quot; additional data from a remote file. Metro has this concept as well, but it is implemented in a somewhat different way. You can tell Metro to include data from another file by using a ''collect annotation''.
Line 84: Line 88:
 
Now, we called these things &quot;collect annotations&quot; for a reason - in Metro, they work slightly differently than most languages implement <tt>include</tt> and <tt>import</tt>. The main difference is that in Metro, a collect annotation does not happen right away. Instead, Metro will add the file to be collected (in this case, that would be the file <tt>/usr/lib/metro/myfile.txt</tt>, or whatever <tt>$[path/metro]/myfile.txt</tt> evaluates to) to a ''collection queue''. This means that Metro will read in the contents of the file at some point in time, and the data in the file will be available to you by the time the parsing is complete. But because Metro doesn't care about the order in which data is defined, it doesn't have the same concept of &quot;read in the data - right now!&quot; that an include or import statement does in other languages.
 
Now, we called these things &quot;collect annotations&quot; for a reason - in Metro, they work slightly differently than most languages implement <tt>include</tt> and <tt>import</tt>. The main difference is that in Metro, a collect annotation does not happen right away. Instead, Metro will add the file to be collected (in this case, that would be the file <tt>/usr/lib/metro/myfile.txt</tt>, or whatever <tt>$[path/metro]/myfile.txt</tt> evaluates to) to a ''collection queue''. This means that Metro will read in the contents of the file at some point in time, and the data in the file will be available to you by the time the parsing is complete. But because Metro doesn't care about the order in which data is defined, it doesn't have the same concept of &quot;read in the data - right now!&quot; that an include or import statement does in other languages.
  
== Conditional Collect Annotations ==
+
=== Conditional Collect Annotations ===
  
 
Metro no longer officially supports conditional collect annotations; however, simple collect annotations can be used to make conditional decisions in Metro, as follows:
 
Metro no longer officially supports conditional collect annotations; however, simple collect annotations can be used to make conditional decisions in Metro, as follows:
Line 96: Line 100:
 
Using the <tt>:zap</tt> modifier, the entire collect argument will be replaced with the empty string if <tt>$[snapshot/type]</tt> is undefined. If Metro is asked to collect an empty string, it will not throw an exception. So this is a handy way to conditionally disable collection of a file. But please note that for all non-null values of <tt>$[snapshot/type]</tt>, a corresponding file must exist on disk in <tt>./snapshots/</tt> or Metro will throw an exception. <tt>:zap</tt> is explained in more detail in the &quot;Special Variable Expansion&quot; section, below.
 
Using the <tt>:zap</tt> modifier, the entire collect argument will be replaced with the empty string if <tt>$[snapshot/type]</tt> is undefined. If Metro is asked to collect an empty string, it will not throw an exception. So this is a handy way to conditionally disable collection of a file. But please note that for all non-null values of <tt>$[snapshot/type]</tt>, a corresponding file must exist on disk in <tt>./snapshots/</tt> or Metro will throw an exception. <tt>:zap</tt> is explained in more detail in the &quot;Special Variable Expansion&quot; section, below.
  
= Multi-line elements =
+
== Multi-line elements ==
  
 
Metro supports multi-line elements and they are the foundation of Metro's ''template'' engine. A multi-line element can be defined as follows, by using square brackets to delimit multi-line data:
 
Metro supports multi-line elements and they are the foundation of Metro's ''template'' engine. A multi-line element can be defined as follows, by using square brackets to delimit multi-line data:
Line 121: Line 125:
 
echo Hi There :)
 
echo Hi There :)
 
]</pre>
 
]</pre>
 
=== Commenting out Multi-Line Element References ===
 
 
If you want to disable a multi-line reference, simply prefix it with some text, such as <tt>#</tt>:
 
 
<pre>myscript: [
 
#!/bin/bash
 
# disabling this for now: $[[steps/setup]]
 
echo Hi There :)
 
]</pre>
 
This comes in handy when you are commenting out a portion of code. If the multi-line reference has any non-whitespace text on the same line, then it will not be expanded and will be passed to the output as literal text.
 
 
In contrast, single line element references will always be expanded wherever they are found in the multi-line element, even if they are &quot;commented out&quot; in the script.
 
 
 
=== Multi-Line Element as Single-Line References ===
 
 
In Metro 1.3.1, it is possible to reference a multi-line element using the standard &quot;<tt>$[ ]</tt>&quot; reference. When this is done, the multi-line element is transformed into a single-line element by concatenating lines together, separated by whitespace. This allows multi-line elements to be used to define single-line data, as follows:
 
 
<pre>packages: [
 
    sys-apps/portage
 
    sys-devel/gcc
 
    dev-util/git
 
]
 
 
myscript: [
 
    emerge -C $[packages]
 
]</pre>
 
In the above example, the <tt>$[packages]</tt> reference will expand to the single-line value <tt>sys-apps/portage sys-devel/gcc dev-util/git</tt>.
 
 
== Special Variable Expansion ==
 
 
As of Metro 1.1, Metro's template engine has a few new features related to variable expansion. The first is the ability to expand a single-line variable using a <tt>?</tt>, as follows:
 
 
<pre>foo: [
 
#!/bin/bash
 
if [ &quot;$[path/mirror?]&quot; = &quot;yes&quot; ]
 
then
 
    echo &quot;path/mirror is defined&quot;
 
fi
 
]</pre>
 
With a trailing <tt>?</tt>, a single-line variable will evaluate to a literal <tt>yes</tt> if it is defined, and a literal <tt>no</tt> if it is not defined.
 
 
Another special variable expansion uses a trailing <tt>:zap</tt> suffix for any single-line variable, and is used as follows:
 
 
<pre>foo: [
 
CFLAGS=$[portage/CFLAGS:zap]
 
]</pre>
 
Above, if <tt>portage/CFLAGS</tt> is defined, then the variable will expand to the value of <tt>portage/CFLAGS</tt>. However, if <tt>portage/CFLAGS</tt> is not defined, then the ''entire line'' will be omitted from the output.
 
 
== Controlling Parser Behavior ==
 
 
Normally, the following code would cause the parser to throw an exception if <tt>path/mirror</tt> were undefined:
 
 
<pre>foo: [
 
#!/bin/bash
 
if [ &quot;$[path/mirror?]&quot; = &quot;yes&quot; ]
 
then    echo &quot;$[path/mirror]&quot;fi
 
]</pre>
 
The first variable expansion, <tt>$[path/mirror?]</tt>, would not throw an exception if <tt>path/mirror</tt> was not defined - instead, the variable would expand to <tt>n
 
o</tt>. However, on the second expansion, <tt>$[path/mirror]</tt>, Metro will abort with an error because the variable has not been defined. In the above code, this is pr
 
obably not what you want, so we can temporarily put Metro's parser into &quot;lax&quot; mode as follows:
 
 
<pre>[option parse/lax]
 
 
foo: [
 
#!/bin/bash
 
if [ &quot;$[path/mirror?]&quot; = &quot;yes&quot; ]
 
then
 
    echo &quot;$[path/mirror]&quot;
 
fi]
 
[option parse/strict]</pre>
 
The <tt>[option parse/lax]</tt> and <tt>[option parse/strict]</tt> annotations will temporarily enable &quot;lax&quot; parsing mode when expanding the variables defined i
 
nside <tt>foo</tt>. When Metro is in &quot;lax&quot; mode and a single-line variable is not defined, it is replaced with a dummy string, which will look something like a
 
literal <tt>[BLANK var=varname]</tt>.
 
 
== Metro 1.2+ Extensions ==
 
 
Metro 1.2 has an extension which allows one to avoid using the coarse-grained <tt>[option parse/lax]</tt> annotation. Rather than using the <tt>[option parse/lax]</tt> and <tt>[option parse/strict]</tt> annotations to delimit variables that contain optionally undefined data, one can use the <tt>:lax</tt> suffix instead, to indicate that the parser should apply &quot;lax&quot; parsing rules when expanding that particular element. For example, the code in the section above could be written as follows in Metro 1.2:
 
 
<pre>foo: [
 
#!/bin/bash
 
if [ &quot;$[path/mirror?]&quot; = &quot;yes&quot; ]
 
then
 
    echo &quot;$[path/mirror:lax]&quot;
 
fi
 
]</pre>
 
This code will not throw an exception, even if <tt>path/mirror</tt> is not defined. In the case that <tt>path/mirror</tt> is not defined, <tt>$[path/mirror]</tt> will be replaced with a dummy value.
 
 
=== Using Strict Parsing ===
 
 
In general, it's recommended that you use strict parsing behavior whenever possible, as it will allow Metro to detect potential error conditions. Use <tt>[option parse/lax]</tt> and (with Metro 1.2+) the finer-grained <tt>:lax</tt> suffix to enable lax parsing behavior only when necessary.
 
 
[[Category:Metro]]
 

Revision as of 22:36, 17 January 2011

Contents

Metro Data Model

Goals

The Metro Data Model has been designed to provide you with an optimal way to organize build data.

Here are the primary goals for the data model:

  1. Provide useful ways to organize data
  2. Use mechanisms and syntax that maximize maintainability of the data over time
  3. Reduce and (ideally) eliminate side-effects at every opportunity

To attain these goals, I've used a functional data model, where an element (variable) can be defined only once, and cannot be redefined.

By default, the Metro parser operates in "strict" mode, which means that it will throw an error if a variable has been referenced that has not been defined. This "strict" mode is actually very useful in catching errors that might otherwise go unnoticed and result in broken builds.

In addition, the Metro parser was designed so that the order in which data elements are defined is not important, even if they reference one another. This was done to eliminate side-effects related to data ordering, where changing the order in which things are defined in a file can change the behavior of or break your code.

Versions of Metro prior to 1.4 contained limited support for conditional logic. After some experimentation, I've decided that the conditional support is not necessary, and it is not used by Metro 1.4. However, support for conditionals still exist in the parser, but will be removed when the parser is rewritten.


First Look

Here is some sample Metro data:

path: /usr/bin

Above, we have defined the element path to have the value /usr/bin. path is a single-line element, and the Metro parser takes care of trimming any trailing whitespace that may be on the line. You can also define single-line elements that have values that consist of multiple whitespace-separated values:

options: ccache replace

Sometimes, you need to define an element but leave it blank. To do this, don't specify any values after the colon:

options:

In Metro, the / character is used to delineate various classes of elements, as follows:

path/mirror: /home/mirror/linux
path/mirror/snapshot: /home/mirror/linux/snapshots
path/metro: /usr/lib/metro

Above, we see the proper Metro convention for specifying paths. Each path has a prefix of path/. We have a path/mirror element but also have a path/mirror/snapshot element. The / is used to organize our data into logical groups. This is not enforced by Metro but is presented here as a best practice.

The data above could also be represented using a section annotation, as follows:

[section path]

mirror: /home/mirror/linux
mirror/snapshot: /home/mirror/linux/snapshots
metro: /usr/lib/metro

Above, the [section path] line is a section annotation, and it tells the Metro parser that the path/ prefix should be applied to all following data elements. A section annotation is in effect until another section annotation is encountered by the parser.

While our data above is getting more organized, there is some redundancy in our data, which generally isn't a good thing. Here's an example of how to make our data a bit more compact:

[section path]

mirror: /home/mirror/linux
mirror/snapshot: $[path/mirror]/snapshots
metro: /usr/lib/metro

Above, we have used an element reference of $[path/mirror] to reference our path/mirror element. What this means is that path/snapshot will have a value of /home/mirror/linux/snapshots.

Also, it's worth pointing out that we could just have well written:


[section path]

mirror/snapshot: $[path/mirror]/snapshots
mirror: /home/mirror/linux
metro: /usr/lib/metro

In other words, it's perfectly OK to use the element reference of $[path/mirror] on a line before the actual definition of path/mirror. Metro doesn't care about the order in which data is defined.

Metro provides another way to organize your data in an efficient way. Supposing that you had a lot of path/mirror-related data, then it might be useful to organize your data as follows:

[section path]

metro: /usr/lib/metro

[section path/mirror]

: /home/mirror/linux
snapshot: $[]/snapshot
source: $[]/$[source/subarch]/funtoo-$[source/subarch]-$[source/version]/$[source/name].tar.bz2

Above, we have used two new parser features. Inside [section path/mirror], we can define the path/mirror element itself by using a blank element name, followed by a :. The next parser feature we see above is that we can use $[] to reference the value of the path/mirror value. $[] will always reference the value of the element specified in the section annotation. Also note that as of Metro 1.1, $[:] can be used as an alternate form of $[]. In addition, as of Metro 1.2.4, $[:foo] can be used as an alternate form of $[section-name/foo].

Collect Annotations

Many scripting languages have the notion of an "include" file, or "importing" additional data from a remote file. Metro has this concept as well, but it is implemented in a somewhat different way. You can tell Metro to include data from another file by using a collect annotation.

A collect annotation looks like this:

[collect $[path/metro]/myfile.txt]

Now, we called these things "collect annotations" for a reason - in Metro, they work slightly differently than most languages implement include and import. The main difference is that in Metro, a collect annotation does not happen right away. Instead, Metro will add the file to be collected (in this case, that would be the file /usr/lib/metro/myfile.txt, or whatever $[path/metro]/myfile.txt evaluates to) to a collection queue. This means that Metro will read in the contents of the file at some point in time, and the data in the file will be available to you by the time the parsing is complete. But because Metro doesn't care about the order in which data is defined, it doesn't have the same concept of "read in the data - right now!" that an include or import statement does in other languages.

Conditional Collect Annotations

Metro no longer officially supports conditional collect annotations; however, simple collect annotations can be used to make conditional decisions in Metro, as follows:

[collect ./snapshots/$[snapshot/type]]

Above, Metro will collect from a file based on the value of the $[snapshot/type] element. This allows for varying definitions of elements to exist dependent on the value of $[snapshot/type].

Above, Metro will raise an exception if $[snapshot/type] is undefined or has a value that does not map to a file on disk. If it is possible that $[snapshot/type] may not be defined, use the following format:

[collect ./snapshots/$[snapshot/type:zap]]

Using the :zap modifier, the entire collect argument will be replaced with the empty string if $[snapshot/type] is undefined. If Metro is asked to collect an empty string, it will not throw an exception. So this is a handy way to conditionally disable collection of a file. But please note that for all non-null values of $[snapshot/type], a corresponding file must exist on disk in ./snapshots/ or Metro will throw an exception. :zap is explained in more detail in the "Special Variable Expansion" section, below.

Multi-line elements

Metro supports multi-line elements and they are the foundation of Metro's template engine. A multi-line element can be defined as follows, by using square brackets to delimit multi-line data:

myscript: [
#!/bin/bash
echo $*
]

The terminating closing square bracket should be on a line all by itself.

One of the very useful things about multi-line elements is that they support Metro element references:

myscript: [
#!/bin/bash
echo Metro's path/metro setting is $[path/metro].
]

In the above multi-line element, the $[path/metro] reference will be expanded to contain the appropriate value of the element. It is possible to expand single-line elements inside multi-line elements simply by referencing them using a dollar sign and square brackets.

Metro also allows you to expand multi-line elements inside other multi-line elements. Here's an example of how that works:

myscript: [
#!/bin/bash
$[[steps/setup]]
echo Hi There :)
]
Personal tools
Namespaces

Variants
Actions
Categories
Toolbox
Stuff