Difference between pages "Funtoo Filesystem Guide, Part 1" and "ConsoleOutput MediaWiki Extension"

From Funtoo
(Difference between pages)
Jump to navigation Jump to search
 
 
Line 1: Line 1:
{{Article
The ConsoleOutput MediaWiki extension was created by Daniel Robbins to provide highlighting of user input for interactive terminal session blocks. To use it, surround user input with  <tt>&lt;console&gt;</tt> opening and closing tags. This tag works similarly to a <tt>&lt;pre&gt;</tt> tag, and preserves output formatting in the quoted text.
|Subtitle=Journaling and ReiserFS
|Author=Drobbins
|Next in Series=Funtoo Filesystem Guide, Part 2
}}
=== What's in Store ===
The purpose of this series is to give you a solid, practical introduction to Linux's various new filesystems, including ReiserFS, XFS, JFS, GFS, ext3 and others. I want to equip you with the necessary practical knowledge you need to actually start using these filesystems. My goal is to help you avoid as many potential pitfalls as possible; this means that we're going to take a careful look at filesystem stability, performance issues (both good and bad), any negative application interactions that you should be aware of, the best kernel/patch combinations, and more. Consider this series an "insider's guide" to these next-generation filesystems.


So, that's what's in store. But to begin this series, I'm going to diverge from this plan for just one article and prepare you for the journey ahead. I'll cover two topics very important to the Linux development community -- journaling, and the design vision behind ReiserFS. Journaling is very important because it's a technology that we've been anticipating for a long time, and it's finally here. It's used in ReiserFS, XFS, JFS, ext3 and GFS. It's important to understand exactly what journaling does and why Linux needs it. Even if you have a good grasp of journaling, I hope that my journaling intro will serve as a good model for explaining the technology to others, something that'll be common practice as departments and organizations worldwide begin transitioning to these new journaling filesystems. Often, this process begins with a "Linux guy/gal" such as yourself convincing others that it's the right thing to do.
To highlight text typed by a user, rather than program output, put a <tt>##i##</tt> input code immediately before user input on each line. This will cause all text from the <tt>##i##</tt> to the end of the line to be highlighted in orange to offset it from the prompt and other other program output.


In the second half of this article, we're going to take a look at the design vision behind ReiserFS. By doing so, we're going to get a good grasp on the fact that these new filesystems aren't just about doing the same old thing a bit faster. They also allow us to do things in ways that simply weren't possible before. Developers, keep this in mind as you read this series. The capabilities of these new filesystems will likely affect how you code your future Linux software development projects.
* {{c|##i##}} - Tag all following text on this line as user input.
* {{c|##b##}} - Highlight the rest of the line in bold.
* {{c|##b##text here##!b##}} - Highlight the text between both markers in bold.
* {{c|##i##text here##!i##}} - Highlight the text between both markers as user input.
* {{c|##g##}} - Green
* {{c|##y##}} - Yellow
* {{c|##bl##}} - Blue
* {{c|##r##}} - Red


=== Understanding Journaling: Meta-data ===
=== Examples ===
As you well know, filesystems exist to allow you to store, retrieve and manipulate data. And, in order to do this, a filesystem needs to maintain an internal data structure that keeps all your data organized and readily accessible. This internal data structure (literally, "the data about the data") is called meta-data. It is the structure of this meta-data that gives a filesystem its particular identity and performance characteristics.
Here are a few examples of the ConsoleOutput extension. First this is how you might typically display {{c|ls}} output, with a particular directory highlighted:


Normally, we don't interact with a filesystem's meta-data directly. Instead, a specific Linux filesystem driver takes care of that job for us. A Linux filesystem driver is specially written to manipulate this maze of meta-data. However, in order for the filesystem driver to work properly, it has one important requirement; it expects to find the meta-data in some kind of reasonable, consistent, non-corrupted state. Otherwise, the filesystem driver won't be able to understand or manipulate the meta-data, and you won't be able to access your files.
<console>
www@www-smw ~/public_html $ ##i##ls
COPYING  LocalSettings.php    api.php  ##b##extensions##!b##  index.php  maintenance          redirect.php    skins              thumb_handler.php5
CREDITS  README                api.php5  images        index.php5  mw-config            redirect.php5  tests              wiki.phtml
FAQ      RELEASE-NOTES-1.19    bin      img_auth.php  languages  opensearch_desc.php  redirect.phtml  thumb.php
HISTORY  StartProfiler.sample  cache    img_auth.php5  load.php    opensearch_desc.php5  resources      thumb.php5
INSTALL  UPGRADE              docs      includes      load.php5  profileinfo.php      serialized      thumb_handler.php
www@www-smw ~/public_html $ ##i##cd extensions/
</console>


=== Understanding Journaling: fsck ===
And here is how you might display a more detailed example of console output, using colors:
This is where <span style="color:green">fsck</span> comes in. When a Linux system boots, <span style="color:green">fsck</span> starts up and scans all local filesystems listed in the system's '''/etc/fstab''' file. <span style="color:green">fsck</span>'s job is to ensure that the to-be-mounted filesystems' meta-data is in a usable state. Most of the time, it is. When Linux shuts down, it carefully flushes all cached data to disk and ensures that the filesystem is cleanly unmounted, so that it's ready for use when the system starts up again. Typically, <span style="color:green">fsck</span> scans the to-be-mounted filesystems and finds that they were cleanly unmounted, and makes the reasonable assumption that all meta-data is OK.


However, we all know that every now and then, something atypical happens, such as an unexpected power failure or system lock-up. When these unfortunate situations occur, Linux doesn't have the opportunity to cleanly unmount the filesystem. When the system is rebooted and <span style="color:green">fsck</span> starts its scan, it detects that these filesystems were not cleanly unmounted and makes a reasonable assumption that the filesystems probably aren't ready to be seen by the Linux filesystem drivers. It's very likely that the meta-data is messed up in some way.
{{console|body=
 
# ##i##bluetoothctl
So, to fix this situation, <span style="color:green">fsck</span> will begin an exhaustive scan and sanity check on the meta-data, correcting any errors that it finds along the way. Once fsck is complete, the filesystem is ready for use. Although some recently-modified data may have been lost due to the unexpected power failure or system lockup, since the meta-data is now consistent, the filesystem is ready to be mounted and be put to use.
[##g##NEW##!g##] Controller 00:02:72:C9:62:65 antec [default]
 
##bl##[bluetooth]##!bl### ##i##power on
=== The Problem With fsck ===
Changing power on succeeded
So far, this may not sound like a bad approach to ensuring filesystem consistency, but the solution isn't optimal. Problems arise from the fact that <span style="color:green">fsck</span> must scan a filesystem's entire meta-data in order to ensure filesystem consistency. Doing a complete consistency check on all meta-data is a time-consuming task in itself, normally taking at least several minutes to complete. Even worse, the bigger the filesystem, the longer this exhaustive scan takes. This is a big problem, because while <span style="color:green">fsck</span> is doing its thing, your Linux system is effectively offline, and if you have a large amount of filesystem storage, your system could be <span style="color:green">fsck</span>-ing for half an hour or more. Of course, standard <span style="color:green">fsck</span> behavior can have devastating results in mission-critical datacenter environments where system uptime is extremely important. Fortunately, there's a better solution.
##bl##[bluetooth]##!bl### ##i##agent on
 
Agent registered
=== The Journal ===
##bl##[bluetooth]##!bl### ##i##scan on
Journaling filesystems solve this <span style="color:green">fsck</span> problem by adding a new data structure, called a journal, to the mix. This journal is an on-disk structure. Before the filesystem driver makes any changes to the meta-data, it writes an entry to the journal that describes what it's about to do. Then, it goes ahead and modifies the meta-data. By doing so, a journaling filesystem maintains a log of recent meta-data modifications, and this comes in handy when it comes time to check the consistency of a filesystem that wasn't cleanly unmounted.
Discovery started
 
##bl##[bluetooth]##!bl### ##i##devices
Think of journaling filesystems this way -- in addition to storing data (your stuff) and meta-data (the data about the stuff), they also have a journal, which you could call meta-meta-data (the data about the data about the stuff).
Device 00:1F:20:3D:1E:75 Logitech K760
 
##bl##[bluetooth]##!bl### ##i##pair 00:1F:20:3D:1E:75
=== Journaling in Action ===
Attempting to pair with 00:1F:20:3D:1E:75
So, what does <span style="color:green">fsck</span> do with a journaling filesystem? Actually, normally, it does nothing. It simply ignores the filesystem and allows it to be mounted. The real magic behind quickly restoring the filesystem to a consistent state is found in the Linux filesystem driver. When the filesystem is mounted, the Linux filesystem driver checks to see whether the filesystem is OK. If for some reason it isn't, then the meta-data needs to be fixed, but instead of performing an exhaustive meta-data scan (like <span style="color:green">fsck</span>) it instead takes a look at the journal. Since the journal contains a chronological log of all recent meta-data changes, it simply inspects those portions of the meta-data that have been recently modified. Thus, it is able to bring the filesystem back to a consistent state in a matter of seconds. And unlike the more traditional approach that <span style="color:green">fsck</span> takes, this journal replaying process does not take longer on larger filesystems. Thanks to the journal, hundreds of Gigabytes of filesystem meta-data can be brought to a consistent state almost instantaneously.
[##y##CHG##!y##] Device 00:1F:20:3D:1E:75 Connected: yes
##r##[agent]##!r## Passkey: 454358
##r##[agent]##!r## Passkey: ##i##4##!i##54358
##r##[agent]##!r## Passkey: ##i##45##!i##4358
##r##[agent]##!r## Passkey: ##i##454##!i##358
##r##[agent]##!r## Passkey: ##i##4543##!i##58
##r##[agent]##!r## Passkey: ##i##45435##!i##8
##r##[agent]##!r## Passkey: ##i##454358##!i##
[##y##CHG##!y##] Device 00:1F:20:3D:1E:75 Paired: yes
Pairing successful
[##y##CHG##!y##] Device 00:1F:20:3D:1E:75 Connected: no
##bl##[bluetooth]##!bl### ##i##connect 00:1F:20:3D:1E:75
Attempting to connect to 00:1F:20:3D:1E:75
[##y##CHG##!y##] Device 00:1F:20:3D:1E:75 Connected: yes
Connection successful
##bl##[bluetooth]##!bl### ##i##quit
[##r##DEL##!r##] Controller 00:02:72:C9:62:65 antec [default]
#
}}


=== ReiserFS ===
To install, make the following modifications to your skin:
Now, we come to ReiserFS, the first of several journaling filesystems we're going to be investigating. ReiserFS 3.6.x (the version included as part of Linux 2.4+) is designed and developed by Hans Reiser and his team of developers at Namesys. Hans and his team share the philosophy that the best filesystems are those that help create a single shared environment, or namespace, where applications can interact more directly, efficiently and powerfully. To do this, a filesystem should meet the performance and feature needs of its users. That way, users can continue using the filesystem directly rather than building special-purpose layers that run on top of the filesystem, such as databases and the like.


=== Small File Performance ===
<syntaxhighlight lang="css">
So, how does one go about making the filesystem more accommodating? Namesys has decided to focus on one aspect of the filesystem, at least initially -- small file performance. In general, filesystems like ext2 and ufs don't do very well in this area, often forcing developers to turn to databases or special organizational hacks to get the kind of performance they need. Over time, this kind of "I'll code around the problem" approach encourages code bloat and lots of incompatible special-purpose APIs, which isn't a good thing.
--- mediawiki-1.19.1/skins/vector/screen.css    2012-06-13 18:22:39.000000000 +0000
+++ public_html/skins/vector/screen.css 2012-08-27 04:34:47.507912892 +0000
@@ -683,10 +683,47 @@
        list-style-image: url(images/bullet-icon.png);
}
-pre {
-       line-height: 1.3em;
+/* ConsoleOutput.php start */
+
+.shell, pre, code, tt, div.mw-geshi {
+        font-size: 12px;
+        font-family: Consolas, 'andale mono','lucida console', monospace;
+}
+
+.shell, pre, div.mw-geshi {
+        background-color: #F8F8FF;
+        line-height: 15px;
+        padding: 10px;
+        border: none;
+        border-top: 2px solid #C6C9E0;
+        border-bottom: 2px solid #C6C9E0;
+        margin: 0;
+        overflow-x: auto;
+        overflow-y: hidden;
+}
+
+.code {
+        color: #666;
+}
+
+.code_input {
+        color: #000;
}
+.code_red {
+        color: #f00;
+}
+
+.code_blue {
+        color: #00f;
+}
+
+.shell_green {
+        color: #080;
+}
+
+/* ConsoleOutput.php end */
+
/* Site Notice (includes notices from CentralNotice extension) */
#siteNotice {
        font-size: 0.8em;
</syntaxhighlight>


Here's an example of how ext2 can tend to encourage this kind of programming. ext2 is good at storing lots of twenty-plus k files, but isn't an ideal technology for storing 2,000 50-byte files. Not only does performance drop significantly when ext2 has to deal with extremely small files, but storage efficiency drops as well, since ext2 allocates space in either one or four k chunks (configurable when the filesystem is created).
Then install the following code in your <tt>extensions</tt> directory and include it with a <tt>require_once( "$IP/extensions/ConsoleOutput.php" );</tt> in <tt>LocalSettings.php</tt>:


Now, conventional wisdom would say that you aren't supposed to store that many ridiculously small files on a filesystem. Instead, they should be stored in some kind of database that runs above the filesystem. In reply, Hans Reiser would point out that whenever you need to build a layer on top of the filesystem, it means that the filesystem isn't meeting your needs. If the filesystem met your needs, then you could avoid using a special-purpose solution in the first place. You would thus save development time and eliminate the code bloat that you would have created by hand-rolling your own proprietary storage or caching mechanism, interfacing with a database library, etc.
<syntaxhighlight lang="php">
<?php
$wgExtensionCredits['validextensionclass'][] = array(
    'name' => 'ConsoleOutput',
    'author' => 'Daniel Robbins',
    'url' => 'https://github.com/danielrobbins/mediawiki-consoleoutput',
    'description' => 'This extension allows you to display colorized console output in mediawiki'
);


Well, that's the theory. But how good is ReiserFS' small file performance in practice? Amazingly good. In fact, ReiserFS is around eight to fifteen times faster than ext2 when handling files smaller than one k in size! Even better, these performance improvements don't come at the expense of performance for other file types. In general, ReiserFS outperforms ext2 in nearly every area, but really shines when it comes to handling small files.
if ( defined( 'MW_SUPPORTS_PARSERFIRSTCALLINIT' ) ) {
        $wgHooks['ParserFirstCallInit'][] = 'consoleOutputSetup';
} else {
        $wgExtensionFunctions[] = 'consoleOutputSetup';
}


=== ReiserFS Technology ===
function consoleOutputSetup( $data )
So how does ReiserFS go about offering such excellent small file performance? ReiserFS uses a specially optimized b* balanced tree (one per filesystem) to organize all filesystem data. This in itself offers a nice performance boost, as well as easing artificial restrictions on filesystem layouts. It's now possible to have a directory that contains 100,000 other directories, for example. Another benefit of using a b*tree is that ReiserFS, like most other next-generation filesystems, dynamically allocates inodes as needed rather than creating a fixed set of inodes at filesystem creation time. This helps the filesystem to be more flexible to the various storage requirements that may be thrown at it, while at the same time allowing for some additional space-efficiency.
{
    global $wgParser;
    $wgParser->setHook('console', 'consoleRender');
    return true;
}


ReiserFS also has a host of features aimed specifically at improving small file performance. Unlike ext2, ReiserFS doesn't allocate storage space in fixed one k or four k blocks. Instead, it can allocate the exact size it needs. And ReiserFS also includes some special optimizations centered around tails, a name for files and end portions of files that are smaller than a filesystem block. In order to increase performance, ReiserFS is able to store files inside the b*tree leaf nodes themselves, rather than storing the data somewhere else on the disk and pointing to it.
function consoleRender($input, $args, $parser)
{
    if (count($args))
    {
        return "<strong class='error'>" .
              "ConsoleOutput: arguments not supported" .
              "</strong>";
    }


This does two things. First, it dramatically increases small file performance. Since the file data and the stat_data (inode) information are stored right next to each other, they can normally be read with a single disk IO operation. Second, ReiserFS is able to pack the tails together, saving a lot of space. In fact, a ReiserFS filesystem with tail packing enabled (the default) can store six percent more data than the equivalent ext2 filesystem, which is amazing in itself.
    # Display < and > as literals, so escape them:


However, tail packing does cause a slight performance hit since it forces ReiserFS to repack data as files are modified. For this reason, ReiserFS tail packing can be turned off, allowing the administrator to choose between good speed and space efficiency, or opt for even more speed at the cost of some storage capacity.
    $input = preg_replace('/>/','&gt;', $input);
    $input = preg_replace('/</','&lt;', $input);


ReiserFS truly is an excellent filesystem. In my next article, I'll guide you through the process of setting up ReiserFS under Linux. We'll also take a close look at performance tuning, application interactions (and how to work around them), the best kernels to use, and more.
    # http://www.perlmonks.org/?node_id=518444
    # See "Matching a pattern that doesn't include another pattern:


== Resources ==
    $input = preg_replace('/##i##((?:(?!##!i##).)*)##!i##/','<span class="code_input">$1</span>', $input);
Be sure to checkout the other articles in this series:
    $input = preg_replace('/##i##(.*)/','<span class="code_input">$1</span>', $input);
* [[Funtoo Filesystem Guide, Part 1|Part 1]]: Journaling and ReiserFS
    $input = preg_replace('/##b##((?:(?!##!b##).)*)##!b##/','<b>$1</b>', $input);
* [[Funtoo Filesystem Guide, Part 2|Part 2]]: Using ReiserFS and Linux
    $input = preg_replace('/##b##(.*)/','<b>$1</b>', $input);
* [[Funtoo Filesystem Guide, Part 3|Part 3]]: Tmpfs and bind mounts
    return "<pre class=\"code\">" . $input . "&lt;/pre>";
* [[Funtoo Filesystem Guide, Part 4|Part 4]]: Introducing Ext3
}
* [[Funtoo Filesystem Guide, Part 5|Part 5]]: Ext3 in action
?>
</syntaxhighlight>


[[Category:Filesystem Guides]]
[[Category:MediaWiki Hacks]]
[[Category:Articles]]
{{ArticleFooter}}

Revision as of 01:11, January 12, 2015

The ConsoleOutput MediaWiki extension was created by Daniel Robbins to provide highlighting of user input for interactive terminal session blocks. To use it, surround user input with <console> opening and closing tags. This tag works similarly to a <pre> tag, and preserves output formatting in the quoted text.

To highlight text typed by a user, rather than program output, put a ##i## input code immediately before user input on each line. This will cause all text from the ##i## to the end of the line to be highlighted in orange to offset it from the prompt and other other program output.

  • ##i## - Tag all following text on this line as user input.
  • ##b## - Highlight the rest of the line in bold.
  • ##b##text here##!b## - Highlight the text between both markers in bold.
  • ##i##text here##!i## - Highlight the text between both markers as user input.
  • ##g## - Green
  • ##y## - Yellow
  • ##bl## - Blue
  • ##r## - Red

Examples

Here are a few examples of the ConsoleOutput extension. First this is how you might typically display ls output, with a particular directory highlighted:

www@www-smw ~/public_html $ ls
COPYING  LocalSettings.php     api.php   extensions   index.php   maintenance           redirect.php    skins              thumb_handler.php5
CREDITS  README                api.php5  images         index.php5  mw-config             redirect.php5   tests              wiki.phtml
FAQ      RELEASE-NOTES-1.19    bin       img_auth.php   languages   opensearch_desc.php   redirect.phtml  thumb.php
HISTORY  StartProfiler.sample  cache     img_auth.php5  load.php    opensearch_desc.php5  resources       thumb.php5
INSTALL  UPGRADE               docs      includes       load.php5   profileinfo.php       serialized      thumb_handler.php
www@www-smw ~/public_html $ cd extensions/

And here is how you might display a more detailed example of console output, using colors:

root # bluetoothctl 
[NEW] Controller 00:02:72:C9:62:65 antec [default]
root ##bl##[bluetooth]##!bl### power on
Changing power on succeeded
root ##bl##[bluetooth]##!bl### agent on
Agent registered
root ##bl##[bluetooth]##!bl### scan on
Discovery started
root ##bl##[bluetooth]##!bl### devices
Device 00:1F:20:3D:1E:75 Logitech K760
root ##bl##[bluetooth]##!bl### pair 00:1F:20:3D:1E:75
Attempting to pair with 00:1F:20:3D:1E:75
[CHG] Device 00:1F:20:3D:1E:75 Connected: yes
root ##r##[agent]##!r## Passkey: 454358
root ##r##[agent]##!r## Passkey: 454358
root ##r##[agent]##!r## Passkey: 454358
root ##r##[agent]##!r## Passkey: 454358
root ##r##[agent]##!r## Passkey: 454358
root ##r##[agent]##!r## Passkey: 454358
root ##r##[agent]##!r## Passkey: 454358
[CHG] Device 00:1F:20:3D:1E:75 Paired: yes
Pairing successful
[CHG] Device 00:1F:20:3D:1E:75 Connected: no
root ##bl##[bluetooth]##!bl### connect 00:1F:20:3D:1E:75
Attempting to connect to 00:1F:20:3D:1E:75
[CHG] Device 00:1F:20:3D:1E:75 Connected: yes
Connection successful
root ##bl##[bluetooth]##!bl### quit
[DEL] Controller 00:02:72:C9:62:65 antec [default]
root #

To install, make the following modifications to your skin:

--- mediawiki-1.19.1/skins/vector/screen.css    2012-06-13 18:22:39.000000000 +0000
+++ public_html/skins/vector/screen.css 2012-08-27 04:34:47.507912892 +0000
@@ -683,10 +683,47 @@
        list-style-image: url(images/bullet-icon.png);
 }
 
-pre {
-       line-height: 1.3em;
+/* ConsoleOutput.php start */
+
+.shell, pre, code, tt, div.mw-geshi {
+        font-size: 12px;
+        font-family: Consolas, 'andale mono','lucida console', monospace;
+}
+
+.shell, pre, div.mw-geshi {
+        background-color: #F8F8FF;
+        line-height: 15px;
+        padding: 10px;
+        border: none;
+        border-top: 2px solid #C6C9E0;
+        border-bottom: 2px solid #C6C9E0;
+        margin: 0;
+        overflow-x: auto;
+        overflow-y: hidden;
+}
+
+.code {
+        color: #666;
+}
+
+.code_input {
+        color: #000;
 }
 
+.code_red {
+        color: #f00;
+}
+
+.code_blue {
+        color: #00f;
+}
+
+.shell_green {
+        color: #080;
+}
+
+/* ConsoleOutput.php end */
+
 /* Site Notice (includes notices from CentralNotice extension) */
 #siteNotice {
        font-size: 0.8em;

Then install the following code in your extensions directory and include it with a require_once( "$IP/extensions/ConsoleOutput.php" ); in LocalSettings.php:

<?php
$wgExtensionCredits['validextensionclass'][] = array(
    'name' => 'ConsoleOutput',
    'author' => 'Daniel Robbins',
    'url' => 'https://github.com/danielrobbins/mediawiki-consoleoutput',
    'description' => 'This extension allows you to display colorized console output in mediawiki'
);

if ( defined( 'MW_SUPPORTS_PARSERFIRSTCALLINIT' ) ) {
        $wgHooks['ParserFirstCallInit'][] = 'consoleOutputSetup';
} else {
        $wgExtensionFunctions[] = 'consoleOutputSetup';
}

function consoleOutputSetup( $data )
{
    global $wgParser;
    $wgParser->setHook('console', 'consoleRender');
    return true;
}

function consoleRender($input, $args, $parser)
{
    if (count($args))
    {
        return "<strong class='error'>" .
               "ConsoleOutput: arguments not supported" .
               "</strong>";
    }

    # Display < and > as literals, so escape them:

    $input = preg_replace('/>/','&gt;', $input);
    $input = preg_replace('/</','&lt;', $input);

    # http://www.perlmonks.org/?node_id=518444
    # See "Matching a pattern that doesn't include another pattern:

    $input = preg_replace('/##i##((?:(?!##!i##).)*)##!i##/','<span class="code_input">$1</span>', $input);
    $input = preg_replace('/##i##(.*)/','<span class="code_input">$1</span>', $input);
    $input = preg_replace('/##b##((?:(?!##!b##).)*)##!b##/','<b>$1</b>', $input);
    $input = preg_replace('/##b##(.*)/','<b>$1</b>', $input);
    return "<pre class=\"code\">" . $input . "&lt;/pre>";
}
?>