Nearly a year ago I was in need of an easy-to-use and simple formatting script that
will make user-input texts look neat without too much processing overhead and without
need to learn any markup language beforehand. That project was
The Imageboard Search Engine v2.
That's then I came up with this script – and I called it LiteComment.
After long time after its creation I learned about Markdown's markup and was very surprised to discover that its formatting was extremely similar to that of LiteCOmmment. However, it's still slightly different.
If you need a full-fledged formatting framework for serious processing take a look at UverseWiki (a modelling text processor) which among other places powers up this blog and the i-Tools.org project.
Download LiteComment PHP script. You can check the LiteComment's sandbox here.
LiteCommentlink here
; www.google.com ⇒ LiteComment[www.google.com]
.LiteComment`code blocks`
.LiteCommentmy@email.org
.
== Heading of level 1 == === Heading of level 2 === ... up till: ======= The smalles heading of level 6 ======= You can specify any number of "="s on the right or omit them at once: ======= Like this == ======= ...this = ======= ...or this! *bold text*, `monospaced` (code) ` Multiline code (unformatted text) ` " Multiline quotation (blockquote) " >>> inline quote >> more recent saying >most recent (spaces after > are optional) Citations: "Long time ago when the world was ruled by Nevermore..." Or using single apostrophes: 'Long time ago'... Links - http://google.com Or this www.embarcadero.com Up to 2 preceding words are used for the caption: www.ya.ru [- http://google.com | Caption of the link -] E-mail masking: e@mail.em.com Pictures: icon.png, [- logo.png | Image-link -] With a thumbnail: [- logo.png | icon.png -] Rules - 4 types (CSS classes); they are lines consisting of 3 or more identical symbols: --- ~~~~ ===== ++++++ |
Dashes: short (en) - dash, long (em) -- dash, the ---- same Ellipsis: 2.. (one removed) or more... dots...... Symbols: (c) (r) (p) (tm) Numbers: #1 #33 1st 55th #100th Matrix: 10*10 10x10 10X10 10 x 10 Division: 10/10 10:10 10 / 10 (spaces are fine) Plus-minus: +- +-5 +- 5 Arrows (number of "=" doesn't matter if it's >= 2): <=> <= => <==== ====> <=====> Flood protection against repeated chars: OMG!!! WTF?!? Erm???? |
LiteCommentDashes: short (en) – dash, long (em) — dash, the — same
Ellipsis: 2. (one removed) or more… dots… Symbols: © ® § ™ Numbers: №1 №33 1st 55th 100th Matrix: 10×10 10×10 10×10 10×10 Division: 10÷10 10÷10 10÷10 (spaces are fine) Plus-minus: ± ±5 ± 5 Arrows (number of «=» doesn't matter if it's >= 2): ⇔ ⇐ ⇒ ⇦ ➯ ⇔ Flood protection against repeated chars: OMG‼ WTF?? Erm?? |
The simplest way of formatting your text is to call static Format method:
require 'litecomment.php';
$str = htmlspecialchars('*bold* <link>');
echo LiteComment::Format($str); // => <strong>bold</strong> <link>
PHP$str
could be an array (see the method's description).
It leaves you no chances to customize things but it's simple. Well, in fact there's not much room for tweaking class instance – most settings are static. Moreover, there shouldn't be many reasons to construct the class and then operate on it because it's constructed once per document and should not be used thereafter.
Note: before calling format methods you need to escape HTML on your own. You'd normally do this with htmlspecialchars like in the example above.
PHPstatic Format($text)
PHP$text
can be a string or an array of strings (just like reg_replace) – array will have all items formatted.PHPSetAntiSpamMode($mode)
PHPSetFileExtensions($extstr)
As most properties are static and represent various settings see the Configuration section for their description.
The following static properties of LiteComment class can be of interest:
LiteCommentmy@e.mail.net
.The following instance properties of LiteComment class can be of interest:
PHPtrue
, simple format characters are kept: *bold* ⇒ LiteComment*bold*
. If PHPfalse
they're removed: LiteCommentbold
.PHPtrue
(equals to all array elements set to true) or PHPfalse
. If PHPfalse
– nested formatting isn't processed – extremely quick (1 regexp ran per 1 source) but "a *b* c" will result in LiteComment«a *b* c»
while if this is PHPtrue
it will be LiteComment«a *b* c»
.Although LiteComment is intentionally limited in features it nevertheless has a few tricks in its pockets (and it has like 65 pockets) that allow you to extend the markup in different ways.
Most bruteforce approach is, obviously, changing the regexp LiteComment uses but
you'll also need to shift all pocket offsets after inserting new capturing brackets.
That's been made easier by thanks of PHPLiteComment->MatchesFrom()
but still requires
some effort.
And, besides, I've preserved a few «extensible air holes» just for this occasion.
Download LiteComment PHP script. You can check the LiteComment's sandbox here.
This is most common and flexible way to add your own markup or features. It uses
what I call "special insets". Its syntax is:
[- arg | arg | name=value | ... -]
If an argument doesn't have a name (no =... part) it'll be assigned an index. Spaces after [-, before -] and around | are optional.
How to add your own handler? Two methods of LiteComment class deal with special insets:
PHPFormatSpecialInsetInHTML(&$contents)
PHP$contents
is what was found inside [-...-].PHPSpecialStrToSettings(&$str)
So to add your command you need to go to PHPFormatSpecialInsetInHTML
and examine its code:
$settings = $this->SpecialStrToSettings($contents);
$urlKey = array_shift(array_keys($settings));
$url = array_shift($settings);
if (!is_int($urlKey)) {
$url = "$urlKey=$url"; // URL contains '=', reinsert it.
}
$title = &$settings['title'];
$title or $title = &$settings[0];
return $this->MakeHTMLLink($url, $title);
Now your actions depend on how you want to extend the special inset.
PHP$url
will contain the name of command (which is the first argument).Let's say we want to be able to embed YouTUbe videos from LiteComment formatting. The syntax can really be anything within [-...-] construction (spcial inset) so I've chosen this one: [- youtube | myCyJJdhhDk -].
First, let's locate FormatSpecialInsetInHTML method and make changes there. Having the code of this function before our eyes (in previous section) we'll add our code after the 2nd block:
if (strtolower($url) === 'youtube') {
$id = $settings[0];
$html = '<iframe title="YouTube video player" width="480" height="390"
src="http://www.youtube.com/embed/'.$id.'?rel=0"
frameborder="0" allowfullscreen></iframe>';
return $html;
}
// original code follows:
$title = &$settings['title'];
...
That's all! Now we can use this syntax to insert a YouTube video:
Here's a tutorial on getting started in well-known ModPlug Tracker: [- youtube | myCyJJdhhDk -]
To demonstrate other possibilities of extending special insets let's
say that I also want to support this, shorter, syntax: [- youtube myCyJJdhhDk -].
Here the difference is that video's ID («myCyJJdhhDk») is no more an argument so we need
to do parsing on our own.
That's not difficult at all, though – simply insert the following code right in thebeginning of the PHPFormatSpecialInsetInHTML
function:
if (stripos($contents, 'youtube ') === 0) {
$id = substr($contents, strlen('youtube '));
// the rest is the same as in the previous example with [-youtube | ID-]:
$html = '<iframe title="YouTube video player" width="480" height="390"
src="http://www.youtube.com/embed/'.$id.'?rel=0"
frameborder="0" allowfullscreen></iframe>';
return $html;
}
// original code follows:
$settings = $this->SpecialStrToSettings($contents);
...
Possibilities for extending special insets are quite endless.
By default when you format an URL (like http://google.com or [- goo.com -])
you'll get a link with text caption. However, this is boring and sometimes we want
to display a thumbnail pointing to the full image – text caption just isn't good enough.
And that's what you can do – make plain URLs look different.
By default LiteComment already includes handlers that will show URLs with picture extensions (such as .png) as images (<img />) so that when you write: picture.jpg or [-http://my-home | thumb.png-] you'll see an image linking to the actual page.
You can extend this by adding your own handlers (handlers are triggered based on file (URL) extension).
Adding a custom handler is as simple as adding a method to LiteCommentclass with the name of PHPHTMLByExt_<EXTENSION>
. For example, if we want to provide a
download counter for archives and also a neat icon of them before the link we can add
this method for ZIP archives:
function HTMLByExt_ZIP($url) {
$counts = unserialize( file_get_contents('dl-counter.txt') );
if (!is_array($counts)) { $counts = array(); }
$thisCount = $counts[$url]++;
file_put_contents('dl-counter.txt', serialize($counts), LOCK_EX);
return '<img src="images/zip-link.png" />'.basename($url).
" ($thisCount downloads so far)";
}
Remember that link (<a>) will be added automatically. Also, since this method is called when formatting texts the counter won't update if you're caching formatted HTML (not that it's necessary with the kind of speed LiteComment has).
Now the following snippet will be neatly formatted:
In this archive: litecomment.zip you'll find the software with all necessary instructions.
Note that LiteComment will only recognize extensions in-text that were registered using PHPSetFileExtensions()
.
Erm, well, when I said that handlers are triggered based on file extension I tricked
you a little :) They're actually triggered based on the whole URL and «extension trigger»
is just the simplest method of adding a new handler.
What do I mean?
Let's say we want to warn users about links to a particular site. Such URLs don't
have to be of one extension – just linking to the same resource. Say, spammer.org.
We'll start off with locating PHPGetHTMLForURL
method and examining its code:
if ($methodName = $this->GetHTMLFileMethodFor($url)) {
return $this->$methodName($url);
}
Pretty straightforward, eh? This function accepts PHP$url
argument which
holds the entire URL that was passed to special inset ([-url|caption-]) or
was found in the text (like www.site.ru/file). Now we know what to do:
add an extra condition to the beginning of this method:
if (stripos($url, 'spammer.org') !== false) {
return '<em>This site might harm your system.</em>';
} elseif ($methodName = $this->GetHTMLFileMethodFor($url)) {
// original code follows.
Now each link to spammer.org will have that text included: «Visit spammer.org» → VIsit <a href="..."><em>This site might harm your system.</em></a>.
A «simple format» is a text placed between 2 identical strings. For example, default
simple formats are bold (*bold*) and preformatted (`code`) texts. As you can see
they are created by * and ` symbols correspondingly. Note that a simple format doesn't
have to use a single symbol – it can be a string (e.g. ##).
You can easily add a new simple format if it follows this rule.
Let's say we want to underline text. It'll have this syntax: _underlined_ and it will be using <ins> tag.
Side note: <u> isn't HTML5-compliant while <ins> is displayed underlined in all browsers as far as I have tested. Similar story with <del> and <s>, <strike> tags – the first is semantic and is displayed striked-through by default).
PHPstatic $tagAliases
property. Example: PHPstatic $tagAliases = array('code' => 'code', 'strong' => 'strong', ...
PHPstatic $tagAliases = array('ins' => 'ins', 'code' => 'code', ...
PHPHTMLReplaceCallback()
method:} elseif ($formatChar = &$matches[30]) {
static $charToClass = array('*' => array('strong', 'emphasis'), '`' => array('tt', 'monotype'));
...
Simply add an item to PHP$charToClass
:
static $charToClass = array('_' => array('ins', 'underlined'), ...
The first item (PHP'ins'
) is our tag, the second (PHP'underline'
) is CSS
class to assign to it. It can also be an array to specify several classes – for example:
array('_' => array('ins', array('underlined', 'inserted')), ...
A «multiline format», or a block, spans multiple lines (obviously). Default multiline formats are code and blockquotes:
` code goes here and isn't processed ` " this is a blockquote with many lines " |
LiteComment
code goes here this is a blockquote |
Similarly to simple formats multiline formats are created by identical strings (1 or more characters) placed on separate lines.
Let's say we want to add some «attention box» that will be expressed via <div> tag with CSS class set to attention. It will be created with this markup:
!! Please use the forum search function before asking your questions! !!
To implement this we need 3 things – just like with a simple format:
PHPstatic $tagAliases
property. Example: PHPstatic $tagAliases = array('code' => 'code', 'strong' => 'strong', ...
PHPHTMLReplaceCallback()
method: PHPstatic $multilineToTag = array('`' => array('code', 'code'),
We need change this line to: PHP...array('`' => array('code', 'code'), '!!' => array('div', 'attention'), ...
LiteComment is able to replace plain e-mail addresses in texts with obfuscated alternatives. By default it has 3 spam-protection methods (obfuscating, JavaScript protection and JavaScript protection with obfuscating for those who have JavaScript turned off) but you can always add more. How?
PHPSetAntiSpamMode()
;
Let's say we want to display e-mails as images. For this let's go to PHPSetAntiSpamMode()
function and add a new PHPcase 'our_method'
there. Example:
...
case 'jsonly':
case 'image': // <- added
$this->antiSpamMethod = "MakeEmail_$mode";
}
Now let's create PHPMakeEmail_image
method (it needs GD library for PHP):
function MakeEmail_image($account, $domain, $zone) {
$img = imagecreatetruecolor(100, 30);
$color = imagecolorallocate($img, 0, 0, 0);
imagestring($img, 3, 0, 0, "$account@$domain.$zone", $color);
ob_start();
imagepng($img);
$data = ob_get_clean();
$base64 = chunk_split(base64_encode($data));
$html = '<img src="data:image/png;base64,'.$base64.'" alt="E-mail" />';
return $html;
}
The code working with GD is purely for demonstration and has some limitations (e.g. we could determine the width of e-mail string before creating the image). However, I think it demonstrates well how things generally work.
Download LiteComment. You can check its sandbox here.
Please drop a comment below if you have questions or if you're using LiteComment in your project!
Heading of level 1
Heading of level 2
… up till:
The smalles heading of level 6
You can specify any number of "="s on
the right or omit them at once:
Like this
…this
…or this!
*bold text*, `monospaced` (code)
Multiline code
(unformatted text)
>>> inline quote
>> more recent saying
> most recent (spaces after > are optional)
Citations: «Long time ago when the world was ruled by Nevermore…»
Or using single apostrophes: «Long time ago»...
Links Or this
Up to 2 preceding words are used for the caption
Caption of the link
E-mail masking: e~@~mail.em~com
Pictures: , Image-link
With a thumbnail:
Rules – 4 types (CSS classes); they are lines
consisting of 3 or more identical symbols: