This script was written a long while ago – near April 2011 when I was launching i-Tools.org (in particular, it powers up its Text processor with more than 130 functions).
Download the class here: php-scripter-20121204.zip.
It has no dependcencies except for mbstring thus supporting Unicode scenarios. Released in public domain so you’re free to do whatever you want with it.
require_once 'scripter.php';
$scr = new Scripter('say @myVar');
$res = $scr->Run();
echo 'Result of last executed command is: ';
var_dump($res);
Note that by default Scripter provides a very bare environment without variables (but easily added) and virtually no built-in commands (which are also easy to add). See also Extending Scripter.
Script example (snippet taken from i-Tools.org):
# In CLF the length of response is the second number between 2 spaces: ACTCOPY / to "Page sizes (HTTP 200)" PMATCH 1 ' 200 (\d+) ' SORT num # ...just as you'd guess after sorting we get ordered lines where: SET min = `LINES 1 # ...and max is: SET max = `LINES -1 SET avg = `3MATH `AVG / 1024 SAY "Average page size is @avg KB, max - `3MATH @max / 1024 KB and min - @min B." SAY "Total outgoing traffic - `3MATH `SUM / 1048576 MB."
Each line is a command. Empty lines, lines starting with hash (#) or semicolon (;) are ignored; hash symbol (#), unless doubled (##), also indicates the start of inline comment that lasts until the end of line. Lines starting with backtick (`), as well as backtick appearing inside lines (other commands or functions) are functions. All other lines are command calls.
This makes scripts look like Windows batch files (.bat) and simplified *nix shell scripts.
Constructions used in the docs:
(a|b|c) | Mandatory element one of 3 listed: «a», «b» or «c». |
---|---|
(ab?|c) | Question mark marks previous symbol as optional thus matching 3 values: «ab», «a» and «c». |
[a|b|c] | Optional element (one or none). |
[opt] | An optional string that might appear or not. Can be nested: [arg1 [arg2 ...]] |
<what> | User-supplied input. |
<name> | path (\|/) [path ...] |
---|---|
<value> | ("str"|'str'|1|1.0|1,0) – where «1» is any int or float. |
<expr> | As <value> but only allows numbers, spaces and the following operators: * / + - ( ) % ^ (power). |
Generally, syntax for a command is:
<command> [<arg_1> [<arg_2> [...]]]
where <arg> (arguments) is a string without spaces as it is or a string wrapped in quotes. If a quote (" or ') is doubled it appears as is. Also, quoted strings can span multiple lines.
Command name is case-insensitive.
Examples: a_string (no spaces), "a string" or 'a string' (spaces but still a single argument). Quoted example: 'it''s the same as yesterday' (string «it’s the same as yesterday»).
Remember that if backtick (`) appears it’s treated as a function call regardless if it’s inside a quoted string or not: `get "piece length" (single function), say Length: `get "piece length" (function inside a command’s argument). Last example could also look like this (nothing changes):
say "Length: `get "piece length""
Note that it’s not necessary to double quotes belonging to the nested function call.
A function can be thought of as an inline command that may return result (actually, commands can return it too but you can’t use it unless you invoke it as a function).
Any command can be called as a function so this separation is mostly conventional – functions usually don’t make sense when called as commands (i.e. without using the returned value). For example, calling count an_array is valid but since count only returns the number of members an array has executing it as a command will have no visible effect.
Its syntax is similar to that of a command except that it’s prefixed with a backtick (`) and has an optional parameter (n):
`[n]<function> [<arg_1> [<arg_2> [...]]]
…where n is an explicit stating of number of arguments this function must be passed (it might not use all of them). If it’s omitted function decides by itself how much it needs and the rest goes to parent command or function.
n can only be one-digit number and it cannot be followed by a space (if it does it’s treated as a function name instead). Unless explicitly stated functions accept the least number of arguments they can.
Function name is case-insensitive. If backtick is doubled (``) it’s treated as a single backtick and not a function call.
Functions can be nested: say Last file in this torrent: `join info/files/`count info/files/name.
Functions can even appear in place of command name (the same goes to @variables): `func cmd_arg (the result of calling `func is treated as command name which is then called with argument cmd_arg).
This command (or function) is just like any other command except that it’s slightly more complex and it’s can be more conveniently described in its own section.
if/unless commands are identical except that unless reverts condition – for this reason here they both are called «if-command».
The if command can be called using 3 different forms (which are one actually). This one returns "0" if condition didn’t match and "1" if it did:
(if|unless) <name> [<operator> <expr>]
This one executes one or more commands if condition held (and returns the result of last command executed):
(if|unless) <name> [<operator> <expr>] <cmd>[ <arg> ...] ...
Finally, this one executes different command(s) based on whether condition held or not:
(if|unless) <name> [<operator> <expr>] <cmd>[ <arg> ...] ... else <cmd>[ <arg> ...] ...
The first form is pretty simple and will be explained below; the others, however, are slightly more complex.
The key point about THEN (has no visible keyword) and ELSE (case-insensitive) blocks is that indentation matters (just like in Python): each nested block must be indented with exactly 2 spaces. Thanks to this if blocks can be nested as long as proper indentation is maintained. Two or more nested if or unless having the same level of indentation produce an error.
THEN/ELSE block ends when its indentation level changes: either it becomes too small (too few leading spaces) or too big (too many).
As demonstated above, condition is specified like this:
if <name> [<operator> <expr>]
If condition (the contents of square brackets) is omitted <name> is checked for boolean true (which is anything but "", 0 and "0"). If condition is passed, it’s broken in two parts (two separate arguments):
An unsuitable operator and expression, as well as supplying operator but omitting expression will result in an error.
Operators and expressions can be extended.
Both if and unless commands can be called in function form retaining the same format. They return either "0" or "1" for loose boolean value of the operand (if there are no THEN/ELSE blocks) or the result of last command called in one of those blocks (or "0" if condition didn’t held and there were no ELSE block).
You still need to maintain proper indentation.
say `ret Flag is Boolean `if flag True. else False.
As you can see, no other arguments for parent command can follow if/unless block – there’s no space for them. If you want to pass multiple arguments with one or more `if calls among them use variables like this:
set @x `if flag_1 set @y `if flag_2 say X is: `get @x and Y is: `get @y
See also function form examples.
if info/files? say Many files (`count info/files). else say Single file.
As Scripter’s commands (functions) are separate classes adding a new command means creating a new class named:
ScriptCmd_[your_cluster_name_]command_name
For example: ScriptCmd_mycommand or ScriptCmd_mycluster_mycommand.
class ScriptCmd_mycluster_count extends BaseScriptCmd {
function Exec() {
return count($this->NextArg(true));
}
}
class ScriptCmd_mycluster_if extends ScriptCmd_if {
function __construct($cmd) {
parent::__construct($cmd);
$this->conds[] = '~';
}
function MatchCond($value, $cond = null, $against = null) {
if (strtolower($cond) === '~') {
return preg_match($against, $value);
} else {
return parent::MatchCond($value, $cond, $against);
}
}
}
There’s no separate concept of «variables» in Scripter – similar to Windows command line shell they’re operated via regular commands or functions. This includes reading variables (done with get varName) and writing them (set varName value).
The difference is that in Scripter you can specify command aliases that don’t have to be proper identifiers like GET. By convention Scripter-derived interpreters use @ symbol to refer to variables similar to LESS for CSS:
say @myVar
The above is complete equivalent of the following except that GET is called indirectly and there’s no explicit backtick operator (`):
say `get myVar
To accomplish this PHPScripter
class is inherited to set up the aliases and maintain variable pool (since Scripter doesn’t have one by default):
class MyVarAwareScripter extends Scripter {
public $vars = array();
function __construct($script) {
parent::__construct($script);
// see comments inside scripter.php for these fields.
$this->cmdAliases['@'] = 'get';
$this->cmdSymbols['@'] = true;
$this->backticks['@'] = 'get';
}
}
Now let’s implement GET and SET commands:
// Accepts any number of arguments, minimum one, joins them together and returns.
class ScriptCmd_get extends BaseScriptCmd {
function Exec() {
$result = '';
foreach ($this->GetAllArgsAtLeast(1) as $var) {
$result .= @$this->script->vars[$var];
}
return $result;
}
}
// Sets variable which name is given as the first mandatory argument to the value
// given as the second argument (also mandatory).
class ScriptCmd_set extends ScriptCmd_Text_copy {
public $toWord = '=';
function Process($doc, $newText) {
$var = $this->NextArg(true);
$this->script->vars[$var] = $this->NextArg(true);
}
}
The main Scripter object (PHPMyVarAwareScripter
here) that any executing command/function has access to via its PHP$script
property is a common way of storing global script instance data or putting its common API methods to.
After we’ve done the above we can assign and read variables:
set myVar "There's a good day `today" say "myVar is @myVar" # or the equivalent but longer: say "myVar is `get myVar"