Strus query evaluation configuration source

Language grammar

The following grammar (as EBNF) describes the formal language for describing a query evaluation scheme used by the strus utilities (strusUtilities).


Comments are starting with # and are reaching to the end of the line. Using # as part of a symbol is possible if it is part of a single or double quoted string.

Handling of spaces

Spaces, control characters and end of lines have no meaning in the language.

Case sensivity/insensivity

Parameter names (keys) of the query evaluation scheme are case insensitive. Keywords and identifiers referring to elements in the storage are case insensitive.


IDENTIFIER     : [A-Za-z][A-Za-z0-9_]*
STRING         : <single or double quoted string with backslash escaping>
NUMBER         : <integer or floating point number in non exponential notation>

config         = statement ";" config
statement      = evalexpr | selectexpr | weightexpr | restrictexpr | termdef | evalexpr
evalexpr       = "EVAL" [ NUMBER "*" ] functionname "(" parameterlist ")" ;
scalarexpr     = "FORMULA" STRING ;
selectexpr     = "SELECT" featureset ";"
weightexpr     = "WEIGHT" featureset ";"
restrictexpr   = "RESTRICT" featureset ";"
termdef        = "TERM" featureset termvalue termtype
evalexpr       = "SUMMARIZE" functionname "(" parameterlist ")"
functionname   = IDENTIFIER
featureset     = IDENTIFIER
termtype       = IDENTIFIER
termvalue      = IDENTIFIER | STRING
parameterlist  = parameter { "," parameter }
parameter      = parametername "=" parametervalue
parametername  = [ "." ] IDENTIFIER
parametervalue = IDENTIFIER | STRING | NUMBER


Meaning of the grammar elements


Name of the weighting or summarization function as provided by the query processor.


Name of the parameter passed to the weighting or summarization function. A parameter name with dot '.' as prefix is specifying a feature parameter declaration. The known names of weighting and summarization function depend on its implementation.

EVAL function

Defines a query evaluation function used for weighting

FORMULA scalar-function

Defines a scalar function (with _0,_1,.. referring to query evaluation function results in order of their definition) used to combine query evaluation function results to one result. If the specified, the different results are just added up to one.

SUMMARIZE function

Defines a summarizer function used for building the results

SELECT featureset

Defines the feature set used for selection of the documents to weight

WEIGHT featureset

Defines the feature set used for weighting

RESTRICT featureset

Defines the feature set used as restriction


The following example declares the feature set 'selfeat' to define what is weighted. All documents containing the feature 'selfeat' will be selected for ranking.
As weighting function we take the arithmetic sum of the 'bm25' weight of the document plus 3 times the value of the meta data element called 'pageweight'.
For presentation of the result we use the summarizer extracting the title attribute and taking the content elements of the best matching phrases.

SELECT selfeat;

EVAL bm25( k1=0.75, b=2.1, avgdoclen=1000, .match=docfeat );
EVAL metadata( name=pageweight );
FORMULA "0.7 * _1 * _0 + 0.3 * _0";

SUMMARIZE title = attribute( name=title );
SUMMARIZE content = matchphrase(
                        type=orig, nof=4, len=60,
                        structseek=40, .struct=sentence, .match=docfeat );