123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169 |
- // Copyright (C) 2015-2016 Internet Systems Consortium, Inc. ("ISC")
- //
- // This Source Code Form is subject to the terms of the Mozilla Public
- // License, v. 2.0. If a copy of the MPL was not distributed with this
- // file, You can obtain one at http://mozilla.org/MPL/2.0/.
- /**
- @page dhcpEval libeval - Expression evaluation and client classification
- @section dhcpEvalIntroduction Introduction
- The core of the libeval library is a parser that is able to parse an
- expression (e.g. option[123].text == 'APC'). This is currently used for
- client classification, but in the future may be also used for other
- applications.
- The external interface to the library is the @ref isc::eval::EvalContext
- class. Once instantiated, it offers a major method:
- @ref isc::eval::EvalContext::parseString, which parses the specified
- string. Once the expression is parsed, it is converted to a collection of
- tokens that are stored in Reverse Polish Notation in
- EvalContext::expression.
- Internally, the parser code is generated by flex and bison. These two
- tools convert lexer.ll and parser.yy files into a number of .cc and .hh files.
- To avoid a build of Kea depending on the presence of flex and bison, the
- result of the generation is checked into the github repository and is
- distributed in the tarballs.
- @section dhcpEvalLexer Lexer generation using flex
- Flex is used to generate the lexer, a piece of code that converts input
- data into a series of tokens. It contains a small number of directives,
- but the majority of the code consists of the definitions of tokens. These
- definitions are regular expressions that define various tokens, e.g. strings,
- numbers, parentheses, etc. Once the expression is matched, the associated
- action is executed. In the majority of the cases a generator method from
- @ref isc::eval::EvalParser is called, which returns returns a newly created
- bison token. The purpose of the lexer is to generate a stream
- of tokens that are consumed by the parser.
- lexer.cc and lexer.hh must not be edited. If there is a need
- to introduce changes, lexer.ll must be updated and the .cc and .hh files
- regenerated.
- @section dhcpEvalParser Parser generation using bison
- Bison is used to generate the parser, a piece of code that consumes a
- stream of tokens and attempts to match it against a defined grammar.
- The bison parser is created from parser.yy. It contains
- a number of directives, but the two most important sections are:
- a list of tokens (for each token defined here, bison will generate the
- make_NAMEOFTOKEN method in the @ref isc::eval::EvalParser class) and
- the grammar. The Grammar is a tree like structure with possible loops.
- Here is an over-simplified version of the grammar:
- @code
- 01. %start expression;
- 02.
- 03. expression : token EQUAL token
- 04. | token
- 05. ;
- 06.
- 07. token : STRING
- 08. {
- 09. TokenPtr str(new TokenString($1));
- 10. ctx.expression.push_back(str);
- 11. }
- 12. | HEXSTRING
- 13. {
- 14. TokenPtr hex(new TokenHexString($1));
- 15. ctx.expression.push_back(hex);
- 16. }
- 17. | OPTION '[' INTEGER ']' DOT TEXT
- 18. {
- 19. TokenPtr opt(new TokenOption($3, TokenOption::TEXTUAL));
- 20. ctx.expression.push_back(opt);
- 21. }
- 22. | OPTION '[' INTEGER ']' DOT HEX
- 23. {
- 24. TokenPtr opt(new TokenOption($3, TokenOption::HEXADECIMAL));
- 25. ctx.expression.push_back(opt);
- 26. }
- 27. ;
- @endcode
- This code determines that the grammar starts from expression (line 1).
- The actual definition of expression (lines 3-5) may either be a
- single token or an expression "token == token" (EQUAL has been defined as
- "==" elsewhere). Token is further
- defined in lines 7-22: it may either be a string (lines 7-11),
- a hex string (lines 12-16), option in the textual format (lines 17-21)
- or option in a hexadecimal format (lines 22-26).
- When the actual case is determined, the respective C++ action
- is executed. For example, if the token is a string, the TokenString class is
- instantiated with the appropriate value and put onto the expression vector.
- @section dhcpEvalMakefile Generating parser files
- In the general case, we want to avoid generating parser files, so an
- average user interested in just compiling Kea would not need flex or
- bison. Therefore the generated files are already included in the
- git repository and will be included in the tarball releases.
- However, there will be cases when one of the developers would want
- to tweak the lexer.ll and parser.yy files and then regenerate
- the code. For this purpose, two makefile targets are defined:
- @code
- make parser
- @endcode
- will generate the parsers and
- @code
- make parser-clean
- @endcode
- will remove the files. Generated files removal was also hooked
- into the maintainer-clean target.
- @section dhcpEvalConfigure Configure options
- Since the flex/bison tools are not necessary for a regular compilation,
- checks are conducted during the configure script, but the lack of flex or
- bison tools does not stop the process. There is a flag
- (--enable-generate-parser) that tells configure script that the
- parser will be generated. With this flag, the checks for flex/bison
- are mandatory. If either tool is missing or at too early a version, the
- configure process will terminate with an error.
- @section dhcpEvalToken Supported tokens
- There are a number of tokens implemented. Each token is derived from
- isc::eval::Token class and represents a certain expression primitive.
- Currently supported tokens are:
- - isc::dhcp::TokenString -- represents a constant string, e.g. "MSFT";
- - isc::dhcp::TokenHexString -- represents a constant string, encoded as
- hex string, e.g. 0x666f6f which is actually "foo";
- - isc::dhcp::TokenIpAddress -- represents a constant IP address, encoded as
- a 4 or 16 byte binary string, e.g., 10.0.0.1 is 0x10000001.
- - isc::dhcp::TokenOption -- represents an option in a packet, e.g.
- option[123].text;
- - isc::dhcp::TokenRelay4Option -- represents a sub-option inserted by the
- DHCPv4 relay, e.g. relay[123].text or relay[123].hex
- - isc::dhcp::TokenRelay6Option -- represents a sub-option inserted by
- a DHCPv6 relay
- - isc::dhcp::TokenPkt -- represents a DHCP packet meta data (incoming
- interface name, source/remote or destination/local IP address, length).
- - isc::dhcp::TokenPkt4 -- represents a DHCPv4 packet field.
- - isc::dhcp::TokenPkt6 -- represents a DHCPv6 packet field (message type
- or transaction id).
- - isc::dhcp::TokenRelay6Field -- represents a DHCPv6 relay information field.
- - isc::dhcp::TokenEqual -- represents the equal (==) operator;
- - isc::dhcp::TokenSubstring -- represents the substring(text, start, length) operator;
- - isc::dhcp::TokenConcat -- represents the concat operator which
- concatenate two other tokens.
- - isc::dhcp::TokenNot -- the logical not operator.
- - isc::dhcp::TokenAnd -- the logical and (strict) operator.
- - isc::dhcp::TokenOr -- the logical or (strict) operator (strict means
- it always evaluates its operands).
- - isc::dhcp::TokenVendor -- represents vendor information option's existence,
- enterprise-id field and possible sub-options. (e.g. vendor[1234].exists,
- vendor[*].enterprise-id, vendor[1234].option[1].exists, vendor[1234].option[1].hex)
- - isc::dhcp::TokenVendorClass -- represents vendor information option's existence,
- enterprise-id and included data chunks. (e.g. vendor-class[1234].exists,
- vendor-class[*].enterprise-id, vendor-class[*].data[3])
- More operators are expected to be implemented in upcoming releases.
- */
|