The Boost C++ Libraries

Grammar

If you want to parse complex formats and need to define multiple rules that refer to each other, you can group them with boost::spirit::qi::grammar.

Example 11.15. Grouping rules in a grammar
#include <boost/spirit/include/qi.hpp>
#include <boost/variant.hpp>
#include <string>
#include <vector>
#include <iostream>

using namespace boost::spirit;

template <typename Iterator, typename Skipper>
struct my_grammar : qi::grammar<Iterator,
  std::vector<boost::variant<int, bool>>(), Skipper>
{
  my_grammar() : my_grammar::base_type{values}
  {
    value = qi::int_ | qi::bool_;
    values = value % ',';
  }

  qi::rule<Iterator, boost::variant<int, bool>(), Skipper> value;
  qi::rule<Iterator, std::vector<boost::variant<int, bool>>(), Skipper>
    values;
};

struct print : public boost::static_visitor<>
{
  template <typename T>
  void operator()(T t) const
  {
    std::cout << std::boolalpha << t << ';';
  }
};

int main()
{
  std::string s;
  std::getline(std::cin, s);
  auto it = s.begin();
  my_grammar<std::string::iterator, ascii::space_type> g;
  std::vector<boost::variant<int, bool>> v;
  if (qi::phrase_parse(it, s.end(), g, ascii::space, v))
  {
    for (const auto &elem : v)
      boost::apply_visitor(print{}, elem);
  }
}

Example 11.15 works like Example 11.14: you can enter integers and boolean values in any order, delimited by commas. They will be written to the standard output stream in the same order, but delimited by semicolons. The example uses the same rules – value and values – as the previous one. However, this time the rules are grouped in a grammar. The grammar is defined in a class called my_grammar, which is derived from boost::spirit::qi::grammar.

Both my_grammar and boost::spirit::qi::grammar are class templates. The template parameters expected by boost::spirit::qi::grammar are the same as those expected by boost::spirit::qi::rule. The iterator type of the string to be parsed has to be passed to boost::spirit::qi::grammar. You can also pass the signature of a function that defines the attribute type and the type of the skipper.

In my_grammar, boost::spirit::qi::rule is used to define the rules value and values. The rules are defined as member variables and are initialized in the constructor.

Please note that the outermost rule has to be passed with base_type to the constructor of the base class. This way, Boost.Spirit knows which rule is the entry point of the grammar.

Once a grammar is defined, it can be used like a parser. In Example 11.15, my_grammar is instantiated in main() to create g. g is then passed to boost::spirit::qi::phrase_parse().

Example 11.16. Storing parsed values in structures
#include <boost/spirit/include/qi.hpp>
#include <boost/variant.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <string>
#include <vector>
#include <iostream>

using namespace boost::spirit;

typedef boost::variant<int, bool> int_or_bool;

struct int_or_bool_values
{
  int_or_bool first;
  std::vector<int_or_bool> others;
};

BOOST_FUSION_ADAPT_STRUCT(
  int_or_bool_values,
  (int_or_bool, first)
  (std::vector<int_or_bool>, others)
)

template <typename Iterator, typename Skipper>
struct my_grammar : qi::grammar<Iterator, int_or_bool_values(), Skipper>
{
  my_grammar() : my_grammar::base_type{values}
  {
    value = qi::int_ | qi::bool_;
    values = value >> ',' >> value % ',';
  }

  qi::rule<Iterator, int_or_bool(), Skipper> value;
  qi::rule<Iterator, int_or_bool_values(), Skipper> values;
};

struct print : public boost::static_visitor<>
{
  template <typename T>
  void operator()(T t) const
  {
    std::cout << std::boolalpha << t << ';';
  }
};

int main()
{
  std::string s;
  std::getline(std::cin, s);
  auto it = s.begin();
  my_grammar<std::string::iterator, ascii::space_type> g;
  int_or_bool_values v;
  if (qi::phrase_parse(it, s.end(), g, ascii::space, v))
  {
    print p;
    boost::apply_visitor(p, v.first);
    for (const auto &elem : v.others)
      boost::apply_visitor(p, elem);
  }
}

Example 11.16 is based on the previous example, but expects at least two values. The rule values is defined as value >> ',' >> value % ','.

The first component in values is value, and the second one is value % ','. The value parsed by the first component has to be stored in an object of type boost::variant. The values parsed by the second component have to be stored in a container. With int_or_bool_values, the example provides a structure to store values parsed by both components of the rule values.

To use int_or_bool_values with Boost.Spirit, the macro BOOST_FUSION_ADAPT_STRUCT must be used. This macro is provided by Boost.Fusion. This macro makes it possible to treat int_or_bool_values like a tuple with two values of type int_or_bool and std::vector<int_or_bool>. Because this tuple has the right number of values with the right types, it is possible to define values with the signature int_or_bool_values(). values will store the first parsed value in first and all other parsed values in others.

An object of type int_or_bool_values is passed to boost::spirit::qi::phrase_parse() as an attribute. If you start the example and enter at least two integers or boolean values delimited by commas, they are all stored in the attribute and written to the standard output stream.

Note

The parser has been changed from what was used in the previous example. If values was defined with value % ',', int_or_bool_values would have only one member variable, and all parsed values could be stored in a vector, as in the previous example. Thus, int_or_bool_values would be like a tuple with only one value – which Boost.Spirit doesn’t support. Structures with only one member variable will cause a compiler error. There are various workarounds for that problem.

Exercises

  1. Create a parser that can add and subtract integers. The parser should be able to process input like 1+2-5+8 and write the result – here 6 – to standard output.

  2. Extend your parser: It should now also support floating point numbers. Furthermore, it should be possible to use fractions. The new parser should be able to process input like 1.2+6/5-0.9 and should write the result – here 1.5 – to standard output.