The Boost C++ Libraries

API

Boost.Spirit provides boost::spirit::qi::parse() and boost::spirit::qi::phrase_parse() to parse a format.

Example 11.1. Using boost::spirit::qi::parse()
#include <boost/spirit/include/qi.hpp>
#include <string>
#include <iostream>

using namespace boost::spirit;

int main()
{
  std::string s;
  std::getline(std::cin, s);
  auto it = s.begin();
  bool match = qi::parse(it, s.end(), ascii::digit);
  std::cout << std::boolalpha << match << '\n';
  if (it != s.end())
    std::cout << std::string{it, s.end()} << '\n';
}

Example 11.1 introduces boost::spirit::qi::parse(). This function expects two iterators of the string being parsed and a parser. The example uses the parser boost::spirit::ascii::digit, which is provided by Boost.Spirit. This is one of several character classification parsers. These parsers test whether characters belong to a certain class. boost::spirit::ascii::digit tests whether a character is a digit between 0 and 9.

The example passes iterators of a string which is read from std::cin. Note that the begin iterator isn’t passed directly to boost::spirit::qi::parse(). It is stored in the variable it, which is then passed to boost::spirit::qi::parse(). This is done because boost::spirit::qi::parse() may modify the iterator.

If you type a digit and then Enter, the example displays true. If you type two digits and then Enter, the output will be true followed by the second digit. If you enter a letter and then Enter, the output will be false followed by the letter.

The parser boost::spirit::ascii::digit, as used in Example 11.1, tests exactly one character to see whether it’s a digit. If the first character is a digit, boost::spirit::qi::parse() returns true – otherwise, it returns false. The return value of boost::spirit::qi::parse() indicates whether the parser succeeded.

boost::spirit::qi::parse() also returns true if you enter multiple digits. Because the parser boost::spirit::ascii::digit only tests the first character, it will succeed on such a string. All digits after the first will be ignored.

To let you determine how much of the string could be parsed successfully, boost::spirit::qi::parse() changes the iterator it. After a call to boost::spirit::qi::parse(), it refers to the character after the last one parsed successfully. If you enter multiple digits, it refers to the second digit. If you enter exactly one digit, it equals the end iterator of s. If you enter a letter, it refers to that letter.

boost::spirit::qi::parse() does not ignore spaces. If you run Example 11.1 and enter a space, false is displayed. boost::spirit::qi::parse() tests the first entered character, even if that character is a space. If you want to ignore spaces, use boost::spirit::qi::phrase_parse() instead of boost::spirit::qi::parse().

Example 11.2. Using boost::spirit::qi::phrase_parse()
#include <boost/spirit/include/qi.hpp>
#include <string>
#include <iostream>

using namespace boost::spirit;

int main()
{
  std::string s;
  std::getline(std::cin, s);
  auto it = s.begin();
  bool match = qi::phrase_parse(it, s.end(), ascii::digit, ascii::space);
  std::cout << std::boolalpha << match << '\n';
  if (it != s.end())
    std::cout << std::string{it, s.end()} << '\n';
}

boost::spirit::qi::phrase_parse() works like boost::spirit::qi::parse() but expects another parameter called skipper. The skipper is a parser for characters that should be ignored. Example 11.2 uses boost::spirit::ascii::space, a character classification parser to detect spaces, as the skipper.

The skipper boost::spirit::ascii::space discards spaces as delimiters. If you start the example and enter a space followed by a digit, it displays true. Unlike the previous example, the parser boost::spirit::ascii::digit is not applied to the space, but to the first character that isn’t a space.

Note that this example ignores any number of spaces. Thus, boost::spirit::qi::phrase_parse() returns true if you enter multiple spaces followed by a digit.

Like boost::spirit::qi::parse(), boost::spirit::qi::phrase_parse() modifies the iterator passed as the first parameter. That way, you know how far into the string the parser was able to work successfully. Example 11.2 skips spaces that occur after successfully parsed characters. If you enter a digit followed by a space followed by a letter, the iterator will refer to the letter, not the space in front of it. If you want the iterator to refer to the space, pass boost::spirit::qi::skip_flag::dont_postskip as another parameter to boost::spirit::qi::phrase_parse().

Example 11.3. phrase_parse() with boost::spirit::qi::skip_flag::dont_postskip
#include <boost/spirit/include/qi.hpp>
#include <string>
#include <iostream>

using namespace boost::spirit;

int main()
{
  std::string s;
  std::getline(std::cin, s);
  auto it = s.begin();
  bool match = qi::phrase_parse(it, s.end(), ascii::digit, ascii::space,
    qi::skip_flag::dont_postskip);
  std::cout << std::boolalpha << match << '\n';
  if (it != s.end())
    std::cout << std::string{it, s.end()} << '\n';
}

Example 11.3 passes boost::spirit::qi::skip_flag::dont_postskip to boost::spirit::qi::phrase_parse() to tell the parser not to skip spaces that occur after a successfully parsed digit, but before the first unsuccessfully parsed character. If you enter a digit followed by a space followed by a letter, it refers to the space after the call to boost::spirit::qi::phrase_parse().

The flag boost::spirit::qi::skip_flag::postskip is the default value, which is used if neither boost::spirit::qi::skip_flag::dont_postskip nor boost::spirit::qi::skip_flag::postskip is specified.

Example 11.4. boost::spirit::qi::phrase_parse() with wide strings
#include <boost/spirit/include/qi.hpp>
#include <string>
#include <iostream>

using namespace boost::spirit;

int main()
{
  std::wstring s;
  std::getline(std::wcin, s);
  auto it = s.begin();
  bool match = qi::phrase_parse(it, s.end(), ascii::digit, ascii::space,
    qi::skip_flag::dont_postskip);
  std::wcout << std::boolalpha << match << '\n';
  if (it != s.end())
    std::wcout << std::wstring{it, s.end()} << '\n';
}

boost::spirit::qi::parse() and boost::spirit::qi::phrase_parse() accept iterators to a wide string. Example 11.4 works like the previous example, except that wide strings are used.

Boost.Spirit also supports the string types std::u16string and std::u32string from the C++11 standard library.