Exercise 06.06.20

The text for this exercise reads:

(*2.5) Write a program that strips comments out of a C++ programm. That is, read from cin, remove both // coments and /* */ comments, and write the result to cout. Do not worry about making the output look nice (that would be another, and much harder, exercise). Do not worry about incorrect programs. Beware of //, /*, and */ in comments, strings, and character constants.

You can see the complete source, but the important code is the process function. You just give it the source a line at a time and it processes it.

string process(string tmp)
{
 static bool in_c_comment = false;

 if (!in_c_comment)
    {
     if (tmp.find("/") == string::npos)
        return tmp;
    }

 if (in_c_comment)
    {
     if (tmp.find("*/") == string::npos)
        return string();

     unsigned where = tmp.find("*/");

     in_c_comment = false;
     where += 2; // For the star ('*')

     if (where == tmp.size()) return string();

     string str(tmp, where, string::npos);
     return process(str);
    }


 bool in_string = false;

 for (unsigned position = 0; position < tmp.size(); position++)
     {
      if (tmp[position] == '\"' || tmp[position] == '\'')
         {
          if (in_string && tmp[position -1] != '\\')   // beware of: "\" still in string"
              in_string = false;
          if (!in_string)
              in_string = true;
          continue;
         }
     if (in_string)
        continue;

     if (tmp[position] == '/')
        {
         if (tmp[position +1] == '/') // C++ type comment
            {
             return string(tmp, 0, position);
            }

         if (tmp[position +1] == '*') // C type comment
            {
             in_c_comment = true;

             string str(tmp, 0, position);
             string str2(tmp, position +2, string::npos);
             return str + process(str2);
            }
        }
   }

 return tmp;
}

It is pretty simple code. It uses a static variable in_c_comment to take care of comments spanning over more than one line. It is not in a comment, it parses the string for comment-initializers taking care to avoid them in strings or character constants. After it has removed the comment it returns the new string, sometimes using recursion to analyse parts which it didn't.

Back to the Exercise Page
Back to my Home Page 1