Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
B.Eckel - Thinking in C++, Vol.2, 2nd edition.pdf
Скачиваний:
50
Добавлен:
08.05.2013
Размер:
2.09 Mб
Скачать

// Test StreamTokenizer #include "StreamTokenizer.h" #include "../require.h" #include <iostream>

#include <fstream> #include <set>

using namespace std;

int main(int argc, char* argv[]) { requireArgs(argc, 1);

ifstream in(argv[1]); assure(in, argv[1]); StreamTokenizer words(in); set<string> wordlist; string word;

while((word = words.next()).size() != 0) wordlist.insert(word);

// Output results: copy(wordlist.begin(), wordlist.end(),

ostream_iterator<string>(cout, "\n")); } ///:~

Now the tool is more reusable than before, but it’s still inflexible, because it can only work with an istream. This isn’t as bad as it first seems, since a string can be turned into an istream via an istringstream. But in the next section we’ll come up with the most general, reusable tokenizing tool, and this should give you a feeling of what “reusable” really means, and the effort necessary to create truly reusable code.

A completely reusable tokenizer

Since the STL containers and algorithms all revolve around iterators, the most flexible solution will itself be an iterator. You could think of the TokenIterator as an iterator that wraps itself around any other iterator that can produce characters. Because it is designed as an input iterator (the most primitive type of iterator) it can be used with any STL algorithm. Not only is it a useful tool in itself, the TokenIterator is also a good example of how you can design your own iterators.18

The TokenIterator is doubly flexible: first, you can choose the type of iterator that will produce the char input. Second, instead of just saying what characters represent the delimiters, TokenIterator will use a predicate which is a function object whose operator( ) takes a char and decides if it should be in the token or not. Although the two examples given

18 This is another example coached by Nathan Myers.

Chapter 15: Multiple Inheritance

203

here have a static concept of what characters belong in a token, you could easily design your own function object to change its state as the characters are read, producing a more sophisticated parser.

The following header file contains the two basic predicates Isalpha and Delimiters, along with the template for TokenIterator:

//: C04:TokenIterator.h #ifndef TOKENITERATOR_H #define TOKENITERATOR_H #include <string> #include <iterator> #include <algorithm> #include <cctype>

struct Isalpha {

bool operator()(char c) {

using namespace std; //[[For a compiler bug]] return isalpha(c);

}

};

class Delimiters { std::string exclude;

public: Delimiters() {}

Delimiters(const std::string& excl) : exclude(excl) {}

bool operator()(char c) {

return exclude.find(c) == std::string::npos;

}

};

template <class InputIter, class Pred = Isalpha> class TokenIterator: public std::iterator<

std::input_iterator_tag,std::string,ptrdiff_t>{ InputIter first;

InputIter last; std::string word; Pred predicate;

public:

TokenIterator(InputIter begin, InputIter end, Pred pred = Pred())

: first(begin), last(end), predicate(pred) {

Chapter 15: Multiple Inheritance

204

++*this;

}

TokenIterator() {} // End sentinel

//Prefix increment: TokenIterator& operator++() {

word.resize(0);

first = std::find_if(first, last, predicate); while (first != last && predicate(*first))

word += *first++; return *this;

}

//Postfix increment

class Proxy { std::string word;

public:

Proxy(const std::string& w) : word(w) {} std::string operator*() { return word; }

};

Proxy operator++(int) { Proxy d(word); ++*this;

return d;

}

// Produce the actual value:

std::string operator*() const { return word; } std::string* operator->() const {

return &(operator*());

}

// Compare iterators:

bool operator==(const TokenIterator&) { return word.size() == 0 && first == last;

}

bool operator!=(const TokenIterator& rv) { return !(*this == rv);

}

};

#endif // TOKENITERATOR_H ///:~

TokenIterator is inherited from the std::iterator template. It might appear that there’s some kind of functionality that comes with std::iterator, but it is purely a way of tagging an iterator so that a container that uses it knows what it’s capable of. Here, you can see input_iterator_tag as a template argument – this tells anyone who asks that a TokenIterator only has the capabilities of an input iterator, and cannot be used with algorithms requiring

Chapter 15: Multiple Inheritance

205

more sophisticated iterators. Apart from the tagging, std::iterator doesn’t do anything else, which means you must design all the other functionality in yourself.

TokenIterator may look a little strange at first, because the first constructor requires both a “begin” and “end” iterator as arguments, along with the predicate. Remember that this is a “wrapper” iterator that has no idea of how to tell whether it’s at the end of its input source, so the ending iterator is necessary in the first constructor. The reason for the second (default) constructor is that the STL algorithms (and any algorithms you write) need a TokenIterator sentinel to be the past-the-end value. Since all the information necessary to see if the TokenIterator has reached the end of its input is collected in the first constructor, this second constructor creates a TokenIterator that is merely used as a placeholder in algorithms.

The core of the behavior happens in operator++. This erases the current value of word using string::resize( ), then finds the first character that satisfies the predicate (thus discovering the beginning of the new token) using find_if( ) (from the STL algorithms, discussed in the following chapter). The resulting iterator is assigned to first, thus moving first forward to the beginning of the token. Then, as long as the end of the input is not reached and the predicate is satisfied, characters are copied into the word from the input. Finally, the TokenIterator object is returned, and must be dereferenced to access the new token.

The postfix increment requires a proxy object to hold the value before the increment, so it can be returned (see the operator overloading chapter for more details of this). Producing the actual value is a straightforward operator*. The only other functions that must be defined for an output iterator are the operator== and operator!= to indicate whether the TokenIterator has reached the end of its input. You can see that the argument for operator== is ignored – it only cares about whether it has reached its internal last iterator. Notice that operator!= is defined in terms of operator==.

A good test of TokenIterator includes a number of different sources of input characters including a streambuf_iterator, a char*, and a deque<char>::iterator. Finally, the original

Wordlist.cpp problem is solved:

//: C04:TokenIteratorTest.cpp #include "TokenIterator.h" #include "../require.h" #include <fstream>

#include <iostream> #include <vector> #include <deque> #include <set>

using namespace std;

int main() {

ifstream in("TokenIteratorTest.cpp"); assure(in, "TokenIteratorTest.cpp"); ostream_iterator<string> out(cout, "\n"); typedef istreambuf_iterator<char> IsbIt;

Chapter 15: Multiple Inheritance

206

IsbIt begin(in), isbEnd; Delimiters

delimiters(" \t\n~;()\"<>:{}[]+-=&*#.,/\\");

TokenIterator<IsbIt, Delimiters>

 

wordIter(begin, isbEnd, delimiters),

 

end;

 

vector<string> wordlist;

 

copy(wordIter, end, back_inserter(wordlist));

 

// Output results:

 

copy(wordlist.begin(), wordlist.end(), out);

 

*out++ = "-----------------------------------

";

//Use a char array as the source: char* cp =

"typedef std::istreambuf_iterator<char> It"; TokenIterator<char*, Delimiters>

charIter(cp, cp + strlen(cp), delimiters), end2;

vector<string> wordlist2;

copy(charIter, end2, back_inserter(wordlist2)); copy(wordlist2.begin(), wordlist2.end(), out); *out++ = "-----------------------------------";

//Use a deque<char> as the source:

ifstream in2("TokenIteratorTest.cpp"); deque<char> dc;

copy(IsbIt(in2), IsbIt(), back_inserter(dc)); TokenIterator<deque<char>::iterator,Delimiters>

dcIter(dc.begin(), dc.end(), delimiters), end3;

vector<string> wordlist3;

copy(dcIter, end3, back_inserter(wordlist3)); copy(wordlist3.begin(), wordlist3.end(), out); *out++ = "-----------------------------------"; // Reproduce the Wordlist.cpp example:

ifstream in3("TokenIteratorTest.cpp"); TokenIterator<IsbIt, Delimiters>

wordIter2(IsbIt(in3), isbEnd, delimiters); set<string> wordlist4;

while(wordIter2 != end) wordlist4.insert(*wordIter2++);

copy(wordlist4.begin(), wordlist4.end(), out); } ///:~

Chapter 15: Multiple Inheritance

207

Соседние файлы в предмете Численные методы
  • #
    08.05.20133.99 Mб22A.Menezes, P.van Oorschot,S.Vanstone - HANDBOOK OF APPLIED CRYPTOGRAPHY.djvu
  • #
  • #
    08.05.20135.91 Mб24B.Eckel - Thinking in Java, 3rd edition (beta).pdf
  • #
  • #
    08.05.20136.09 Mб17D.MacKay - Information Theory, Inference, and Learning Algorithms.djvu
  • #
    08.05.20133.85 Mб15DIGITAL Visual Fortran ver.5.0 - Programmers Guide to Fortran.djvu
  • #
    08.05.20131.84 Mб12E.A.Lee, P.Varaiya - Structure and Interpretation of Signals and Systems.djvu