- •Thinking in C++ 2nd edition Volume 2: Standard Libraries & Advanced Topics
- •Preface
- •What’s new in the second edition
- •What’s in Volume 2 of this book
- •How to get Volume 2
- •Prerequisites
- •Learning C++
- •Goals
- •Chapters
- •Exercises
- •Exercise solutions
- •Source code
- •Language standards
- •Language support
- •The book’s CD ROM
- •Seminars, CD Roms & consulting
- •Errors
- •Acknowledgements
- •Library overview
- •1: Strings
- •What’s in a string
- •Creating and initializing C++ strings
- •Initialization limitations
- •Operating on strings
- •Appending, inserting and concatenating strings
- •Replacing string characters
- •Concatenation using non-member overloaded operators
- •Searching in strings
- •Finding in reverse
- •Finding first/last of a set
- •Removing characters from strings
- •Stripping HTML tags
- •Comparing strings
- •Using iterators
- •Iterating in reverse
- •Strings and character traits
- •A string application
- •Summary
- •Exercises
- •2: Iostreams
- •Why iostreams?
- •True wrapping
- •Iostreams to the rescue
- •Sneak preview of operator overloading
- •Inserters and extractors
- •Manipulators
- •Common usage
- •Line-oriented input
- •Overloaded versions of get( )
- •Reading raw bytes
- •Error handling
- •File iostreams
- •Open modes
- •Iostream buffering
- •Seeking in iostreams
- •Creating read/write files
- •User-allocated storage
- •Output strstreams
- •Automatic storage allocation
- •Proving movement
- •A better way
- •Output stream formatting
- •Internal formatting data
- •Format fields
- •Width, fill and precision
- •An exhaustive example
- •Formatting manipulators
- •Manipulators with arguments
- •Creating manipulators
- •Effectors
- •Iostream examples
- •Code generation
- •Maintaining class library source
- •Detecting compiler errors
- •A simple datalogger
- •Generating test data
- •Verifying & viewing the data
- •Counting editor
- •Breaking up big files
- •Summary
- •Exercises
- •3: Templates in depth
- •Nontype template arguments
- •Typedefing a typename
- •Using typename instead of class
- •Function templates
- •A string conversion system
- •A memory allocation system
- •Type induction in function templates
- •Taking the address of a generated function template
- •Local classes in templates
- •Applying a function to an STL sequence
- •Template-templates
- •Member function templates
- •Why virtual member template functions are disallowed
- •Nested template classes
- •Template specializations
- •A practical example
- •Pointer specialization
- •Partial ordering of function templates
- •Design & efficiency
- •Preventing template bloat
- •Explicit instantiation
- •Explicit specification of template functions
- •Controlling template instantiation
- •Template programming idioms
- •Summary
- •Containers and iterators
- •STL reference documentation
- •The Standard Template Library
- •The basic concepts
- •Containers of strings
- •Inheriting from STL containers
- •A plethora of iterators
- •Iterators in reversible containers
- •Iterator categories
- •Input: read-only, one pass
- •Output: write-only, one pass
- •Forward: multiple read/write
- •Bidirectional: operator--
- •Random-access: like a pointer
- •Is this really important?
- •Predefined iterators
- •IO stream iterators
- •Manipulating raw storage
- •Basic sequences: vector, list & deque
- •Basic sequence operations
- •vector
- •Cost of overflowing allocated storage
- •Inserting and erasing elements
- •deque
- •Converting between sequences
- •Cost of overflowing allocated storage
- •Checked random-access
- •list
- •Special list operations
- •list vs. set
- •Swapping all basic sequences
- •Robustness of lists
- •Performance comparison
- •A completely reusable tokenizer
- •stack
- •queue
- •Priority queues
- •Holding bits
- •bitset<n>
- •vector<bool>
- •Associative containers
- •Generators and fillers for associative containers
- •The magic of maps
- •A command-line argument tool
- •Multimaps and duplicate keys
- •Multisets
- •Combining STL containers
- •Creating your own containers
- •Summary
- •Exercises
- •5: STL Algorithms
- •Function objects
- •Classification of function objects
- •Automatic creation of function objects
- •Binders
- •Function pointer adapters
- •SGI extensions
- •A catalog of STL algorithms
- •Support tools for example creation
- •Filling & generating
- •Example
- •Counting
- •Example
- •Manipulating sequences
- •Example
- •Searching & replacing
- •Example
- •Comparing ranges
- •Example
- •Removing elements
- •Example
- •Sorting and operations on sorted ranges
- •Sorting
- •Example
- •Locating elements in sorted ranges
- •Example
- •Merging sorted ranges
- •Example
- •Set operations on sorted ranges
- •Example
- •Heap operations
- •Applying an operation to each element in a range
- •Examples
- •Numeric algorithms
- •Example
- •General utilities
- •Creating your own STL-style algorithms
- •Summary
- •Exercises
- •Perspective
- •Duplicate subobjects
- •Ambiguous upcasting
- •virtual base classes
- •The "most derived" class and virtual base initialization
- •"Tying off" virtual bases with a default constructor
- •Overhead
- •Upcasting
- •Persistence
- •MI-based persistence
- •Improved persistence
- •Avoiding MI
- •Mixin types
- •Repairing an interface
- •Summary
- •Exercises
- •7: Exception handling
- •Error handling in C
- •Throwing an exception
- •Catching an exception
- •The try block
- •Exception handlers
- •Termination vs. resumption
- •The exception specification
- •Better exception specifications?
- •Catching any exception
- •Rethrowing an exception
- •Uncaught exceptions
- •Function-level try blocks
- •Cleaning up
- •Constructors
- •Making everything an object
- •Exception matching
- •Standard exceptions
- •Programming with exceptions
- •When to avoid exceptions
- •Not for asynchronous events
- •Not for ordinary error conditions
- •Not for flow-of-control
- •You’re not forced to use exceptions
- •New exceptions, old code
- •Typical uses of exceptions
- •Always use exception specifications
- •Start with standard exceptions
- •Nest your own exceptions
- •Use exception hierarchies
- •Multiple inheritance
- •Catch by reference, not by value
- •Throw exceptions in constructors
- •Don’t cause exceptions in destructors
- •Avoid naked pointers
- •Overhead
- •Summary
- •Exercises
- •8: Run-time type identification
- •The “Shape” example
- •What is RTTI?
- •Two syntaxes for RTTI
- •Syntax specifics
- •Producing the proper type name
- •Nonpolymorphic types
- •Casting to intermediate levels
- •void pointers
- •Using RTTI with templates
- •References
- •Exceptions
- •Multiple inheritance
- •Sensible uses for RTTI
- •Revisiting the trash recycler
- •Mechanism & overhead of RTTI
- •Creating your own RTTI
- •Explicit cast syntax
- •Summary
- •Exercises
- •9: Building stable systems
- •Shared objects & reference counting
- •Reference-counted class hierarchies
- •Finding memory leaks
- •An extended canonical form
- •Exercises
- •10: Design patterns
- •The pattern concept
- •The singleton
- •Variations on singleton
- •Classifying patterns
- •Features, idioms, patterns
- •Basic complexity hiding
- •Factories: encapsulating object creation
- •Polymorphic factories
- •Abstract factories
- •Virtual constructors
- •Destructor operation
- •Callbacks
- •Observer
- •The “interface” idiom
- •The “inner class” idiom
- •The observer example
- •Multiple dispatching
- •Visitor, a type of multiple dispatching
- •Efficiency
- •Flyweight
- •The composite
- •Evolving a design: the trash recycler
- •Improving the design
- •“Make more objects”
- •A pattern for prototyping creation
- •Trash subclasses
- •Parsing Trash from an external file
- •Recycling with prototyping
- •Abstracting usage
- •Applying double dispatching
- •Implementing the double dispatch
- •Applying the visitor pattern
- •More coupling?
- •RTTI considered harmful?
- •Summary
- •Exercises
- •11: Tools & topics
- •The code extractor
- •Debugging
- •Trace macros
- •Trace file
- •Abstract base class for debugging
- •Tracking new/delete & malloc/free
- •CGI programming in C++
- •Encoding data for CGI
- •The CGI parser
- •Testing the CGI parser
- •Using POST
- •Handling mailing lists
- •Maintaining your list
- •Mailing to your list
- •A general information-extraction CGI program
- •Parsing the data files
- •Summary
- •Exercises
- •General C++
- •My own list of books
- •Depth & dark corners
- •Design Patterns
- •Index
o << nl << "[{[" << name << "]}]" << nl
<<"[([" << nl << value << nl << "])]"
<<nl;
//Delimiters were added to aid parsing of
//the resulting text file.
}
} ///:~
The program is designed to be as generic as possible, but if you want to change something it is most likely the way that the data is stored in a file (for example, you may want to store it in a comma-separated ASCII format so that you can easily read it into a spreadsheet). You can make changes to the storage format by modifying store( ), and to the way the data is displayed by modifying show( ).
main( ) begins using the same three lines you’ll start with for any POST program. The rest of the program is similar to mlm.cpp because it looks at the “test-field” and “email-address” (checking it for correctness). The file name combines the user’s email address and the current date and time in hex – notice that sprintf( ) is used because it has a convenient way to convert a value to a hex representation. The entire file and path information is stored in the file, along with all the data from the form, which is tagged as it is stored so that it’s easy to parse (you’ll see a program to parse the files a bit later). All the information is also sent back to the user as a simply-formatted HTML page, along with the reminder, if there is one. If “mail-copy” exists and is not “no,” then the names in the “mail-copy” value are parsed and an email is sent to each one containing the tagged data. Finally, if there is a “confirmation” field, the value selects the type of confirmation (there’s only one type implemented here, but you can easily add others) and the command is built that passes the generated data file to the program (called ProcessApplication.exe). That program will be created in the next section.
Parsing the data files
You now have a lot of data files accumulating on your Web site, as people sign up for whatever you’re offering. Here’s what one of them might look like:
//:! C07:TestData.txt
///{/home/eckel/super-cplusplus-workshop- registration/Bruce@EckelObjects.com35B589A0.txt From[Bruce@EckelObjects.com]
[{[subject-field]}] [([
super-cplusplus-workshop-registration ])]
[{[Date-of-event]}] [([
Appendix B: Programming Guidelines
566
Sept 2-4 ])]
[{[name]}]
[([
Bruce Eckel ])]
[{[street]}]
[([
20 Sunnyside Ave, Suite A129 ])]
[{[city]}]
[([
Mill Valley ])]
[{[state]}]
[([ CA ])]
[{[country]}]
[([ USA ])]
[{[zip]}]
[([
94941
])]
[{[busphone]}]
[([ 415-555-1212 ])]
///:~
This is a brief example, but there are as many fields as you have on your HTML form. Now, if your event is compelling you’ll have a whole lot of these files and what you’d like to do is automatically extract the information from them and put that data in any format you’d like.
For example, the ProcessApplication.exe program mentioned above will use the data in an email confirmation message. You’ll also probably want to put the data in a form that can be
Appendix B: Programming Guidelines
567
easily brought into a spreadsheet. So it makes sense to start by creating a general-purpose tool that will automatically parse any file that is created by ExtractInfo.cpp:
//: C10:FormData.h #include <string> #include <iostream> #include <fstream> #include <vector> using namespace std;
class DataPair : public pair<string, string> { public:
DataPair() {}
DataPair(istream& in) { get(in); } DataPair& get(istream& in); operator bool() {
return first.length() != 0;
}
};
class FormData : public vector<DataPair> { public:
string filePath, email;
// Parse the data from a file: FormData(char* fileName);
void dump(ostream& os = cout);
string operator[](const string& key); }; ///:~
The DataPair class looks a bit like the CGIpair class, but it’s simpler. When you create a DataPair, the constructor calls get( ) to extract the next pair from the input stream. The operator bool indicates an empty DataPair, which usually signals the end of an input stream.
FormData contains the path where the original file was placed (this path information is stored within the file), the email address of the user, and a vector<DataPair> to hold the information. The operator[ ] allows you to perform a map-like lookup, just as in CGImap.
Here are the definitions:
//: C10:FormData.cpp {O} #include "FormData.h" #include "../require.h"
DataPair& DataPair::get(istream& in) { first.erase(); second.erase(); string ln;
Appendix B: Programming Guidelines
568
getline(in,ln);
while(ln.find("[{[") == string::npos) if(!getline(in, ln)) return *this; // End
first = ln.substr(3, ln.find("]}]") - 3); getline(in, ln); // Throw away [([ while(getline(in, ln))
if(ln.find("])]") == string::npos) second += ln + string(" ");
else
return *this;
}
FormData::FormData(char* fileName) { ifstream in(fileName);
assure(in, fileName); require(getline(in, filePath) != 0);
//Should be start of first line: require(filePath.find("///{") == 0); filePath = filePath.substr(strlen("///{")); require(getline(in, email) != 0);
//Should be start of 2nd line: require(email.find("From[") == 0); int begin = strlen("From[");
int end = email.find("]"); int length = end - begin;
email = email.substr(begin, length);
//Get the rest of the data:
DataPair dp(in); while(dp) {
push_back(dp); dp.get(in);
}
}
string FormData::operator[](const string& key) { iterator i = begin();
while(i != end()) { if((*i).first == key)
return (*i).second; i++;
}
return string(); // Empty string == not found
}
Appendix B: Programming Guidelines
569
void FormData::dump(ostream& os) {
os << "filePath = " << filePath << endl; os << "email = " << email << endl; for(iterator i = begin(); i != end(); i++)
os << (*i).first << " = " << (*i).second << endl;
} ///:~
The DataPair::get( ) function assumes you are using the same DataPair over and over (which is the case, in FormData::FormData( )) so it first calls erase( ) for its first and second strings. Then it begins parsing the lines for the key (which is on a single line and is denoted by the “[{[” and “]}]”) and the value (which may be on multiple lines and is denoted by a begin-marker of “[([” and an end-marker of “])]”) which it places in the first and second members, respectively.
The FormData constructor is given a file name to open and read. The FormData object always expects there to be a file path and an email address, so it reads those itself before getting the rest of the data as DataPairs.
With these tools in hand, extracting the data becomes quite easy:
//: C10:FormDump.cpp //{L} FormData #include "FormData.h"
#include "../require.h"
int main(int argc, char* argv[]) { requireArgs(argc, 1);
FormData fd(argv[1]); fd.dump();
} ///:~
The only reason that ProcessApplication.cpp is busier is that it is building the email reply. Other than that, it just relies on FormData:
//: C10:ProcessApplication.cpp //{L} FormData
#include "FormData.h" #include "../require.h" using namespace std;
const string from("Bruce@EckelObjects.com"); const string replyto("Bruce@EckelObjects.com"); const string basepath("/home/eckel");
Appendix B: Programming Guidelines
570
int main(int argc, char* argv[]) { requireArgs(argc, 1);
FormData fd(argv[1]); char tfname[L_tmpnam];
tmpnam(tfname); // Create a temporary file name string tempfile(basepath + tfname + fd.email); ofstream reply(tempfile.c_str());
assure(reply, tempfile.c_str());
reply << "This message is to verify that you " "have been added to the list for the "
<< fd["subject-field"] << ". Your signup " "form included the following data; please " "ensure it is correct. You will receive " "further updates via email. Thanks for your " "interest in the class!" << endl;
FormData::iterator i;
for(i = fd.begin(); i != fd.end(); i++) reply << (*i).first << " = "
<< (*i).second << endl; reply.close();
// "fastmail" only available on Linux/Unix: string command("fastmail -F " + from +
" -r " + replyto + " -s \"" + fd["subject-field"] + "\" " + tempfile + " " + fd.email);
system(command.c_str()); // Wait to finish remove(tempfile.c_str()); // Erase the file
} ///:~
This program first creates a temporary file to build the email message in. Although it uses the Standard C library function tmpnam( ) to create a temporary file name, this program takes the paranoid step of assuming that, since there can be many instances of this program running at once, it’s possible that a temporary name in one instance of the program could collide with the temporary name in another instance. So to be extra careful, the email address is appended onto the end of the temporary file name.
The message is built, the DataPairs are added to the end of the message, and once again the Linux/Unix fastmail command is built to send the information. An interesting note: if, in Linux/Unix, you add an ampersand (&) to the end of the command before giving it to system( ), then this command will be spawned as a background process and system( ) will immediately return (the same effect can be achieved in Win32 with start). Here, no ampersand is used, so system( ) does not return until the command is finished – which is a good thing, since the next operation is to delete the temporary file which is used in the command.
Appendix B: Programming Guidelines
571
The final operation in this project is to extract the data into an easily-usable form. A spreadsheet is a useful way to handle this kind of information, so this program will put the data into a form that’s easily readable by a spreadsheet program:
//: C10:DataToSpreadsheet.cpp //{L} FormData
#include "FormData.h" #include "../require.h" #include <string>
using namespace std;
string delimiter("\t");
int main(int argc, char* argv[]) { for(int i = 1; i < argc; i++) {
FormData fd(argv[i]);
cout << fd.email << delimiter; FormData::iterator i;
for(i = fd.begin(); i != fd.end(); i++) if((*i).first != "workshop-suggestions")
cout << (*i).second << delimiter; cout << endl;
}
} ///:~
Common data interchange formats use various delimiters to separate fields of information. Here, a tab is used but you can easily change it to something else. Also note that I have checked for the “workshop-suggestions” field and specifically excluded that, because it tends to be too long for the information I want in a spreadsheet. You can make another version of this program that only extracts the “workshop-suggestions” field.
This program assumes that all the file names are expanded on the command line. Using it under Linux/Unix is easy since file-name global expansion (“globbing”) is handled for you. So you say:
DataToSpreadsheet *.txt >> spread.out
In Win32 (at a DOS prompt) it’s a bit more involved, since you must do the “globbing” yourself:
For %f in (*.txt) do DataToSpreadsheet %f >> spread.out
This technique is generally useful for writing Win32/DOS command lines.
Appendix B: Programming Guidelines
572