- The program takes no input from the user. The input is from a
file
words.dat, which contains text of up to 500 words.
A sample file
words.dat is given below. The data file used must be
named words.dat.
- Your program should use the header file get_word.h, which you can copy from my
directory by typing
cp ~loftin/cs101/get_word.h .
at the pegasus prompt. This will put a copy of
get_word.h into your current directory. This header file
must be in the same directory as your program file. Your program file
should contain the header line
#include"get_word.h"
- The file get_word.h uses the global variable
WORD_LENGTH, which should be defined in your program file.
So you need to declare this global variable by using the line
const int WORD_LENGTH=50;
in your program file.
- You should also have a constant global variable
NUM_WORDS, equal to 500, which is the maximum number of words
your program can handle. All relevant loops should use
NUM_WORDS instead of 500.
- The output of your program should consist of the list of words
contained in words.dat in
alphabetical order, with each word preceded by the number of times it
appears in words.dat. Each word should be output only once,
and each word should be in lowercase letters.
How to do it (using C strings)
- The basic loop to read in words from a text file is contained in
words.cpp. In your
program, you should not use the standard input line
infile >> w[n];
Instead, you should use the function get_word, which is
defined in the header file get_word.h. You should use the
line
get_word (w[n], infile);
instead. get_word behaves very similarly to the standard
input, except that the standard input uses white space to mark the
boundary between words, while get_word uses both white space
and punctuation marks to mark the boundary between words. If the
standard input reads in earth. as a word, get_word
will read in the word earth (without the period) as a word.
- The declaration for get_word is this:
void get_word (char[], ifstream&);
get_word skips over any non-letter character from the input
file stream, and then reads in a word into the char array.
The function get_word will cause the infile.eof()
function to behave in much the same way as the standard input. The
input loop used in words.cpp
correctly deals with this issue.
-
If the first letter of a word is uppercase, your program should store
the word with the lowercase version. In other words, the word
Four should be stored as four. The process is
described in string.cpp. It is probably
easiest to do this inside the loop in which the words are being read in.
- To sort the words alphabetically, you should use either an
exchange or a bubble sort. You'll need to use the function
strcmp, which is found in the header file cstring
to compare two C strings alphabetically. If a and b
are two C strings of type char[], strcmp(a,b) is
>0 if a is after b
alphabetically. strcmp(a,b) is equal to 0 if a and
b are identical C strings, and strcmp(a,b) is <0
is a is before b alphabetically.
Also, to copy one C string to another, as required in the exchange sort, use
the function strcpy found in cstring, as we
discussed in class. (This replaces the assignment operator =.)
- To print out the sorted list, by printing out the number of times
each word appears, together with the word, there is a model in the
program sort_count.cpp.
Note that to test two C strings for equality, you'll need to use
strcmp.
How to do it (using the C++ string class)
- You may use the C++ string class for your program instead of C
strings, if you'd like.
- In the header file get_word.h, there is a version of the
function get_word which deals with the C++ string class
instead of C strings. The declaration for this function is
get_word (string &, ifstream &);
Its behavior is otherwise the
same as the version of get_word described above. In
particular, you still need to declare a global variable
const int WORD_LENGTH=50;
in order for get_word to
work properly.
- Remember to compare objects in the C++ string class, you can use
the operators > and < to compare two string
alphabetically. Also, to test if two strings are equal, you can use
the operator ==. Thus no special function such as
strcmp is needed. Also, the assignment operator =
may be used, so strcpy is not needed.
- You still need to modify the first letter of each word, if it is
uppercase, as above.
- In order to implement the loop as in words.cpp, to test to see if a
string is null, test for
(w[n] == "")
instead of
(w[n][0] == '\0')
This is because in the C++ string class,
strings do not end with a terminating '\0' character.
- Otherwise, use much the same outline as above in the section on C
strings.
Sample run of the program
If the file words.dat is
I slit a sheet.
A sheet I slit.
Upon a slitted sheet I sit.
then the output should look like this:
3 a
3 i
3 sheet
1 sit
2 slit
1 slitted
1 upon
To run my solution, you can type
cp ~loftin/cs101/solution6 .
at the pegasus
prompt. This will put a copy of the executable file solution6
in your current directory. Then you can test the solution by writing
your own data file words.dat. Then type solution6
to run my solution. (You can also copy a sample
words.dat from my ~loftin/cs101/ directory as above.)
Due date: Monday, May 3, 2004