KAL
Processing an SGML marked up document presumes that a document parser exists that understands the markups and their relationships to each other (i.e., the logical structure of the document).
In this lab, we will use a pre-created DTD to (1) create a document parser, and (2) process a pre-written document using the generated document parser to validate that our document is marked up correctly. The document parser will also insert any missing tags.
NOTE: These directions were written to be executed at WPI. Mail
for advice on how to perform these steps outside of the WPI CS department.
Step 1: Let's look at the DTD. Can you see what the tags are and what the document's logical structure is? Ask if you can't!
Step 2: Let's use SGML to create a Document Parser that will recognize a document of this type:
(a) First, make sure you're on the host owl or sequoia . You can find out which machine you are on by typing hostname.
(b) First, copy the document type definition (DTD) and the pre-written document to be parsed to your current directory.
cp /usr/local/lib/asp/dtd/article.dtd . cp /usr/local/lib/asp/dtd/article.doc .
(c) Now, invoke the parser generator, aspgen . If you're running low on disk quota (you need about 300k for the parser executable), you may invoke aspgen with a -min flag. This instructs aspgen to leave the generated parser in "temp space" and not in your current directory.
aspgen article.dtd- or -
aspgen -min article.dtd
Step 3: aspgen has now created the document parser. Execute the parser on the marked up document, article.doc . (You will have to cd /tmp if you invoked aspgen with the -min flag. Look at the output from aspgen .)
Type:
article.asp article.doc > article.outThis sends the results of parsing to the file article.out.
Questions (Please link these to your class page)
Send questions and comments to: Karen Lemone