Electronic Documents, Summer 1997

KAL

Electronic Documents Lab #9

Processing SGML

Processing an SGML marked up document presumes that a document parser exists that understands the markups and their relationships to each other (i.e., the logical structure of the document).

In this lab, we will use a pre-created DTD to (1) create a document parser, and (2) process a pre-written document using the generated document parser to validate that our document is marked up correctly. The document parser will also insert any missing tags.


NOTE: These directions were written to be executed at WPI. Mail for advice on how to perform these steps outside of the WPI CS department.


Step 1: Let's look at the DTD. Can you see what the tags are and what the document's logical structure is? Ask if you can't!

Step 2: Let's use SGML to create a Document Parser that will recognize a document of this type:

Step 3: aspgen has now created the document parser. Execute the parser on the marked up document, article.doc . (You will have to cd /tmp if you invoked aspgen with the -min flag. Look at the output from aspgen .)

Type:

 
article.asp article.doc > article.out
This sends the results of parsing to the file article.out.

Questions (Please link these to your class page)

  1. What kind of a program is LLgen? (E.g., it's not an editor or a debugger or an os...)

  2. What does the 1st command aspgen article.dtd actually do?

  3. What does the 2nd command article.asp article.doc > article.out actually do?

  4. article.doc and article.out look similar. What is the difference; that is, what did article.asp actually do?

Amsterdam Parser (ASP) Documentation


Send questions and comments to: Karen Lemone

What do these buttons mean?