XML Structure: Elements, Attributes, and Syntax

1.0″ encoding=”UTF-8” standalone=”no“?>XML Declaration
White space characters (space, carriage return, line feed, tab, etc)
Comments
your_documents_css.css”?>External style sheet for browse your xml document
<?word document=”test.doc” ?>Processing Instruction
root_element [ root_element (#PCDATA)> ]>Internal DOCTYPE Declaration, and/or
root_element SYSTEM “your_documents_dtd.dtd“>External DOCTYPE Declaration
root_element>Open tag of “root_element
subElement>Open tag of “subElement
…text…Data
subSubElement attr_name=”attr_value”>Open tag of “subSubElement” with attribute “attr_name” equal “attr_value
…any characters (including markup)… CDATA Section
subSubElement>Close tag of “subSubElement
subElement>Close tag of “subElement
Well-Formed Document (well formedness) emptyElement/>Tag of empty element
Has a single root element anotherSubElement xmlns=”http://www.xml.su/shema1“>Open tag of “anotherSubElement” with namespace “http://www.xml.su/shema1” as default
All other element are correctly nested:
– all other elements are children of the root element element1>Open tag of “element1” from namespace “http://www.xml.su/shema1
– all elements are correctly paired element1>Close tag of “element1
– the element name in a start-tag and an end-tag are exactly the same element2 xmlns:sh=”http://www.xml.su/shema2“>Open tag of “element2” with namespace “http://www.xml.su/shema2” for sh preffix
– attribute names are used only once within the same element sh:emptyElement/>Empty tag of “sh:emptyElement” from namespace “http://www.xml.su/shema2
Valid XML document element2>Close tag of “element2
Abide by the constraints placed on each element’s position in the document anotherSubElemen>Close tag of “anotherSubElemen
Abide by the constaints placed on the attributes of each element Any supperposition of elements (corrected nested), empty elements, text data,CDATA, comments and white space
Require a Document Type Definition or XML Schema to specify the constraints
root_element>Close tag of “root_element
XML DeclarationComments
<?xml
version=”version_number”
encoding=”encoding_declaration”
standalone=”standalone_status”
?>

The XML declaration is a processing instruction that identifies the document as being XML. All XML documents should begin with an XML declaration.

Rules:
  • If the XML declaration is included, it must be situated at the first position of the first line in the XML document well-formedness constraint.
  • If the XML declaration is included, it must contain the version number attribute well-formedness constraint.
  • If all of the attributes glossary are declared in an XML declaration, they must be placed in the order shown above well-formedness constraint.
  • If any elements, attributes, or entities are used in the XML document that are referenced or defined in an external DTD, standalone=”no” must be included validity constraint.
  • The XML declaration must be in lower case (except for the encoding declarations) well-formedness constraint.
Note:
  • The XML declaration has no closing tag, that is well-formedness constraint.

Comments are used to hide text from the end user when the output document is displayed. They are useful as notes to the author of the document, or other authors who may modify the document.

Rules:
  • Comments may be placed anywhere after the XML declaration.
  • Comments may not be placed inside a tag glossary, but may be placed inside a document type declaration.
  • Double dashes ‘–‘ are not allowed inside a comment well-formedness constraint.
  • markup glossary may be used inside a comment.
  • Nested comments are not allowed.
Processing Instruction

Processing instructions are used to embed information intended for proprietary applications. The XML declaration is an example of a processing instruction. Processing instructions beginning with ‘xml’ or ‘XML’ have been reserved for standardization in the XML Version 1.0 specification and onwards.

<?PI-target ?>

PI-target: any name that does not contain the letters ‘X’ or ‘x’, ‘M’ or ‘m’, or ‘L’ or ‘l’ in that order.

Rules:
  • The string ‘?>’ cannot be placed within a processing instruction, therefore, nested processing instructions are not allowed
Attribute Name:Possible Attribute Value:Attribute Description:
version1.0Specifies the version of the XML standard that the XML document conforms to. The version attribute must be included if the XML declaration is declared
encodingUTF-8, UTF-16, ISO-10646-UCS-2, ISO-10646-UCS-4, ISO-8859-1 to ISO-8859-9, ISO-2022-JP, Shift_JIS, EUC-JPThese are the encoding names glossary of the most common character sets in use today. For a full list of encodings check the IANA’s glossary website.
standaloneyes, noUse ‘yes’ if the XML document has an internal DTD.
Use ‘no’ if the XML document is linked to an external DTD, or any external entity references validity constraint.
CDATA Section
CDATA sections are used to display markup without the XML processor trying to interpret that markup. They are particularly useful when you want to display sections of XML code.


…any characters (including markup)…

Rules:
  • The string ‘]]>’ cannot be placed within a CDATA section, therefore, nested CDATA sections are not allowed well-formedness constraint.
DOCTYPE Declaration & DTDsRelated ReferencesTools

The document type (DOCTYPE) declaration consists of an internal, or references an external Document Type Definition (DTD). It can also have a combination of both internal and external DTDs. The DTD defines the constraints on the structure of an XML document. It declares all of the document’s element types glossary, children element types, and the order and number of each element type. It also declares any attributes, entities, notations, processing instructions, comments, and PE references in the document.

A document can use both internal and external DTD subsets. The internal DTD subset is specified between the square brackets of the DOCTYPE declaration. The declaration for the external DTD subset is placed before the square brackets immediately after the SYSTEM keyword.

Important:
  • DTD (DocType Declaration) in one page >>>
Documentations:
  • Extensible Markup Language (XML) >>>
  • Extensible Markup Language (XML) 1.0 (Third Edition) >>>
  • The Extensible Stylesheet Language Family (XSL) >>>
  • W3C XML Pointer, XML Base and XML Linking (XLink) >>>
  • W3C XML Query (XQuery) >>>
  • XML Schema >>>
  • XML Processing Model Working Group >>>
  • XML Binary Characterization Working Group Public Page >>>
  • Efficient XML Interchange Working Group Public Page >>>
Validators:
  • W3C MarkUp Validator. – Also known as the HTML validator, it helps check Web documents in formats like HTML and XHTML, SVG or MathML >>
  • Checklink – Checks anchors (hyperlinks) in a HTML/XHTML document. Useful to find broken links, etc. >>
  • CSS Validator – validates CSS stylesheets or documents using CSS stylesheets. >>
  • RDF Validator >>
  • Feed Validator. – it helps check newsfeeds in formats like ATOM, RSS of various flavors. >>
  • P3P Validator – Checks whether a site is P3P enabled and controls protocol and syntax of Policy-Reference-File and Policy >>
  • XML Schema Validator >>
  • MUTAT – a human-centered testing tool (framework) >>
The Internal DTD:


Document Type Definition (DTD):
elements/attributes/entities/notations/processing instructions/comments/PE references
]>
Rules:
  • The document type declaration must be placed between the XML declaration and the first element (root element) in the document well-formedness constraint.
  • The keyword DOCTYPE must be followed by the name of the root element in the XML document validity constraint.
  • The keyword DOCTYPE must be in upper case well-formedness constraint.

“Private” External DTDs:

root_element SYSTEM “DTD_location”>

“Public” External DTDs:

DTD_name” “DTD_location“>

External DTDs are useful for creating a common DTD that can be shared between multiple documents. Any changes that are made to the external DTD automatically updates all the documents that reference it. There are two types of external DTDs: private, and public.

Private external DTDs are identified by the keyword SYSTEM, and are intended for use by a single author or group of authors.

Public external DTDs are identified by the keyword PUBLIC and are intended for broad use. The “DTD_location” is used to find the public DTD if it cannot be located by the “DTD_name”.

Rules:
  • If any elements, attributes, or entities are used in the XML document that are referenced or defined in an external DTD, standalone=”no” must be included in the XML declaration validity constraint.

DTD_location: relative or absolute URL
DTD_name follows the syntax:
prefix//owner_of_the_DTD//description_of_the_DTD//ISO 639_language_identifier”

Prefix:Definition:
ISOThe DTD is an ISO standard. All ISO standards are approved.
+The DTD is an approved non-ISO standard.
The DTD is an unapproved non-ISO standard.