Document Type Definition (DTD) is a set of markup declarations that define a document type for SGML The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents. ISO 8879 Annex A.1 defines generalized markup:-family markup languages A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts. Examples are typesetting instructions (SGML The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents. ISO 8879 Annex A.1 defines generalized markup:, XML Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards, HTML HTML, which stands for HyperText Markup Language, is the predominant markup language for web pages. It is written in the form of HTML elements consisting of "tags" surrounded by angle brackets within the web page content). A DTD is a kind of XML schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements,.
DTDs use a terse formal syntax that declares precisely which elements and references may appear where in the document of the particular type, and what the elements’ contents and attributes are. DTDs also declare entities which may be used in the instance document.
XML Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards uses a subset of SGML The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents. ISO 8879 Annex A.1 defines generalized markup: DTD.
As of 2009[update] newer XML Namespace XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are defined in Namespaces in XML, a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary. If each vocabulary is given a namespace then the ambiguity between identically named elements or-aware schema languages (such as W3C The World Wide Web Consortium is the main international standards organization for the World Wide Web (abbreviated WWW or W3) XML Schema XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. It was the first separate schema language for XML to achieve Recommendation status by the W3C. Because of confusion between XML Schema as a specific W3C specification, and the use of the same term to describe schema languages in general, some parts and ISO The International Organization for Standardization , widely known as ISO, is an international-standard-setting body composed of representatives from various national standards organizations. Founded on 23 February 1947, the organization promulgates worldwide proprietary industrial and commercial standards. It has its headquarters in Geneva, RELAX NG In computing, RELAX NG is a schema language for XML, based on Murata Makoto's RELAX and James Clark's TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document; however, RELAX NG also offers a popular compact, non-XML syntax. Compared to other popular schema languages,) have largely superseded DTDs. A namespace-aware version of DTDs is being developed as Part 9 of ISO DSDL Document Schema Definition Languages is a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology [1]. DTDs persist in applications which need special publishing characters such as the XML and HTML Character Entity References In SGML, HTML and XML documents, the logical constructs known as character data and attribute values consist of sequences of characters, in which each character can manifest directly , or can be represented by a series of characters called a character reference, of which there are two types: a numeric character reference and a character entity, which were derived from the larger sets defined as part of the ISO The International Organization for Standardization , widely known as ISO, is an international-standard-setting body composed of representatives from various national standards organizations. Founded on 23 February 1947, the organization promulgates worldwide proprietary industrial and commercial standards. It has its headquarters in Geneva, SGML The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents. ISO 8879 Annex A.1 defines generalized markup: standard A technical standard is an established norm or requirement. It is usually a formal document that establishes uniform engineering or technical criteria, methods, processes and practices. In contrast, a custom, convention, company product, corporate standard, etc. which becomes generally accepted and dominant is often called a de facto standard effort.
Contents |
Associating DTDs with documents
A Document Type Declaration A Document Type Declaration, or DOCTYPE, is an instruction that associates a particular SGML or XML document with a Document Type Definition (DTD) (for example, the formal definition of a particular version of HTML). In the serialized form of the document, it manifests as a short string of markup that conforms to a particular syntax associates a DTD with an XML document. Document Type Declarations appear in the syntactic fragment doctypedecl near the start of an XML document.[1] The declaration establishes that the document is an instance of the type defined by the referenced DTD.
DTDs make two sorts of declaration:
- an internal subset
- an external subset
The declarations in the internal subset form part of the Document Type Declaration in the document itself. The declarations in the external subset are located in a separate text file. The external subset may be referenced via a public identifier and/or a system identifier. Programs for reading documents may not be required to read the external subset.
Example
The following example of a Document Type Declaration contains both public and system identifiers:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
All HTML 4.01 documents conform to one of three SGML DTDs. The public identifiers of these DTDs are constant and are as follows:
The system identifiers of these DTDs, if present in the Document Type Declaration, will be URI references In computing, a Uniform Resource Identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network (typically the World Wide Web) using specific protocols. Schemes specifying a concrete syntax and associated protocols define each. System identifiers can vary, usually point to a specific set of declarations in a resolvable location. SGML allows for public identifiers to be mapped to system identifiers in catalogs that are optionally made available to the URI resolvers used by document parsing software.
Markup Declarations
DTDs describe the structure of a class of documents via element and attribute-list declarations. Element declarations name the allowable set of elements within the document, and specify whether and how declared elements and runs of character data may be contained within each element. Attribute-list declarations name the allowable set of attributes for each declared element, including the type In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, stating the possible values for that type, the operations that can be done on that type, and the way the values of that type are stored of each attribute value, if not an explicit set of valid value(s).
DTD markup declarations declare which element types, attribute lists, entities and notations are allowed in the structure of the corresponding class of XML documents.[2]
Element Type Declarations
An Element Type Declaration defines an element and its possible content. A valid XML document contains only elements that are defined in the DTD.
Various key-words and characters specify an element’s content:
- EMPTY for no content
- ANY for any content
- , for orders
- | for alternatives ("either...or")
- ( ) for groups
- * for any number (zero or more)
- + for at least once (one or more)
- ? mark for optional (zero or one)
- If there is no *, + or ?, the element must occur exactly one time
Examples:
<!ELEMENT html (head, body)> <!ELEMENT p (#PCDATA | p | ul | dl | table | h1|h2|h3)*>
Attribute List Declarations
An Attribute List specifies the name, data type and default value of each attribute associated with a given element type,[3] for example:
<!ATTLIST img id ID #IMPLIED src CDATA #REQUIRED sort CDATA #FIXED "true" print (yes|no) "yes" >
There are the following attribute types:
- CDATA (Character set of data)
- ID
- IDREF and IDREFS
- NMTOKEN and NMTOKENS
- ENTITY and ENTITIES
- NOTATION and NOTATIONS
- Listings and NOTATION-listings
A default value can define whether an attribute must occur (#REQUIRED) or not (#IMPLIED), whether it has a fixed value (#FIXED), and which value should be used as a default value ("…") in case the given attribute is left out in an XML tag.
Entity Declarations
Entities are variables used to define abbreviations; a typical use is user-readable names for special characters.[4] Thus, entities help to avoid repetition and make editing easier. In general, there are basically two different types:
- Internal (Parsed) Entities define entity references in order to replace certain strings by a replacement text. The content of the entity is given in the declaration.
- External (Parsed) Entities refer to external storage objects.
Notation Declarations
Notations read the file format of unparsed external documents in order to include non-XML data in a XML document. For example a GIF image:
<!NOTATION GIF system "image/gif">
XML DTDs and schema validation
The XML DTD syntax is one of several XML schema An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself. These constraints are generally expressed using some combination of grammatical rules governing the order of elements, languages.
A common misconception holds that non-validating XML parsers do not have to read DTDs, when in fact, the DTD must still be scanned for correct syntax as well as for declarations of entities and default attributes. A non-validating parser may, however, elect not to read external entities, including the external subset of the DTD. If the XML document depends on declarations found only in external entities, it should assert standalone="no" in its XML declaration. Identification of the validating DTD may be performed by the use of XML Catalogs.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE people_list SYSTEM "example.dtd"> <people_list> <person> <name>Njandu Varuthude</name> <birthdate>04/02/1977</birthdate> <gender>Male</gender> </person> </people_list>
XML DTD Example
An example of a very simple XML DTD to describe a list of persons might consist of:
<!ELEMENT people_list (person*)> <!ELEMENT person (name, birthdate?, gender?, socialsecuritynumber?)> <!ELEMENT name (#PCDATA)> <!ELEMENT birthdate (#PCDATA)> <!ELEMENT gender (#PCDATA)> <!ELEMENT socialsecuritynumber (#PCDATA)>
Taking this line by line:
people_listis a valid element name, and an instance of such an element contains any number ofpersonelements. The*denotes there can be 0 or morepersonelements within thepeople_listelement.personis a valid element name, and an instance of such an element contains one element namedname, followed by one namedbirthdate(optional), thengender(also optional) andsocialsecuritynumber(also optional). The?indicates that an element is optional. The reference to thenameelement name has no?, so apersonelement must contain anameelement.nameis a valid element name, and an instance of such an element contains "parsed character data" (#PCDATA).birthdateis a valid element name, and an instance of such an element contains parsed character data.genderis a valid element name, and an instance of such an element contains parsed character data.socialsecuritynumberis a valid element name, and an instance of such an element contains parsed character data.
An example of an XML file which makes use of and conforms to this DTD follows. It assumes that we can identify the DTD with the relative URI reference "example.dtd"; the "people_list" after "!DOCTYPE" tells us that the root tags, or the first element defined in the DTD, is called "people_list":
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE people_list SYSTEM "example.dtd"> <people_list> <person> <name>Fred Bloggs</name> <birthdate>27/11/2008</birthdate> <gender>Male</gender> </person> </people_list>
One can render this in an XML-enabled browser A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content. Hyperlinks present in resources enable users to easily navigate their browsers to (such as Internet Explorer Windows Internet Explorer , is a series of graphical web browsers developed by Microsoft and included as part of the Microsoft Windows line of operating systems starting in 1995. It has been the most widely used web browser since 1999, attaining a peak of about 95% usage share during 2002 and 2003 with IE5 and IE6 or Mozilla Firefox Mozilla Firefox is a free and open source web browser descended from the Mozilla Application Suite and managed by Mozilla Corporation. A Net Applications statistic put Firefox at 24.59% of the recorded usage share of web browsers as of April 2010[update], making it the second most popular browser in terms of current use worldwide after Microsoft's) by pasting and saving the DTD component above to a text file named example.dtd and the XML file to a differently-named text file, and opening the XML file with the browser. The files should both be saved in the same directory. However, many browsers do not check that an XML document conforms to the rules in the DTD; they are only required to check that the DTD is syntactically correct. For security reasons, they may also choose not to read the external DTD.
Alternatives to DTDs are available:
- XML Schema XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. It was the first separate schema language for XML to achieve Recommendation status by the W3C. Because of confusion between XML Schema as a specific W3C specification, and the use of the same term to describe schema languages in general, some parts, also referred to as XML Schema Definition (XSD), has achieved Recommendation status within the W3C, and is popular for "data oriented" (that is, transactional non-publishing) XML use because of its stronger typing and easier round-tripping to Java declarations. Most of the publishing world has found that the added complexity of XSD would not bring them any particular benefits[citation needed], so DTDs are still far more popular there. An XML Schema Definition is itself an XML document while a DTD is not.
- RELAX NG In computing, RELAX NG is a schema language for XML, based on Murata Makoto's RELAX and James Clark's TREX. A RELAX NG schema specifies a pattern for the structure and content of an XML document. A RELAX NG schema is itself an XML document; however, RELAX NG also offers a popular compact, non-XML syntax. Compared to other popular schema languages,, which is also a part of DSDL Document Schema Definition Languages is a framework within which multiple validation tasks of different types can be applied to an XML document in order to achieve more complete validation results than just the application of a single technology, is an ISO international standard. It is more expressive than XSD, while providing a simpler syntax, but commercial software support has been slow in coming.
See also
- Document Type Declaration A Document Type Declaration, or DOCTYPE, is an instruction that associates a particular SGML or XML document with a Document Type Definition (DTD) (for example, the formal definition of a particular version of HTML). In the serialized form of the document, it manifests as a short string of markup that conforms to a particular syntax
- Semantic Web Semantic Web is a group of methods and technologies to allow machines to understand the meaning - or "semantics" - of information on the World Wide Web
- XML Schema Language Comparison - Comparison to other XML Schema languages.
References
- ^ "doctypedecl". Extensible Markup Language (XML) 1.1. W3C. http://www.w3.org/TR/2004/REC-xml11-20040204/#NT-doctypedecl.
- ^ Sams teach yourself XML in 10 minutes Von Andrew H. Watt
- ^ http://www.stylusstudio.com/w3c/xml11/attdecls.htm#attdecls
- ^ http://www.w3schools.com/dtd/dtd_entities.asp
External links
| This article's use of external links may not follow Wikipedia's policies or guidelines. Please improve this article by removing excessive and inappropriate external links or by converting links into footnote references. (August 2009) |
- Definition of the XML document type declaration from Extensible Markup Language (XML) 1.0 (Fourth Edition) on W3.org
- XML DTD Quick Reference
- The XML FAQ has some DTD-specific entries
- DTD Tutorial from W3schools
- Zvon DTD Tutorial - in 7 languages
- Interactive DTD tutorial from XMLzoo
- Different doctypes for HTML
- XMLPatterns.com - Design Patterns for developing DTDs
- dtd2xs Converts a DTD to an XML Schema XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. It was the first separate schema language for XML to achieve Recommendation status by the W3C. Because of confusion between XML Schema as a specific W3C specification, and the use of the same term to describe schema languages in general, some parts
- PlainXML Converts a DTD to POJO objects
- DTD Statistics
Categories: XML-based standards Categories: XML | Computer and telecommunication standards
Ben Jenkins
Wed, 24 Feb 2010 20:10:26 GM
The first significant difference between an HTML and XHTML page is the . document type definition. . There are several ways in which this specification can be done in Adobe Dreamweaver. One of the most convenient is to go to the Modify menu ...
