What is XML?
- XML
stands for EXtensible Markup Language
- XML
is a markup language much like HTML
- XML
was designed to store and transport data
- XML
was designed to be self-descriptive
- XML
is a W3C Recommendation
The Difference Between XML and
HTML
XML and HTML were designed with different goals:
- XML
was designed to carry data - with focus on what data is
- HTML
was designed to display data - with focus on how data looks
- XML
tags are not predefined like HTML tags are.
For example:
<?xml
version="1.0"?>
<contact-info>
<name>Tanmay
Patil</name>
<company>TutorialsPoint</company>
<phone>(011)
123-4567</phone>
</contact-info>
XML Documents Must Have a Root Element
XML documents must contain one root element
that is the parent of all other elements:
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
In this example <note> is
the root element:
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The XML Prolog
This line is called the XML prolog:
<?xml version="1.0" encoding="UTF-8"?>
The XML prolog is optional. If it exists, it
must come first in the document.
XML documents can contain international
characters, like Norwegian øæå or French êèé.
To avoid errors, you should specify the encoding
used, or save your XML files as UTF-8.
UTF-8 is the default character encoding for XML
documents.
XML Tags are Case Sensitive:
XML tags are case sensitive. The tag
<Letter> is different from the tag <letter>.
Opening and closing tags must be written with
the same case:
<Message>This is incorrect</message>
<message>This is correct</message>
"Opening and closing tags" are often
referred to as "Start and end tags". Use whatever you prefer. It is
exactly the same thing.
XML Attribute Values Must be Quoted
XML elements can have attributes in name/value
pairs just like in HTML.
In XML, the attribute values must always be
quoted.
INCORRECT:
<note date=12/11/2007>
<to>Tove</to>
<from>Jani</from>
</note>
CORRECT:
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>
The error in the first document is that the date
attribute in the note element is not quoted.
Entity References
Some characters have a special meaning in XML.
If you place a character like "<"
inside an XML element, it will generate an error because the parser interprets
it as the start of a new element.
This will generate an XML error:
<message>salary < 1000</message>
To avoid this error, replace the
"<" character with an entity reference:
<message>salary < 1000</message>
There are 5 pre-defined entity references in
XML:
<child>
<subchild>.....</subchild>
</child>
</root>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<message>This is correct</message>
<to>Tove</to>
<from>Jani</from>
</note>
<to>Tove</to>
<from>Jani</from>
</note>
<
|
<
|
less than
|
>
|
>
|
greater than
|
&
|
&
|
ampersand
|
'
|
'
|
apostrophe
|
"
|
"
|
quotation mark
|
Comments in XML:
The syntax for writing comments in XML is
similar to that of HTML.
<!-- This is a comment -->
Two dashes in the middle of a comment are not
allowed.
Not allowed:
<!-- This is a --
comment -->
Strange, but allowed:
<!-- This is a - -
comment -->
What is an XML Element?
An XML element is everything from (including)
the element's start tag to (including) the element's end tag.
<price>29.99</price>
An element can contain:
- text
- attributes
- other
elements
- or
a mix of the above
<bookstore>
<book category="children">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
<book category="children">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
In the example above:
<title>, <author>, <year>, and <price>
have text content because they contain
text (like 29.99).
<bookstore> and <book> have element contents, because they contain elements.
<book> has an attribute (category="children").
Attributes are designed to contain data related
to a specific element.
XML Attributes Must be Quoted
Attribute values must always be quoted. Either
single or double quotes can be used.
For a person's gender, the <person>
element can be written like this:
<person gender="female">
or like this:
<person gender='female'>
If the attribute value itself contains double
quotes you can use single quotes, like in this example:
<gangster name='George
"Shotgun" Ziegler'>
or you can use character entities:
<gangster name="George
"Shotgun" Ziegler">
XML Elements vs. Attributes
Take a look at these examples:
<person gender="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person>
<gender>female</gender>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<gender>female</gender>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
In the first example gender is an attribute. In
the last, gender is an element. Both examples provide the same information.
There are no rules about when to use attributes
or when to use elements in XML.
My Favorite Way
The following three XML documents contain
exactly the same information:
A date attribute is used in the first example:
<note date="2008-01-10">
<to>Tove</to>
<from>Jani</from>
</note>
<to>Tove</to>
<from>Jani</from>
</note>
A <date> element is used in the second
example:
<note>
<date>2008-01-10</date>
<to>Tove</to>
<from>Jani</from>
</note>
<date>2008-01-10</date>
<to>Tove</to>
<from>Jani</from>
</note>
An expanded <date> element is used in the
third example: (THIS IS MY FAVORITE):
<note>
<date>
<year>2008</year>
<month>01</month>
<day>10</day>
</date>
<to>Tove</to>
<from>Jani</from>
</note>
<date>
<year>2008</year>
<month>01</month>
<day>10</day>
</date>
<to>Tove</to>
<from>Jani</from>
</note>
XML Attributes for Metadata:
Sometimes ID references are assigned to
elements. These IDs can be used to identify XML elements in much the same way
as the id attribute in HTML. This example demonstrates this:
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
An Example XML Document
The image above represents books in this XML:
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
XML Tree Structure
XML documents are formed as element
trees.
An XML tree starts at a root element and
branches from the root to child elements.
All elements can have sub elements (child
elements):
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
<child>
<subchild>.....</subchild>
</child>
</root>
The terms parent, child, and sibling are used to
describe the relationships between elements.
Parent have children. Children have parents.
Siblings are children on the same level (brothers and sisters).
All elements can have text content (Harry
Potter) and attributes (category="cooking").
Self-Describing Syntax
XML uses a much self-describing syntax.
A prolog defines the XML version and the
character encoding:
<?xml version="1.0" encoding="UTF-8"?>
The next line is the root element of
the document:
<bookstore>
The next line starts a <book> element:
<book category="cooking">
The <book> elements have 4 child
elements: <title>,< author>, <year>, <price>.
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
The next line ends the book element:
</book>
You can assume, from this example, that the XML
document contains information about books in a bookstore.
Comments
Post a Comment