Introduction to XML

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format which is both human-readable and machine-readable

 

Example:

<?xml version="1.0" encoding="UTF-8"?>
<Employee>
  <id>001</id>
  <name>Jacob</name>
  <age>31</age>
  <salary>1000000</salary>
</Employee>

 

Important points about XML

  1. XML is both human-readable and machine-readable

  2. XML was designed to store and transport data.

  3. XML is a W3C Recommendation

  4. XML is similar to HTML, but they are different

    • XML focus on structure of data, while HTML focus on displaying data

    • HTML has predefined tags, but XML has no predefined tags.

    • XML has a strict syntax compared to HTML.

  5. An XML document is a string of characters. Almost every legal Unicode character may appear in an XML document.

    • Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems.

    •  The latest version of Unicode contains a repertoire of more than 120,000 characters.

    • The standard consists of a set of code charts for visual reference, an encoding method and set of standard character encodings, a set of reference data files, and a number of related items

      • UTF-8 is the default character encoding for XML documents.

 

Syntax summary

  • The characters making up an XML document are divided into markup and content.

    • Generally, strings that constitute markup either begin with the character < and end with a >, or they begin with the character & and end with a ;.

    • Strings of characters that are not markup are content. 

    • E.g. In the above example, id is part of markup while 001 is content.

  • A markup construct (tags) that begins with < and ends with >.

    • Tags come in three flavors:

      • start-tags

        • e.g. <section>

      • end-tags

        • e.g. </section>

      • empty-element tags

        • e.g. <line-break />

  • A logical document component which either begins with a start-tag and ends with a matching end-tag is called as a element.

    • May consist only of an empty-element tag. 

  • A start-tag or an empty-element tag, may contain key/value pairs of data called as attributes.

    • <Employee id ="001">

  • XML documents may have a description about itself called as prolog.

    • E.g. <?xml version="1.0" encoding="UTF-8"?>

    • If it exists, it must come first in the document.

    • UTF-8 is the default character encoding for XML documents.

  • XML has strict set of rules

    • All XML Elements Must Have a Closing Tag

    • XML Tags are Case Sensitive

    • XML Elements Must be Properly Nested

    • XML Attribute Values Must be Quoted

    • Some characters have a special meaning in XML (e.g. <). They should be replaced with an entity reference.

      • E.g. You can replace the "<" character with an entity reference: &lt;

 

Please go throgh the attached reference links to learn more about XML.

Quick Notes Finder Tags

Activities (1) advanced java (1) agile (3) App Servers (6) archived notes (2) Arrays (1) Best Practices (12) Best Practices (Design) (3) Best Practices (Java) (7) Best Practices (Java EE) (1) BigData (3) Chars & Encodings (6) coding problems (2) Collections (15) contests (3) Core Java (All) (55) course plan (2) Database (12) Design patterns (8) dev tools (3) downloads (2) eclipse (9) Essentials (1) examples (14) Exception (1) Exceptions (4) Exercise (1) exercises (6) Getting Started (18) Groovy (2) hadoop (4) hibernate (77) hibernate interview questions (6) History (1) Hot book (5) http monitoring (2) Inheritance (4) intellij (1) java 8 notes (4) Java 9 (1) Java Concepts (7) Java Core (9) java ee exercises (1) java ee interview questions (2) Java Elements (16) Java Environment (1) Java Features (4) java interview points (4) java interview questions (4) javajee initiatives (1) javajee thoughts (3) Java Performance (6) Java Programmer 1 (11) Java Programmer 2 (7) Javascript Frameworks (1) Java SE Professional (1) JPA 1 - Module (6) JPA 1 - Modules (1) JSP (1) Legacy Java (1) linked list (3) maven (1) Multithreading (16) NFR (1) No SQL (1) Object Oriented (9) OCPJP (4) OCPWCD (1) OOAD (3) Operators (4) Overloading (2) Overriding (2) Overviews (1) policies (1) programming (1) Quartz Scheduler (1) Quizzes (17) RabbitMQ (1) references (2) restful web service (3) Searching (1) security (10) Servlets (8) Servlets and JSP (31) Site Usage Guidelines (1) Sorting (1) source code management (1) spring (4) spring boot (3) Spring Examples (1) Spring Features (1) spring jpa (1) Stack (1) Streams & IO (3) Strings (11) SW Developer Tools (2) testing (1) troubleshooting (1) user interface (1) vxml (8) web services (1) Web Technologies (1) Web Technology Books (1) youtube (1)