XML
Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

XML Structure
XML Declaration
XML documents may begin by declaring some information about themselves, as in the following example:
<?xml version="1.0" encoding="UTF-8" ?> 
Tag
A markup construct that begins with < and ends with >. Tags come in three flavors: 
start-tags; for example: <section>
end-tags; for example: </section>
empty-element tags; for example: <line-break />
Element
A logical document component either begins with a start-tag and ends with a matching end-tag or consists only of an empty-element tag.
The characters between the start- and end-tags, if any, are the element's content, and may contain markup, including other elements, which are called child elements.
<Greeting>Hello, world.</Greeting>
Attribute
A markup construct consisting of a name/value pair that exists within a start-tag or empty-element tag.
<step number="3">

XML Parser
SAX (Simple API for XML) is an event-based sequential access parser API.
<?xml version="1.0" encoding="UTF-8"?>
 <RootElement param="value">
     <FirstElement>
         Some Text
     </FirstElement>
<SecondElement param2="something">
         Pre-Text <Inline>Inlined text</Inline> Post-text.
     </SecondElement>
 </RootElement>
Events
Processing Instruction event
XML Element start
XML Text node
XML Element end
DOM (Document Object Model) is a convention for representing and interacting with objects XML documents. Objects in the DOM tree may be addressed and manipulated by using methods on the objects.
|-> Document
    |-> Node (element)
        |-> Node (text)
StAX (Streaming API for XML) was designed as a median between these two opposites. In the StAX metaphor, the programmatic entry point is a cursor that represents a point within the document. The application moves the cursor forward - 'pulling' the information from the parser as it needs. This is different from an event based API - such as SAX - which 'pushes' data to the application - requiring the application to maintain state between events as necessary to keep track of location within the document.
while ( reader.hasNext() )
    reader.next()

JSON
JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.

JSON Structure
object
{}
{ members } 
members
pair
pair , members
pair
string : value
array
[]
[ elements ]
elements
value
value
string
number
object
array
true
false
null 

Number (double precision floating-point format in JavaScript, generally depends on implementation)
String (double-quoted Unicode (UTF-8 by default), with backslash escaping)
Boolean (true or false)
Array (an ordered sequence of values, comma-separated and enclosed in square brackets; the values do not need to be of the same type)
Object (an unordered collection of key:value pairs with the ':' character separating the key and the value, comma-separated and enclosed in curly braces; the keys must be strings and should be distinct from each other)
null (empty)

JSON Parsing
A serializer parses the JSON string
Every object or array or value is retrieved by providing the necessary key to the object which contains it.



Protocol Buffer
Protocol Buffers are a serialization format with an interface description language developed by Google.

Protocol Buffers are serialized into a binary file format.
Compact
Forwards-compatible, backwards-compatible
Not self-describing (that is, there is no way to tell the names, meaning, or full data types of fields, without having an external specification; there is no defined way to include or refer to such a schema within a Protocol Buffer file.

Protocol Buffer Parsing
Define message formats in a .proto file.

Use the protocol buffer compiler.
$  protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto 
Use the Java protocol buffer API to write and read messages. 
byte[] toByteArray();: serializes the message and returns a byte array containing its raw bytes. 
static Person parseFrom(byte[] data);: parses a message from the given byte array. 
void writeTo(OutputStream output);: serializes the message and writes it to an OutputStream. 
static Person parseFrom(InputStream input);: reads and parses a message from an InputStream. 

Comparision
Why?
XML, JSON and Protocol buffers all are used for same purpose i.e. transmission of data.
Basis of Comparision
Size of data
Complexity of code
Parsing speed

Comparision based on Size of data
Data to be converted
Employee name
Employee DOB
Employee designation
Sample converted data
XML
<?xml version="1.0" encoding="UTF-8"?>
<employees>
    <employee>
        <id>1</id>
        <name>Uday</name>
        <dob>13-05-1984</dob>
        <role>SE</role>
     </employee>
</employees>
JSON
{"employees":[{"id":1,"name":"Uday","dob":"13-05-1984","role":"SE"} ]}
Protocol Buffer
Uday13-05-1984"SE


Comparision - Complexity of Code
XML Parsing
SAX

StAX

JSON Parser (Jackson)

Protocol Buffers

Comparison - Parsing Speed





Further
Formats
Java Serialization (Binary)
Smile Data Format (Binary)
YAML (YAML Ain't Markup Language)
APIs
Apache Thrift (native)
Apache Avro (native)
Protostuff (protostuff, protobuf, 
json, smile , xml, yaml)
Wobly (native)
Kryo(native)
Jackson (JSON)