Understanding Interprocess Communication, Sockets, and XML-RPC
Interprocess Communication Characteristics
Interprocess communication (IPC) involves message passing between processes, supported by send and receive operations, defined by destinations and messages. Processes communicate by sending messages (byte sequences) to a destination, where another process receives them.
Synchronization
In synchronous communication, sending and receiving processes synchronize at each message exchange. Both send and receive are blocking operations.
In asynchronous communication, the send operation is non-blocking. The receive operation can be blocking or non-blocking.
Message Destination
Internet protocols send messages to (Internet address, local port) pairs. Processes can use multiple ports to receive messages.
Reliability and Ordering
Communication is reliable if messages are guaranteed to be delivered despite packet loss. Some applications require messages to be delivered in the order they were sent.
Sockets
Interprocess communication consists of transmitting a message between sockets in different processes. A process can use multiple ports to receive messages but cannot share ports with other processes on the same computer.
UDP Datagram Communication
To send or receive messages, a process must create a socket bound to an Internet address of the local host and a local port. A server binds its socket to a server port.
Blocking and Timeouts
Sockets typically provide non-blocking sends and blocking receives for datagram communication. Non-blocking receive is an option in some implementations.
A receive operation that blocks indefinitely is suitable for servers waiting for client requests. However, it’s inappropriate for a process to wait indefinitely if the sending process has crashed or the message is lost.
Receive from Any
The receive method doesn’t specify a message origin. It gets a message addressed to its socket from any origin and returns the sender’s Internet address and local port, allowing the recipient to check the source.
Failure Model
A failure model for communication channels defines reliable communication with integrity and validity. Integrity requires messages not be corrupted or duplicated.
- Omission failures: Messages may be dropped due to checksum errors or lack of buffer space.
- Ordering: Messages can be delivered out of order.
TCP Protocol API
The TCP protocol API, originating from BSD 4.x UNIX, provides a stream of bytes for data writing and reading.
Stream Communication
Stream communication assumes one process is the client and the other is the server when establishing a connection.
Message Sizes, Lost Messages, and Flow Control
Applications choose how much data to write to or read from a stream.
The TCP protocol uses an acknowledgement scheme. If the sender doesn’t receive an acknowledgement within a timeout, it retransmits the message.
The TCP protocol attempts to match the speeds of reading and writing processes. If the writer is too fast, it’s blocked until the reader consumes enough data.
Message Duplication, Ordering, and Destinations
Message identifiers are associated with each IP packet, enabling the recipient to detect and reject duplicates or reorder messages that arrive out of order.
Communicating processes establish a connection before communicating over a stream. Once connected, they read from and write to the stream without needing Internet addresses and ports. Establishing a connection involves a connect request-accept sequence.
Failure Model
To satisfy integrity, TCP streams use checksums to detect and reject corrupt packets and sequence numbers to detect and reject duplicate packets.
Data Representation: Marshalling, Unmarshalling, and XML
Marshalling and Unmarshalling involve assembling and disassembling data items into a transmittable message format.
XML (Extensible Markup Language)
XML defines a textual format for representing structured data. Originally for documents with self-describing structured data (e.g., web documents), it’s now used for messages exchanged by clients and servers in web services.
Other Data Representations
CORBA’s common data representation handles external representation for structured and primitive types passed as arguments and results of remote method invocations in CORBA.
Java’s object serialization handles flattening and external data representation of single objects or object trees for message transmission or disk storage.
XML Details
XML is a markup language defined by the World Wide Web Consortium (W3C) for general web use.
HTML is for defining web page appearance, while XML is for writing structured documents for the web.
XML data items are tagged with ‘markup’ strings, describing the logical structure and associating attribute-value pairs.
XML is used for client-web service communication and for defining web service interfaces and properties.
Attributes and Names
A start tag can include attribute name-value pairs, such as id="123456789"
.
Tag and attribute names in XML generally start with a letter, but can also start with an underline or a colon.
Binary Data and Namespaces
All information in XML elements must be expressed as character data.
An XML namespace is a set of names for element types and attributes, referenced by a URL.
It uses an attribute called xmlns, whose value is a URL referring to the file containing the namespace definitions.
XML Schema
An XML schema defines the elements and attributes in a document, their nesting, order, number, and whether an element is empty or includes text.
Remote Procedure Call (RPC) and XML-RPC
Request-Reply Protocol
The request-reply protocol is based on doOperation, getRequest, and sendReplay communication primitives.
Message Identifiers
Any message management scheme requires each message to have a unique identifier.
RPC Overview
RPC is a mechanism to call a procedure or function on a remote computer.
RPC is an older technology than the Web, providing developers with a mechanism for defining interfaces callable over a network.
XML-RPC
XML-RPC allows programs to make function or procedure calls across a network.
XML-RPC uses the HTTP protocol to pass information from a client to a server.
Requests and Responses
XML-RPC requests combine XML content and HTTP headers. The XML content uses data typing to pass parameters and identifies the procedure being called, while HTTP headers wrap the request for web transmission.
XML-RPC responses can contain only one parameter, which can be an array or struct, allowing multiple values to be returned.
XML-RPC Components
XML-RPC consists of three parts:
- XML-RPC data model: A set of types for passing parameters, return values, and faults (error messages).
- XML-RPC request structures: An HTTP POST request containing method and parameter information.
- XML-RPC response structures: An HTTP response containing return values or fault information.