Category Archives: nodejs

NodeJS Streaming HTTP Chunked Responses

NodeJS is extremely streamable, including HTTP responses. HTTP being a first-class protocol in Node, make HTTP response streamable in very convenient way.

HTTP chunked encoding allows a server to keep sending data to the client without ever sending the body size. Unless we specify a “Content-Length” header, Node HTTP server sends the header

Transfer-Encoding: chunked

to client, which makes it wait for a final chunk with length of 0 before giving the response as terminated.

This can be useful for streaming data – text, audio, video – or any other into the HTTP client.

Streaming Example

Here we are going to code an pipes the output of a child process into the client:

var spawn = require('child_process').spawn;

require('http').createServer(function(req, res) {
    var child = spawn('tail', ['-f', '/var/log/system.log']);
    child.stdout.pipe(res);
    res.on('end', function() {
        child.kill();
    });
}).listen(4000);

Here we are creating an HTTP server and binding it to port 4000.

When there is a new request, we launch a new child process by executing the command “tail -f /var/log/system.log” which output is being piped into the response.

When response ends (because the browser window was closed, or the network connection severed, etc), we kill the child process so it does not hang around indefinitely.

The post NodeJS Streaming HTTP Chunked Responses appeared first on Xathrya.ID.

NodeJS Child Processes

Child process is a process created by another process (the parent process). The child inherits most of its parent’s attributes, such as file descriptors.

On Node, we can spawn child processes, which can be another Node process or any process we can launch from the command line. For that we will have to provide the command and arguments to execute it. We can either spawn and live along side with the process (spawn), or we can wait until it exits (exec).

Executing Command

We can launch another process and wait for it to finish like this:

var exec = require('child_process').exec;

exec('cat *.js | wc -l', function(err, stdout, stderr) {
    if (err) {
        console.log('child process exited with error code ' + err.code);
        return;
    }
    console.log(stdout);
});

Here we are executing a command, represented as string like what we type on terminal, as first argument of exec(). Here our command is “cat *.js | wc -l” which has two commands piped. The first command is print out the content of every file which has .js extension. The second argument will retrieve data from the pipe and count the line for each file. The second argument is a callback which will be invoked once the exec has finished.

If the child process returned an error code, the first argument of the callback will contain an instance of Error, with the code property set to the child exit code.

If not, the output of stdout and stderr will be collected and be offered to us as strings.

We can also pass an optional options argument between the command and the callback function:

var options = { timeout: 10000 };
exec('cat *.js | wc -l', options, function (err, stdout, stderr) { ... });

The available options are:

  • encoding: the expected encoding for the child output. Default to ‘utf8’
  • timeout: the timeout in miliseconds for the execution of the command. Default is 0 which does not timeout.
  • maxBuffer: specify the maximum size of the output allowed on stdout or stderr. If exceeded, the child is killed. Default is 200 * 1024
  • killSignal: the signal to be sent to the child, if it times out or exceeds the output buffers. Identified as string.
  • cwd: current working directory, the working directory it will operated in.
  • env: environment variables to be passed in to child process. Defaults to null.

On the killSignal option, we can pass a string identifying the name of the signal we wish to send to the target process. Signals are identified in node as strings.

Spawning Processes

If in previous section we see that we can execute a process. In node we can spawn a new child process based on the child_process.spawn function.

var spawn = require('child_process').spawn;

var child = spawn('tail', ['-f', 'file.txt']);
child.stdout.on('data', function(data) {
    console.log('stdout: ' + data);
});
child.stderr.on('data', function(data) {
    console.log('stderr: ' + data);
});

Here we are spawning a child process to run the “tail” command. The tail command need some arguments, therefore we pass array of string for “tail”, which become second argument of  spawn(). This tail receive arguments “-f” and “file.txt” which will monitor the file “file.txt” if it exists and output every new data appended to it into the stdout.

We also listening to the child stdout and printing it’s output. So here, in this case, we are piping the changes to the “file.txt” file into our Node application.  We also print out the stderr stream.

Killing Process

We can (and should) eventually kill child process by calling the kill method on the child object. Otherwise, it will become zombie process.

var spawn = require('child_process').spawn;

var child = spawn('tail', ['-f', 'file.txt']);
child.stdout.on('data', function(data) {
    console.log('stdout: ' + data);
    child.kill();
});

In UNIX, this sends a SIGTERM signal to the child process.

We can also send another signal to the child process. You need to specify it inside the kill call like this:

child.kill('SIGKILL');

The post NodeJS Child Processes appeared first on Xathrya.ID.

NodeJS Datagrams (UDP)

UDP is a connectionless protocol that does not provide the delivery characteristics that TCP does. When sending UDP packets, there is no guarantee for the order of packets and no guarantee for all packets will arrive. This may lead to conclusion that there is possibility that packets are arrive in random order or packets are incomplete.

On the other hand, UDP can be quite be useful in certain cases, like when we want to broadcast data, when we don’t need strict quality of delivery and sequence or even when we don’t know the address of our peers.

NodeJS has ‘dgram’ module to support Datagram transmission.

Datagram Server

A server listening to UDP port can be:

var dgram = require('dgram');

var server = dgram.createSocket('udp4');
server.on('message', function(message, rinfo) {
    console.log('server got message: ' + message +
                ' from ' + rinfo.address + ':' + rinfo.port);
});

server.on('listening', function() {
    var address = server.address();
    console.log('server listening on ' + address.address +
                ':' + address.port);
});

server.bind(4002);

The createSocket function accepts the socket type as the first argument, which can be either “udp4” (UDP over IPv4), “udp6” (UDP over IPv6) or “unix_dgram” (UDP over UNIX domain socket).

When you run the script, you will see the server address, port, and then wait for messages.

We can test it using a tool like netcat:

echo 'hello' | netcat -c -u -w 1 localhost 4002

This sends an UDP packet with “hello” to localhost port 4002 which our program listen to. You should then get the server print out like:

server got message: hello
from 127.0.0.1:54950

Datagram Client

To create an UDP client to send UDP packets, we can do something like:

var dgram = require('dgram');

var client = dgram.createSocket('udp4');

var message = new Buffer('this is a message');
client.send(message, 0, message.length, 4002, 'localhost');
client.close();

Here we are creating a client using the same createSocket function we did to create the client, with difference we don’t bind.

You have to be careful not to change the buffer you pass on client.send before the message has been sent. If you need to know when your message has been flushed to the kernel, you should pass a callback function when the buffer may be reused.

client.send(message, 0, message.length, 4002, 'localhost', function() {
    // buffer can be reused now
});

Since we are not binding, the message is sent from random UDP port. If we want to send from a specific port, we use client.bind(port).

var dgram = require('dgram');

var client = dgram.createSocket('udp4');

var message = new Buffer('this is a message');
client.bind(4001);
client.send(message, 0, message.length, 4002, 'localhost');
client.close();

The port binding on the client really mixes what a server and client are, but it can be useful for maintaining conversations like this:

var dgram = require('dgram');

var client = dgram.createSocket('udp4');

var message = new Buffer('this is a message');
client.bind(4001);
client.send(message, 0, message.length, 4002, 'localhost');
client.on('message', function(message, rinfo) {
    console.log('and got the response: ' + message);
    client.close();
});

Here we are sending a message and also listening to messages. When we receive one message we close the client.

Don’t forget that UDP is unreliable. Whatever protocol we devise on top of it, it has possibility of lost and out of order.

Datagram Multicast

One of the interesting uses of UDP is to distribute message to several nodes using only one network message. This is multicast. Message multicasting can be useful when we don’t want to need to know the address of all peers. Peers just have to “tune in” and listen to multicast channel.

Nodes can report their interest in listening to certain multicast channels by “tuning” into that channel. In IP addressing there is a space reserved for multicast addresses. In IPv4 the range is between 224.0.0.0 and 239.255.255.255, but some of these are reserved. 224.0.0.0 through 224.0.0.255 is reserved for local purposes (as administrative and maintenance tasks) and the range 239.0.0.0 to 239.255.255.255 has also been reserved for “administrative scoping”.

Receiving Multicast Message

To join a multicast address, for example 230.1.2.3, we can do something like this:

var server = require('dgram').createSocket('udp4');

server.on('message', function(message, rinfo) {
    console.log('server got message: ' + message + ' from ' + 
                 rinfo.address + ':' + rinfo.port);
});

server.addMembership('230.1.2.3');
server.bind(4002);

We say to the kernel that this UDP socket should receive multicast message for the multicast address 230.1.2.3. When calling the addMembership, we can pass the listening interface as an optional second argument. If omitted, Node will try to listen on every public interface.

The we can test the server using netcat like this:

echo 'hello' | netcat -c -u -w 1 230.1.2.3

Sending Multicast Message

To send a multicast message we simply have to specify the multicast address:

var dgram = require('dgram');

var client = dgram.createSocket('udp4');

var message = new Buffer('this is a message');
client.setMulticastTTL(10);
client.send(message, 0, message.length, 4002, '230.1.2.3');
client.close();

Here, besides sending the message, we previously set the Multicast time-to-live to 10 (an arbitrary value here). This TTL tells the network how many hops (routers) it can travel through before it is discard. Every time an UDP packet travels through a hop, TTL counter is decremented and if 0 is reached, the packet is discard.

The post NodeJS Datagrams (UDP) appeared first on Xathrya.ID.

NodeJS UNIX Sockets

UNIX socket or Unix domain socket, also known as IPC socket (inter-process communication socket) is a data communications endpoint for exchanging data between processes executing within the same host operating system. The similar functionality used by named pipes, but Unix domain sockets may be created as connection-mode or as connectionless.

Unix domain sockets use the file system as their address name space. They referenced by process as inodes in the file system. This allows two processes to open the same socket in order to communicate. However, communication occurs entirely within the operating system kernel.

Server

Node’s net.Server class not only supports TCP sockets, but also UNIX domain sockets.

To create a UNIX socket server, we have to create a normal net.Server bu then make it listen to a file path instead of a port.

var server = net.createServer(function(socket) {
    // got a client connection here
});
server.listen('/path/to/socket');

UNIX domain socket servers present the exact same API as TCP server.

If you are doing inter-process communication that is local to host, consider using UNIX domain sockets instead of TCP sockets, as they should perform much better. For instance, when connecting node to a front-end web-server that stays on the same machine, choosing UNIX domain sockets is generally preferable.

Client

Connecting to a UNIX socket server can be done by using net.createConnection as when connecting to a TCP server. The difference is in the argument, a socket path is passed in instead of a port.

var net = require('net');
var conn = net.createConnection('/path/to/socket');
conn.on('connect', function() {
    console.log('connected to unix socket server');
});

Passing File Descriptors Around

UNIX sockets have this interesting feature that allows us to pass file descriptors from a process into another process. In UNIX, a file descriptor can be a pointer to an open file or network connection, so this technique can be used to share files and network connections between processes.
For instance, to grab the file descriptor from a file read stream we should use the fd attribute like this:

var fs = require('fs');
var readStream = fs.createReadStream('/etc/passwd', {flags: 'r'});
var fileDescriptor = readStream.fd;

and then we can pass it into a UNIX socket using the second or third argument of socket.write like this:

var socket = ...
// assuming it is UTF-8
socket.write('some string', fileDescriptor);

// specifying the encoding
socket.write('453d9ea499aa8247a54c951', 'base64', fileDescriptor);

On the other end, we can receive a file descriptor by listening to the “fd” event like this:

var socket = ...
socket.on('fd', function(fileDescriptor) {
    // now I have a file descriptor
});

We can do various Node API operation, depend on the type of file descriptor.

Read or Write into File

If it’s a file-system file descriptor, we can use the Node low-level “fs” module API to read or write data.

var fs = require('fs');
var socket = ...
socket.on('fd', function(fileDescriptor) {
    // write some
    var writeBuffer = new Buffer("here is my string");
    fs.write(fileDescriptor, writeBuffer, 0, writeBuffer.length);

    // read some
    var readBuffer = new Buffer(1024);
    fs.read(fileDescriptor, readBuffer, 0, readBuffer.length, 0,
    function(err, bytesRead) {
        if (err) {console.log(err); return; }
        console.log('read ' + bytesRead + ' bytes:');
        console.log(readBuffer.slice(0, bytesRead));
    });
});

We should be careful because of the file open mode. If the is opened with “r” flag, no write operation can be done.

Listen to the Server Socket

As another example on sharing a file descriptor between processes: if the file descriptor is a server socket that was passed in, we can create a server on the receiving end and associate the new file descriptor by using the server.listenFD method on it to it like this:

var server = require('http').createServer(function(req, res) {
    res.end('Hello World!');
});

var socket = ...
socket.on('fd', function(fileDescriptor) {
    server.listenFD(fileDescriptor);
});

We can use listenFD() on an “http” or “net” server. In fact, on anything that descends from net.Server.

The post NodeJS UNIX Sockets appeared first on Xathrya.ID.

NodeJS TCP

TCP or Transmission Control Protocol, is a connection oriented protocol for data communication and data transmission in network. It provides reliability over transmitted data so the packets sent or received is guaranteed to be in correct format and order.

Node has a first-class HTTP module implementation, but this descends from the “bare-bones” TCP module. Being so, everything described here applies also to every class descending from the net module.

TCP Server

We can create TCP server and client, using “net” module.

Here, how we create a TCP server.

require('net').createServer(function(socket) {
    // new connection
    socket.on('data', function(data) {
        // got data
    });

    socket.on('data', function(data) {
        // connection closed
    });

    socket.write('Some string');
}).listen(4001);

Here our server is created using “net” module and listen to port 4001 (to distinguish with our HTTP server in 4000). Our callback is invoked every time new connection arrived, which is indicated by “connection” event.

On this socket object, we can then listen to “data” events when we get a package of data and the “end” event when that connection is closed.

Listening

As we saw, after the server is created, we can bind it to a specific TCP port.

var port = 4001;
var host = '0.0.0.0';
server.listen(port, host);

The second argument (host) is optional. If omitted, the server will accept connections directed to any IP address.

This method is asynchronous. To be notified when the server is really bound we have to pass a callback.

//-- With host specified
server.listen(port, host, function() {
    console.log('server listening on port ' + port);
});

//-- Without host specified
server.listen(port, function() {
    console.log('server listening on port ' + port);
});

Write Data

We can pass in a string or buffer to be sent through the socket. If a string is passed in, we can specify an encoding as a second argument. If no encoding specified, Node will assume it as UTF-8. The operation are much like in HTTP module.

var flush = socket.write('453d9ea499aa8247a54c951', 'base64');

The socket object is an instance of net.Socket, which is a writeStream, so the write method returns a boolean, saying whether it flushed to the kernel or not.

We can also pass in a callback. This callback will be invoked when data is finally written out.

// with encoding specified
var flush = socket.write('453d9ea499aa8247a54c951', 'base64', function(){
    // flushed
});

// Assuming UTF-8
var flush = socket.write('Heihoo!', function(){
    // flushed
});

.end()

Method .end() is used to end the connection. This will send the TCP FIN packet, notifying the other end that this end wants to close the connection.

But, we can still get “data” events after we have issued this. It is simply because there still might be some data in transit, or the other end might be insisting on sending you some more data.

In this method, we can also pass in some final data to be sent:

socket.end('Bye bye!');

Other Methods

Socket object is an instance of net.Socket, and it implements the WriteStream and ReadStream interface, so all those methods are available like pause() and resume(). We can also bind to the “drain” events like other stream object can do.

Idle Sockets

A socket can be in idle state, or idle for some time. For example, there has been no data received at moment. When this condition happen, we can be notified by calling setTimeout():

var timeout = 60000;    // 1 minute
socket.setTimeout(timeout);
socket.on('timeout', function() {
    socket.write('idle timeout, disconnecting, bye!');
    socket.end();
});

or in shorter form:

socket.setTimeout(60000, function() {
    socket.end('idle timeout, disconnecting, bye!');
});

Keep-Alive

Keep-alive is mechanism to make the server prevent timeout. The concept is very simple: when we set up a TCP connection, we associate a set of timers and some of it deal with the keep-alive procedure. When the keep-alive timer reaches zero, we send our peer a keep-alive probe packet with no data in it and the ACK flag turned on.

In Node, all the functionality has been simplified. So, we can send keep-alive notification by invoking.

socket.keepAlive(true);

We can also speficy the delay between the last packet received and the next keep-alive packet on the second argument to the keep-alive call.

socket.keepAlive(true, 10000);    // 10 seconds

Delay or No Delay

When sending off TCP packets, the kernel buffers data before sending it off and uses the Naggle algorithm to determine when to send off the data. If you wish to turn this off and demand that the data gets sent immediately after write commands, use:

socket.setNoDelay(true);

Of course we can turn it on by simply invoking it with false value.

Connection Close

This method closes the server, preventing it from accepting new connections. This function is asynchronous, and the server will emit the “close” event when actually closed:

var server = ...
server.close();
server.on('close', function() {
    console.log('server closed!');
});

TCP Client

We can create a TCP client which connect to a TCP server using “net” module.

var net = require('net');
var port = 4001;
var host = 'www.google.com';
var conn = net.createConnection(port, host);

Here, if we omitted the host for creating connection, the defaults will be localhost.

Then we can listen for data.

conn.on('data', function(data) {
    console.log('some data has arrived');
});

or send some data.

conn.write('I send you some string');

or close it.

conn.close();

and also listen to the “close” event (either by yourself, or sent by peer)

conn.on('close', function(data) {
    console.log('connection closed');
});

Socket conforms to the ReadStream and WriteStream interfaces, so we can use all of the previously described methods on it.

Error Handling

When handling a socket on the client or the server, we can (and should) handle the errors by listening to the “error” event.

Here is simple template how we do:

require('net').createServer(function(socket) {
    socket.on('error', function(error) {
        // do something
    });
});

If we don’t catch an error, Node will handle an uncaught exception and terminate the current process. Unless that’s what we want, we should handle the errors.

The post NodeJS TCP appeared first on Xathrya.ID.

NodeJS Streams, Pump, and Pipe

Stream is an object that represents a generic sequence of bytes. Any type of data can be stored as a sequence of bytes, so the details of writing and reading data can be abstracted.

Node has a useful abstraction for stream. More specifically, two very abstractions: Read Streams and Write Streams. They are implemented throughout several Node objects, and they represent inbound (ReadStream) or outbound (WriteStream) flow of data.

Though read and write operation on stream is not special in any programming language, Node has unique way to do it.

ReadStream

A ReadStream is like a faucet of data. The method of creating streams are depends on the type of stream itself. After you have created a one, you can: wait for data, know when it ends, pause it, and resume it.

Wait for data

By binding to the “data” event we can be notified every time there is a chunk being delivered by that stream. It can be delivered as a buffer or as a string.
If we use stream.setEncoding(encoding), the “data” events pass in strings. If we don’t set an encoding, the “data” events pass in buffers.

var readStream = ...
readStream.on('data', function(data) {
    // data is a buffer
});

var readStream = ...
readStream.setEncoding('utf8');
readStream.on('data', function(data) {
    // data is utf8-encoded string
});

So here data passed in on the first example is a buffer, and the second one is a string because we specify it as utf8 string.

The size of each chunk may vary, it may depend on buffer size or on the amount of available data so it might unpredictable.

Know when it ends

A stream can end, and we can know when that happens. By binding to the “end” event, we can see it.

var reasStream = ...
readStream.on('end', function() {
    console.log('the stream has ended');
});

Pause

A read stream is like a faucet, and we can keep the data from coming in by pausing it.

readStream.pause();

Resume

If stream is paused, we can reopen it and the stream can start flowing again.

readStream.resume();

WriteStream

A WriteStream is an abstraction on somewhere you can send data to. It can be a file or network connection or even an object that outputs data that was transformed.

When we have WriteStream object, we can do two operations: write and wait for it to drain.

Write

We can write data to stream. The data can be in string format or a buffer format.

By default, write operation will treat a stirng as utf8 string unless it is told otherwise.

var writeStream = ...;

writeStream.write('this is an utf-8 string');
writeStream.write('7e3e4acde5ad240a8ef5e731e644fbd1', 'base64');

For writing a buffer, we can slightly modify it to

var writeStream = ...;
var buffer = new Buffer('this is a buffer with some string');
writeStream.write(buffer);

Wait for it to drain

Node does not block on I/O operation, so it does not block on read or write commands. On write commands, if Node is not able to flush the data into the kernel buffers, it will buffer that data, storing it in our process memory. Because of this, writeStream.write() returns a boolean. If write() manages to flush all data to the kernel buffer, it returns true. If not, it returns false.

When a writeStream manages to do flush the data into the kernel buffers, it emits a “drain” event so we can listen it like this:

var writeStream = ...;
writeStream.on('drain', function() { console.log('drain emitted'); });

Stream by Example

FileSystem stream

We can create a read stream for a file path.

var fs = require('fs');

var rs = fs.createReadStream('/path/to/file');

We can also pass some options for .createReadStream(), for example: start and end position of file, the encoding, the flags, and the buffer size. Below is the default value of option:

{
    flags: 'r',
    encoding: null,
    fd: null,
    mode: 0666,
    bufferSize: 64*1024
}

We can also create a write stream

var fs = require('fs');
var rs = fs.createWriteStream('/path/to/file', options);

Which also accepts a second argument with an option object. Below is the default value of option:

{
    flags: 'w',
    encoding: null,
    mode: 0666
}

We can also give a single specification if it is necessary.

var fs = require('fs');
var rs = fs.createWriteStream('/path/to/file', {encoding: 'utf8'});

Case Study: Slow Client Problem

As said before, Node does not block on writes, and it buffers the data if the write cannot be flushed into the kernel buffers. Now if we are pumping data into a write stream (like a TCP connection to a browser) and our source of data is a read stream (like a file ReadStream):

var fs = require('fs');

require('http').createServer(function(req, res) {
   var rs = fs.createReadStream('/path/to/big/file');
   rs.on('data', function(data) {
      res.write(data);
   });
   rs.on('end', function() {
      res.end();
   });
});

If the file is local, the read stream should be fast. Now if the connection to the client is slow, the writeStream will be slow. So readStream “data” events will happen quickly, the data will be sent to the writeStream, but eventually Node will have to start buffering the data because the kernel buffers will be full.

What will happen then is that the /path/to/big/file file will be buffered in memory for each request, and if we have many concurrent requests, Node memory consumption will inevitably increase, which may lead to other problems, like swapping, thrashing and memory exhaustion.

To address this problem we have to make use of the pause and resume of the read stream, and pace it alongside your write stream so your memory does not fill up:

var fs = require('fs');

require('http').createServer(function(req, res) {
   var rs = fs.createReadStream('/path/to/big/file');
   rs.on('data', function(data) {
      if(!res.write(data)) {
         rs.pause();
      }
   });
   res.on('drain', function() {
      rs.resume();
   });
   rs.on('end', function() {
      res.end();
   });
});

We are pausing the readStream if the write cannot flush it to the kernel, and we are resuming it when the writeSTream is drained.

Pump

What was described here is a recurring pattern, and instead of this complicated chain of events we can simply use util.pump() which does exactly what we described:

var util = require('util');
var fs = require('fs');

require('http').createServer(function(req, res) {
    var rs = fs.createReadStream('/path/to/big/file');
    util.pump(rs, res, function() {
        res.end();
    });
});

util.pump() accept 3 argumens: the readable stream, the writable stream, and a callback when the read stream ends.

Pipe

There is another approach we can use, pipe. A ReadStream can be piped into a WriteStream on the same fashion, simply by calling pipe(destination).

var fs = require('fs');

require('http').createServer(function(req, res) {
    var rs = fs.createReadStream('/path/to/big/file');
    rs.pipe(res);
});

By default, end() is called on the destination when the read stream ends. We can prevent that behavior by passing in end: false on the second argument options object like this:

var fs = require('fs');

require('http').createServer(function(req, res) {
    var rs = fs.createReadStream('/path/to/big/file');
    rs.pipe(res, {end: false});
    rs.end(function() {
        res.end("And that's all folks!");
    });
});

Creating Own Read and Write Streams

We can implement our own read and write streams.

ReadStream

When creating a Readable stream, we have to implement following methods:

  • setEncoding(encoding)
  • pause()
  • resume()
  • destroy()

and emit the following events:

  • “data”
  • “end”
  • “error”
  • “close”
  • “fd” (not mandatory)

We should also implement the pipe() method, but we can lend some help from Node by inheriting from Stream.

var MyClass = ...
var util = require('util'),
    Stream = require('stream').Stream;
util.inherits(MyClass, Stream);

This will make the pipe method available at no extra cost.

WriteStream

To implement our own WriteStream-ready pseudo-class we should provide the following methods:

  • write(string, encoding=’utf8′, [fd])
  • write(buffer)
  • end()
  • end(string, encoding)
  • end(buffer)
  • destroy()

and emit the following events:

  • “drain”
  • “error”
  • “close”

The post NodeJS Streams, Pump, and Pipe appeared first on Xathrya.ID.

NodeJS HTTP

HTTP or Hyper Text Transfer Protocol is an application protocol for distributed, collaborative, hypermedia information systems. It is the protocol which become the foundation of data communication for the World Wide Web.

HTTP in most case is using client-server architecture. It means there are one (or more) server which can serve several client.

NodeJS has abstract HTTP behavior in a way we can use it for building scalable application.

HTTP Server

Using Node, we can easily create an HTTP server. For example:

var http = require('http');

var server = http.createServer();
server.on('request', function(req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.write('Hello World!');
    res.end();
});
server.listen(4000);

We have ‘http’ module, which is a further encapsulation of what ‘net’ module does for HTTP protocol.

Every server should bind and listen to an arbitrary port. In our example, our server listen to port 4000. The server is handling an event request for HTTP server. This event is triggered when a client is request or connecting to our server. We set callback which has two arguments: request and response.

Our callback will have two object, in our example are req (request) and res (response). A request is object which encapsulate all request data sent to our server. A response is object which we will sent back to the client. Here, when we are requested by client, we will write a response. That’s what HTTP does, as simple as that.

A response is composed of two field: header and body. We write the header with the content-type which indicate a plain text. In the body, we have a string ‘Hello World!’.

If you run this script on node, you can then point your browser to http://localhost:4000 and you should see the “Hello World!” string on it.

We can shorten the example to be:

require('http').createServer(function(req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end('Hello World!');
}).listen(4000);

Here we are giving up the intermediary variable for storing the http module (since we only need to call it once) and the server (since we only need to make it listen on port 4000). Also, as a shortcut, the http.createServer function accepts a callback function that will be invoked on every request.

Request Object

Request object (first argument of the callback) is an instantiation of http.ServerRequest class. It has several important aspect which we can see.

.url

This is the URL of the request. It does not contain schema, hostname, or port, but it contains everything after that.

We can try to analyze the url by:

require('http').createServer(function(req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end(req.url);
}).listen(4000);

The URL are the resource requested by client. For example:

  • http://localhost:4000/ means the URL requested is /
  • http://localhost:4000/index.html means the URL requested is /index.html
  • http://localhost:4000/controller/index.js means the URL requested is /controller/index.js
  • etc

.method

This contains the HTTP method used on the request. It can be ‘GET’, ‘POST’, ‘DELETE’, or any valid HTTP request method.

.headers

This contains an object with a property for every HTTP header on the request.

We can analyze the headers by:

require('http').createServer(function(req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end(util.inspect(req.headers));
}).listen(4000);

req.headers properties are on lower-case. For instance, if the browser sent a “Cache-Control: max-age: 0” header, reg.headers will have a property named “cache-control” with the value “max-age: 0” (this last one is untouched).

Response Object

Response object is the second argument for callback. It is used to reply the request to the client and an instantiation of http.ServerResponse class.

Write Header

A response is like request, composed of header and body. The header contains a property for every header we want to send. We can use res.writeHead(status, headers) to write the header.

For example:

var util = require('util');

require('http').createServer(function(req, res) {
    res.writeHead(200, {
        'Content-Type': 'text/plain',
        'Cache-Control': 'max-age=3600'
    });
    res.end('Hello World!');
}).listen(4000);

On this example, we set 2 headers: one with “Content-Type: text/plain” and another with “Cache-Control: max-age=3600”.

Change or Set a Header

We can change a header which already set or set a new one by using.

res.setHeader(name, value);

This will only work if we haven’t already sent a piece of the body by using res.write().

Remove a Header

We can also remove a hader we already set by using.

res.removeHeader(name, value);

This will only work if we haven’t already sent a piece of the body by using res.write().

Write Response Body

To write a response, we can use:

// write a simple string
res.write('Here is string');

// existing buffer
var buf = new Buffer('Here is buffer');
buf[0] = 45;
res.write(buffer);

This method can, as expected, be used to reply dynamically generated strings or binary file.

HTTP Client

Creating http client using node is also possible and easy. The same module ‘http’ can be used to create HTTP client, even though it is specifically designed to be a server.

.get()

HTTP GET is a simple request to the url.

In this example, we sent HTTP GET request to the url http://www.google.com:80/index.html.

var http = require('http');

var options = {
  host: 'www.google.com',
  port: 80,
  path: '/index.html'
};

http.get(options, function(res) {
  console.log('got response: ' + res.statusCode);
}).on('error', function(err) {
  console.log('got error: ' + err.message)
});

.request()

Using http.request, we can make any type of HTTP request (not limited to HTTP GET only).

http.request(options, callback);

The options is an object which describe the host we want to connect to. It is composed of:

  • host: a domain name or IP address of the server to issue the request to
  • port: Port of remote server
  • method: a string specifying the HTTP request method. Possible values: GET, POST, PUT, DELETE
  • path: Request path. It should include query string and fragments if any. E.G. ‘/index.html?page=12’
  • headers: an object containing request headers.

For example:

var options = {
    host: 'www.google.com',
    port: 80,
    path: '/upload',
    method: 'POST'
};

var req = require('http').request(options, function(res) {
    console.log('STATUS: ' + res.statusCode);
    console.log('HEADERS: ' + JSON.stringify(res.headers));
    res.setEncoding('utf8');
    res.on('data', function (chunk) {
      console.log('BODY: ' + chunk);
    });
});

// write data to request body
req.write('data ');
req.write('data ');
req.end();

We are writing the HTTP request body data (two lines with the “data ” string with req.write()) and ending the request immediately. Only then the server replies and the response callback gets activated.

We wait for response. When it comes, we get a ‘response’ event, which we are listening to on the callback function. By then we only have the HTTP status and headers ready, which we print.

Then we bind to ‘data’ events. These data happen when we get a chunk of the response body data.

This mechanism can be used to stream data from a server. As long as the server keeps sending body chunks, we keep receiving them.

The post NodeJS HTTP appeared first on Xathrya.ID.

NodeJS Low Level File System Operation

Node has a nice streaming API for dealing with files in an abstract way, as if they were network streams. But sometimes, we might need to go down a level and deal with the filesystem itself. Node has facilitate this by providing a low-level file system operation, using module fs.

Get File Metainfo

A metainfo is information about information of a file or directory. In POSIX API, we use stat() and fstat() function to do this. Node, which is inspired by POSIX, has taken this approach too. The stat() and fstat() has been encapsulated to fs module.

var fs = require('fs');

fs.stat('file.txt', function(err, stats) {
    if (err) { console.log (err.message); return }
    console.log(stats);
});

Here we need a fs module and do stat(). A callback is set with two arguments, first is for get the error message if there is an error occurred, and second is the stat information if the function succeeded.

If succeeded, the callback funtion might print something like this (result taken on Cygwin64 on Windows 8):

{ dev: 0,
  mode: 33206,
  nlink: 1,
  uid: 0,
  gid: 0,
  rdev: 0,
  ino: 0,
  size: 2592,
  atime: Thu Sep 12 2013 19:25:05 GMT+0700 (SE Asia Standard Time),
  mtime: Thu Sep 12 2013 19:27:57 GMT+0700 (SE Asia Standard Time),
  ctime: Thu Sep 12 2013 19:25:05 GMT+0700 (SE Asia Standard Time) }

stats is Stats instance, an object, which we cal call some methods of it.

stats.isFile();
stats.isDirectory();
stats.isBlockDevice();
stats.isCharacterDevice();
stats.isSymbolicLink();
stats.isFIFO();
stats.isSocket();

If we have a plain file descriptor, we can use fs.fstat(fileDescriptor, callback) instead.

If using low-level filesystem API in node, we will get file descriptors as a way to represent files. These file descriptors are plain integer numbers given by kernel that represent a file in Node process. Much like C POSIX APIs.

Open and Close a File

Opening a file is a simple matter by using fs.open()

var fs = require('fs');
fs.open('path/to/file', 'r', function(err, fd) {
    // got file descriptor (fd)
});

It’s like C function, if you are familiar with.

The first argument to fs.open is the file path. The second argument is the flags, which indicate the mode with which the file is to be open. The valid flags can be ‘r’, ‘r+’, ‘w’, ‘w+’, ‘a’, or ‘a+’.

  • r = open text file for reading. The stream is positioned at the beginning of the file.
  • r+ = open for reading and writing. The stream is positioned at the beginning of the file.
  • w = truncate file to zero length or create text file for writing. The stream is positioned at the beginning of the file.
  • w+ = open for reading and writing. The file is created if it does not exist, otherwise it is truncated. The stream is positioned at the beginning of the file.
  • a = open for writing. The file is created if it does not exist. The stream is positioned at the end of the file. Subsequent writes to the file will always end up at the current end of file.
  • a+ = open for reading and writing. The file is created if it does not exist. The stream is positioned at the end of the file. Subsequent writes to the file will always end up at the current end of file.

On the callback function, we get the file descriptor or fd as second argument. It is a handler to read and write the file which is opened by fs.open() function.

After operation, it is recommended to close the opened file using fs.close(fd).

Read From a File

Once it’s open, we can read from a file but make sure we have set the mode to allow us read it.

var fs = require('fs');
fs.open('file.txt', 'r', function(err, fd) {
   if (err) { throw err; }
   var readBuffer = new Buffer(1024),
      bufferOffset = 0,
      bufferLength = readBuffer.length,
      filePosition = 100;

   fs.read(fd, readBuffer, bufferOffset, bufferLength, filePosition,
      function(err, readBytes) {
         if (err) { throw err; }
         console.log('just read ' + readBytes + ' bytes');
         if (readBytes > 0) {
            console.log(readBuffers.slice(0, readBytes));
         }
   });
});

Here we open the file, and when it’s opened we are asking to read a chunk of 1024 bytes from it, starting at position 100 (so basically, we read data from bytes 100 to 1124).

A callback is called when one of the following three happens:

  • there is an error
  • something has been read
  • nothing could be read

If there is an error, the first argument of callback – err – will be set. Otherwise, it is null.

The second argument of callback – readBytes – is the number of bytes read into the buffer. If the read bytes is zero, the file has reached the end.

Write Into a File

Once file is open, we can write into a file but make sure we have set the mode to allow us read it.

var fs = require('fs');
fs.open('file.txt', 'a', function(err, fd) {
   if (err) { throw err; }
   var writeBuffer = new Buffer('Writing this string'),
      bufferOffset = 0,
      bufferLength = writeBuffer.length,
      filePosition = null;

   fs.write(fd, writeBuffer, bufferOffset, bufferLength, filePosition,
      function(err, written) {
         if (err) { throw err; }
         console.log('wrote ' + written + ' bytes');
   });
});

Here we open the file with append-mode (‘a’), and we are writing into it, starting at position 0. We pass in the buffer with data we want to written, an offset inside the buffer where we want to start writing from, the length of what we want to write, the file position and a callback.

In this case we are passing in a file position of null, which is to say that we writes at the current file position. As noted before, we open the file using append-mode, so the file cursor is positioned at the end of the file.

Case on Appending

If you are using these low-level file-system functions to append into a file, and concurrent writes will be happening, opening it in append-mode will not be enough to ensure there will be no overlap. Instead, you should keep track of the last written position before you write, doing something like this:

var fs = require('fs');

var startAppender = function(fd, startPos) {
   var pos = startPos;
   return {
      append: function(buffer, callback) {
         var oldPos = pos;
         pos += buffer.length;
         fs.write(fd, buffer, 0, buffer.length, oldPos, callback);
      }
   };
}

Here we declare a function stored on a variable named “startAppender”. This function starts the appender state (position and file descriptor) and then returns an object with an append function.

To use the Appender:

fs.open('file.txt', 'w', function(err, fd) {
   if (err) { throw err; }
   var appender = startAppender(fd, 0);
   appender.append(new Buffer('append this!'), function(err) {
      console.log('appended');
   });
});

And here we are using the appender to safely append into a file.

This function can then be invoked to append, and this appender will keep track of the last position, and increments it according to the buffer length that was passed in.

Actually, there is a problem: fs.write() may not write all the data we asked it to, so we need to modify it a bit.

var fs = require('fs');

var startAppender = function(fd, startPos) {
    var pos = startPos;
    return {
        append: function(buffer, callback) {
            var written = 0;
            var oldPos = pos;
            pos += buffer.length;
            (function tryWriting() {
                if (written < buffer.length) {
                    fs.write(fd, buffer, written, buffer.length - written,
                             oldPos + written, 
                        function(err, bytesWritten) {
                            if (err) { callback(err); return; }
                            written += bytesWritten;
                            tryWriting();
                        }
                    );
                } else {
                   // we have finished
                   callback(null);
                }
            })();
        }
    }
};

Here we use a function named “tryWriting” that will try to write, call fs.write, calculate how many bytes have already been written and call itself if needed. When it detects it has finished (written == buffer.length) it calls callback to notify the caller, ending the loop.

Also, the appending client is opening the file with mode “w”, which truncates the file, and it’s telling appender to start appending on position 0. This will overwrite the file if it has content. So, a wizer version of the appender client would be:

fs.open('file.txt', 'a', function(err, fd) {
   if (err) { throw err; }
   fs.fstat(fd, function(err, stats) {
      if (err) { throw err; }
      console.log(stats);
      var appender = startAppender(fd, stats.size);
      appender.append(new Buffer('append this!'), function(err) {
         console.log('appended');
      });
   })
});

The post NodeJS Low Level File System Operation appeared first on Xathrya.ID.

NodeJS Timers

A timer is a specialized type of clock for measuring time intervals. It is used for deploying routine action.

Node implements the timers API which also found in web browsers.

setTimeout

setTimeout let us to schedule an arbitrary function to be executed in the future. For example:

var timeout = 2000;    // 2 seconds
setTimeout(function() {
    console.log('time out!');
}, timeout);

The code above will register a function to be called when the timeout expires. As in any place in JavaScript, we can pass in an inline function, the name of a function or a variable which value is a function.

If we set the timeout to be 0 (zero), the function we pass gets executed some time after the stack clears, but with no waiting. This can be used to, for instance schedule a function that does not need to be executed immediately. This was a trick sometimes used on browser JavaScript. Another alternative we can use is process.nextTick() which is more efficient.

clearTimeout

A timer schedule can be disabled after it is scheduled. To clear it, we need a timeout handle which is returned by function setTimeout.

var timeoutHandle = setTimeout(function() { 
    console.log('Groaaarrrr!!!');
}, 1000);
clearTimeout(timeoutHandle);

If you look carefully, the timeout will never execute because we clear it just after we set it.

Another example:

var timeoutA = setTimeout(function() {
    console.log('timeout A');
}, 2000);

var timeoutB = setTimeout(function() {
    console.log('timeout B');
    clearTimeout(timeoutA);
}, 1000);

Which timeoutA will never be executed.

There are two timers above, A with timeout 2 seconds and B with timeout 1 second. The timeoutB (which fires first) unschedule timeoutA so timeout never executes and the program exits right after the timeoutB is executed.

setInterval

Set interval is similar to set timeout, but schedules a given function to run every X seconds.

var period = 1000; // 1 second
var interval = setInterval(function() {
  console.log('tick');
}, period);

That code will indefinitely keep the console logging ‘tick’ unless we terminate Node.

clearInterval

To terminate schedule set by setInterval, the procedure we do is similar to what we did to setTimeout. We need interval handler returned by setInterval and do it like this:

var interval = setInterval(...);
clearInterval(interval);

process.nextTick

A callback function can also be scheduled to run on next run of the event loop. To do so, we use:

process.nextTick(function() {
    // This runs on the next event loop
    console.log('yay!');
});

This method is preferred to setTimeout(fn, 0) because it is more efficient.

On each loop, the event loop executes the queued I/O events sequentially by calling associated callbacks. If, on any of the callbacks you take too long, the event loop won’t be processing other pending I/O events meanwhile (blocking). This can lead to waiting customers or tasks. When executing something that may take too long, we can delay execution until the next event loop, so waiting events will be processed meanwhile. It’s like going to the back of the line on a waiting line.

To escape the current event loop, we can use process.nextTick() like this:

process.nextTick(function() {
    // do something
});

This will delay processing that is not necessary to do immediately to the next event loop.

For instance, we need to remove a file, but perhaps we don’t need to do it before replying to the client. So we could do something like this:

stream.on('data', funciton(data) {
    stream.end('my response');
    process.nextTick(function() {
        fs.unlink('path/to/file');
    });
});

Let’s say we want to schedule a function that does some I/O – like parsing a log file -to execute periodically, and we want to guarantee that no two of those functions are executing at the same time. The best way is not to use a setInterval, since we don’t have that guarantee. the interval will fire no matter if the function has finished it’s duty or not.

Supposing there is an asynchronous function called “async” that performs some IO and that gets a callback to be invoked when finished, and we want to call it every second:

var interval = 1000;
setInterval(function() {
    async(function() {
        console.log('async is done');
    }
}, interval);

If any two async() calls can’t overlap, it is better off using tail recursion like this:

var interval = 1000;
(function schedule() {
    setTimeout(function() {
        async(function() {
            console.log('async is done!');
            schedule();
        });
    }, interval);
})();

Here we declare schedule() and invoking it immediately after we are declaring it.

This function schedules another function to execute within one second. The other function will then call async() and only when async is done we schedule a new one by calling schedul() again, this time inside the schedule function. This way we can be sure that no two calls to async execute simultaneously in this context.

The difference is that we probably won’t have async called every second (unless async takes to time to execute), but we will have it called 1 second after the last one finished.

The post NodeJS Timers appeared first on Xathrya.ID.

NodeJS Event Emitter

Many objects can emit events, in NodeJS of course. For instance a TCP server can emit a ‘connect’ event every time a client connects, or a file stream request can emit a ‘data’ event.

Connecting an Event

One can listen for events. If you are familiar with other event-driven programming, you will know that there must be a function or method “addListener” where you have to pass a callback. Every time the event is triggered, for example ‘data’ event is triggered every time there is some data available to read, then your callback is called.

In NodeJS, here is how we can achieve that:

var fs = require('fs');      // get the fs module
var readStream = fs.createReadStream('file.txt');
readStream.on('data', function(data) {
    console.log(data);
});
readStream.on('end', function(data) {
    console.log('file ended');
});

Here on readStream object we are binding two event: ‘data’ and ‘end’. We pass callback function to handle each of these cases. All are available uniquely to handle events from readStream object.

We can either pass in an anonymous function (as we are doing here), or a function name for a function available on the current scope, or even a variable containing a function.

Only Connect Once

There is a case where we only want to handle an event once and the rest we give up for it. It means, we only interests in first event only. We want to listen for event exactly once, no more or less.

There are two ways to do it: using .once() method or make sure we remove the callback once we are called.

The first on is the simplest way. We use .once() to tell NodeJS that we are only interested in handling first event occurred.

object.once('event', function() {
    // Callback body
});

Other way is

function evtListener() {
    // Function body
    object.removeListener('event', evtListener);
}
object.on('event', evtListener);

Here we use removeListener() which will be discussed more in next section.

On two above samples, make sure you pass appropriate callback i.e. providing appropriate argument number. The event also should be specified.

Removing Callback from Certain Event

Though we have use it in previous section, we will discuss it again here.

To remove a callback we need the object in which we will remove the callback from it and also the event name. Note that this is a pair which we should provide. We can’t provide only one of it.

function evtListener() {
    // Function body
    object.removeListener('event', evtListener);
}
object.on('event', evtListener);

The removeListener belongs to the EventEmitter pattern. It accepts the event name and the function is should remove.

Removing All Callback from Certain Event

If you ever need to, removing all listener for an event from an Event Emitter is possible. We can use:

object.removeAllListener('event');

Creating Self-Defined Event

One can use this Event-Emitter patter throughout application. The way we do is creating a pseudo-class and make it inherit from the EventEmitter.

var EventEmitter = require('events').EventEmitter,
    util         = require('util');

// Here is the MyClass constructor
var MyClass = function(option1, option2) {
    this.option1 = option1;
    this.option2 = option2;
}

util.inherits(MyClass, EventEmitter);

util.inherits() is setting up the prototype chain so that we get the EventEmitter prototype methods available on MyClass instance.

That way, instances of MyClass can emit events:

MyClass.prototype.someMethod = function() {
    this.emit('custom event', 'some arguments');
}

Which emitting an event named ‘custom event’, sending also some data (in this case is “some arguments”).

A clients of MyClass instance can listen to “custom events” event by:

var myInstance = new MyClass(1,2);
myInstance.on('custom event', function() {
    console.log('got a custom event!');
});

The post NodeJS Event Emitter appeared first on Xathrya.ID.