Building a cross-platform & cross-language IDE [Part III]

Welcome to Part III of this series, also the last part focusing on the back-end component of our application.

We have a lot to cover in this part :

  • Moving from pure HTTP protocol (REST API) to WebSockets for our run method (offering us real-time output streams for our running code).
  • Adding a REPL feature, allowing the client to evaluate code statements interactively for supported languages. The REPL will make use of WebSockets.
  • Adding Hello World code samples for each of the languages.
  • Packaging our Docker based back-end (inception style using DIND) to allow easy distribution.

So without further due, let's get started !

Transitioning from HTTP to WebSockets

Remember that in the last article we mentioned that if we were to run some code that produce long output, because we were using HTTP request/response for such purpose, we would have to wait a long time before seeing our program output. Indeed, our code was waiting for the ouptut to be complete (program exit) before sending the output in the reponse.

While this was easy to implement, this is clearly not a good solution. Nobody wants to wait for minutes or hours before seeing some output (or even worse, infinitely waiting in case the program does not end).

We could still use HTTP protocol to meet our needs, by using a chunked response type, that somehow allows to stream chunks of data in real time.
Instead of using this method, we will turn over to WebSockets that allows clients to create a direct bidirectional connection (tunnel) to our server. Moreover, using socket.io package for this purpose, makes the code as easy as if we were using HTTP requests.

While the use of WebSockets over chunked HTTP is debatable regarding our run method, it however makes much sense for our REPL feature that we will implement later on, considering all the back&forth communication with the server during a REPL session.

That being said, let's dive into the code !

First let's add a dependency to socket.io in our node application. It's done by adding the following line in the dependencies section of our config.json file :

"socket.io": "1.4.4"

Then let's load socket.io in our file, to be able to use it, by adding the following import statement at the top of our index.js source :

var io = require('socket.io')(http);  

Now we are good to go.

We'll first add a new run method, equivalent to our existing run method, with the difference that the output will be send on the client socket instead of in an HTTP response. This new method is not originally called runOverWebSocket.
The code will be stricly the same as for our run method, except that the response parameter is replaced by a socket parameter and we are using this socket to send the ouptut.
Here is in fact the only section of the code that differs from our run method :

if (result.stdout != null) {  
  result.stdout.on('data', function(chunk) {
    clientSocket.emit('stdout_stream', chunk);
  });
} else if (result.stderr != null) {
  result.stderr.on('data', function(chunk) {
    clientSocket.emit('stderr_stream', chunk);
  });
}

As you can see here, we are "attaching" our stdout and stderr streams to the client socket, and each time we receive data we are emitting the data as such on the socket.
The strings stdout_stream and stderr_stream are corresponding to the names of the events. It allows the client to differentiate between event types.

We can now handle a run request over a WebSocket and stream the output in real-time over the socket. But we didn't discuss yet about how the connection is made in the first place, and how the client can ask our back-end to run code over the WebSocket.
Well, here is some code that will shed some light regarding this matter :

io.on('connection', function(socket){  
  socket.on('run', function(msg) {
    runOverWebSocket(JSON.parse(msg), clientSocket);
  });
}

Here it is, the complete setup. Instead of declaring an HTTP handler with a verb and a path we are just waiting for a client connection to our back-end app over a WebSocket. Once this connection is established, the client can send data to the server (and likewise as we saw in the code above, the server can send data to the client). When the client emits the event named run along with the data associated to this event (being the exact same JSON object as the one we used our HTTP request), then we call our function runOverWebSocket which will in turn stream the output on the socket (sending output back to the client).

There is not much more to it ! As simple as our HTTP REST endpoint, and much more efficient in the context of our application ! Neat :)

Unfortunately it is a bit tougher to debug Web Sockets, Postman do not support this, so we won't go into the details here, let's just assume it works (and in the next article we are going to test this with our IDE anyway ;)).

Adding the REPL feature

Onward with the REPL feature !

Our REPL feature should allow a client to send code statements for evaluation and get an immediate output corresponding to the result of this evaluation. Not all languages support REPL, so we will only add the REPL feature to the languages that support it.

Let's start by introducing a new language support to our back-end : Haskell.

As you know from the previous articles, adding a language is as easy as adding a new language entry in our config.json file. Here is the new entry for Haskell :

{
  "id": "haskell",
  "name": "Haskell",
  "extension": "hs",
  "imagename": "haskell",
  "compile": "ghc -o %BASENAME% %BASENAME%.hs",
  "run": "./%BASENAME%",
  "repl": "ghci",
  "replPrefix": "Prelude> "
}

If you followed the previous articles, you'll notice that two new attributes have been added : repl and replPrefix :

  • repl corresponds to the name of the executable that launches the REPL (or "interactive session") for this language. In the case of Haskell this executable is ghci.
  • replPrefix corresponds to the prefix string that is displayed when waiting for the user to enter an expression to be evaluated. In the case of Haskell REPL this is Prelude>

Now that our configuration supports REPL, let's add the code to handle REPL in our app.

First of all, we have a function startRepl which just check if the image corresponding to the language for which REPL needs to be launched is already stored locally or not. If it is, the function directly calls launchRepl,otherwise it pulls the image before calling launchRepl. I won't paste the code from this function, it's pretty straightforward (you can have a look to it in the GitHub).

The launchRepl function is more interesting as it actually launches the REPL session. Here is its source :

function launchRepl(language, callback) {  
  var imagename = language.imagename;
  var replCommand = languageById[languageId].repl;

  child_process.spawnSync('docker', ['rm', '-f', languageId + '_repl']);
  var replContainer = child_process.spawn('docker', ['run', '--name', languageId + '_repl', '-i', imageName, replCommand]);

  callback(replContainer.stdout, replContainer.stderr, replContainer.stdin);
}

This code will retrieve the REPL executable name from the config. It will then make sure that no other container is already running a REPL for this language by removing such a container (if none is running, the remove will be a no-op). It then launches the REPL container and execute the repl command inside of it. Please note that we use a convention for naming our REPL container as languageId + _repl. It is merely to identify such running containers to remove them as part of the previous instruction (this is bad as it does not support multiple users using our back-end, but we keep it simple).

The stdout and stderr streams (output) and stdin stream (input) are then provided in a callback to make use of them.

Now here is the code that invokes startRepl. As you may have guessed, it is located inside of the io.on('connection', function(socket){ code block.

clientSocket.on('startRepl', function(msg) {  
    var obj = JSON.parse(msg);
    startRepl(obj.languageId, function cb(stdout, stderr, stdin) {
      clientSocket.on('repl_in', function(msg) {
        stdin.write(msg, 'utf8');
      });
      if (null != stdout) {
        stdout.on('data', function(chunk) {
          clientSocket.emit('repl_out', chunk);
        });
      }
      if (null != stderr) {
        stderr.on('data', function(chunk) {
          clientSocket.emit('repl_err', chunk);
        });
      }
    });
  });

When the server receives a startRepl event from the client, it calls the startRepl function. Then the client can issue one or more repl_in events. The data for this event will just be the expression string entered by the client, to be evaluated by the REPL. It is then forwarded onto the stdin stream of the REPL (it is just as like if the client was directly typing the expression in the REPL running inside the container). The stdout and stderr outputs are then emitted as two different event types repl_out and repl_err respectively, that the client can interpret accordingly.

Now we have support of a fully fledged functional REPL ! We're getting there :)

Adding code samples

Wouldn't it be nice for the client to be able to see some sample code for each of the language ? Considering we will theoretically support many languages, chances are high than the client will have no idea of how some languages look like. Moreover it will allow us to automatically preload some code into our IDE, easier for testing and more pleasant to the eye than a boring blank code window.

For now we will just offer a single code sample per language. The most popular introductory code sample in any language is most undoubtedly Hello World. So let's add Hello World source code for each of our supported languages !

As for the runtime folder, we will create a codesamples folder which will contain all of our code samples and reference it via a variable in our app :

var relativeCodeSamplesPath = 'codesamples';  

Next we will add three HelloWorld source files, one for each of the languages we currently support (C#, Python and Haskell). I won't paste the code here, take a look at the GitHub if you wish, but in the end, we should now have three source code files in our codesamples folder : HelloWorld.hs, HelloWorld.cs and HelloWorld.py.

Setup is done, let's now see the supporting code :

First we'll add a new attribute named codeSampleFileName to language objects in our config.json file. It's value will just be the name of the sample file to use for this language. Here is the attribute we use for the C# language for example :

"codeSampleFileName": "HelloWorld.cs"

Then we'll just add an HTTP route (we'll provide code samples through our REST API) :

app.get('/codesample/:languageId', getCodeSample);  

Client can then get a code sample for Python let's say, by hitting path /codesample/python.

Let's see the handler method getCodeSample :

function getCodeSample(request, response) {  
  var lang = _.find(config.languages, function(language) {
    return language.id == request.params.languageId;
  });

  if (lang) {
    fs.readFile('./' + relativeCodeSamplesPath + '/' + lang.codeSampleFileName, function(err, data) {
      response.send({"base64encodedsample" : data.toString('base64')});
    });
  }
}

We just retrieve from config.json the code sample filename associated to the language, and we send the content of the file as such (well ... encoded in base64 of course) in our response.

Packaging the back-end as a Docker image

We now have a fully fledged functional node.js backend !

Last but not least, we need to package our back-end as a Docker image to allow easy distribution for running it anywhere easily.

As you may remember from the previous articles, our package Docker image will be based on the Docker In Docker image. When the image is run (as a Docker container), what we want is that the "virtual machine" contains a Docker environment but also our node.js service up and running. This way, our node.js application can pull images and run containers for languages, from within the container itslef (inception style).

I won't go into too much details regarding the configuration of the Dockerfile. What I have done is to retrieve the Dockerfile and associated shell script from the official Docker repo for dind (HERE). I then modified the Dockerfile slighlty in order to add necessary linux packages needed for running node. I also copy all source code file to a directory inside the container and I modified the dind shell script to launch Docker in the background then execute node index.js to launch our node.js back-end in the foreground. You can take a closer look to the files on github if you want to learn more.

You now have two ways of retrieving the Docker image of our back-end. Either grab the full project from the GitHub repository and build the Docker image yourself (by running docker build -t belemaire/polyglot . at the root of the project), or retrieve it directly from my DockerHub repository where I have stored it (by running docker pull belemaire/polyglot).

Now that you have the image ready, all you have to do is to run it in a container.

The command for running the image is

docker run --privileged -d -p 8889/tcp belemaire/polyglot

If you are on OSX or Windows, remember to launch this command from within the Docker Quickstart Terminal otherwise it won't work.

The privileged flag is necessary to be able to run a Docker inside Docker image. The -d flag just indicates to run the container as a daemon (otherwise it will run in your foreground terminal) and the -p flag is just to indicate that we want to publish port 8889 to the host.

Now to access the back-end (from Postman for example, to test some REST endpoints), you'll need the IP and the port to send your requests to.

If you are on OSX or Windows, the IP will be the one from the virtual machine (VirtualBox) running your Docker. The IP is displayed when you launch the Docker Quickstart terminal ... In my case docker is configured to use the default machine with IP 192.168.99.100.

If you are on Linux, then the IP is directly the IP of the container, which can be obtained by running docker inspect [containerId] | grep IPAddress.

To view the port to hit (8889 is only internal to the machine and containers-containers communication) just run docker port [containerId]. In my case here is the result : 8889/tcp -> 0.0.0.0:32769. It means the port to hit is 32769. Trafic will then be automatically forwarded the port 8889 (corresponding to our node application).

So for me, as I am running docker in OSX, given my configuration, I'll have to hit http://192.168.99.100:32769 to reach the poyglot node.js backend.

In the next article we will start working on the IDE itself which will be build using ReactJs and will be nicely packaged as a desktop app using Electron.

Til next time !

PS : The updated project is available on my github repository (https://github.com/belemaire/polyglot-server)