Nagle’s algorithm

Imagine I want to send one byte to your server. TPC needs 40 bytes in order to work (20bytes TCP and 20bytes for IPv4). I just wanted to send one byte but I must send 41 bytes! This is huge overload.

Meet Nagle’s algorithm.

This may sound awesome but.. all that glitters is not gold. Using such a technique has some drawbacks. I am gonna describe two.

Realtime!

Using Nagel’s algorithm with games or other realtime stuff is not very recommended. In really depends on the app itself as the game could be turn based so we don’t care but in FPS like games etc.. I would not recommend using this algorithm. Why?

Nagel’s algorithm basically merge all packets and send them when there’s a reasonable amount to send. This sounds good but in case you build a low latency app/software you should turn this algorithm off (At least for some low latency operations).

Go away, clients!

Do you build an app where you expect high amount of concurrent connections? I would say that you will try to handle responses as quick as possible in order to handle more. For example, disabling nagel’s for long-pool connections could save some milliseconds for you.

I would recommend this article: Scaling node.js to 100k concurrent connections!

Approximately 1 of 6 amazon S3 buckets is potentially vulnerable

Metaspoloit team tested 12,328 amazon S3 buckets.

Result is:

  • public: 1,951
  • private: 10,377

From those 1,951 public buckets they indexed over 126 billion files!

Buckets with public-read have directory indexing enabled as an XML listing.

It means that literally anyone can access your files if the bucket is public.

If you use public bucket for a good reason, you should ignore this. But for those who use S3 bucket to back-up files, handle uploads etc. you should check your bucket.

If you want to find out more, check this blog post.

The files included:

  • Personal photos from a medium-sized social media service
  • Sales records and account information for a large car dealership
  • Affiliate tracking data, click-through rates, and account information for an ad company’s clients
  • Employee personal information and member lists across various spreadsheets
  • Unprotected database backups containing site data and encrypted passwords
  • Video game source code and development tools for a mobile gaming firm
  • PHP source code including configuration files, which contain usernames and passwords
  • Sales “battlecards” for a large software vendor

What the fuq is WebRTC?

It would be very nice if two browsers could communicate to each other without a server.

Oh wait, I just found WebRTC!

Indeed, WebRTC allows you to create peer-to-peer real-time communication between two clients right in your browser.

Do you create a video chat, file sharing or realtime messaging platform?

WebRTC makes video chats, audio tools and realtime messaging server resource trivial. It literally allows you to re-program skype into browser.

Video itself is very resource heavy. It’s just so big, normally a client would need to stream a video to your server and server would then need to stream this video to a client. If you use WebRTC you can cut the server off because a caller send the video stream right to a callee.

In very low level it’s not trivial at all. For example what about clients who are hidden behind a router so they don’t have a publicly visible IP? A client could have (in real world they have) firewalls etc.

So how the fuq it works? In this case ICE comes on the scene. Ice can literally overcome the complexities of real-world networking.

ICE tries to find the best path to connect peers. It tries all possibilities in parallel and chooses the most efficient option that works. ICE first tries to make a connection using the host address obtained from a device’s operating system and network card; if that fails (which it will for devices behind NATs) ICE obtains an external address using a STUN server, and if that fails, traffic is routed via a TURN relay server.

In other words:

  • A STUN server is used to get an external network address.
  • TURN servers are used to relay traffic if direct (peer to peer) connection fails.

If you need to know more I really recommend to read blog post WebRTC in the real world: STUN, TURN and signaling.

Lastly I share a small video conversation app done with no more than 20 lines of code. Bravo!

Stream your response with node.js

Did you know that http response doesn’t have to be one big chunk of data? In this article I will talk a bit about http streams.

Http streams allows you to send chunked data as a response. Why I need it? It simply allows you to not wait for whole response and send just what is ready to be send.

var http = require('http');

http.createServer(function (req, res) {
    res.writeHead(200, {
        'Content-Type': 'text/html', 'Transfer-Encoding': 'chunked'
    });
    res.write('Hello')
    setTimeout(function () {
        res.end(' world');
    },3000);
}).listen(3000, '127.0.0.1');

Run this code with your node.js and then curl localhost:3000 —raw as you can see Hello was send immediately and world waited 3 seconds to be sent.

This is probably the most simple use case. But why we used curl over a browser? Problem with browsers (my chrome) is that it buffers the response so even it has the Hello text it does render only whole response so you will see that your page is loading for 3 seconds.

It doesn’t mean that something is wrong with the code above. Imagine that as a response we send a file. You would read a file content and then you would send it as a normal response. Nothing wrong with that unless you get tons of requests.

The file content must be saved in a buffer in order to send it. So as number of simultaneous requests would increase your server could go out of memory.

That’s where become handy to use a stream pipe.

In other words the data from the file are saved in buffer as chunks, so the buffer doesn’t hold whole file. You should then send those chunks as response as it would have NO advantage if you wouldn’t.

var http = require('http');
var fs = require('fs');

var server = http.createServer(function (req, res) {
  var stream = fs.createReadStream('a_file.txt');
  stream.pipe(res);
});
server.listen(3000, '127.0.0.1');

We just created a file read stream and we created a pipe to response. So now your memory should be just fine and everyone won.

In Ruby on Rails you can use streams from version 4. Read this blog post.

Something resource heavy in your web app? Consider background-processing

Imagine you have a web app where a user can upload an image, video, raw data.. whatever.

Mostly you handle those files somehow. It may include image processing, video processing etc. Should your web app handle this? I would say no.

In my opinion a web app should only serve files, manage users, authorize them etc. All data processing should be outsourced to an other process.

Background processing helps you solve the lags between successful request and rendering response. Imagine Youtube. When you upload a video it tells you that your Video is being processed. Would you rather wait and watch 1-2 hours the loading image in your browser? I don’t think so. They process your video in background.

This of course doesn’t mean that you can only background-process files. You can background-process everything, database queries, some code, anything.

There are some libraries that help you achieve background processing. The most known are probably Resque, Delayed Job and Sidekiq.

So do you have some resource heavy processing? Make your web server lazy and outsource them to an other process.

Security:

Outsourcing file processing to a background job could help you maintain file security a bit more. As the one who is responsible for the file is not your webserver user but an other user who work with those files (Requires creating an other user account on your machine). This user in most cases will just run the background jobs and will be responsible for the files. So disallow running files for him and no script-kiddie should be able to upload a stinky self-running file to your file system.

Discussion: What do you think about TDD

TDD is a term you should know already. For those who don’t ‘Test-driven developmentWikipedia article.

I should say that I am a huge fan of TDD. In my opinion every production app should be fully tested and your code test coverage should be as big as possible. If you don’t test your application you probably do something wrong as everyone, again EVERYONE code bugs (Unless you’re a code superman). But sometimes even I don’t write tests (In personal projects).

This term is not white or black, It’s grey. It can help you a lot on the other hand it takes some of your resources. TDD is time expensive.

My problem is that when I start to work on a personal project, I never finish it. I am not sure if I am either a perfectionist, lazy or I just decide that it’s not why expected and it’s not worth it.

I should mention that I don’t finish just personal projects I work on in my spare time.

Why I am saying this? Because of TDD. Basically I am trying to force myself to do TDD as much as possible but the decision on start of a project is hard. Shall I use TDD from scratch or not?

I don’t like when I must write a spec when a feature is finished. I would say it’s a bad practice either. First reason could be that when the feature is done already, you basically don’t write a test that cover all cases. Second reason is that writing a spec which doesn’t fail is probably wrong practice.

So here is my list of pros and cons.

Pros:

  1. It helps you think about a feature.
  2. You will have almost 100% test coverage.
  3. Less bugs
  4. Code optimalization is much easier (In case you write good tests)

Cons:

  1. It takes time
  2. Sometimes you really don’t know how to do something so it’s hard to write a test first (as you don’t know how it will work)

My work-flow when I do a TDD looks like:

  1. Think about the feature
  2. Make an integration spec
  3. Think about the code
  4. Make an unit test
  5. Program the feature by fixing the specs, one by one
  6. Optimize the code

But as I said, when I start a project I really think if first thing should be write a spec that the app runs. Because sometimes I just write a test-app where I am just trying something. The thing is that when I decide to continue on the project I mostly cry that I did not write any tests first so I start from scratch.

So What about your experience with TDD?