Monday, January 13, 2014

OSV - operating system for the cloud

Cloud development requires design oriented towards environment with reduced performance. With many virtual servers occupying one physical machine a lot of processing power is consumed by context switching, reducing overall performance of all virtual instances.

OSV is a new operating system designed specifically for instances running in the cloud, written in C++ (which, as a side note, proves that Linus Torvalds was wrong when he claimed that C++ is not suitable for writing a system kernel). OSV reduces a lot of overhead found in the traditional operating systems, like kernel memory protection. If you ever worked with operating system which does not use such protection (like AROS or MorphOS), especially on hardware with x86 architecture, you must have observed a huge performance gain on those systems. Of course the biggest drawback of this approach is that a badly written application can crash the whole machine. However, since OSV does not run directly on hardware, but on top of a hypervisor (such as Xen or KVM) such crash affects only a virtual cloud instance, not the whole server. Moreover, OSV runs a standard Java Virtual Machine, which provides automatic memory management and necessary level of protection by itself, so no extra effort from the operating system is needed to ensure software stability.

Will OSV become sucessful? It's hard to say at the moment, but it surely shows a new trend, where not only hardware and applications, but also operating systems evolve to fit the new reality which cloud computing creates. If you are interested in trying out OSV yourself, here are some instructions how to run an Amazon EC2 instance with OSV on-board.

Saturday, January 11, 2014

Waiting for the seamless cloud

You may not have heard about memristor, the long sought fundamental circuit element (next to resistor, capacitor and inductor), but this invention can be the greatest revolution in computer industry since introducing the transistor. Memristors have simpler construction than transistors and don't require external power source to retain information. Last year Crossbar Inc. unveiled their RRAM non-volatile memory technology with shockingly impressive characteristics: 1 terabyte of data on a single chip with 20 times faster access than offered by traditional NANDs.

But this is just the beginning. When technology evolves, it may be able to replace not only flash drives, but also DRAMs, which will have a huge impact on the whole computer industry. Imagine a device with only one memory space, used both as data storage and for running applications. Moreover, applications are loaded into memory only once, and they stay there even when you turn the power off. You can take off the battery from the smartphone and replace it with a new one, and after turning the power on you instantly get your device just as you left it.

Sounds familiar? If you ever worked with Smalltalk you already know the whole idea. If you didn't, I strongly encourage you to try out Squeak or Pharo. They are both open source Smalltalk implementations, and they provide complete, object-oriented working environment. What is unique about Smalltalk is that you can start applications, create new workspaces, build classes and objects, and when you quit, the whole environment is saved in an image, and restored on the next session. At first sight it looks like simple hibernation, but it isn't. In Smalltalk you can build applications without writing the source code in a traditional way. You just create an object, dynamically add properties and methods to it, and it becomes a part of the current environment. When you want to distribute the application you just distribute the image, which has all the application code, libraries, and necessary tools.

RRAM is a technology which will eventually allow to implement the old Smalltalk idea on the hardware level and finally separate virtual machines from underlying hardware. Currently, every time a virtual machine crashes, a new instance needs to be started from scratch and configured. Although this process can be automated with tools like Puppet, it still takes time and makes applications run from scratch. With RRAM-based cloud the problem will not exist any more: you will be able to save the state of a virtual machine at any moment and than spawn it or move it to another physical location (server, rack unit, or even datacenter) within seconds. More important, the state of all the applications will be preserved, which means that, for example, you will not loose user sessions even if you move a virtual machine to another location.

In my opinion it will be a next big step in cloud computing evolution. Now the cloud instances are mostly stateless, and require external storage or cache to keep user data. With the new memory chips they will be able to preserve their state even while moving across different hardware, and this process will become absolutely seamless for the end user - just the same way the GSM networks work today, keeping your call running even when you switch between base transceiver stations while talking.

Moving to the cloud

The cloud computing has developed rapidly during last few years. As shown by Gartner study, the cloud market has grown by 18% in 2013 and is now worth 131 billion (or thousand million) US dollars. By the year 2015 it is expected to hit the value of 180 billion. Despite some well known security and privacy concerns regarding public cloud storage, and even despite the PRISM global scandal, the future in which most (if not all) of our data is stored in the cloud seems inevitable. This requires a substantial shift in approach to software architecture and creates new challenge for software developers.

First, you need to abandon huge, complicated, monolytic applications in favour of platforms built from many small, interconnected services. Such services require much less resources, and can be easily replicated througout the cloud, increasing the overall stability and safety of the applications. One of the most famous examples of a successful SOA revolution is Amazon, which radically changed its software platform.

Second, you need to change the underlying technologies. For example, Java with its extremely resource hungry virtual machine, bloated frameworks and heavy threads seems a rather bad choice. On the other hand, Node.js with its small memory footprint and single-threaded, event-driven, non-blocking I/O model fits the cloud perfectly.
If you have ever had doubts about Javascript being the right choice for an enterprise application, than you should read about Paypal moving their backend from Java to Javascript. Except from huge cost savings (Paypal has been extensively using very expensive Java soultions like Terracotta BigMemory to scale its backend vertically) and increased productivity (engineers claim to be able to develop software twice as fast as in Java), Paypal also benefits from new skills of its software developers, who can now work both on frontend and backend using the same programming language.

Third, you may need to redesign the application flow and algorithms used. You may often get better results processing huge amount of data in small chunks on many small cloud instances, than all at once on few huge mainframes. It means not only using MapReduce to speed up your tasks, but also avoiding of using complex sequential algorithms. For example, in typical situations mergesort may perform worse than heapsort, but is much simpler to parallelize and with enough cloud instances will sort your data much faster.
But there is more to it than that. AMD has just announced a new line of 64-bit ARM server CPUs, which is widely regarded as the end of its race against Intel in high performance computing. ARM processors are less complicated and more power efficient than their Intel counterparts, but are also slower - which means that they will be more suited for clouds with many small instances.

Finally, there is a whole new class of problems in cloud computing, which are either not present, or long solved in traditional applications, like data consistence or dealing with transactions. Some of them cannot be solved by software, and they require change in your business model: for example you may have to trade real time consistency for the amount of traffic your application can handle. Also, most cloud installations suffer from poor I/O performance, because there are usually many virtual instances simultaneously trying to access one physical device like hard drive or network interface.