Friday, May 2, 2014

RISC design principles for project management

When David A. Patterson and Carlo H. Sequin designed the first RISC processors in the beginning of the 80s they soon realized that designing instructions that could be issued (started) quickly was the key to good performance. While the initial emphasis was on simple instructions that could be executed quickly, they learned, that how long an instruction actually took mattered less than how many could be started per second.
I learned that the same principles apply to good project management in the last few years. Thanks RISC designers!

Tuesday, October 1, 2013

Mindroid - Android everywhere

Mindroid is an application framework that lets you create applications using a set of reusable components - just like Android. The name Mindroid has two different meanings. On one hand Mindroid is a minimal set of core Android classes and on the other hand these classes also form Android's mind (at least in my opinion). There are three versions of Mindroid focusing on different domains.

The feature-richest version is Mindroid.java. It is a complete application framework intended for machine-to-machine (M2M) communication projects and the Internet of Things. It is almost completely Android API-compliant with minor exceptions since Mindroid even runs on Java v1.4 (Java SE and Jave ME CDC) platforms like IBM's J9 Java virtual machine on Qualcomm's modem chipsets running an L4 microkernel OS. Mindroid.java also includes an application framework documentation. For the future, I plan to make Mindroid run on Java ME 8.

Another version of Mindroid is Mindroid.cpp. It is written in C++ using reference counting and other nice gimmicks. Currently, Mindroid.cpp is not really a complete application framework but focuses on Android's messaging and concurrency features. Nevertheless, it provides all messaging abstractions for implementing a lot of applications - the Android Transporter Player is one such example. At the moment Mindroid.cpp runs on POSIX operating systems like Linux, Android or QNX Neutrino RTOS and it uses dynamic memory allocations.

The third version is Mindroid.ecpp. This deeply embedded version of Mindroid is similar to Mindroid.cpp but it runs on tiny embedded operating systems like CMSIS-RTOS for ARM Cortex-M processors and it also runs on AUTOSAR-OS for various embedded processors. Such deeply embedded environments often lack dynamic memory allocation which is also not required by Mindroid.ecpp.

Why?

One of the main reasons for building Mindroid is the fantastic Actor implementation of Google's Android operating system. This actor model is the core building block of Android's whole software architecture. In all versions of Mindroid (and Android) an Actor is implemented by the Looper and Handler classes. A Looper has a thread and a message queue. The threads blocks on the message queue waiting for something to do. To put something into a message queue there is the Handler class. Each Handler also has a handleMessage method for processing messages. An example of this scenario is given in the documentation of the Looper class. Each Handler is attached to exactly one message queue (and thus to one Looper) at creation time. A message object always refers to its Handler that will process it later. Now think of software components in terms of Handlers. Software components (Handlers) talk to each other by passing messages. For example, you can run each Handler on its own Looper (thread) if you like. But you are also allowed to put multiple Handlers on the same Looper. (Android is also able to put multiple Handlers in different operating system processes that communicate with each other using the Binder IPC.) This behavior (software architecture) is changeable at will without a need to change the inner workings of a software component.

How?

Here are "Hello World" examples (using two Handlers) for all versions of Mindroid. One Handler prints "Hello " and the other one "World!" once in a second resulting in "Hello World!" outputs.

Mindroid.java
public class HelloWorld extends Service {
  private static final String LOG_TAG = "HelloWorld";
  private static final int SAY_HELLO = 0;
  private static final int SAY_WORLD = 1;
 
  private final Handler mHelloHandler = new Handler() {
    public void handleMessage(Message message) {
      switch (message.what) {
      case SAY_HELLO:
        System.out.print("Hello ");
        mWorldHandler.obtainMessage(SAY_WORLD)
          .sendToTarget();
        break;
      }
    }
  };
 
  private final Handler mWorldHandler = new Handler() {
    public void handleMessage(Message message) {
      switch (message.what) {
      case SAY_WORLD:
        System.out.println("World!");
        Message hello = mHelloHandler
          .obtainMessage(SAY_HELLO);
        mHelloHandler.sendMessageDelayed(hello, 1000);
        break;
      }
    }
  };

  public void onCreate() {
    mHelloHandler.obtainMessage(SAY_HELLO)
      .sendToTarget();
  }

  public void onDestroy() {
    mHelloHandler.removeMessages(SAY_HELLO);
    mWorldHandler.removeMessages(SAY_WORLD);
  }

  public IBinder onBind(Intent intent) {
    return null;
  }
}
Mindroid.cpp
static const int SAY_HELLO = 0;
static const int SAY_WORLD = 1;

class HelloHandler : public Handler {
public:
  HelloHandler(const sp<Handler>& worldHandler) :
      mWorldHandler(worldHandler) {
  }

  virtual void handleMessage(const sp<Message>& msg) {
    switch (msg->what) {
    case SAY_HELLO:
      printf("Hello ");
      sp<Message> message = mWorldHandler
        ->obtainMessage(SAY_WORLD);
      message->metaData()->putObject("Handler", this);
      message->sendToTarget();
      break;
    }
  }

private:
  sp<Handler> mWorldHandler;
};

class WorldHandler : public Handler {
public:
  WorldHandler() {
  }

  virtual void handleMessage(const sp<Message>& msg) {
    switch (msg->what) {
    case SAY_WORLD:
      printf("World!\n");
      sp<Handler> helloHandler = msg->metaData()
        ->getObject<Handler>("Handler");
      sp<Message> message = helloHandler
        ->obtainMessage(SAY_HELLO);
      helloHandler->sendMessageDelayed(message, 1000);
      break;
    }
  }
};

int main() {
  Looper::prepare();
  sp<Handler> worldHandler = new WorldHandler();
  sp<Handler> helloHandler = 
    new HelloHandler(worldHandler);
  helloHandler->obtainMessage(SAY_HELLO)
    ->sendToTarget();
  Looper::loop();

  return 0;
}
Mindroid.ecpp
static const int SAY_HELLO = 0;
static const int SAY_WORLD = 1;

class HelloHandler : public Handler {
public:
  HelloHandler(Handler& worldHandler) :
      mWorldHandler(worldHandler) {
  }

  virtual void handleMessage(const Message& msg) {
    switch (msg.what) {
    case SAY_HELLO:
      printf("Hello ");
      mWorldHandler.obtainMessage(mMessage, SAY_WORLD);
      mMessage.obj = this;
      mMessage.sendToTarget();
      break;
    }
  }

private:
  Handler& mWorldHandler;
  Message mMessage;
};

class WorldHandler : public Handler {
public:
  WorldHandler() {
  }

  virtual void handleMessage(const Message& msg) {
    switch (msg.what) {
    case SAY_WORLD:
      printf("World!\n");
      Handler* helloHandler = (Handler*) msg.obj;
      helloHandler->obtainMessage(mMessage, SAY_HELLO);
      helloHandler->sendMessageDelayed(mMessage, 1000);
      break;
    }
  }

private:
  Message mMessage;
};

static uint8_t sHelloHandler[sizeof(HelloHandler)];
static uint8_t sWorldHandler[sizeof(WorldHandler)];

int main() {
  Message message;

  Looper::prepare();
  Handler* worldHandler =
    new (sHelloHandler) WorldHandler();
  Handler* helloHandler =
    new (sWorldHandler) HelloHandler(*worldHandler);
  helloHandler->obtainMessage(message, SAY_HELLO);
  message.sendToTarget();
  Looper::loop();

  return 0;
}

For more information on Mindroid please consult the Mindroid.java documentation and the examples provided with the application frameworks on GitHub.

Thursday, April 18, 2013

Concurrency: A single human body cell versus the internet

This week I asked Alan Kay (the inventor of OOP, Smalltalk, the Dynabook, and much more) if there are more concurrent activities going on inside a single human body cell than on the whole internet? Since he is a computer scientist as well as a molecular biologist he seemed to be the right person to answer this question. Here is a summary of the short conversation:

Hi Daniel

Good question -- and we have to work on the question some more in order to get a reasonable comparison.

Since matter is concurrently active at the subatomic level, the more mass we have the more concurrent activities -- so the Internet wins here because it has much more mass.

Things get a lot more tricky at the "cell mechanics" level. Typical human body cells have about 80 billion large molecules of about 10,000 types (the other 70% is water and other small molecules), and these are concurrently active. 
There are several billion computers on the Internet (plus many nuts and bolts computers such as routers) and we can guess that most of these have a few hundred processes each (and each with some number of threads) -- all of which are pseudo-concurrent (it's really how many CPU type things are operating).

So at the hardware level with real CPU like things, there are likely ~ 10 in each computer, and now more with multiples cores, so that gives us an estimate of around 20 billion real concurrent processes not counting routers and other infrastructure.

If we count pseudo-processes (which are logically concurrent), then I think in the Internet we've exceeded a single cell at the active large molecule as a process level of concurrency.

We have about 10 Trillion cells in our body that have our DNA (and about 90 Trillion that are microorganisms of many different kinds).

I think we are safe to say that the concurrent activities in one human body completely dominate the Internet.

However, I have not looked below the CPU level to processes at the hardware level .... perhaps you'd like to take a shot at estimating that!

Cheers,
Alan


Hi Alan,

thanks for your detailed answer. I really appreciate that you took the time to think about this. I came up with this question when thinking about concurrency (maybe also parallelism and message passing) as measurement unit for complex systems. There is no measurement unit for concurrency, I think. But its quite hard to define one, just like comparing the incomparable :-).
...
I also tried to google the answer to this question but since I am a software engineer and only a hobbyist in biology, I wasn't quite sure if I can count all proteins as concurrently active entities or if I should only count ribosomes, motor proteins etc. as concurrently active. On the internet I found that there are about 15.000 ribosomes in one single cell. If only counting ribosomes (which I think is not correct) would reduce the concurrency within a human body cell enormously. Hard to compare...

Thank you very much and best regards,
Daniel


Hi Daniel

You have to at least count proteins that are acting as enzymes -- they are continuously active. Also, you will be interested (and amazed) if you look at the dynamics of biochemistry (i.e. why does it work? How do the molecules "find" each other). These processes are happening concurrently.

I think it is hard to compare -- but mainly for "level" not for estimating how many entities are actually "in process" concurrently.

Cheers,
Alan


Really impressive, isn't it? And we software engineers think that today's multi-core computers provide concurrency and parallelism. |-O

Sunday, November 4, 2012

Concurrent Hello World in Go, Erlang and C++

This week I started taking a deeper look at the Go programming language since I am somewhat addicted to scalability, reliability, concurrency and message passing :-). The first thing I always do when playing around with a new software platform is to write a concurrent "Hello World" program. The program works as follows: One active entity (e.g. thread, Erlang process, Goroutine) has to print "Hello " and another one "World!\n" with the two active entities synchronizing with each other so that the output always is "Hello World!\n". Here is the concurrent Hello World program in Go, Erlang and in C++ using the Mindroid framework.

Go
package main

import "fmt"

func main() {
    sayHello := make(chan string)
    sayWorld := make(chan string)
    quitter := make(chan string)

    go func() {
        for i := 0; i < 1000; i++ {
            fmt.Print("Hello ")
            sayWorld <- "Do it"
            <-sayHello
        }
        sayWorld <- "Quit"
    }()

    go func() {
        for {
            var reply string
            reply = <- sayWorld
            if (reply == "Quit") {
                break
            }
            fmt.Print("World!\n")
            sayHello <- "Do it"
        }
        quitter <- "Done"
    }()
 
    <-quitter
}

Erlang
-module(hello_world).

-export([start/0, say_hello/2, say_world/0]).

say_hello(0, WorldPid) ->
    WorldPid ! quit;

say_hello(N, WorldPid) ->
    io:format("Hello ", []),
    WorldPid ! {sayWorld, self()},
    receive
        sayHello ->
            say_hello(N - 1, WorldPid)
    end.

say_world() ->
    receive
        quit ->
            quit;
        {sayWorld, HelloPid} ->
            io:format("World!~n", []),
            HelloPid ! sayHello,
            say_world()
    end.

start() ->
    WorldPid = spawn(hello_world, say_world, []),
    spawn(hello_world, say_hello, [1000, WorldPid]).

C++ (Mindroid)
#include <stdio.h>
#include "mindroid/os/Message.h"
#include "mindroid/os/Handler.h"
#include "mindroid/os/LooperThread.h"

using namespace mindroid;

static const int SAY_HELLO = 0;
static const int SAY_WORLD = 1;
static const int QUIT = 2;

class HelloHandler : public Handler
{
public:
  HelloHandler(const sp<Handler>& worldHandler) : 
    mWorldHandler(worldHandler), mCounter(0) {}

  virtual void handleMessage(const sp<Message>& msg) {
    switch (msg->what) {
      case SAY_HELLO:
        mCounter++;
        if (mCounter <= 1000) {
          printf("Hello ");
          sp<Message> sayWorld =
            mWorldHandler->obtainMessage(SAY_WORLD);
          sayWorld->metaData()->putObject("SayHello",
            obtainMessage(SAY_HELLO));
          sayWorld->sendToTarget();
        } else {
          mWorldHandler->obtainMessage(QUIT)
            ->sendToTarget();
          Looper::myLooper()->quit();
        }
        break;
    }
  }

private:
  sp<Handler> mWorldHandler;
  int mCounter;
};

class WorldHandler : public Handler
{
public:
  virtual void handleMessage(const sp<Message>& msg) {
    switch (msg->what) {
      case SAY_WORLD:
        printf("World!\n");
        msg->metaData()
          ->getObject<Message>("SayHello")
          ->sendToTarget();
        break;
      case QUIT:
        Looper::myLooper()->quit();
        break;
    }
  }
};

int main() {
  sp< LooperThread<WorldHandler> > wlt =
    new LooperThread<WorldHandler>();
  wlt->start();

  Looper::prepare();
  sp<Handler> hh = new HelloHandler(wlt->getHandler());
  hh->obtainMessage(SAY_HELLO)->sendToTarget();
  Looper::loop();

  return 0;
}

I think I really like Go (although Erlang is a bit more crafty :-)). Go eases developing scalable and reliable distributed systems while being a good to read language. Since you can also handle much more concurrent activities with one machine using Goroutines (compared to plain operating system threads) it also helps in saving energy.

Links

Thursday, June 21, 2012

Android Transporter on the Raspberry Pi

The Android Transporter which we have developed at E.S.R.Labs now also runs on the Raspberry Pi. It allows you to easily share and display the contents of an Android gadget wirelessly with a television set or a beamer.

The demo video shows how the Android Transporter turns your smartphone or tablet together with the Raspberry Pi into a powerful media hub to watch movies, play games, or do slideshows.
This way, the Raspberry Pi becomes the cheapest gaming console or set-top box :-).

The Android Transporter for the Raspberry Pi is based on my Android messaging and concurrency (a.k.a. Mindroid) C++ library. The source code of this library is available on GitHub.

Tuesday, May 8, 2012

Android Transporter WiFi Display



The video above shows the Android Transporter, a realtime WiFi Display for Android which I and some other guys have implemented during the last few days. We have done this for our new software startup company called E.S.R.Labs, which is short for Embedded Software Research Labs.
The WiFi Display mirror mode beams the display contents of your smartphone or tablet onto other gadgets, TVs, beamers, or other devices. Soon, there will also be a dual screen mode with an accompanying SDK.
I think WiFi Displays will find their firm place for interconnecting devices to make the world around you a bit smarter :-).

Saturday, March 31, 2012

Android's C++ reference counting

#include <stdio.h>
#include "android/os/Ref.h"

using namespace android::os;

class RefTest : public Ref {
public:
    RefTest(int32_t id) : mID(id) {
        printf("RefTest ctor: %d\n", mID);
    }

    virtual ~RefTest() {
        printf("RefTest dtor: %d\n", mID);
    }

    int32_t id() const {
        return mID;
    }

private:
    int32_t mID;
};

int main() {
    sp<RefTest> ref1 = new RefTest(1);

    {
    sp<RefTest> ref2 = new RefTest(2);
    }

    wp<RefTest> ref3 = ref1;
    sp<RefTest> ref4 = ref3.toStrongRef();
    if (ref4 == NULL) {
        printf("RefTest object is destroyed\n");
    } else {
        printf("RefTest object %d is still around\n",
            ref4->id());
    }
    ref4 = NULL;

    ref1 = NULL;
    ref4 = ref3.toStrongRef();
    if (ref4 == NULL) {
        printf("RefTest object is destroyed\n");
    } else {
        printf("RefTest object %d is still around\n",
            ref4->id());
    }

    return 0;
}

To use the reference counting mechanism for (non Android) C++ projects check out the Ref class of my native Android messaging and concurrency framework. It contains Google's C++ reference counting class from the Android project along with some fixes and refactorings for code readability.

To get started you have to inherit (virtual) public from the Ref base class so that each object of a particular class that derives from Ref contains reference counters and the necessary management methods to increment and decrement them.
Afterwards you are able to create instances of types like RefTest as in the example above and assign them to strong (smart) pointers of type sp<RefTest>. There also is a wp<> template class for weak pointers when you have to get around circles of strong references.

Each Ref object has a WeakRefImpl object that keeps two reference counters, one for weak references and one for strong references. Whenever the strong reference counter is incremented or decremented then the weak reference counter also gets incremented or decremented. But when you create a wp<> to some object only the object's weak reference counter is incremented. Therefore, managed objects live as long as there is some strong pointer referring to it. However, the managed object's WeakRefImpl object may live longer than the object itself because it lives as long as there is some weak pointer referring to the object. If that is the case and the object itself is already destroyed the weak pointer's toStrongRef method returns NULL to indicate that the object is not available anymore.

The Ref base class is thread-safe without needing expensive locks. It completely synchronizes the access to the reference counters using atomic operations. The sp<> and wp<> template classes are not thread-safe. So always make sure you copy the strong and weak pointers when you share managed objects with different threads.

The C++0x shared_ptr class is similar to Android's C++ reference counting class but there are also some differences. E.g. while C++0x keeps the reference counter outside of the managed object in the shared_ptr class Android's reference counting mechanism keeps the counter in the object itself.

Saturday, February 4, 2012

The Erlang Design Pattern

Over the last couple of weeks I did an OO programming experiment. I call it the Erlang design pattern. It is based on the Actor model but goes some steps further. At its core just like the Actor model there are active entities (objects) that have a thread and a message queue with the thread waiting on the message queue to do some stuff.

The Erlang design pattern extends the Actor model by first dividing the software program into active (actors, that have their own thread) and passive (objects, whose methods are called by other actors) objects.
Second, I enforced that all methods of an active or passive object are always called by the same thread context (e.g. actor). So for each object there is no shared state between different thread contexts anymore. Two different objects of the same class could as a matter of course be used by two different actors.
Third, if two or more actors want to share data they have to explicitly do this via message passing. To avoid a lot of copying I shared data using smart pointers in C++ and references in Java and C#. When one actor sends such a data structure to another actor it loses its reference to that data structure to make sure that it is only owned by one actor at a time (see also Microsoft Singularity's exchange heap).

Building software systems using the Erlang design pattern was very interesting because it helped to write concurrent, modular, scalable and readable code :-). Maybe you will try it at your next code retreat.
Of course there also were some quirks. E.g. I couldn't use my C++ closures (which I really like) anymore because they delegate methods of an object from one thread context to another.
To sum this up, the Erlang design pattern was an interesting experiment which I will most probably not use every day in its pure form. But it helps to better understand how ideas of functional programming result in modular, scalable and readable (OO) software systems.

For C++, I used my Android messaging and concurrency (a.k.a. Mindroid) framework to play around with the Erlang design pattern and for Java I just used the Android SDK.

Of Joe Armstrong's Erlang design principles, the Actor model "solves" the first and third point. The Erlang Design pattern also "solves" the second point and goes a step further on the third one.
  • The world is concurrent.
  • Things in the world don't share data.
  • Things communicate with messages.
  • Things fail.
To achieve to same degree of concurrency as Erlang does with its lightweight processes one definitely needs to build its own virtual machine, OS thread abstractions and libraries.
But for smartphone and tablet software development one will most probably not need that degree of concurrency and is able to stick with OS threads :-).

See also:

Sunday, January 1, 2012

How do you read source code?

If software is eating the world as Marc Andreessen and I think, how do you [r]ea(d|t) source code?
Well, let's first answer why you should be good at reading source code at all.
First it's always great fun to figure out how things work. By reading source code one is exactly doing that to learn about interesting software systems and projects.
Another reason for reading source code may be to get better (and faster) at reading and writing software by learning from others and sometimes also from their mistakes.
If you join a new software company or an open source project you are probably going to work on a huge existing codebase and therefore you should be able to get into it quickly, e.g. to implement tests and features or to fix bugs.
The primary goal of reading source code always is to be able to think and reason about all aspects of a software system's codebase. In this article I put together some advise and patterns for reading source code which made my life as software engineer a lot easier :-).

Now the big question is: How do you read source code?
Before you begin to dive deep into the source code of a software project you should make sure to have enough domain specific knowledge to understand the particular piece of software. Hence, you should start to get the big picture by reading documentation and computer science literature about the software platform/product or field of computer science (e.g. Windows apps, Mac OS X and iOS apps, Android apps, operating systems, computer networks, browsers, search engines, databases, etc.).
You don't have to know everything about the topic, but you have to understand the core abstractions and the basic building blocks of the software platform/product. E.g. you should know what processes, threads, semaphores, etc. are before you write your own scheduling algorithm for Linux (see Modern Operating Systems by Andrew S. Tanenbaum). You should also know about Linux specific process management before doing this (see Linux Kernel Development by Robert Love and Linux Kernel Architecture by Wolfgang Mauerer).
But most probably you have already done this before investigating a particular piece of software. So let's get started...

For starters, all software systems or at least all subsystems of huge software systems have some basic building blocks and core abstractions that you will notice all over the place. These components (e.g. classes, modules, actors, data structures, etc.) are also known as hubs. The hubs are simultaneously part of various aspects or subsystems of the whole codebase. Therefore the hubs interlink the subsystems and yet make huge codebases look like small worlds.
Hubs form the contexts around which software engineers build the software architecture. They also implement quite a lot of the core features and functionality. As software systems grow, more and more other components will depend on the hubs. Therefore look for the hubs first and learn about their responsibilities. Usually even huge software systems only have a relatively small number of hubs. Hence, you don't have to fear millions of lines of source code because the hubs will guide you through the codebase. E.g. if we take a look at Google's Android OS, I would say that the following classes (active objects and processes) are the hubs: Zygote, ActivityManagerService, WindowManagerService, PackageManagerService, ConnectivityService and the SurfaceFlinger. You see, just 6 components :-).
You can also repeat the game at a smaller scale, e.g. for Android's widget framework where the View, ViewGroup and ViewRoot classes are the hubs upon which a lot of other UI components build.
This reductionist approach also works for other software systems such as operating systems, filesystems, networking stacks, web backend platforms, etc.
For more details on hubs and network theory I suggest Albert-Laszlo Barabasi's book Linked.

Next, after identifying the hubs you should try to understand the interaction patterns between the hubs. The interactions may rely on different mechanisms like pure API calls or message passing (e.g. message queues or IPC calls). To get the idea of how hubs depend on each other I suggest to just draw some pictures of the hubs, their dependencies and their interactions.
As an example just take a look at one of my previous blog posts about Andoid Architecture Patterns.
On the 7th slide there is a picture about how Android starts activities, services and content providers within their own Linux OS processes. It does so by several interactions between the ActivityManagerService, the Zygote process and the app's process.

As you see, getting the big picture is done by identifying the hubs and understanding their interactions using a top-down approach. To dig deep into specific parts or aspects of software systems we have to change our source code reading patterns. Therefore we will switch to a bottom-up approach to inspect modules, classes, data structures, methods, functions, etc. Later we are able to combine both code reading approches. This strategy of summarizing the findings of the top-down and the bottom-up approach is called downward causation.

I think the bottom-up approach works best by starting with the active entities (threads, actors, processes) that breathe life into the hubs. This is because to be able to think and reason about some piece of source code you really need to understand the environment in which the hubs run.
So always make sure which active entities run which parts of a system's source code and try to understand how and when they interact with each other. This will help you to achieve the primary goal of reading source code, that is to be able to think and reason about all aspects of a software system's codebase (solely with your brain and without the help of external tools like a debugger :-)).
Getting into the details of some piece of source code always starts with trying things out. I do that by adding some logging code or by making assumptions about the code's behavior which I verify with tests afterwards. Another method is to do modifications to the source code just to check how the code behaves under the new circumstances. Breaking or damaging the code may also help you to learn about it ;-).
While reading source code always ask yourself: "How does it work?" and "Why have the developers done it that way?". This will most probably cause you some sleepless nights but it will also make you a better software engineer and software architect.
Everything you do to get better at thinking and reasoning about the source code will help you to develop stronger debugging and analytical skills which in turn enable you to implement new features, fix bugs or do refactorings.

By thinking and reflecting about the source code you are reading you will learn a lot about how to write software systems and platforms. Besides, from bad software you will also learn what to avoid when developing software systems.
Furthermore, there are two great articles about how to write great source code and software systems. Rich Hickey's talk about "Simple made easy" at InfoQ and Erlang's programming rules and conventions. These two guides are outstanding no matter which programming language you use.
So, reading code really is fun. Maybe next time instead of reading another software engineering book just read some source code. (GitHub is really great for that.)

Since you need some staying power to get into a huge codebase I suggest to pick a software project that provides some fun and purpose along the way :-). Maybe the list below contains an interesting software project for you...

Software projects

Saturday, November 5, 2011

Apple's Siri and the Semantic Web




Maybe not Google will start to build the semantic web as I thought in one of my previous blog posts about "The Internet of Things" but Apple will do so with Siri.

Some interesting links:
Siri at semanticweb.com
Siri points the way to next level of semantic web apps
New research helps people to get engaged with the semantic web
Will Siri and her offspring bring the semantic web to life?
Siri - the do engine
Siri internals

Siri was born out of SRI’s CALO Project, the largest AI project in U.S. history:

Siri is “Search” as it should be: Ask a question, receive an answer or get things done.
...great, Siri is what I wanted Quickdroid to be :-).

Wednesday, November 2, 2011

Algorithms and Data Structures

Topics:

Approaches and strategies:
The divide-and-conquer algorithm design paradigm divides the problem into a number of subproblems. Then it conquers the subproblems by solving them recursively. Finally, it combines the solutions to the subproblems into a solution for the original problem.
This algorithm design method is e.g. used by Quicksort and Mergesort.

Recursion in computer engineering is a method where the solution to a problem depends on solutions to smaller instances of the same (self-similar) problem.
A classic example of recursion is the definition of the factorial function, given here in Java:
public static int factorial(int n) {
    if (n <= 1) {
        return 1;
    } else {
        return n * factorial(n - 1);
    }
}

Search algorithms:
Binary search finds the position of a specified value within a sorted array.
Worst case performance: O(log n)

Java source code:
public static int search(final char character,
  final char[] alphabet) {
  int leftIndex = 0;
  int rightIndex = alphabet.length - 1;        
 
  while (leftIndex <= rightIndex) {
    final int middleIndex = leftIndex + 
      ((rightIndex - leftIndex) / 2);
    if (alphabet[middleIndex] < character) {
      leftIndex = middleIndex + 1;
    } else if (alphabet[middleIndex] > character) {
      rightIndex = middleIndex - 1;
    } else {
      return middleIndex;
    }
  }
 
  return -1;
}

Sorting algorithms:
Quicksort is a divide-and-conquer sorting algorithm developed by Tony Hoare that, on average, makes O(n log n) comparisons to sort n items. In the worst case, it makes O(n^2) comparisons, though this behavior is rare. Quicksort is often faster in practice than other O(n log n) algorithms. Quicksort can be implemented as an in-place sort, requiring only O(log n) additional space. The recursion call stack has height n in the worst case and height log n in the best case.

Quicksort selects a random pivot element from the n items to sort. (Here the pivot element is always the rightmost element of the pile.) Then Quicksort separates the n - 1 other elements into two piles: a low pile containing all elements that appear before the pivot element and a high pile that contains all elements that appear after the pivot element. By doing that recursively for smaller and smaller piles the whole pile is being sorted.

Java source code:
public static void quicksort(char[] string,
    int leftIndex, int rightIndex) {
    if (leftIndex < rightIndex) {
        int pivotIndex = partition(string,
            leftIndex, rightIndex);
        quicksort(string, leftIndex, pivotIndex-1);
        quicksort(string, pivotIndex+1, rightIndex);
    }
}
 
static int partition(char[] string, int leftIndex,
    int rightIndex) {
    int pivotIndex = rightIndex;
    // divider index for the pivot element
    int storageIndex = leftIndex;
  
    for (int i = leftIndex; i < rightIndex; i++) {
        if (string[i] < string[pivotIndex]) {
            swap(string, i, storageIndex);
            storageIndex++;
        }
    }
    swap(string, pivotIndex, storageIndex);
  
    return storageIndex;
}

Heapsort is a comparison-based sorting algorithm. Although somewhat slower in practice on most machines than a well implemented Quicksort, it has the advantage of a more favorable worst-case O(n log n) runtime. Heapsort is an in-place algorithm, but is not a stable sort. The Java source code for Heapsort is available here.

Introsort or introspective sort is a sorting algorithm designed by David Musser in 1997. It begins with Quicksort and switches to Heapsort when the recursion depth exceeds a level based on (the logarithm of) the number of elements being sorted. It is the best of both worlds, with a worst-case O(n log n) runtime and a practical performance comparable to Quicksort on typical data sets. Introsort is used by the GCC C++ STL and the SGI C++ STL.

Mergesort is an O(n log n) divide-and-conquer sorting algorithm. Most implementations produce a stable sort, meaning that the implementation preserves the input order of equal elements in the sorted output. Mergesort was invented by John von Neumann in 1945.
Mergesort does not sort in place but it has the advantage that it may be distributed across multiple machines to sort parts of really huge data set. Therefore mergesort is used a lot by web search engines. Mergesort is a typical recursive algorithms that reduces large problems into smaller ones. The recursion call stack always has height log n.

Java source code:
public static void mergesort(char[] string,
  int leftIndex, int rightIndex) {
  if (leftIndex < rightIndex) {
    int middleIndex = (leftIndex + rightIndex) / 2;
    mergesort(string, leftIndex, middleIndex);
    mergesort(string, middleIndex + 1, rightIndex);
    merge(string, leftIndex, middleIndex,
      rightIndex);
  }
}

static void merge(char[] string, int leftIndex,
  int middleIndex, int rightIndex) {
  Queue<Character> string1 = 
    new LinkedList<Character>();
  Queue<Character> string2 = 
    new LinkedList<Character>();
 
  for (int i=leftIndex; i<=middleIndex; i++) {
    string1.add(string[i]);
  }  
  for (int i=middleIndex+1; i<=rightIndex; i++) {
    string2.add(string[i]);
  }
 
  int i = leftIndex;
  while (!string1.isEmpty() && !string2.isEmpty()) {
    if (string1.peek() <= string2.peek()) {
      string[i++] = string1.poll();
    } else {
      string[i++] = string2.poll();
    }
  }
 
  while (!string1.isEmpty()) {
    string[i++] = string1.poll();
  }
  while (!string2.isEmpty()) {
    string[i++] = string2.poll();
  }
}

Random numbers
Random number generators
x(t+1) = R * x(t) * (1 - x(t))
With the initial values of R = 4.0 and x(0) = 0.2 you will get the chaotic trajectory as shown in the map above. The values for x(t+1) are chaotic and therefore like random numbers.
Dynamical systems theory characterizes the behavior of the above equation as chaotic attractor.
For more details see chapter 2 of "Complexity: A Guided Tour" by Melanie Mitchell.

Data Structures
Array
Indexing performance: O(1)
Adding or deleting values: O(1)
Search performance (unsorted array): O(n)

Linked list
Indexing performance: O(n)
Adding or deleting items: O(1)
Search performance: O(n)

Java source code (singly-linked list):
public class LinkedList<T> {
    public void put(T item) {    
        Node node = new Node(item);
        Node curNode = mHeadNode;
        if (curNode == null) {
            node.nextNode = curNode;
            mHeadNode = node;             
        } else {
            Node prevNode = null;
            while (curNode != null) {
                prevNode = curNode;
                curNode = curNode.nextNode;
            }
            node.nextNode = prevNode.nextNode;
            prevNode.nextNode = node;            
        }     
    }

    public T take() {
        Node node = getHeadNode();
        if (node != null) {
            T item = node.item;                    
            return item;
        } else {
            return null;
        }  
    }
    
    class Node
    {
        public T item;
        public Node nextNode;
        public Node(T t) { item = t; }
    };

    Node getHeadNode() {
        Node node = mHeadNode;
        if (node != null) {
            mHeadNode = node.nextNode;
            return node;
        }
        return null;
    }

    private Node mHeadNode;
}

Stacks (LIFO) and queues (FIFO)
Stacks support retrieval of data items by last-in, first-out (LIFO) order.
Queues support retrieval of data items in first-in, first out (FIFO) order.
Both stack and queue implementations are normally based on arrays or linked lists.

Hashtable / Dictionary
In computer science, a hash table or dictionary is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associative array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. For further details also see hashing with chaininghashing with open addressing and hash functions.
The best case performance for adding or deleting items into or from hash tables is O(1). The search (indexing) performance is also O(1). You get the best case performance e.g. if the hash table is implemented using an array and if there are no collisions when inserting items.
If there are collisions the performance depends on the data structure that is used for managing the data items of a hash table. E.g. if a self-balancing tree is used to manage them you will get a worst-case performance of O(log n) for adding, deleting and searching items.
For more details about which underlying data structure should be used to implement hash tables see Skiena's "The Algorithm Design Manual" page 368.

Java source code (simple Hashtable with open addressing):
public class Hashtable {
    private static final int SIZE = 16;
    private Object mKeys[] = new Object[SIZE];
    private Object mObjects[] = new Object[SIZE];
 
    public void add(Object key, Object object)
        throws Exception {
        int i = 0;
        int index;
        do {
            index = hashCode(key, i);
            if (mKeys[index] == null) {
                mKeys[index] = key;
                mObjects[index] = object;
                return;
            } else {
                i++;
            }            
        } while (i < SIZE); 
        throw new Exception("hash table overflow");
    }
 
    public Object get(Object key) {
        int i = 0;
        int index;
        do {
            index = hashCode(key, i);
            if (mKeys[index].equals(key)) {
                return mObjects[index];
            }
            i++;
        } while (i < SIZE); 
        return null;
    }
 
    int hashCode(Object key, int i) {
        int k = Math.abs(key.hashCode());
        int hashCode1 = k % 701;
        int hashCode2 = 1 + (k % 700);
        int hastCode =
            (hashCode1 + i * hashCode2) % 701;
        return hastCode % SIZE;
    }
}

Binary search tree
A binary search tree is a node-based binary tree data structure which has the following properties:
  • The left subtree of a node contains only nodes with keys less than the node's key.
  • The right subtree of a node contains only nodes with keys greater than the node's key.
  • Both the left and right subtrees must also be binary search trees.
Generally, the information represented by each node is a record rather than a single data element. However, for sequencing purposes, nodes are compared according to their keys rather than any part of their associated records.
The major advantage of binary search trees over other data structures is that the related sorting algorithms and search algorithms such as in-order traversal can be very efficient.
Binary search trees are a fundamental data structure used to construct more abstract data structures such as sets, multisets, and associative arrays.

Adding or deleting items: O(h), where h is the height of the tree.
Search performance: O(h), where h is the height of the tree.

Java source code (see also "Introduction to Algorithms", chapter 12):
public class BinarySearchTree {
    class Node {
        public int mKey;
        public Object mObject;
        public Node mParentNode;
        public Node mLeftNode;
        public Node mRightNode;
    }
 
    private Node mRootNode;
 
    // Recursive add method.
    public void add(Node node, Node parentNode,
        int key, Object object) {
        if (node == null) {
            Node newNode = new Node();
            newNode.mKey = key;
            newNode.mObject = object;
            newNode.mParentNode = parentNode;
            if (parentNode != null) {
                if (key < parentNode.mKey) {
                    parentNode.mLeftNode = newNode;
                } else {
                    parentNode.mRightNode = newNode;
                }
            } else {
                mRootNode = newNode;
            }
            return;
        }
  
        if (key < node.mKey) {
            add(node.mLeftNode, node, key, object);
        } else {
            add(node.mRightNode, node, key, object);
        }
    }
 
    public void add(int key, Object object) {
        add(mRootNode, null, key, object);
    }
 
    // Iterative add method.
    public void add2(Node node, int key,
        Object object) {
        Node prevNode = null;  
        while (node != null) {
            prevNode = node;
            if (key < node.mKey) {
                node = node.mLeftNode;
            } else {
                node = node.mRightNode;
            }
        }
        Node newNode = new Node();
        newNode.mKey = key;
        newNode.mObject = object;
        newNode.mParentNode = prevNode;
        if (prevNode == null) {
            mRootNode = newNode;
        } else {
            if (key < prevNode.mKey) {
                prevNode.mLeftNode = newNode;
            } else {
                prevNode.mRightNode = newNode;
            }
        }
    }
 
    public void add2(int key, Object object) {
        add2(mRootNode, key, object);
    } 
 
    // Recursive search method.
    public Object search(Node node, int key) {
        if(node == null) {
            return null;
        }
        if (node.mKey == key) {
            return node.mObject;
        }
  
        if (key < node.mKey) {
            return search(node.mLeftNode, key);
        } else {
            return search(node.mRightNode, key);
        }
    }
 
    public Object search(int key) {
        return search(mRootNode, key);
    }
 
    // Iterative search method.
    public Object search2(Node node, int key) {
        if(node == null) {
            return null;
        }
  
        while (node != null && node.mKey != key) {
            if (key < node.mKey) {
                node = node.mLeftNode;
            } else {
                node = node.mRightNode;
            }
        }
        if (node != null) {
            return node.mObject;
        } else {
            return null;
        }
    }
 
    public Object search2(int key) {
        return search2(mRootNode, key);
    }
 
    // Inorder walk over the tree.
    String printBST(Node node) {
        String string = "";
        if (node != null) {
            string += printBST(node.mLeftNode);
            string += node.mObject + ", ";
            string += printBST(node.mRightNode);
        }
        return string;
    }
 
    public String toString() {
        return printBST(mRootNode);
    }
}

Red-black tree is a type of self-balancing binary search tree, a data structure that is typically used to implement associative arrays. The original structure was invented in 1972 by Rudolf Bayer and named "symmetric binary B-tree," but acquired its modern name in a paper in 1978 by Leonidas J. Guibas and Robert Sedgewick. It is complex, but has good worst-case running time for its operations and is efficient in practice: it can search, insert, and delete in O(log n) time, where n is the total number of elements in the tree. Put very simply, a red–black tree is a binary search tree that inserts and deletes in such a way that the tree is always reasonably balanced. For further details on red-black trees I suggest chapter 13 of the "Introduction to Algorithms" book.

AVL tree is a self-balancing binary search tree, and it was the first such data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node differ by at most one. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where n is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.
The AVL tree is named after its two Soviet inventors, G.M. Adelson-Velskii and E.M. Landis, who published it in their 1962 paper "An algorithm for the organization of information."
AVL trees are often compared with red-black trees because they support the same set of operations and because red-black trees also take O(log n) time for the basic operations. Because AVL trees are more rigidly balanced, they are faster than red-black trees for lookup intensive applications. However, red-black trees are faster for insertion and removal.

B-tree
A B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time [O(log n)]. The B-tree is a generalization of a binary search tree in that a node can have more than two children. Unlike self-balancing binary search trees, the B-tree is optimized for systems that read and write large blocks of data. It is commonly used in databases and filesystems.
In B-trees, nodes can have a variable number of keys (elements) and children. The keys of a node are stored in non-decreasing order. Each node either is a leaf node or it has some associated children that are the root nodes of subtrees. The left child node of a node's element contains all nodes (elements) with keys less than or equal to the node element's key but greater than the preceding node element's key. When data is inserted to or removed from a node, its number of keys (elements) or child nodes changes. In order to maintain the pre-defined range, nodes may be joined or split. Because a range of child nodes is permitted, B-trees do not need re-balancing as frequently as other self-balancing search trees, but may waste some space, since nodes are not entirely full. The lower and upper bounds on the number of child nodes are typically fixed for a particular implementation.
Each node of a B-tree will contain a number of keys. Usually, the number of keys is chosen to vary between t-1 and 2t-1. In practice, the keys (elements) take up the most space in a node. The factor of 2 will guarantee that nodes can be split or combined. If a node has 2t-1 keys, then adding a key to that node can be accomplished by splitting the 2t-1 key node into two t-1 key nodes and adding the median (middle) key of the original node to the parent node. Each splitted node has the required minimum number of keys.
If a new element is added into a B-tree it will always be inserted into a leaf node while the median key of the original node is shifted up into the parent node when the original node is already full.
A B-tree is kept balanced by requiring that all leaf nodes are at the same depth. This depth will increase slowly as elements are added to the tree, but an increase in the overall depth is infrequent, and results in all leaf nodes being one more node further away from the root.
B-trees have substantial advantages over alternative implementations when node access times far exceed access times within nodes, because then the cost of accessing the node may be amortized over multiple operations within the node. This usually occurs when the nodes are in secondary storage such as disk drives. By maximizing the number of child nodes within each internal node, the height of the tree decreases and the number of expensive node accesses is reduced. In addition, rebalancing the tree occurs less often. The maximum number of child nodes depends on the information that must be stored for each child node and the size of a full disk block or an analogous size in secondary storage. Practical B-trees using secondary storage want a large number of child nodes to improve performance.

Java source code (including delete) (see also "Introduction to Algorithms", chapter 18):
public class BTree {
  class Node {
    public int mNumKeys = 0;
    public int[] mKeys = new int[2*T-1];
    public Object[] mObjects = new Object[2*T-1];
    public Node[] mChildNodes = new Node[2*T];
    public boolean mIsLeafNode;
  }

  private Node mRootNode;
  private static final int T = 4;

  public BTree() {
    mRootNode = new Node();
    mRootNode.mIsLeafNode = true;
  }

  public void add(int key, Object object) {
    Node rootNode = mRootNode;
    if (rootNode.mNumKeys == (2 * T - 1)) {
      Node newRootNode = new Node();
      mRootNode = newRootNode;
      newRootNode.mIsLeafNode = false;
      mRootNode.mChildNodes[0] = rootNode;
      // Split rootNode and move its median
      // key up into newRootNode.
      splitChildNode(newRootNode, 0, rootNode);
      // Insert the key into the B-Tree
      // with root newRootNode.
      insertIntoNonFullNode(newRootNode, key,
        object);
    } else {
      // Insert the key into the B-Tree
      // with root rootNode.
      insertIntoNonFullNode(rootNode, key, object);
    }
  }

  // Split the node, node, of a B-Tree into
  // two nodes that both contain T-1 elements
  // and move node's median key up
  // to the parentNode. This method will
  // only be called if node is full; node is the
  // i-th child of parentNode.
  void splitChildNode(Node parentNode, int i,
    Node node) {
    Node newNode = new Node();
    newNode.mIsLeafNode = node.mIsLeafNode;
    newNode.mNumKeys = T - 1;
    // Copy the last T-1 elements of node
    // into newNode.
    for (int j = 0; j < T - 1; j++) {
      newNode.mKeys[j] = node.mKeys[j + T];
      newNode.mObjects[j] = node.mObjects[j + T];
    }
    if (!newNode.mIsLeafNode) {
      // Copy the last T pointers of node
      // into newNode.
      for (int j = 0; j < T; j++) {
        newNode.mChildNodes[j] =
          node.mChildNodes[j + T];
      }
      for (int j = T; j <= node.mNumKeys; j++) {
        node.mChildNodes[j] = null;
      }
    }
    for (int j = T; j < node.mNumKeys; j++) {
      node.mKeys[j] = 0;
      node.mObjects[j] = null;
    }
    node.mNumKeys = T - 1;

    // Insert a (child) pointer to node newNode
    // into the parentNode, moving other keys
    // and pointers as necessary.
    for (int j = parentNode.mNumKeys; j >= i + 1;
      j--) {
      parentNode.mChildNodes[j + 1] =
        parentNode.mChildNodes[j];
    }
    parentNode.mChildNodes[i + 1] = newNode;
    for (int j = parentNode.mNumKeys - 1; j >= i;
        j--) {
      parentNode.mKeys[j + 1] =
        parentNode.mKeys[j];
      parentNode.mObjects[j + 1] =
        parentNode.mObjects[j];
    }
    parentNode.mKeys[i] = node.mKeys[T - 1];
    parentNode.mObjects[i] = node.mObjects[T - 1];
    node.mKeys[T - 1] = 0;
    node.mObjects[T - 1] = null;
    parentNode.mNumKeys++;
  }

  // Insert an element into a B-Tree. (The element
  // will ultimately be inserted into a leaf node).
  void insertIntoNonFullNode(Node node, int key,
    Object object) {
    int i = node.mNumKeys - 1;
    if (node.mIsLeafNode) {
      // Since node is not a full node insert the
      // new element into its proper place
      // within node.
      while (i >= 0 && key < node.mKeys[i]) {
        node.mKeys[i + 1] = node.mKeys[i];
        node.mObjects[i + 1] = node.mObjects[i];
        i--;
      }
      i++;
      node.mKeys[i] = key;
      node.mObjects[i] = object;
      node.mNumKeys++;
    } else {
      // Move back from the last key of node until
      // we find the child pointer to the node
      // that is the root node of the subtree
      // where the new element should be placed.
      while (i >= 0 && key < node.mKeys[i]) {
        i--;
      }
      i++;
      if (node.mChildNodes[i].mNumKeys ==
        (2 * T - 1)) {
        splitChildNode(node, i,
          node.mChildNodes[i]);
        if (key > node.mKeys[i]) {
          i++;
        }
      }
      insertIntoNonFullNode(node.mChildNodes[i],
        key, object);
    }
  }

  // Recursive search method.
  public Object search(Node node, int key) {
    int i = 0;
    while (i < node.mNumKeys && key >
      node.mKeys[i]) {
      i++;
    }
    if (i < node.mNumKeys && key ==
      node.mKeys[i]) {
      return node.mObjects[i];
    }
    if (node.mIsLeafNode) {
      return null;
    } else {
      return search(node.mChildNodes[i], key);
    }
  }

  public Object search(int key) {
    return search(mRootNode, key);
  }

  // Iterative search method.
  public Object search2(Node node, int key) {
    while (node != null) {
      int i = 0;
      while (i < node.mNumKeys && key >
        node.mKeys[i]) {
        i++;
      }
      if (i < node.mNumKeys && key ==
        node.mKeys[i]) {
        return node.mObjects[i];
      }
      if (node.mIsLeafNode) {
        return null;
      } else {
        node = node.mChildNodes[i];
      }
    }
    return null;
  }

  public Object search2(int key) {
    return search2(mRootNode, key);
  }
}

B+ tree or B plus tree is a type of balanced tree which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, each of which is identified by a key. It is a dynamic, multilevel index, with maximum and minimum bounds on the number of keys in each index segment (usually called a "block" or "node"). In a B+ tree, in contrast to a B-tree, all records are stored at the leaf level of the tree; only keys are stored in interior nodes.
The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context—in particular, filesystems. This is primarily because unlike binary search trees, B+ trees have a very high fanout (typically on the order of 100 or more), which reduces the number of I/O operations required to find an element in the tree.
The NTFS, XFS, and JFS filesystems all use this type of tree for metadata indexing. Relational database management systems such as IBM DB2, Microsoft SQL Server, MySQL and SQLite support this type of tree for table indices. Key-value database management systems such as CouchDB support this type of tree for data access.
The leaves of the B+ tree are often linked to one another in a linked list; this makes range queries or an (ordered) iteration through the blocks simpler and more efficient.

Java source code (for differences to the B-tree data structure take a look at the splitChildNode method):
public class BPlusTree {
  class Node {
    public int mNumKeys = 0;
    public int[] mKeys = new int[2*T-1];
    public Object[] mObjects = new Object[2*T-1];
    public Node[] mChildNodes = new Node[2*T];
    public boolean mIsLeafNode;
    public Node mNextNode;
  }

  private Node mRootNode;
  private static final int T = 4;

  public BPlusTree() {
    mRootNode = new Node();
    mRootNode.mIsLeafNode = true;
  }

  public void add(int key, Object object) {
    Node rootNode = mRootNode;
    if (rootNode.mNumKeys == (2 * T - 1)) {
      Node newRootNode = new Node();
      mRootNode = newRootNode;
      newRootNode.mIsLeafNode = false;
      mRootNode.mChildNodes[0] = rootNode;
      // Split rootNode and move its median
      // key up into newRootNode.
      splitChildNode(newRootNode, 0, rootNode);
      // Insert the key into the B+-Tree
      // with root newRootNode.
      insertIntoNonFullNode(newRootNode, key,
        object);
    } else {
      // Insert the key into the B+-Tree
      // with root rootNode.
      insertIntoNonFullNode(rootNode, key, object);
    }
  }

  // Split the node, node, of a B+-Tree into two
  // nodes that contain T-1 (and T) elements
  // and move node's median key up to the
  // parentNode.
  // This method will only be called if node is full;
  // node is the i-th child of parentNode.
  // All internal keys (elements) will have
  // duplicates within the leaf nodes.
  void splitChildNode(Node parentNode, int i,
    Node node) {
    Node newNode = new Node();
    newNode.mIsLeafNode = node.mIsLeafNode;
    newNode.mNumKeys = T;
    // Copy the last T elements of node
    // into newNode. Keep the median key
    // as duplicate in the first key of newNode.
    for (int j = 0; j < T; j++) {
      newNode.mKeys[j] = node.mKeys[j + T - 1];
      newNode.mObjects[j] = node.mObjects[j + T - 1];
    }
    if (!newNode.mIsLeafNode) {
      // Copy the last T + 1 pointers of node
      // into newNode.
      for (int j = 0; j < T + 1; j++) {
        newNode.mChildNodes[j] =
          node.mChildNodes[j + T - 1];
      }
      for (int j = T; j <= node.mNumKeys; j++) {
        node.mChildNodes[j] = null;
      }
    } else {
      // Manage the linked list that is used e.g.
      // for doing fast range queries.
      newNode.mNextNode = node.mNextNode;
      node.mNextNode = newNode;
    }
    for (int j = T - 1; j < node.mNumKeys; j++) {
      node.mKeys[j] = 0;
      node.mObjects[j] = null;
    }
    node.mNumKeys = T - 1;

    // Insert a (child) pointer to node newNode
    // into the parentNode, moving other keys
    // and pointers as necessary.
    for (int j = parentNode.mNumKeys; j >= i + 1;
      j--) {
      parentNode.mChildNodes[j + 1] =
        parentNode.mChildNodes[j];
    }
    parentNode.mChildNodes[i + 1] = newNode;
    for (int j = parentNode.mNumKeys - 1; j >= i;
        j--) {
      parentNode.mKeys[j + 1] =
        parentNode.mKeys[j];
      parentNode.mObjects[j + 1] =
        parentNode.mObjects[j];
    }
    parentNode.mKeys[i] = newNode.mKeys[0];
    parentNode.mObjects[i] = newNode.mObjects[0];
    parentNode.mNumKeys++;
  }

  // Insert an element into a B-Tree. (The element
  // will ultimately be inserted into a leaf node).
  void insertIntoNonFullNode(Node node, int key,
    Object object) {
    int i = node.mNumKeys - 1;
    if (node.mIsLeafNode) {
      // Since node is not a full node insert the
      // new element into its proper place
      // within node.
      while (i >= 0 && key < node.mKeys[i]) {
        node.mKeys[i + 1] = node.mKeys[i];
        node.mObjects[i + 1] = node.mObjects[i];
        i--;
      }
      i++;
      node.mKeys[i] = key;
      node.mObjects[i] = object;
      node.mNumKeys++;
    } else {
      // Move back from the last key of node until
      // we find the child pointer to the node
      // that is the root node of the subtree
      // where the new element should be placed.
      while (i >= 0 && key < node.mKeys[i]) {
        i--;
      }
      i++;
      if (node.mChildNodes[i].mNumKeys ==
        (2 * T - 1)) {
        splitChildNode(node, i,
          node.mChildNodes[i]);
        if (key > node.mKeys[i]) {
          i++;
        }
      }
      insertIntoNonFullNode(node.mChildNodes[i],
        key, object);
    }
  }

  // Recursive search method.
  public Object search(Node node, int key) {
    int i = 0;
    while (i < node.mNumKeys && key >
      node.mKeys[i]) {
      i++;
    }
    if (i < node.mNumKeys && key ==
      node.mKeys[i]) {
      return node.mObjects[i];
    }
    if (node.mIsLeafNode) {
      return null;
    } else {
      return search(node.mChildNodes[i], key);
    }
  }

  public Object search(int key) {
    return search(mRootNode, key);
  }

  // Iterative search method.
  public Object search2(Node node, int key) {
    while (node != null) {
      int i = 0;
      while (i < node.mNumKeys && key >
        node.mKeys[i]) {
        i++;
      }
      if (i < node.mNumKeys && key ==
        node.mKeys[i]) {
        return node.mObjects[i];
      }
      if (node.mIsLeafNode) {
        return null;
      } else {
        node = node.mChildNodes[i];
      }
    }
    return null;
  }

  public Object search2(int key) {
    return search2(mRootNode, key);
  }

  // Inorder walk over the tree.
  public String toString() {
    String string = "";
    Node node = mRootNode;  
    while (!node.mIsLeafNode) {   
      node = node.mChildNodes[0];
    }  
    while (node != null) {
      for (int i = 0; i < node.mNumKeys; i++) {
        string += node.mObjects[i] + ", ";
      }
      node = node.mNextNode;
    }
    return string;
  }
 
  // Inorder walk over parts of the tree.
  public String toString(int fromKey, int toKey) {
    String string = "";
    Node node = getLeafNodeForKey(fromKey);
    while (node != null) {
      for (int j = 0; j < node.mNumKeys; j++) {
        string += node.mObjects[j] + ", ";
        if (node.mKeys[j] == toKey) {
          return string;
        }
      }
      node = node.mNextNode;
    }
    return string;
  }
 
  Node getLeafNodeForKey(int key) {
    Node node = mRootNode;
    while (node != null) {
      int i = 0;
      while (i < node.mNumKeys &&
        key > node.mKeys[i]) {
        i++;
      }
      if (i < node.mNumKeys &&
        key == node.mKeys[i]) {
        node = node.mChildNodes[i + 1];
        while (!node.mIsLeafNode) {   
          node = node.mChildNodes[0];
        }
        return node;
      }
      if (node.mIsLeafNode) {
        return null;
      } else {
        node = node.mChildNodes[i];
      }
    }
    return null;
  }
}

Books about algorithms and data structures:

The source code of the examples above can be found here in the blog repository.

A lot of texts have been taken from Wikipedia, the free encyclopedia.

Tuesday, November 1, 2011

Algorithms and Data Structures - Part 2

Topics:

Data Structures
Heap is always a perfectly-balanced fully-populated tree data structure that satisfies the heap property: If B is a child node of A, then key(A) ≥ key(B). This implies that an element with the greatest key is always in the root node, and so such a heap is sometimes called a max-heap. (Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min-heap.) There is no restriction as to how many children each node has in a heap, although in practice each node has at most two. The heap is one maximally-efficient implementation of an abstract data type called a priority queue with a worst-case performance of O(log n) for adding and removing elements and O(1) for getting the maximum or minimum element. Heaps are crucial in several efficient graph algorithms such as Dijkstra's algorithm, and in the sorting algorithm Heapsort.
A heap data structure should not be confused with the heap which is a common name for dynamically allocated memory. The term was originally used only for the data structure. Some early popular languages such as LISP provided dynamic memory allocation using heap data structures, which gave the memory area its name.

The binary tree data structure of the Heap in the following Java source code is organized as an array (see also "Introduction to Algorithms", chapter 6):
public class BinaryHeap {
    private Integer[] mArray;
    private int mHeapSize;
 
    public BinaryHeap(int maxSize) {
        mArray = new Integer[maxSize];
        mHeapSize = 0;
    }
 
    int parentIndex(int i) {
        return (i + 1) / 2 - 1;
    }
 
    int leftChildIndex(int i) {
        return 2 * i + 1;
    }
 
    int rightChildIndex(int i) {
        return 2 * i + 2;
    }
 
    // When maxHeapify is called, it is assumed that
    // the binary tree rooted at leftChildIndex(i)
    // and rightChildIndex(i) are max-heaps.
    // Worst-case performance: O(log n).
    void maxHeapify(int i) {
        int leftChildIndex = leftChildIndex(i);
        int rightChildIndex = rightChildIndex(i);
        int largestElementIndex;
        if (leftChildIndex < mHeapSize &&
            mArray[leftChildIndex] > mArray[i]) {
            largestElementIndex = leftChildIndex;
        } else {
            largestElementIndex = i;
        }
        if (rightChildIndex < mHeapSize &&
            mArray[rightChildIndex] >
            mArray[largestElementIndex]) {
            largestElementIndex = rightChildIndex;
        }
        if (largestElementIndex != i) {
            int tmpValue = mArray[i];
            mArray[i] = mArray[largestElementIndex];
            mArray[largestElementIndex] = tmpValue;
            maxHeapify(largestElementIndex);
        }
    }
 
    void buildMaxHeap() {
        int heapSize = mArray.length;
        for (int i = heapSize / 2; i >= 0; i--) {
            maxHeapify(i);
        }
    } 
 
    public int max() {
        return mArray[0];
    }
 
    // Worst-case performance: O(log n).
    public int extractMax() {
        int max = mArray[0];
        mArray[0] = mArray[mHeapSize - 1];
        mArray[mHeapSize - 1] = null;
        mHeapSize--;
        maxHeapify(0);
        return max;
    }
 
    // Worst-case performance: O(log n).
    void increaseKey(int i, int newValue)
        throws Exception {
        if (newValue < mArray[i]) {
            throw new Exception("New value is smaller
                than current value");
        }
        mArray[i] = newValue;
        while (i > 0 && mArray[parentIndex(i)] <
            mArray[i]) {
            int tmpValue = mArray[parentIndex(i)];
            mArray[parentIndex(i)] = mArray[i];
            mArray[i] = tmpValue;
            i = parentIndex(i);
        }  
    }
 
    // Worst-case performance: O(log n).
    public boolean insert(int value) {
        if (mHeapSize < mArray.length) {
            mHeapSize++;
            mArray[mHeapSize - 1] = value;
            try {
                increaseKey(mHeapSize - 1, value);
            } catch (Exception e) {
                return false;
            }
            return true;
        } else {
            return false;
        }
    }
 
    public int size() {
        return mHeapSize;
    }
}

Graphs
A graph data structure consists of a finite (and possibly mutable) set of ordered pairs, called edges, arcs or links, of certain entities called nodes or vertices. As in mathematics, an edge (x,y) is said to point or go from x to y. The nodes may be part of the graph structure, or may be external entities represented by integer indices or references. A graph data structure may also associate to each edge some edge value, such as a symbolic label or a numeric attribute (cost, capacity, length, etc.).
Graph algorithms are a significant field of interest within computer science. Typical higher-level operations associated with graphs are: finding a path between two nodes, like depth-first search and breadth-first search and finding the shortest path from one node to another, like Dijkstra's algorithm.
Graphs are normally represented as adjacency lists or by an adjacency matrix.
Adjacency list (which I used in my examples) are a set of vertices that are stored as records or objects, and every vertex stores a list of adjacent vertices. This data structure allows the storage of additional data on the vertices.
An adjacency matrix is a two-dimensional matrix, in which the rows represent source vertices and columns represent destination vertices. Data on edges and vertices must be stored externally. Only the cost for one edge can be stored between each pair of vertices.

Java source code for breadth-first search [O(|V| + |E|)]. See also "Introduction to Algorithms", chapter 22):
public Dictionary<Node, NodeAttributes>
  breadthFirstSearch(Node s) {
  Hashtable<Node, NodeAttributes> nodeAttributes =
    new Hashtable<Node, NodeAttributes>();
  
  for (Node u : mNodes) {   
    NodeAttributes attributes = new NodeAttributes();
    attributes.color = NodeAttributes.WHITE;
    attributes.distance = Integer.MAX_VALUE;
    attributes.predecessor = null;
    nodeAttributes.put(u, attributes);
  }
  NodeAttributes sAttributes = nodeAttributes.get(s);
  sAttributes.color = NodeAttributes.GRAY;
  sAttributes.distance = 0;
  sAttributes.predecessor = null;
  Queue<Node> queue = new ArrayDeque<Node>();
  queue.add(s);
  while (!queue.isEmpty()) {
    Node u = queue.poll();
    NodeAttributes uAttributes =
      nodeAttributes.get(u);
    for (Node v : u.getLinks()) {
      NodeAttributes vAttributes =
        nodeAttributes.get(v);
      if (vAttributes.color == NodeAttributes.WHITE) {
        vAttributes.color = NodeAttributes.GRAY;
        vAttributes.distance =
          uAttributes.distance + 1;
        vAttributes.predecessor = u;
        queue.add(v);
      }
    }
    Attributes.color = NodeAttributes.BLACK;
  }
  
  return nodeAttributes;
}

Java source code for depth-first search [O(|V| + |E|)]. For more details see also "Introduction to Algorithms", chapter 22.
public Dictionary<Node, NodeAttributes>
  depthFirstSearch() {
  Hashtable<Node, NodeAttributes> nodeAttributes =
    new Hashtable<Node, NodeAttributes>();
  
  for (Node u : mNodes) {   
    NodeAttributes attributes = new NodeAttributes();
    attributes.color = NodeAttributes.WHITE;
    attributes.predecessor = null;
    nodeAttributes.put(u, attributes);
  }
  mTime = 0;
  for (Node u : mNodes) {
    if (nodeAttributes.get(u).color ==
      NodeAttributes.WHITE) {
      dfsVisit(u, nodeAttributes);
    }
  }
  return nodeAttributes;
}
 
private void dfsVisit(Node u,
  Dictionary<Node, NodeAttributes> nodeAttributes) {
  NodeAttributes uAttributes = nodeAttributes.get(u);
  uAttributes.color = NodeAttributes.GRAY; 
  mTime++;
  uAttributes.startTime = mTime;
  
  for (Node v : u.getLinks()) {
    NodeAttributes vAttributes =
      nodeAttributes.get(v);
    if (vAttributes.color == NodeAttributes.WHITE) {
      vAttributes.predecessor = u;
      dfsVisit(v, nodeAttributes);
    }
  }
  
  uAttributes.color = NodeAttributes.BLACK;
  mTime++;
  uAttributes.finishTime = mTime;
  mTopologicalOrdering.addFirst(u);
}

What is also very interesting is the topological ordering of a directed graph. A topological sort creates a dependency graph. That's very useful e.g. if you want to implement your own dependency injection container that automatically resolves dependencies between objects to create them in the right order.
The topological sort is just a depth-first search within a graph that constructs a linked-list of dependencies. The Java source code containing topological sort is available here in my blog repository. The Java source code also provides an example of Dijkstra's algorithm.

Algorithms
Dynamic programming
Dynamic programming, like the divide-and-conquer method, solves problems by combining the solutions to subproblems. Divide-and-conquer algorithms partition the problem into independent subproblems, solve the subproblems recursively, and then combine their solutions to solve the original problem. In contrast, dynamic programming is applicable when the subproblems are not independent.
A dynamic programming algorithm solves every subproblem just once and then saves its answers in a table, thereby avoiding the work of recomputing the answer every time the subproblem is encountered.
Dynamic programming is typically applied to optimization problems. In such problems there can be many possible solutions. Each solution has a value, and we wish to find a solution with the optimal (minimum or maximum) value.
Top-down dynamic programming simply means storing the results of certain calculations, which are later used again since the completed calculation is a sub-problem of a larger calculation. Bottom-up dynamic programming involves formulating a complex calculation as a recursive series of simpler calculations.

One of the computer science areas where dynamic programming is used a lot is Bioinformatics.
An example is the longest common subsequence algorithm that is used to compare DNA strands. A DNA strand consists of a string of molecules called bases, where the possible bases are adenine, guanine, cytosine and thymine. The goal of comparing two DNA strands is to determine how similar the two strands are, as some measure of how closely related the two organisms are. So this is no exact substring method but one that allows for gaps while comparing the two input DNA strands.

Java source code for the longest common subsequence problem (see also "Introduction to Algorithms", chapter 15)
public class LongestCommonSubsequence {
  char[] mSeqA;
  char[] mSeqB;
  int[][] mC;
  int[][] mB;
  private static int UP = 0;
  private static int DIAG = 1;
  private static int LEFT = 2;
 
  public LongestCommonSubsequence(char[] seqA,
    char[] seqB) {
    mSeqA = new char[seqA.length + 1];
    for (int i = 0; i < seqA.length; i++) {
      mSeqA[i+1] = seqA[i];
    }  
    mSeqB = new char[seqB.length + 1];
    for (int i = 0; i < seqB.length; i++) {
      mSeqB[i+1] = seqB[i]; 
    }
    mC = new int[mSeqA.length][mSeqB.length];
    mB = new int[mSeqA.length][mSeqB.length];
    for (int i = 0; i < mSeqA.length; i++) {
      mC[i][0] = 0;
    }  
    for (int j = 0; j < mSeqB.length; j++) {
      mC[0][j] = 0;
    }
  }
 
  // O(m + n) -> O(mSeqA.length + mSeqB.length)
  public void lcsLength() {
    for (int i = 1; i < mSeqA.length; i++) {
      for (int j = 1; j < mSeqB.length; j++) {
        if (mSeqA[i] == mSeqB[j]) {
          mC[i][j] = mC[i-1][j-1] + 1;
          mB[i][j] = DIAG;
        } else if (mC[i-1][j] >= mC[i][j-1]) {
          mC[i][j] = mC[i-1][j];
          mB[i][j] = UP;
        } else {
          mC[i][j] = mC[i][j-1];
          mB[i][j] = LEFT;
        }
      }
    }
  }
 
  public void backtrack() {
    backtrack(mSeqA.length - 1, mSeqB.length - 1);
  }
 
  void backtrack(int i, int j) {
    if (i == 0 || j == 0) {
      return;
    }
    if (mB[i][j] == DIAG) {
      backtrack(i-1, j-1);
      System.out.print(mSeqA[i] + ", ");
    } else if (mB[i][j] == UP) {
      backtrack(i-1, j);   
    } else {
      backtrack(i, j-1);
    }
  }
}

There are also protein sequence comparison methods in Bioinformatics that are more sophisticated than the longest common subsequence, like the Needleman-Wunsch algorithm, the Smith-Waterman algorithm and  HHsearch. My blog repository currently contains implementations of the Needleman-Wunsch algorithm and the Smith-Waterman algorithm.

Memoization is a variation of dynamic programming. A memoized recursive algorithm maintains an entry in a table for the solution to each subproblem. When a subproblem is first encountered during the execution of the recursive algorithm, its solution is computed an then stored in a table. Each subsequent time that the subproblem is encountered, the value stored in the table is simply looked up and returned.

Greedy algorithms always make the choice that looks best at the moment. That is, it makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution.
For example, applying the greedy strategy to the traveling salesman problem yields the following algorithm: "At each stage visit the unvisited city nearest to the current city".
For many optimization problems, using dynamic programming to determine the best choice is overkill, often a more efficient greedy algorithm will do. Greedy algorithms do not always yield optimal solutions, but for many problems they do.
The following problem is about activity scheduling where several competing activities require exclusive use of a common resource. The goal is to select the maximum number of mutually compatible activities. The greedy algorithm solves the problem by sorting the activities by their finish time and then go once through the sorted list of activities and select the next possible activity based on the start time.

Java source code for the activity selector problem (see also "Introduction to Algorithms", chapter 16)
public class ActivitySelector {
  private List<Activity> mActivities =
    new ArrayList<Activity>();
 
  public static class Activity implements 
    Comparable<Activity> {
    private int mStartTime;
    private int mFinishTime;
  
    public Activity(int startTime, int finishTime) {
      mStartTime = startTime;
      mFinishTime = finishTime;
    }
  
    public int getStartTime() {
      return mStartTime;
    }
  
    public int getFinishTime() {
      return mFinishTime;
    }

    @Override
    public int compareTo(Activity activity) {
      if (this.mFinishTime == activity.mFinishTime) {
        return 0;
      } else if (this.mFinishTime >
        activity.mFinishTime) {
        return 1;
      } else {
        return -1;
      }
    }
  }

  public void addActivity(Activity activity) {
    mActivities.add(activity);
  }
 
  public List<Activity> greedyActivitySelector() {
    Collections.sort(mActivities);
    List<Activity> activities =
      new ArrayList<Activity>();
    activities.add(mActivities.get(0));
    int i = 0;
    for (int m = 1; m < mActivities.size(); m++) {
      if (mActivities.get(m).getStartTime() >=
        mActivities.get(i).mFinishTime) {
        activities.add(mActivities.get(m));
        i = m;
      }
    }
    return activities;
  }
}

Greedy algorithms versus dynamic programming
The 0-1 knapsack problem is the best way to compare the greedy strategy and dynamic programming.
A thief robbing a store finds n items; the i-th item is worth vi dollars and weighs wi pounds, where vi and wi are integers. He wants to take as valuable a load as possible, but he can carry at most W pounds in his knapsack for some integer W. Which items should he take? (This is called the 0-1 knapsack problem because each item must either be taken or left behind; the thief cannot take a fractional amount of an item.)
While the 0-1 knapsack problem is only solvable using dynamic programming but not with a greedy algorithm, the fractional knapsack problem is also solvable using a greedy strategy.
To solve the fractional knapsack problem, you first have to compute the value per pound for each item. The thief then begins by taking as much as possible of the item with the greatest value per pound. If the supply of that item is exhausted, the thief takes as much as possible of the item with the next greatest value per pound and so forth.

String matching algorithms
See this tutorial with great explanations and examples by Christian Charras and Thierry Lecroq.

Further reading
A lot of texts have been taken from Wikipedia, the free encyclopedia.