Site Navigation


Virtual Mako (VMako): Aggregating Mako Servers

A VMako is a single Mako server that acts as a front-end to multiple remote Mako instances. The single VMako instance maps its (virtual) collections onto a user-defined set of remote Mako collections. This reduces the complexity for the client application, presenting a single virtualized interface to a number of distinct, federated Makos. The primary purpose for a Virtual Mako is to enable distributed query execution. In a Virtual Mako, requests are broken down into sub-queries and sent to appropriate remote Makos. Responses are then aggregated at the VMako and returned to the client. Additionally, collections added to a VMako can be distributed across the remote Makos, thus reducing the load carried by any single instance. The default "ingestor" class uses a round-robin algorithm for distributing load across the federated Makos.

As in the other QuickStarts, before instantiating a VMako, you will want to edit the configuration file. There is a localhost-vmako-config.xml file in the /conf directory. It contains the configuration to stand up a basic VMako instance. In the <MobiusNetworkServiceDescriptor ...> tag, edit the "hostname" attribute as appropriate. Also set the "serviceId" in the <serviceIdentifier ...> tag, immediately below it.

A VMako aggregates other Mako services. Because of this, it needs no configuration for MySQL. Instead it can either be instantiated alone and then you can use the Mako Viewer or command line utilities to manually aggregate other Makos "within" it. Or you instantiate the VMako so that it reads a cache file and automatically "ingests" collections from already running Makos.

In the config file, the resource element whose "name" attribute is set to "vmakoConfig" contains a sub-element called <vmako-storage ...> which takes only a "file" attribute. In the distribution's localhost-vmako-config.xml file, that attribute is set to "vmako-storage.xml".

Let's start the VMako
./startMako.bat ../conf/localhost-vmako-config.xml

If you have left the <vmako-storage ...> element with its default "file" value, then any collections you add (via the Mako Viewer or the command line) will be recorded in the vmako-storage.xml file. The file will be created in the /scripts directory by default (since that is where the startMako script is). But you can also specify a path when setting the value for "file". The collections you add to your VMako during the session will be recorded and, the next time you start the VMako, the file will be read in and the collections ingested.

In the /conf directory, there is an example VMako cache file called vmako-cache-example.xml. To eliminate the need to manually add collections, we can load the remote Makos and collections specified in this file when we start our VMako.

You'll notice that the vmako-cache-example.xml file has two child elements below the root <vmako> element. The first child is <xml-data-services>. It contains tags that reference the Makos that you desire to aggregate.

The second child (<virtual-collections>) contains two sub-elements. The first (<virtual-collection ...>) specifies the name of a virtual collection that you will use to run queries against. Queries run against that virtual collection will be executed on the aggregated collections specified in the other sub-element, <xml-collection-handle ...>. Each of these collection handle elements takes the service ID of a Mako specified as a "xml-data-service" and the name of the collection on the host where it resides (specified by the "collectionName" attribute).

The value of the "collectionName" attribute must specify a real collection on the host described by the "serviceId". Collection names do not have to be identical to the name of the virtual collection. In fact, it will usually be the case that the names are different. For example, you might set up a VMako to aggregate data from a number of differently purposed Makos: You want simultaneously to query references to a specific allele from a Makos referencing data from cancer genetics, proteomics, and SNP studies. You craft your single XPath query so that it returns the appropriate set of attributes from the multiple, separate, virtually aggregated stores.

Let's set up our example. In the localhost-vmako-config.xml file, change the <vmako-storage ...> element's "file" attribute so that it points at your cache. For example, ../conf/vmako-cache-example.xml.

Edit the vmako-cache-example.xml file and point the "serviceId" attribute within the <xml-data-service ...> at a remote Mako service. In the <xml-collection-handle ...> tag, repeat the service id and fill out the "collectionName" attribute with the XPath path to a collection on that server that you would like to aggregate under your VMako. In the <virtual-collection ...> tag, fill out the "name" attribute with a virtual collection name of your choosing (below we have used "TestVirtualCollection").

Restart the VMako, as above. You will see output similar to the following:


INFO - Nov 5, 2004 1:02:27 PM  -- Adding service MAKO://dc01
INFO - Nov 5, 2004 1:02:27 PM  -- Adding service MAKO://dc02
INFO - Nov 5, 2004 1:02:28 PM  -- Adding the virtual collection TestVirtualCollection
INFO - Nov 5, 2004 1:02:28 PM  -- Server started on localhost at Fri Nov 05 13:02:28 EST 2004
INFO - Nov 5, 2004 1:02:28 PM  -- Setting Service Identifier to localhost
INFO - Nov 5, 2004 1:02:28 PM  -- Listening using the protocol TCP on the host 0.0.0.0, port 3940

Now, any XPath queries performed via command line or viewer against the TestVirtualCollection collection will, in fact, be run against the collections on the remote nodes you configured in the vmako-cache-example.xml file (in this case above, dc01 and dc02).