Using GridGain’s Topology SPI

On a GridGain cluster, you sometimes want to execute your jobs on only a subset of the nodes available: those nodes meeting a given condition. Let’s say some nodes run an expensive piece of thirdparty software that is (fortunately) only needed for a couple of tasks. At another time, the jobs should be executed on a different subset, or maybe on all nodes of the cluster.

There are several ways to do this: Filtering out nodes in your map() method would work, but you usually don’t want that if you’re using split() already. Or you could put the nodes on different multicast groups, thus filtering via DiscoverySpi. In that case it’s more difficult to run a task’s job on all nodes: You can no longer use the default discovery service provider; you’d have to pick a different one or even implement your own.

Luckily, there’s the TopologySpi. It is used by the framework to filter the set of nodes returned by discovery. Any strategy for filtering can be implemented, providing a perfect hook for a criteria based filter. For example, the service provider shipped with GridGain, GridBasicTopologySpi, makes it possible to execute jobs only on the local node or only on remote nodes.

To solve the problem described above, I developed a simple group concept based on user attributes that can be assigned to each grid node in its configuration. A node may be part of one or more groups. This is added to the configuration file of the worker grid nodes like this:

  <property name="userAttributes">
    <map>
      <entry key="grid.groups">
        <set>
          <value>foo</value>
          <value>bar</value>
        </set>
      </entry>
    </map>
  </property>

A custom topology provider is used on the node where my tasks are deployed. It’s derived from GridBasicTopologySpi, but you can also specify which groups a node has to be part of to qualify for the task. The provider is called GroupTopologySpi (the “i” is used consistently in GridGain, although it’s counter-intuitive) and is activated in the master’s configuration:

  <property name="topologySpi">
    <bean class="de.mafr.grid.GroupTopologySpi">
      <property name="localNode" value="true"/>
      <property name="remoteNodes" value="true"/>
      <property name="requiredGroups">
        <set>
          <value>foo</value>
        </set>
      </property>
    </bean>
  </property>

If the requiredGroups property isn’t given or if it contains the empty set, the service provider works exactly the same as GridBasicTopologySpi, so it can be used as a drop-in replacement. In the example only grid nodes in the group foo will take part in the task.

The implementation is simple and consists of one short class and an interface: The actual topology provider, GroupTopologySpi, and the MBean interface.

Advertisements
This entry was posted in java and tagged , , . Bookmark the permalink.

4 Responses to Using GridGain’s Topology SPI

  1. Hi Matthias,
    Great job! What I am missing is what do you mean by:

    “the “i” is used consistently in GridGain, although it’s counter-intuitive”

    What do you mean by “i” and what is counter-intuitive?

    Thanks!
    Nikita.

  2. mafr says:

    In GridGain, you have service provider interfaces (SPIs), so the names of those java interfaces are DiscoverySpi, TopologySpi etc. However, the classes that implement those interfaces (like GridBasicTopologySpi) are service providers and not interfaces, so I’d expect a different naming convention: GridBasicTopologySp, or maybe GridBasicTopologyProvider, without the ‘i’ at the end :)

    That’s just a minor glitch, and I follow it in my own service providers for consistency.

    Cheers,
    Matthias

  3. BRET says:

    Hi Matthias,

    I’m new to GridGain and I was trying to figure out you’re example.

    I’m getting an error trying to compile your code in the community version 3.6.0c

    Multiple markers at this line
    – The method getTopology(GridTaskSession, Collection) of type GroupTopologySpi must override or implement a supertype method
    – Name clash: The method getTopology(GridTaskSession, Collection) of type GroupTopologySpi has the same erasure as
    getTopology(GridTaskSession, Collection) of type GridBasicTopologySpi but does not override it
    – Name clash: The method getTopology(GridTaskSession, Collection) of type GroupTopologySpi has the same erasure as
    getTopology(GridTaskSession, Collection) of type GridTopologySpi but does not override it

    And

    The method getAttribute(String) is undefined for the type
    GridNode

    I fixed the getAttribute by changing it to attribute()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s