Apache Ignite.NET Documentation

The Apache Ignite .NET Developer Hub

Welcome to the Apache Ignite .NET developer hub. You'll find comprehensive guides and documentation to help you start working with Apache Ignite.NET as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started    

Fault Tolerance

Automatically fail-over jobs to other nodes in case of a crash.

Ignite .NET supports automatic job failover. In case of a node crash, jobs are automatically transferred to other available nodes for re-execution. However, in Ignite you can also treat any job result as a failure as well. The worker node can still be alive, but it may be running low on CPU, I/O, disk space, etc. There are many conditions that may result in a failure within your application and you can trigger a failover. Moreover, you have the ability to choose to which node a job should be failed over to, as it could be different for different applications or different computations within the same application.

The FailoverSpi is responsible for handling the selection of a new node for the execution of a failed job. FailoverSpi inspects the failed job and the list of all available grid nodes on which the job execution can be retried. It ensures that the job is not re-mapped to the same node it had failed on. Failover is triggered when the method IComputeTask.OnResult(...) returns the ComputeJobResultPolicy.Failover policy. Ignite comes with a number of built-in customizable Failover SPI implementations.

At Least Once Guarantee

As long as there is at least one node standing, no job will ever be lost.

By default, Ignite will failover all jobs from stopped or crashed nodes automatically. For custom failover behavior, you should implement IComputeTask.OnResult() method. The example below triggers failover whenever a job throws any Exception:

class MyTask : IComputeTask<string, int, int>
{
    ...
      
    public ComputeJobResultPolicy OnResult(IComputeJobResult<int> res, IList<IComputeJobResult<int>> rcvd)
    {
        if (res.Exception != null)
            return ComputeJobResultPolicy.Failover;

        // If there is no exception, wait for all job results.
        return ComputeJobResultPolicy.Wait;
    }

    ...
}

Closure Failover

Closure failover is by default governed by ComputeTaskAdapter, which is triggered if a remote node either crashes or rejects closure execution. This default behavior may be overridden by using ICompute.WithNoFailover() method, which creates an instance of ICompute with a no-failover flag set on it . Here is an example:

ICompute compute = ignite.GetCompute().WithNoFailover();

compute.Apply(..., "Some argument");

AlwaysFailOverSpi

AlwaysFailoverSpi always reroutes a failed job to another node. Note, that at first an attempt will be made to reroute the failed job to a node that the task was not executed on. If no such nodes are available, then an attempt will be made to reroute the failed job to the nodes that may be running other jobs from the same task. If none of the above attempts succeeded, then the job will not be failed over and null will be returned.

The following configuration parameters can be used to configure AlwaysFailoverSpi.

PropertyDescriptionDefault
maximumFailoverAttempts(int)Sets the maximum number of attempts to fail-over a failed job to other nodes.5
<bean id="grid.custom.cfg" class="org.apache.ignite.IgniteConfiguration" singleton="true">
  ...
  <bean class="org.apache.ignite.spi.failover.always.AlwaysFailoverSpi">
    <property name="maximumFailoverAttempts" value="5"/>
  </bean>
  ...
</bean>

Updated less than a minute ago

Fault Tolerance


Automatically fail-over jobs to other nodes in case of a crash.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.