public class GRpcJobKillServiceImpl extends com.netflix.genie.proto.JobKillServiceGrpc.JobKillServiceImplBase implements JobKillService
JobKillService which uses parked gRPC requests to tell the agent to
shutdown via a user kill request if the job is in an active state.| Constructor and Description |
|---|
GRpcJobKillServiceImpl(DataServices dataServices,
AgentRoutingService agentRoutingService,
RequestForwardingService requestForwardingService)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
void |
cleanupOrphanedObservers()
Remove orphaned kill observers from local map.
|
void |
killJob(java.lang.String jobId,
java.lang.String reason,
javax.servlet.http.HttpServletRequest request)
Kill the job with the given id if possible.
|
void |
registerForKillNotification(com.netflix.genie.proto.JobKillRegistrationRequest request,
io.grpc.stub.StreamObserver<com.netflix.genie.proto.JobKillRegistrationResponse> responseObserver)
Register to be notified when a kill request for the job is received.
|
public GRpcJobKillServiceImpl(DataServices dataServices, AgentRoutingService agentRoutingService, RequestForwardingService requestForwardingService)
dataServices - The DataServices instance to useagentRoutingService - The AgentRoutingService instance to use to find where agents are
connectedrequestForwardingService - The service to use to forward requests to other Genie nodespublic void registerForKillNotification(com.netflix.genie.proto.JobKillRegistrationRequest request,
io.grpc.stub.StreamObserver<com.netflix.genie.proto.JobKillRegistrationResponse> responseObserver)
registerForKillNotification in class com.netflix.genie.proto.JobKillServiceGrpc.JobKillServiceImplBaserequest - Request to register for getting notified when server gets a job kill request.responseObserver - The response observer@Retryable(value={com.netflix.genie.common.internal.exceptions.unchecked.GenieInvalidStatusException.class,com.netflix.genie.common.exceptions.GenieServerException.class},
backoff=@Backoff(delay=1000L))
public void killJob(java.lang.String jobId,
java.lang.String reason,
@Nullable
javax.servlet.http.HttpServletRequest request)
throws com.netflix.genie.common.internal.exceptions.unchecked.GenieJobNotFoundException,
com.netflix.genie.common.exceptions.GenieServerException
killJob in interface JobKillServicejobId - id of job to killreason - brief reason for requesting the job be killedrequest - The optional HttpServletRequest information if the request needs to be forwardedcom.netflix.genie.common.internal.exceptions.unchecked.GenieJobNotFoundException - When a job identified by jobId can't be found in the systemcom.netflix.genie.common.exceptions.GenieServerException - if there is an unrecoverable error in the internal state of the Genie cluster@Scheduled(fixedDelay=30000L,
initialDelay=30000L)
public void cleanupOrphanedObservers()
The logic as currently implemented is to have the Agent, once handshake is complete, open a connection to the server which results in parking a response observer in the map stored in this implementation. Upon receiving a kill request for the correct job this class will use the observer to send the "response" to the agent which will begin shut down process. The issue is that if the agent disconnects from this server the server will never realize it's gone and these observers will build up in the map in memory forever. This method will periodically go through the map and determine if the observers are still valid and remove any that aren't.