I needed preemption to be enabled for services in my nomad cluster to shuffle around some existing services and allow a new larger multi service job to be allocated. Specifically, I had the distinct hosts constraint on the services in my job, and there were no nodes immediately available to allocate them to.
Annoyingly if you don't enable preemption during server bootstrapping, you need to enable it against the API directly, this stanza isn't used to set the scheduler config after bootstrapping and the CLI doesn't seem to support changing the scheduler settings.
From a machine that can speak to a nomad server you can find the current scheduler config with the following (this is the default scheduler config for Nomad 1.1.6):
-> % curl -s http://localhost:4646/v1/operator/scheduler/configuration | jq .
{
"SchedulerConfig": {
"SchedulerAlgorithm": "binpack",
"PreemptionConfig": {
"SystemSchedulerEnabled": true,
"BatchSchedulerEnabled": false,
"ServiceSchedulerEnabled": false
},
"MemoryOversubscriptionEnabled": false,
"CreateIndex": 5,
"ModifyIndex": 5
},
"Index": 5,
"LastContact": 0,
"KnownLeader": true
}
Specify your desired config in schedulerUpdate.json
. I want to switch to the spread
scheduler, and enable preemption on services, so my config looked like:
{
"SchedulerAlgorithm": "spread",
"MemoryOversubscriptionEnabled": false,
"PreemptionConfig": {
"SystemSchedulerEnabled": true,
"BatchSchedulerEnabled": false,
"ServiceSchedulerEnabled": true
}
}
Then apply it with:
-> % curl -X PUT http://localhost:4646/v1/operator/scheduler/configuration \
--data @schedulerUpdate.json
{"Updated":true,"Index":38465}%
Now if you check the config, it has been updated:
-> % curl -s http://localhost:4646/v1/operator/scheduler/configuration | jq .
{
"SchedulerConfig": {
"SchedulerAlgorithm": "spread",
"PreemptionConfig": {
"SystemSchedulerEnabled": true,
"BatchSchedulerEnabled": false,
"ServiceSchedulerEnabled": true
},
"MemoryOversubscriptionEnabled": false,
"CreateIndex": 5,
"ModifyIndex": 38465
},
"Index": 38465,
"LastContact": 0,
"KnownLeader": true
}
After applying this change, I restarted the job that was failing to allocate, nomad then preempted lower priority jobs and was able to allocate all of my services.