Topic on Project:Support desk

Cirrus search ElasticaWrite job failed

2
Summary last edited by Ernstkm 15:22, 4 June 2024 5 months ago

Elasticsearch was out of disk space.

That is what I thought so. After providing enough space, watermark notices disappeared but runJobs.php still has error messages

ElasticaWrite job failed: Requeued

and an increasing number of jobs are delayed/requeued.

But this time there are no indicating messages in the Elasticsearch log to hint anything.

The last step is to tell Elasticsearch to allow writing to indexes after it was frozen due to hitting the high watermark:

curl -X PUT -H "Content-Type: application/json" \
     http://localhost:9200/_all/_settings \
     -d '{"index.blocks.read_only_allow_delete": null}'

Now even runJobs.php is processing new updates/edits without issue. The question is how to make those delayed jobs to go back to queue.

Pspviwki (talkcontribs)

Elastic search has been working fine since install. Suddenly, two days ago the delayed jobs started to appear and now they seem to appear after each new page is added or page updated.

cirrusSearchElasticaWrite: 0 queued; 0 claimed (0 active, 0 abandoned); 32 delayed
cirrusSearchIncomingLinkCount: 0 queued; 0 claimed (0 active, 0 abandoned); 28 delayed

and error messages started to appear in runjobs log

2022-09-09 17:38:21 cirrusSearchElasticaWrite Special: method=sendData arguments=["content",[{"data":{"version":37144,"wiki":"....","namespace":0,"namespace_text":"","title":"....","timestamp":"....","create_timestamp":"2022-09-07T20:04:46Z","redirect":[],"incoming_links":0},"params":{"_id":"6309","_type":"","_index":"","_cirrus_hints":{"BuildDocument_flags":0,"noop":{"version":"documentVersion","incoming_links":"within 20%"}}},"upsert":true}]] cluster=default createdAt=1662745101 errorCount=0 retryCount=0 requestId=bfca08d3df0945870a8e9f4c namespace=-1 title= (uuid=fae3faef720547c5afce97a7f131619b,timestamp=1662745101) STARTING
2022-09-09 17:38:21 cirrusSearchElasticaWrite Special: method=sendData arguments=["content",[{"data":{"version":37144,"wiki":"....","namespace":0,"namespace_text":"","title":"....","timestamp":"2022-09-07T20:04:46Z","create_timestamp":"2022-09-07T20:04:46Z","redirect":[],"incoming_links":0},"params":{"_id":"6309","_type":"","_index":"","_cirrus_hints":{"BuildDocument_flags":0,"noop":{"version":"documentVersion","incoming_links":"within 20%"}}},"upsert":true}]] cluster=default createdAt=1662745101 errorCount=0 retryCount=0 requestId=bfca08d3df0945870a8e9f4c namespace=-1 title= (uuid=fae3faef720547c5afce97a7f131619b,timestamp=1662745101) t=41 error=ElasticaWrite job failed: Requeued

There are no error messages appearing anywhere else so what might be the reason for this sudden appearance?

Environment: MediaWiki 1.38.2 PHP 7.4.30 (fpm-fcgi) MariaDB 10.8.3-MariaDB ICU 71.1 LilyPond 2.22.2 Elasticsearch 6.8.23, there was no environmental change since Cirrus search deployment.

For anyone encountering similar problem. Check elasticsearch.log wherever it may reside. Even if there is enough space on partition, elasticsearch checks its own high disk watermark and freezes indexing. Solution: move indexes to partition with more space. Resolved.

Ernstkm (talkcontribs)

Thanks for this. I had a sinking feeling in my stomach when I noticed my CirrusSearch indexes were not being updated anymore, because I lacked the time to troubleshoot properly. However, your solution came up ranked near the top of web search results, and—fortunately—I had seen and dealt with this problem before in the context of some different software that uses Elasticsearch (GitLab). Cheers. --Ernstkm (talk) 13:09, 4 June 2024 (UTC)

Reply to "Cirrus search ElasticaWrite job failed"