Bug #878

Multi Node crashes on Fits

Added by Austin Trask almost 11 years ago. Updated almost 9 years ago.

Status:VerifiedStart date:
Priority:HighDue date:
Assignee:Joseph Perry% Done:

0%

Category:-
Target version:Release 0.7
Google Code Legacy ID:archivematica-223 Pull Request:
Sponsored: Requires documentation:

Description

SIPS are getting stuck in .currentlyProcessing in multinode setup.

SIPs should move to failed directory if processing of SIP ends

This occurs when placing a SIP in acquireSIP and the mcp passes jobs to the remote client. Many modules on the remote client appear to succeed, this instance appeared to have lost track of a file fits was looking for. logs are attached.

maybe this is a xml configuration Im missing?

localClientLog (30.3 KB) Austin Trask, 11/30/2012 04:48 PM

mcpLog (58.4 KB) Austin Trask, 11/30/2012 04:48 PM

remoteClientLog (24.6 KB) Austin Trask, 11/30/2012 04:48 PM

History

#1 Updated by Joseph Perry almost 11 years ago

Reviewed log.
Appears to be a file not found error. I notice this is the first instance of FITS running.

I believe the code should wait for the os to report the SIP move completed to the processing folder, before the tasks are assigned.
Possible delay in the NFS mount to see the files exist? Feels unlikely.

This is strange behaviour.

Client log excerpt:

wrote: taskCompleted<!&\delimiter/&!>1470a359-3cb2-4573-b1e7-f6ff4ee7b6b5<!&\delimiter/&!>0
processing: /usr/bin/clientScripts/assignFileUUIDs.sh "/home/demo/sharedFolders/.currentlyProcessing/44c2cd84-ce95-4ded-a932-eab283639561/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/objects/Archivematica_architecture_7May2010.png" "/home/demo/sharedFolders/.currentlyProcessing/44c2cd84-ce95-4ded-a932-eab283639561/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/objects/"
returned: 0
('4a01a5b5-9efb-47a3-a6db-6c9b23a73083 -> objects/Archivematica_architecture_7May2010.png\n', '')
writing to: /home/demo/sharedFolders/.currentlyProcessing/44c2cd84-ce95-4ded-a932-eab283639561/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/logs/FileUUIDs.log
No output or file specified
processing completed
wrote: taskCompleted<!&\delimiter/&!>6860a6c1-2238-4183-957f-1d0d0f90581b<!&\delimiter/&!>0
processing: fits.sh -i "/home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/objects/G31DS.TIF" -o "/home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/logs/FITS-3d2cbabb-2245-4235-9dfd-31ac69382de0.xml"
processing: fits.sh -i "/home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/objects/LAND2.BMP" -o "/home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/logs/FITS-UUID not found for: objects/LAND2.BMP.xml"
returned: 1
('', 'Exception in thread "main" java.io.FileNotFoundException: /home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/logs/FITS-UUID not found for: objects/LAND2.BMP.xml (No such file or directory)\n\tat java.io.FileOutputStream.open(Native Method)\n\tat java.io.FileOutputStream.<init>(FileOutputStream.java:209)\n\tat java.io.FileOutputStream.<init>(FileOutputStream.java:99)\n\tat edu.harvard.hul.ois.fits.Fits.main(Fits.java:177)\n')
No output or file specified
writing to: /home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/logs/FITS-UUID not found for: objects/LAND2.BMPerror.txt
Exception in thread Thread-23:
Traceback (most recent call last):
File "/usr/lib/python2.6/threading.py", line 532, in _bootstrap_inner
self.run()
File "/usr/lib/python2.6/threading.py", line 484, in run
self.
_target(*self.__args, **self.__kwargs)
File "/usr/bin/archivematicaClient.py", line 156, in performTask
ret = executeCommand(command[1], command[2], command[3], command[4], command[5], command[6], command[7], self)
File "/usr/bin/archivematicaClient.py", line 92, in executeCommand
writeToFile(output[1], sError)
File "/usr/bin/archivematicaClient.py", line 42, in writeToFile
f = open(fileName, 'a')
IOError: [Errno 2] No such file or directory: '/home/demo/sharedFolders/.currentlyProcessing/8dae3dbb-93b5-4fa7-b669-37ff15f05f0d/5th_copy_of_ImagesSIP-5c7f732f-4b08-4210-ae0c-76a4327979f3/logs/FITS-UUID not found for: objects/LAND2.BMPerror.txt'

#2 Updated by Joseph Perry almost 11 years ago

  • Status changed from New to Verified

The UUID wasn't found for "objects/LAND2.BMP.xml"

Problem1:
Writing to log failed and not reported as a fail.
Created issue for this:
http://code.google.com/p/archivematica/issues/detail?id=224

Problem2:
I suspect what happened here is that as a consequence of mcpModuleConfig/assignUUID.xml not having the <requiresUserLock> set to yes, the attempted write for this failed, because another thread was writing to the file then. I don't see any error in the logs to confirm this, and would need to look at the /logs/fileUUIDs.log file to confirm.

Also available in: Atom PDF