Bug #743

XML output from sanitizing script

Added by Evelyn McLellan over 10 years ago. Updated over 7 years ago.

Status:VerifiedStart date:
Priority:CriticalDue date:
Assignee:Joseph Perry% Done:

0%

Category:-
Target version:Release 0.6
Google Code Legacy ID:archivematica-88 Pull Request:
Sponsored: Requires documentation:

Description

Eg:

<!XML>
<sipname>
<filename>

blah blah</previous>
   <clean>blah-blah</clean>
   <UUID>4270678076k437etc</UUID>  
  </filename>

Add this to ingestLogs folder

[g] Legacy categories: Ingest

History

#1 Updated by Evelyn McLellan over 10 years ago

  • Priority changed from High to Critical

Note that a UUID is to be created for each object rather than each SIP.

[g] Labels added: Priority-Critical
[g] Labels removed: Priority-High

#2 Updated by Austin Trask over 10 years ago

now have detox.log that is created during the ingest, keeping track of changes. A
separate python script should be created to parse this into clean XML.

#3 Updated by Austin Trask over 10 years ago

example of detox.log:

Scanning: /tmp/bfcd0f4a-1cd5-11df-939b-525400123456
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/inkscape_wallpaper___blue_by_ryanlerch.svg
> /tmp/bfcd0f4a-1cd5-11df-939b-525400123456/inkscape_wallpaper_blue_by_ryanlerch.svg
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Lake Chelan.JPG ->
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Lake_Chelan.JPG
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/inkscape_wallpaper___blue_by_ryanlerch
(copy).svg ->
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/inkscape_wallpaper_blue_by_ryanlerch-copy
.svg
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Supported standards.doc >
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Supported_standards.doc
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Suported standards.rtf ->
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Suported_standards.rtf
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Ica-atom-technical-architecture-2008-06
(copy).jpg ->
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Ica-atom-technical-architecture-2008-06-copy
.jpg
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Basic search.odt ->
/tmp/bfcd0f4a-1cd5-11df-939b-525400123456/Basic_search.odt

#4 Updated by Austin Trask about 10 years ago

transferring ownership to berwin22, I believe this issue is closed.

[g] New owner: berwin22

#5 Updated by Joseph Perry about 10 years ago

  • Status changed from New to Verified

/includes/archivematica/SIPxmlModifiers/addDetoxLogToSIP.py

The above script parses the detox log into the SIP.xml file.

Also available in: Atom PDF