CLI task digital object path check throws error if absolute paths used
|Status:||Code Review||Start date:||12/02/2021|
|Google Code Legacy ID:||Tested version:||2.6, 2.7|
First reported in the user forum 2021-10-31, with a proposed solution on 2021-11-04: https://groups.google.com/g/ica-atom-users/c/_2fctwQMtc4/m/blX7IQjlAQAJ
Reported again in the forum by another user 2021-12-02: https://groups.google.com/g/ica-atom-users/c/phE2_pIiXns/m/KkexXY-cBgAJ
The CLI task
csv:digital-object-path-check expects users to provide two inputs: a path to a directory of images, and a path to a CSV. However, the AtoM CSV import works best when absolute paths are provided. When the path-check task runs, it combines the user supplied input path with the absolute path in the CSV, thereby failing. From a forum user:
In atom/lib/task/import/csvDigitalObjectPathsCheckTask.class.php, the getCsvColumnValues() function returns values from the CSV file. If these values are full path names, though, they won't match the output of getImageFiles(), which consists of filenames only, not pathnames.
- Prepare an archival description CSV that includes absolute path values in the digitalObjectPath column
- Add the directory of images and the CSV to a location accessible by AtoM
- Run the path check task
Despite paths being correct, all objects are reported unused. Path values found by task do not match those in the CSV.
Path check task successfully checks absolute paths in CSV.
Suggested fix from the forum
In getCsvColumnValues(), replace:
// Remove absolute path leading to image file. $relativeFilePath = basename($row[$imageColumnIndex]); array_push($values, $relativeFilePath);
When I created a modified version of csvDigitalObjectPathsCheckTask.class.php that included this change, I got expected results.