Feature #10692

Script to compare CSV digital object path data (for imports) with corresponding files

Added by Mike Cantelon over 3 years ago. Updated 4 months ago.

Status:VerifiedStart date:12/16/2016
Priority:MediumDue date:
Assignee:Mike Cantelon% Done:

100%

Category:CLI tools
Target version:Release 2.4.0
Google Code Legacy ID: Tested version:2.4, 2.6
Sponsored:No Requires documentation:

Description

Create script with logic that will compare CSV data to corresponding files to determine which files are unused, which are referenced in CSV data but missing, and which files are used more than once.

History

#2 Updated by Mike Cantelon over 3 years ago

  • Status changed from In progress to Code Review
  • Assignee changed from Mike Cantelon to Nick Wilkinson

#3 Updated by Nick Wilkinson over 3 years ago

  • Assignee changed from Nick Wilkinson to José Raddaoui Marín

#4 Updated by José Raddaoui Marín over 3 years ago

  • Category changed from CSV import to CLI tools
  • Status changed from Code Review to Feedback
  • Assignee changed from José Raddaoui Marín to Mike Cantelon

Looks great!

#5 Updated by Mike Cantelon over 3 years ago

  • Status changed from Feedback to Document
  • Assignee changed from Mike Cantelon to Dan Gillean

Skipping QA as this is a dev-centric feature.

#6 Updated by Dan Gillean over 3 years ago

  • Assignee changed from Dan Gillean to Mike Cantelon

Hey Mike,

I'll still have to test this to be able to document it. Can you provide me with a bit of context on how to invoke the task, what it does, etc that will help me test and document this? Thanks!

#7 Updated by Mike Cantelon over 3 years ago

  • Status changed from Document to Feedback
  • Assignee changed from Mike Cantelon to Dan Gillean

Hi Dan,

This is a CLI task to help with imports that involve digital objects. What it does is it tells you if there are issues with any of the digital object filenames specified in the CSV or if there are files in a filesystem directory that aren't included in your CSV data.

How it works is you point it to a CSV file and a directory in the filesystem and it runs through the CSV file's digitalObjectPath column values. Once it runs it reports on the following:

  • Which files in the filesystem directory aren't referenced in the CSV data
  • Which files are referenced in CSV data but missing on the filesystem
  • Which files are referenced more than once in the CSV data

Let me know if that makes sense!

#8 Updated by Dan Gillean over 3 years ago

  • Status changed from Feedback to QA/Review

to see task and options:

php symfony help csv:digital-object-path-check

#9 Updated by Dan Gillean about 3 years ago

  • Assignee deleted (Dan Gillean)

#10 Updated by Nick Wilkinson almost 3 years ago

  • Assignee set to Mike Cantelon

Hi Mike, further to the email I sent out, assigning this to you.

#11 Updated by Mike Cantelon almost 3 years ago

  • Status changed from QA/Review to Code Review
  • Assignee changed from Mike Cantelon to Nick Wilkinson

Found a small issue with the task and fixed it.

PR for CR: https://github.com/artefactual/atom/pull/595

#12 Updated by José Raddaoui Marín almost 3 years ago

  • Status changed from Code Review to Feedback
  • Assignee changed from Nick Wilkinson to Mike Cantelon

Looks great!

#13 Updated by Mike Cantelon almost 3 years ago

Thanks Radda!

#14 Updated by Mike Cantelon almost 3 years ago

  • Status changed from Feedback to QA/Review

I'm going to give this another quick test then will mark it verified if I find no issues.

#15 Updated by Mike Cantelon almost 3 years ago

  • Status changed from QA/Review to Verified

#16 Updated by Dan Gillean 4 months ago

  • % Done changed from 0 to 100
  • Requires documentation deleted (Yes)
  • Tested version 2.4, 2.6 added

Also available in: Atom PDF