add support for consuming fixity reports from DuraCloud
|Category:||-||Estimated time:||8.00 hours|
The upcoming release of DuraCloud has some new REST api endpoints.
Documentation about the manifest REST API endpoint can be found here: https://wiki.duraspace.org/display/DURACLOUDDOC/DuraCloud+REST+API#DuraCloudRESTAPI-GetManifest
There is a test instance available, credentials in LastPass as DuraCloud QA Server).
The new endpoint we want to use is called GetManifest. This will return either a Tab Separated file, or a BagIt style manifest, listing all the files in a given DuraCloud Space, and their md5sums.
Possible questions to ask DuraCloud include:
can checksums be supplied other than md5 ?
can checksums be supplied for a single file?
#2 Updated by Holly Becker almost 7 years ago
- Can non-md5 checksums be supplied?
- md5 has decent likelihood of collisions - use sha256 (also what AM uses)
- Will all checksums be the same format?
- Is the checksum the same as returned in the Content-MD5 header of HEAD on the file?
- Can we get checksums for a subset of files? Eg. all files that start with a prefix?
- How are filenames with a tab in them handled in the TSV?
- Are checksums generated on demand? Can we see when the checksum was generated? Force a new checksum? Times preferred in ISO 8601
- Why return the space-id if it's part of the URL? Can multiple space-ids be specified? How?
- Is this paginated? How many files can be returned at once? How do I get the next page?
- What encoding are the filenames returned in? UTF8?
- More generally, how does Duracloud handle encodings of filenames?
- Suggestion: Use JSON as a format
- More flexible for future changes