Feature #11082

Add CLI task for exporting terms, in a given taxonomy, that have been associated with one or more information objects

Added by Mike Cantelon about 3 years ago. Updated about 1 year ago.

Status:VerifiedStart date:04/18/2017
Priority:MediumDue date:
Assignee:Mike Cantelon% Done:

0%

Category:CLI tools
Target version:Release 2.4.0
Google Code Legacy ID: Tested version:2.4
Sponsored:Yes Requires documentation:

Description

This task, when run against a specific taxonomy, will generate a CSV with a list of terms that are linked to one or more archival descriptions (information objects). The task does give a count of how many times a specific term is used (e.g. a count of direct links to information objects - inherited links from a hierarchy are not counted)., It does not list terms that are in the taxonomy but currently not used.

The CSV output for the task includes the following columns:
  • id: the internal object ID of the term
  • parentId: the object ID of the the parent to which the term is linked. Even in a taxonomy that is not organized hierarchically, terms are linked to a root term object. If the terms are organized heirarchically, then the parentID value will be the objectID of the parent term.
  • taxonomy: the ID of the taxonomy to which the terms belong. In AtoM, typically the Subjects taxonomy ID is 35; Places is 42, etc.
  • name: the authorized/preferred form of name for the term
  • sourceCulture: the culture in which the term was created - generally a 2 letter ISO language code value (e.g. en, fr, es, etc)
  • culture:
  • use_count: a simple count of the number of times the term has been directly linked to an information object (archival description). Inherited relationships are not counted - e.g. in a hierarchy of Canada > Ontario > Toronto, when Toronto is linked to an information object, Canada and Ontario do not also receive a count.

To see the help for the task:

php symfony help csv:export-term-usage

Sample command to return terms currently used in the Subjects taxonomy in English:

php symfony csv:export-term-usage --taxonomy-name="Subjects" --taxonomy-name-culture=en /vagrant/my-subjects.csv

cli-terms-used-task-help.png - CLI help output for the task (17.2 KB) Dan Gillean, 04/18/2017 04:22 PM

cli-terms-used-sample-csv-output.png - Sample output of resulting CSV (24.5 KB) Dan Gillean, 04/21/2017 10:09 AM

translation-tests.png (20.4 KB) Dan Gillean, 05/31/2017 03:02 PM


Related issues

Related to Access to Memory (AtoM) - Bug #12696: CSV term export task's culture parameter does not work as... New 01/10/2019

History

#2 Updated by Dan Gillean about 3 years ago

  • File cli-terms-used-sample-csv-output.png added
  • File cli-terms-used-task-help.png added
  • Description updated (diff)
  • Category set to CLI tools
  • Status changed from New to Feedback

Mike, FYI, I tried the following:

  • Flip UI to French
  • Create a French term (default installation culture: en)
  • Link new french term to a translated french description
  • Run task (without specifying any target culture)

The resulting output had no name listed for the French term - looks like without a translation, there is no culture fallback?

Additionally, what is the "code" field? Is it the same code field from the UI? If so, we can exclude this - it is used to add decimal-based lat and long coordinates, and will generate a static Google map based on those. I don't see how that is useful to the output, and the column only really ever has data in the Places taxonomy, if that.

Nice to have: a count of number of information objects to which each term is linked

#3 Updated by Mike Cantelon about 3 years ago

  • Status changed from Feedback to QA/Review
  • Assignee changed from Mike Cantelon to Dan Gillean

Hi Dan. i got rid of the code field, added term use count, and added culture fallback. Mike G reviewed it and it's now merged into qa/2.4.x.

#4 Updated by Dan Gillean about 3 years ago

  • Description updated (diff)
  • Status changed from QA/Review to Feedback
  • Assignee changed from Dan Gillean to Mike Cantelon
  • Sponsored changed from No to Yes
  • Requires documentation set to Yes
  • Tested version 2.4 added

Hi Mike,

The changes look good! And I saw the fr term in my export without specifying culture, which is great.

However, I tried to run the task using --taxonomy-name-culture=fr (also tried --taxonomy-name-culture="fr") and would always get an error:

Invalid taxonomy-name and/or taxonomy-name-culture.

Otherwise, lookin' good.

#5 Updated by Dan Gillean about 3 years ago

Also, can you explain the difference between the sourceCulture and the culture columns?

For example, I have a default-en installation. I flipped the UI to french and created a term, and in the resulting output, sourceCulture was fr and culture was en. Is culture the default culture of the installation, or what? Thanks.

#6 Updated by Dan Gillean about 3 years ago

  • File deleted (cli-terms-used-sample-csv-output.png)

#7 Updated by Dan Gillean about 3 years ago

Added updated screenshot of CSV output

#8 Updated by Dan Gillean about 3 years ago

update: had a thought, and it turned out to be true, but still not seeing what I expected.

It seems that if I use --taxonomy-name-culture=fr, then for the task to run successfully, I must also give the name of the taxonomy in French.

However, when I tried, what was returned was not in fact just a list of french terms. In my instance there are many terms first made in English, but translated into french - and I created one new term in french and linked it. As output, I got a mix of both en and fr results - in fact, the exact same output I get if I run it without the --culture element. So.... that doesn't quite seem to be working.

#9 Updated by Mike Cantelon about 3 years ago

  • Assignee changed from Mike Cantelon to Dan Gillean

Hi Dan... the --taxonomy-name-culture option currently just exists to specify the culture used for the --taxonomy-name option's lookup (i.e. look for the taxonomy with this French name). It doesn't filter the export by culture. I think we talked about the culture handling in chat and the chat log for that's no longer available in Slack, so let me know if I misunderstood.

#10 Updated by Dan Gillean about 3 years ago

  • Assignee changed from Dan Gillean to Mike Cantelon

Ahhhh I see now. Wow, that is not intuitive. Thanks for clarifying.

Can you just also clarify for my documentation the difference between the culture column and the sourceCulture? It seems that sourceCulture has to do with what culture was used to create the term, but what is the culture column then? For example, I made a term in French, in an installation that had en as the default installation culture. In the sourceCulture of my export the french term listed fr, but in the culture column all terms listed "en". Is this just the culture of the installation?

If so, I might recommend we drop this from the export, as it doesn't seem very useful. Curious to know more about particular behaviors in multilingual AtoM instances that might make this useful to keep, though.

#11 Updated by Mike Cantelon about 3 years ago

Yeah it's definitely a little odd.

I'll look into the culture/sourceCulture thing.

#12 Updated by Mike Cantelon about 3 years ago

Yeah, source_culture seems to be set to the language selected when the term's created. And culture is the language of the individual term name.

I tried switching to French and creating a term: source culture was "fr" and culture was "fr". I switched to English, edited the term and added an English version of the name: source culture was "fr" and culture was "en".

#13 Updated by Dan Gillean about 3 years ago

Welp, I'm still seeing something a bit different than you, but I understand enough now to document.

See the attached image. In my tests, I created a number of terms in different cultures, added translations for some of them, linked them all to descriptions, and exported when my installation culture was a) set to English, and b) set to French.

I found, as shown in the image, that culture column always seemed merely to show the installation culture - it didn't seem to matter what culture the term was made in. See for example the Dutch term.

If a term has translations, the export will not necessarily show the original term string - it will display the translation on export if that is what matches your installation culture. Again, see the French vs English terms in the attached image.

Otherwise, task works well. Marking verified.

#14 Updated by Dan Gillean about 3 years ago

  • Requires documentation deleted (Yes)

#15 Updated by Mike Cantelon almost 3 years ago

I've merged the check for an existing export file to qa/2.4.x.

#16 Updated by Dan Gillean over 2 years ago

  • Requires documentation set to Yes

#17 Updated by Dan Gillean over 1 year ago

  • Related to Bug #12696: CSV term export task's culture parameter does not work as expected added

#18 Updated by Dan Gillean about 1 year ago

  • Requires documentation deleted (Yes)

Also available in: Atom PDF