XML import not converting <p> and <br /> tags
|Assignee:||José Raddaoui Marín||% Done:|
|Target version:||Release 1.4.0|
|Google Code Legacy ID:||atom-719||Tested version:|
Google user: rada...@gmail.com
<p> tags embedded in some EAD tags (eg <bioghist> <scopecontent>) not
recognized. Can be problematic in case of long admin history.
[g] Legacy categories: Import/Export, EAD
#3 Updated by Peter Van Garderen over 12 years ago
- Subject set to XML import not converting <p> and <br /> tags
will have to convert <p> to "\n\n" and <br /> to "\n" and </p> to "" on import. Can
write a simple QubitHelper to do this.
$replace1 = array("<br />", "<br />");
$string1 = str_replace($replace1, "\r\n", $string);
$replace2 = array("<p>", "<p>");
$string2 = str_replace($replace2, "\r\n\r\n", $string1);
$replace3 = array("</p>", "</p>");
return str_replace($replace3, "", $string2);
However, cannot include this helper in the current XML import configuration. Will
split-up and simplify XML import per standard in release 1.1. and include this
capability at that time.
As a workaround for release 1.0.8, will use nl2br to include <br /> tags to preserve
linebreaks and paragraphs in the content data.
As of 1.0.8 <p> and <br /> will (re)import into Qubit literally as long as the '<'
and '>' reserved characters are escaped properly ('<' and '>'). The Symfony
output escaping respects and displays these properly in the view and show templates.
This just means that tags will be embedded within the regular text content that is
stored in the dbase (not ideal).
#7 Updated by Anonymous about 11 years ago
- Priority changed from Low to High
- File ICA-AtoMDemoScreenGrab.jpg added
- File QubitTrunkScreenGrab.jpg added
- File USaskScreenGrab.jpg added
"Import EAD.XML removes line break Formatting" discovered by Jessica Bushey while testing Wu's patch for /p/qubit-toolkit/issues/detail?id=1132.
See Screenshots attached:
*ICA-AtoM Demo ScreenGrab shows accurate presentation of Context Area formatting.This screengrab was taken prior to deleting the townley.ead.xml file. After deletion I imported the townley.ead.xml file and the presentation was no-longer accurate. See following screengrabs for example of inaccurate formatting.
*Qubit Trunk ScreenGrab shows presentation after importing townley.ead.xml file. Note: inaccurate presentation of Context Area formatting.
*USaskScreenGrab shows presentation after importing townley.ead.xml file. Note: inaccurate presentation of Context Area formatting.
[g] Labels added: Priority-High
[g] Labels removed: Priority-Low
[g] New owner: Jessica Bushey
#8 Updated by Anonymous about 11 years ago
Qubit Development Thread - http://groups.google.com/group/qubit-dev/browse_thread/thread/563379fb9ad113a1?hl=en related to this Issue.
#9 Updated by Tim Hutchinson about 11 years ago
A few things:
- the development thread indicates that export works as expected. This does not seem to be the case, since paragraphs and line breaks are not retained on export (testing in the demo site with the same record)
- the relevant EAD element for linebreak is <lb/>, not <br/>
- in USask testing, we made a change to retain <p>'s on import, so may be be able to contribute a patch later. However, this was not done with a QubitHelper as suggested above, so we'll need to review whether it's been done correctly. In any case based on Peter's comments above it doesn't seem to too complicated.
#14 Updated by Anonymous over 10 years ago
I'm trying to generate XML for import to ICA-AtoM which includes multiple values for description extents. According to instructions displayed when entering them manually, they should be separated with a linebreak.
I've tried separating the values in XML with <lb/>, <br/>, /n, and /n/n, and also tried using separate extent tags for each value - all to no avail: the values (including the text values of the separators) are all concatenated on the first line.
Have I missed something, or has anyone found a way round this?
#16 Updated by Anonymous about 10 years ago
- Target version changed from Release 1.3 to Release 2.1.0
When exporting to EAD, ICA-AtoM does not translate line breaks in a field as <p> or <br> tags within a parent element; when importing, it does not translate <p> or <br> tags within a parent element as line breaks in field data.
Assessment: this can be a problem when dealing with long data fields (e.g. administrative histories); it will requires the user to manually go in and add line breaks after importing a finding aid.
Recommendation: Implement line breaks in EAD import/export.
[g] Labels added: Milestone-Release-2.0
[g] Labels removed: Milestone-Release-1.3
#21 Updated by Dan Gillean over 9 years ago
Thought: what about using EAD tags to solve this problem? encoding the AtoM fields so that every carriage return adds <lb> tag? Not sure about the feasability of this from a dev point of view, but it is worth noting that the EAD Tag library includes a line break tag:
#24 Updated by José Raddaoui Marín about 9 years ago
As Stephen says in update #16 EAD export does not translate line breaks in a field as <p> or <br>. So the export looks like this:
<note><p>actor_history actor_history - a - b - c</p></note>
As Dan has recomended, I've replaced all the '\n' for '<lb/>' in all fields in EAD export, so now the export looks like this:
<note><p>actor_history<lb/><lb/>actor_history<lb/><lb/>- a<lb/>- b<lb/>- c</p></note>
The tricky part was the import, replacing it back and trying not to mess with other imports. But I think finally I got it.
#26 Updated by Dan Gillean about 9 years ago
- File Notes_Area_Example_CarriageReturns.png added
- File ScopeContent_Example_CarriageReturns.png added
- File ScopeContent_Example_CarriageReturns_EAD.png added
- Status changed from QA/Review to Feedback
This is a broad error which must apply to every field in AtoM. Testing on a sample EAD record did not show any carriage returns being preserved when roundtripping. As it is, when you enter a number of carriage returns in the AtoM template, only one line of separation is preserved on the showscreen regardless of how many breaks you insert when editing.
I've attached some screenshots, but I couldn't see line breaks being preserved anywhere during roundtripping. Interestingly, the line breaks were preserved in the EAD file (see screenshot), but as they were wrapped in a single < p > tag, they were not preserved when imported again.
#30 Updated by Dan Gillean about 9 years ago
- Status changed from QA/Review to Verified
EAD Linebreak (<lb>) tags have been successfully introduced, and roundtrip without issue. When a roundtrip fonds is exported again, the linebreaks are preserved in the EAD as well.
NOTE: The atom display interface will not display multiple linebreaks in a saved record - any number of carriage returns is represented as a single line break. However, when a user edits the record, line breaks and spacing are preserved in the edit template. This means that a user can easily delete <lb> tags from the EAD through the GUI, in the edit template. AtoM's display will not represent multiple carriage returns in the display, though paragraph separations (ie 1-2 linebreaks) will appear. If a user really wants several linebeaks to appear in the display, we have allow logged in users to use the HTML <br /> tag for added linebreaks - this will show up in the display as well.