Loading...

XML

Word

Printable

Type: Sub-task
Resolution: Fixed
Fix Version/s: None
Affects Version/s: None
Component/s: VisualReview / VisualTranslation

Urgency:
Medium
Checklist:

Empty

show more show less

If the zip-based import is used, the PDFs in layout have to be provided in the subfolder 'visualReview', located at the same level as the proofRead folder.
If the pdf files are not provided, the 'visualReview' is not the default view mode in editor.

Conception

listen the "beforeDirectoryParsing" event from "editor_Models_Import_Worker_Import"
- when this event is triggered
  - check if there are the needed pdf files in the temporary import folder
  - if the files exist
    - copy the files in the "data/editorImportTask/[taskGuid]/[folder name where the pdfs will be stored]"
after the file is copied, we start the 'PdfToHtml' worker
- the 'PdfToHtml' worker will sent the pdf file to the parser. The parser will return converted html out of the pdf. This converted html file will be loaded in the layout area
- check if the pdf files exist in "data/editorImportTask/[taskGuid]/[folder name where the pdfs will be stored]"
in this time the 'PdfToHtml' and 'Import' worker are running in parallel
after both of the workers are finished, 'PdfToHtml' and 'Import' worker, then the 'Segmentation' worker is started
- the job of this worker is to map the segments and add the needed javascript inside the html file

PdfToHtml worker has no dependencies
The Segmentation worker depends on:
- Import worker ->
- PdfToHtml worker -> Segmentation worker is started after the PdfToHtml and Import worker is finished

pseudo example of pdf file converted to html with task mapping included

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"/>
<body>
	<div id="page-container">
		<div id="pf4" class="pf w2 h0" data-page-no="4">
			<div class="pc pc4 w2 h0">
				<span class="segment" data-segment-id="1000"><!-- START OF SEGMENT 1 -->
				<div class="t m0 x7 h9 y1c ff2 fs5 fc2 sc0 ls5 ws1">Industriepionier Ludwig von Hartmann, die Spinner<span class="_ _0"></span>ei Meebold.
				</div>
				</span><!-- END OF SEGMENT 1 -->
				
				<span class="segment" data-segment-id="1001"><!-- START OF SEGMENT 2 -->
				<div class="t m0 x7 h9 y1c ff2 fs5 fc2 sc0 ls5 ws1">Hallo, ich bin Aleksandar<span class="_ _0"></span>Mitrev.
				</div>
				</span><!-- END OF SEGMENT 2 -->
				
			</div>
		</div>
	</div>

</body>
</html>

using 'class name' or the 'data-segment-id' tag, we can easily recognize which text to which segment belongs to.
on each segment span we should be able to append custom html
- icons for comments
- tooltips
- highlighting the text

In this case, (zip import), we can have more than one pdf files.
Because of that we need visually to present where one pdf layout ends and where another one starts.
The best way to do this is, when between the two pdf layouts there are border that says to the user where the pdf layout file starts and also border for where the pdf layout file ends.

The border can be for example <dif> block with top and bottom margin.
The start file block should have the followed content:

Start of file (filename 1)

(Pdf layout content 1)

End of file (filename 1)

Start of file (filename 2)

(Pdf layout content 2)

End of file (filename 2)

Assignee:: Aleksandar Mitrev
Reporter:: Aleksandar Mitrev
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: 19/May/2017 06:03
Updated:: 18/May/2023 10:29
Resolved:: 26/Jun/2017 05:09

Details

Description

Conception

Attachments

Activity

People

Dates