Appearance
Parse Actions
Parse actions extract data from filenames or text files using regular expressions and populate DICOM tags with the extracted values.
Parse Types
DICOM Printer 2 supports two parse action types:
- ParseJobFileName - Extract data from the print job filename
- ParseJobTextFile - Extract data from a text file associated with the print job
ParseJobFileName
Extracts data from the filename of the print job using regular expressions.
Basic Syntax
xml
<ParseJobFileName name="ActionName">
<Pattern>regular_expression</Pattern>
<DcmTag tag="GGGG,EEEE" group="1"/>
<!-- Additional tag mappings -->
</ParseJobFileName>Note: Error handling (onError) is configured on <Perform> nodes in the workflow, not on the action definition. See Actions Overview for details.
Elements
<Pattern>
The regular expression pattern to match against the filename. Use capturing groups () to extract values.
<DcmTag>
Maps a captured group to a DICOM tag.
Attributes:
tag- The DICOM tag to populate (format:GGGG,EEEE)group- The regex capture group number (1-based)mandatory(optional) - Whether this tag must be successfully extracted- Valid Values:
true,false - Default:
false(tag is optional) - When
true, the parse action fails if the tag cannot be extracted - When
false, processing continues even if the tag is not found
- Valid Values:
replacePattern(optional) - Pattern to construct the final tag value from captured groups- Uses backreferences:
\1,\2,\3, etc. for captured groups - Default:
\1(first captured group) - Allows reformatting and combining multiple captured groups
- Uses backreferences:
Example: Extract Patient ID from Filename
xml
<ParseJobFileName name="ExtractPatientID">
<!-- Filename pattern: PatientID_12345.pdf -->
<Pattern>PatientID_(\d+)\.pdf</Pattern>
<DcmTag tag="0010,0020" group="1"/>
</ParseJobFileName>Use in workflow with error handling:
xml
<Perform action="ExtractPatientID" onError="Discard"/>If the filename is PatientID_12345.pdf, this extracts 12345 and sets tag (0010,0020) to that value.
Example: Extract Multiple Fields
xml
<ParseJobFileName name="ExtractPatientData">
<!-- Filename pattern: John_Doe_19800515_M.pdf -->
<Pattern>(\w+)_(\w+)_(\d{8})_([MF])\.pdf</Pattern>
<DcmTag tag="0010,0010" group="1,2"/> <!-- Patient Name: last^first -->
<DcmTag tag="0010,0030" group="3"/> <!-- Birth Date -->
<DcmTag tag="0010,0040" group="4"/> <!-- Sex -->
</ParseJobFileName>For filename John_Doe_19800515_M.pdf:
- Patient Name (0010,0010) =
Doe^John - Birth Date (0010,0030) =
19800515 - Sex (0010,0040) =
M
Example: Extract Study Information
xml
<ParseJobFileName name="ExtractStudyInfo">
<!-- Filename pattern: PATID12345_CT_20240315_ACC9876.pdf -->
<Pattern>PATID(\d+)_([A-Z]+)_(\d{8})_ACC(\d+)\.pdf</Pattern>
<DcmTag tag="0010,0020" group="1"/> <!-- Patient ID -->
<DcmTag tag="0008,0060" group="2"/> <!-- Modality -->
<DcmTag tag="0008,0020" group="3"/> <!-- Study Date -->
<DcmTag tag="0008,0050" group="4"/> <!-- Accession Number -->
</ParseJobFileName>Example: Mandatory vs Optional Tags
Use the mandatory attribute to control parse failure behavior:
xml
<ParseJobFileName name="ExtractWithValidation">
<!-- Filename pattern: ID12345_optional_data.pdf -->
<Pattern>ID(\d+)(?:_([A-Z]+))?\.pdf</Pattern>
<!-- Patient ID is mandatory - action fails if not found -->
<DcmTag tag="0010,0020" group="1" mandatory="true"/>
<!-- Modality is optional - action succeeds even if not found -->
<DcmTag tag="0008,0060" group="2" mandatory="false"/>
</ParseJobFileName>Use in workflow with error handling:
xml
<Perform action="ExtractWithValidation" onError="Discard"/>Behavior:
- Filename
ID12345.pdf: Success - Patient ID extracted, Modality empty - Filename
ID12345_CT.pdf: Success - Both extracted - Filename
report.pdf: Failure - Patient ID (mandatory) not found, job discarded
Example: Using replacePattern to Format Data
The replacePattern attribute allows you to reformat captured data:
xml
<ParseJobFileName name="FormatPatientName">
<!-- Filename pattern: FirstName_LastName_12345.pdf -->
<Pattern>(\w+)_(\w+)_(\d+)\.pdf</Pattern>
<!-- Format as DICOM PN: Last^First -->
<DcmTag tag="0010,0010" replacePattern="\2^\1"/>
<!-- Patient ID from third group -->
<DcmTag tag="0010,0020" replacePattern="\3"/>
</ParseJobFileName>For filename John_Doe_12345.pdf:
- Patient Name (0010,0010) =
Doe^John(reformatted) - Patient ID (0010,0020) =
12345
Advanced Formatting:
xml
<ParseJobFileName name="FormatComplexData">
<Pattern>(\d{4})(\d{2})(\d{2})_ID(\d+)\.pdf</Pattern>
<!-- Reformat date from YYYYMMDD to YYYY-MM-DD format (or keep as YYYYMMDD for DICOM) -->
<DcmTag tag="0008,0020" replacePattern="\1\2\3"/>
<!-- Add prefix to ID -->
<DcmTag tag="0010,0020" replacePattern="PAT\4"/>
</ParseJobFileName>For filename 20240315_ID00123.pdf:
- Study Date (0008,0020) =
20240315 - Patient ID (0010,0020) =
PAT00123
ParseJobTextFile
Extracts data from a text file using regular expressions.
Basic Syntax
xml
<ParseJobTextFile name="ActionName">
<Pattern>regular_expression</Pattern>
<DcmTag tag="GGGG,EEEE" group="1"/>
<!-- Additional tag mappings -->
</ParseJobTextFile>Note: Error handling (onError) is configured on <Perform> nodes in the workflow, not on the action definition.
The text file must have the same name as the print job with a .txt extension.
Example: Parse Text File
For a print job file report_12345.pdf with associated report_12345.txt:
Text file content (report_12345.txt):
Patient: John Doe
ID: 12345
Date: 2024-03-15
Accession: ACC9876Parse configuration:
xml
<ParseJobTextFile name="ParseReportData">
<Pattern>Patient:\s*(.+)</Pattern>
<DcmTag tag="0010,0010" group="1"/>
</ParseJobTextFile>
<ParseJobTextFile name="ParsePatientID">
<Pattern>ID:\s*(\d+)</Pattern>
<DcmTag tag="0010,0020" group="1"/>
</ParseJobTextFile>
<ParseJobTextFile name="ParseStudyDate">
<Pattern>Date:\s*(\d{4}-\d{2}-\d{2})</Pattern>
<DcmTag tag="0008,0020" group="1"/>
</ParseJobTextFile>Multiline Patterns
Use the (?s) flag to enable multiline matching:
xml
<ParseJobTextFile name="ParseMultiline">
<Pattern>(?s)Patient Information:.*?Name:\s*(.+?)\s*ID:\s*(\d+)</Pattern>
<DcmTag tag="0010,0010" group="1"/>
<DcmTag tag="0010,0020" group="2"/>
</ParseJobTextFile>Regular Expression Syntax
DICOM Printer 2 uses standard regular expression syntax:
Common Patterns
\d- Digit (0-9)\w- Word character (a-z, A-Z, 0-9, _)\s- Whitespace.- Any character+- One or more*- Zero or more?- Zero or one()- Capturing group[]- Character class|- Alternation (OR)
Examples
regex
\d{5} # Exactly 5 digits
[A-Z]{2,4} # 2 to 4 uppercase letters
\w+@\w+\.\w+ # Simple email pattern
\d{4}-\d{2}-\d{2} # Date: YYYY-MM-DD
[MF] # M or F (sex)Combining Multiple Parse Actions
xml
<Actions>
<!-- Parse patient ID from filename -->
<ParseJobFileName name="GetPatientID">
<Pattern>(\d{6})_\w+\.pdf</Pattern>
<DcmTag tag="0010,0020" group="1"/>
</ParseJobFileName>
<!-- Parse additional data from text file -->
<ParseJobTextFile name="GetPatientName">
<Pattern>Name:\s*(.+)</Pattern>
<DcmTag tag="0010,0010" group="1"/>
</ParseJobTextFile>
<ParseJobTextFile name="GetAccession">
<Pattern>Accession:\s*(\w+)</Pattern>
<DcmTag tag="0008,0050" group="1"/>
</ParseJobTextFile>
</Actions>
<Workflow>
<Perform action="GetPatientID"/>
<Perform action="GetPatientName"/>
<Perform action="GetAccession"/>
<Perform action="QueryWorklist"/>
</Workflow>Error Handling
Parse actions can fail in two ways:
- Pattern doesn't match - The regular expression doesn't match the filename/text
- Mandatory tag not extracted - A tag marked
mandatory="true"was not found
Configure error handling in the workflow using the onError attribute on <Perform> nodes:
onError="Discard"- Job is discarded on failureonError="Ignore"- Processing continues with empty tag valuesonError="Hold"- Job is held and retried lateronError="Suspend"- Job is suspended for manual review
Example: Mandatory Tags with Error Handling
Action definition:
xml
<ParseJobFileName name="ExtractRequiredData">
<Pattern>(\d+)_(.+)\.pdf</Pattern>
<!-- Patient ID is mandatory -->
<DcmTag tag="0010,0020" group="1" mandatory="true"/>
<!-- Patient Name is optional -->
<DcmTag tag="0010,0010" group="2" mandatory="false"/>
</ParseJobFileName>Workflow with discard on failure:
xml
<Perform action="ExtractRequiredData" onError="Discard"/>Behavior:
- If Patient ID cannot be extracted, the action fails and job is discarded
- If only Patient Name is missing, the action succeeds with empty Patient Name
Example: All Tags Optional
Action definition:
xml
<ParseJobFileName name="OptionalParsing">
<Pattern>(?:ACC)?(\w+)?</Pattern>
<!-- All tags are optional (default behavior) -->
<DcmTag tag="0008,0050" group="1"/>
</ParseJobFileName>Workflow:
xml
<Perform action="OptionalParsing" onError="Ignore"/>Processing continues even if the pattern doesn't match or no data is extracted.
Complete Example
xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE DicomPrinter SYSTEM "config.dtd">
<DicomPrinter>
<Actions>
<!-- Parse patient ID and accession from filename -->
<!-- Expected format: 12345_ACC9876_Report.pdf -->
<ParseJobFileName name="ParseFilename">
<Pattern>(\d+)_ACC(\w+)_\w+\.pdf</Pattern>
<DcmTag tag="0010,0020" group="1"/> <!-- Patient ID -->
<DcmTag tag="0008,0050" group="2"/> <!-- Accession Number -->
</ParseJobFileName>
<!-- Parse patient name from text file -->
<ParseJobTextFile name="ParsePatientName">
<Pattern>Patient Name:\s*(.+)</Pattern>
<DcmTag tag="0010,0010" group="1"/>
</ParseJobTextFile>
<!-- Query worklist with parsed data -->
<Query name="FindWorklist" type="Worklist">
<ConnectionParameters>
<PeerAETitle>RIS</PeerAETitle>
<MyAETitle>PRINTER</MyAETitle>
<Host>192.168.1.200</Host>
<Port>104</Port>
</ConnectionParameters>
<DcmTag tag="0010,0020">#{PatientID}</DcmTag>
<DcmTag tag="0008,0050">#{AccessionNumber}</DcmTag>
</Query>
<!-- Store to PACS -->
<Store name="SendToPACS">
<ConnectionParameters>
<PeerAETitle>PACS</PeerAETitle>
<MyAETitle>PRINTER</MyAETitle>
<Host>192.168.1.100</Host>
<Port>104</Port>
</ConnectionParameters>
</Store>
</Actions>
<Workflow>
<!-- Parse data first -->
<Perform action="ParseFilename"/>
<Perform action="ParsePatientName"/>
<!-- Then query and store -->
<Perform action="FindWorklist"/>
<If field="QUERY_FOUND" value="true">
<Perform action="SendToPACS"/>
</If>
</Workflow>
</DicomPrinter>Best Practices
- Test patterns thoroughly with representative filenames
- Use specific patterns to avoid false matches
- Use
mandatory="true"for critical tags (e.g., Patient ID, Accession Number) to ensure data quality - Mark optional tags with
mandatory="false"explicitly for clarity in configuration - Use
replacePatternto format data according to DICOM standards (e.g., Patient Name asLast^First) - Validate extracted data using workflow conditionals
- Document expected filename formats in comments
- Use
onError="Discard"for invalid formats to avoid processing corrupt data - Combine with Query actions to validate extracted patient data against authoritative sources