Skip to main content

file_merger

1. Description

The File Merger program processes multiple input files in TXT, CSV, and CF formats and performs merging operations on data based on user-defined mappings. It combines data from different sources according to the specified field mappings and generates a unified output file in TXT, CF format.

2. Program location

<installation_path>/file_merger

3.Screen Configuration

3.1 Process Sub Type

groupby_config_image

Here we have to select file_merger_txt if we want the output file in .TXT format.

groupby_config_image

Here we have to select file_merger_txt if we want the output file in .CF format.

3.2 Process Config Fields

groupby_config_image

Here we have to provide all the Input files that have to be merged, one file as Base Input File, others as Additional Input File.

3.3 Process Derivations

Here select the derivation types. Based on derivation type a required filed section will come which needs to be filled with required values for derivations.

3.3.1 Constant

The Constant Derivation Type is used to populate a specified constant value in place of a field from the input file.

  • Screen Configuration

constant_image

  • Required Fields

    • name: Output Field Name.
    • data_type: Output Field Type.
    • derivation
      • derivation_type: The derivation type used is "Constant".
      • req_fields
        • Value: The constant value to be assigned to the output field.

3.3.2 ColumnMapping

The Column Mapping Derivation Type is used to map values to an output column using fields from one or more input files.

For every output column that needs to be generated, a Column Mapping derivation must be configured. This allows you to specify which fields from different input files should be mapped to that particular output column.

  • Screen Configuration

constant_image

  • Required Fields

    • name: Output Field Name.
    • data_type: Output Field Type.
    • derivation
      • derivation_type: The derivation type used is "ColumnMapping".
      • req_fields
        • File 1 Field: The column from File 1 to be mapped to the output column, or a previously derived constant field.
        • File 2 Field: The column from File 2 to be mapped to the output column, or a previously derived constant field.
        • File 3 Field: The column from File 3 to be mapped to the output column, or a previously derived constant field.
        • File 4 Field: The column from File 4 to be mapped to the output column, or a previously derived constant field.
        • File 5 Field: The column from File 5 to be mapped to the output column, or a previously derived constant field.

3.4 Process Arguments

groupby_pa_image

Here enter the process arguments to be passed to the program. The mandatory and non-mandatory fields are given below.

3.4.1 Mandatory Parameters
#ParametersDescriptionExample
1process-configPath to process config file that needs to be processed./path/to/process_config.json
2as-on-dateThe date for which the program has to run (format: DD-MM-YYYY).28-11-2024
3log-fileContains the detailed logs of the program execution.log.txt
4diagnostics-log-fileContains concise diagnostics logs including performance details.diag-log.txt
3.4.2 Non Mandatory Parameters
#ParametersDescriptionDefault valueExample
1log-levelLevel of diagnostics written to the log file.INFOerror/warn/info/debug/trace/none
2diagnostics-flagThe flag that decides whether performance diagnostics will be written to the diagnostics log file.falsetrue or false

Click ⬇️ to download the sample data.

5. Output

This program generates a unified output file in TXT or CF format by merging data from multiple input files based on the configured mappings.

Each output record is formed by combining fields from different input files as per the defined Column Mapping. The output structure and field values are entirely driven by the process configuration.

In addition to the output file, the program generates a detailed log file that captures the execution flow and any issues encountered during processing. A diagnostics log file is also generated, which provides a concise summary of execution details when the diagnostics flag is enabled.