Skip to main content

group_by_aggr

1. Description

The Group By Aggregation program processes input files in TXT, CSV, and CF formats to perform aggregation operations on structured data, groups records based on user-defined keys, computes aggregate functions such as SUM, AVG, COUNT, MIN, MAX, and Weighted Average, and writes the aggregated output to a TXT file.

2. Program location

<installation_path>/group_by_aggr

3.Screen Configuration

3.1 Process Config Fields

groupby_config_image

Here we have to provide Input file, Exchange rate file and Rule file details here.

3.2 Required Fields

groupby_rf_image

Here we have to provide the fields which contains Account Number, Currency and Exchange Rate respectively.

3.3 Process Derivations

Here select the derivation types. Based on derivation type a required filed section will come which needs to be filled with required values for derivations.

3.3.1 Derivations desc
#ParametersDescriptionDerivationType
1GroupByDefines the key fields used for grouping records; multiple fields can be combined to form a composite key.GroupBy
2SUMComputes the sum of values for the specified expression after grouping.SUM
3AVGCalculates the average of values for the specified expression within each group.AVG
4COUNTCounts the number of records within each group.COUNT
5MINFinds the minimum value of the specified expression within each group.MIN
6MAXFinds the maximum value of the specified expression within each group.MAX
7WtdAvgComputes the weighted average based on value and weight expressions within each group.WtdAvg
8ExpressionComputes derived values using expressions based on previously generated output fields.Expression
9ConstantAssigns a constant value to an output field for all grouped records.Constant
10ConfigParamsPopulates configuration-based values such as AS_ON_DATE into output fields.ConfigParams
11DERIVE_LLGPopulates LLG values into the output field based on rule logic.DERIVE_LLG

3.4 Process Arguments

groupby_pa_image

Here enter the process arguments to be passed to the program. The mandatory and non-mandatory fields are given below.

3.4.1 Mandatory Parameters
#ParametersDescriptionExample
1process-configPath to process config file that needs to be processed./path/to/process_config.json
2as-on-dateThe date for which the program has to run (format: DD-MM-YYYY).28-11-2024
3log-fileContains the detailed logs of the program execution.log.txt
4diagnostics-log-fileContains concise diagnostics logs including performance and summary details.diag-log.txt
5decimal-placesNumber of digits to retain after the decimal point (values are rounded).2
6default-llgDefault LLG value used when no rules are matched.8888
7src-ccySource currency of the input data.INR
8disp-ccyDisplay currency for output representation.RUP
9consol-ccyConsolidation currency used for aggregation.USD
3.4.2 Non Mandatory Parameters
#ParametersDescriptionDefault valueExample
1log-levelLevel of diagnostics written to the log file.INFOerror/warn/info/debug/trace/none
2diagnostics-flagThe flag that decides whether performance diagnostics will be written to the diagnostics log file.falsetrue or false
3negative-llgsComma-separated LLGs for which negative transformation is applied; if empty, applied to all LLGs.""1001,1002
4absolute-llgsComma-separated LLGs for which absolute transformation is applied; if empty, applied to all LLGs.""1003,1004
5is-consolFlag to indicate whether amounts are already consolidated.falsetrue or false

Click ⬇️ to download the sample data.

5. Output

This program generates an output file in TXT format containing the aggregated results after performing the Group By operations. The output consists of grouped records based on the specified keys, along with the computed aggregation values such as SUM, AVG, COUNT, MIN, MAX, and Weighted Average.

Each output record represents a unique group formed using the configured Group By fields, and includes all derived fields as defined in the process configuration. All numeric values in the output are formatted according to the specified decimal precision, and all date fields are represented in DD-MM-YYYY format.

In addition to the output file, the program generates a detailed log file that captures the execution flow and any issues encountered during processing. A diagnostics log file is also generated, which provides a concise summary of execution details when the diagnostics flag is enabled.