Skip to main content

extractor

1. Description

Extractor program extract the output of a query from a database and write it in a specified output file. It takes one query at a time and extracts its output.

2. Program location

<installation_path>/extractor

3. Input

3.1 Mandatory Parameters

#ParametersDescriptionExample
1db_nameDatabase to be used.ORACLE/MSSQL
2querystore_connection_stringConnection credentials to connect with the database containing the querystore table.Oracle (Username|Password|IP:PORT/Servicename) and Mssql (Server|Database|UserName|Password)
3extraction_connection_stringConnection credentials to connect with the database from which the data is to be extracted.Oracle (Username|Password|IP:PORT/Servicename) and Mssql (Server|Database|UserName|Password)
4querystore_schema_nameSchema name in the database from where query to be fetcheddbo
5output_file_pathFile path where the extracted data from database is to be finally written.test-bed/extractor.txt
6log_filePath to write logstest-bed/extractor_log.txt
7diagnostics_log_filePath to write diag logstest-bed/extractor_diag_log.txt
8qidQueryId of the query to be fetched from the database.1,2

3.2 Non Mandatory Parameters

#ParametersDescriptionDefault valueExample
1as_on_dateDate to be replaced in the fetched queryEmpty string07-06-2024
2threadsNumber of parallel processing to be used for fetching data.Only in case of ORACLE.44
3log_levelLevel of diagnostics written to the log file.infoerror/warn/info/debug/trace/none
4diagnostic-flagFlag used to enable detailed loggingtruetrue/false
5arraysizeNumber of rows to be fethced from the database in a single round-trip.Only in case of ORACLE.300100
6optionAdditional parameters to be used for sqlcmd,sqlldr & sqlplus.The option parameters should be pipe seperated,containing the identification key and its value.""m,2|l,10
7intermediary_file_pathFile path where the extracted data from database is to be temporarily written.test-bed/intermediary.txt

4. Output

4.1 Intermediary File

If the path to an intermediary file path is specified then the output of the query execution is first written into the intermediary file. Then the data from intermediary file is written into the output file specified.

Note: After writing from intermediary file to output file, the intermediary file gets deleted.

4.2 Output File

If intermediary file path is not specified, the output is directly written into the output file path.

4.3 Health report

The health check report gives summary about the number of rows extracted and not extracted from the database.🔗

Output derivation logic:

  • Fetch Query :
    The program uses querystore_connection_string to connect with the database containing the QueryStore table and extract the query using the queryid(qid) provided by the user.

    It will replace {AsOnDate} if present in the extracted query with the as_on_date.


  • Extraction :
    Runs the fetched query and extracts the output of the query from the database and write it into the intermediary file.
    From intermediary file it writes the data into the final output file.

    Note: The rows in output are seperated by pipe "|" in case of MSSQL.
    For ORACLE, separator will be given by the user in the query itself.
    Example for ORACLE: SELECT PP_TD.ACCNo || '|' || PP_TD.Branchcode || '|' || PP_TD.CustNo from tablename.
    In this case the separator is "|".

5. Process configuration in streamdef table:

  .........
{
"processName": "extractor",
"processId": "1",
"processBinary": "{PROGRAMS}/IND/extractor",
"processArguments": [
"--db-name",
"{DB}",
"--querystore-connection-string",
"{RC_CONN_STR}",
"--extraction-connection-string",
"{EXTR_CONN_STR}",
"--querystore-schema-name",
"{RC_SCHEMA}",
"--qid",
"1",
"--output-file-path",
"{SH_PREPROCESS}/IND/BALM/{ddmmyyyy}/output.txt",
"--intermediary-file-path",
"{SH_PREPROCESS}/IND/BALM/{ddmmyyyy}/intermediary.txt",
"--log-file",
"{SH_LOGS}/IND/BALM/{ddmmyyyy}/extractor_log.txt",
"--diagnostics-log-file",
"{SH_LOGS}/IND/BALM/{ddmmyyyy}/extractor_diag_log.txt"
],
"processDependencies": [
""
],
"processReport": "{SH_PREPROCESS}/IND/BALM/{ddmmyyyy}/output-health-check-report.json",
"processType": "TOOLS",
"infoLog": "{SH_LOGS}/IND/BALM/{ddmmyyyy}/extractor_log.txt",
"errorLog": ""
}
.........