F-HARMONIZE Module

The F-HARMONIZE module allows for feature harmonization to correct batch effects due to variables like scanner, center, or other batch-related factors. Harmonization can be performed:

  • Internally, using radiomics and batch data from input Excel files (radiomics_filename and batch_filename).

  • Externally, using both the above files and additional radiomics and batch data from previously built models (specified using radiomics_from_model_filename and batch_from_model_filename).

f-harmonize_folders.jpg

Modes of Harmonization

  • Internal Mode (default): Harmonization occurs solely on the current radiomics file.

  • External Mode: New data is harmonized with a reference batch from an external radiomics model.

When mode is set to External, harmonization is performed by merging the new data with data specified in radiomics_from_model_filename. If ref_batch is not specified, data from radiomics_from_model_filename will be used as the reference batch. Otherwise, the specified ref_batch is used as the reference batch (e.g., the name of a center).

The module saves results in an output folder (outputFolder), with results defaulting to the input folder if outputFolder is not specified. For external mode, a modelFolder must be provided to store relevant files for external data harmonization.

Options

  • verbose: Enable or disable verbose mode.

  • timer: Enable or disable the timer to record execution time.

  • log: Path to save logs.

  • new_log_file: Overwrite log file if it already exists.

  • inputFolder: Path to the input folder.

  • outputFolder: Path to the output folder.

  • modelFolder: Path to data from a previously built model (used with mode External).

  • radiomics_filename: Name of the Excel file with radiomics results.

  • batch_file: Name of the Excel file with batch information.

  • harmonized_radiomics_filename: Name of the Excel file to store harmonized results (default: harmonized_radiomics.xlsx).

  • radiomics_from_model_filename: External radiomics file used in building a model (for mode External).

  • batch_from_model_filename: Batch information for radiomics_from_model_filename (for mode External).

  • estimates_filename: Save neuroCombat estimates (works only in mode Internal). DEPRECATED

  • ref_batch: Name of the batch to serve as reference.

  • covars: Names of additional covariates to use (must be present in batch_file and batch_from_model_filename).

  • mode: Specify harmonization mode: Internal (default) or External.

Batch File Structure

Batch Excel files should include the following columns:

  • patientID: Patient IDs matching those in the radiomics file.

  • sub_Analysis: Sub-analysis column aligning with IDs in the radiomics file.

  • batch: Batch information for harmonization (e.g., center or scanner).

  • Other covariates: Optional columns for additional covariates (e.g., gender, age group, etc.).

batch.jpg

Example Usage

The following example demonstrates using F-HARMONIZE to perform harmonization with center_1 as the reference batch from external radiomics data:

F-HARMONIZE
{
    inputFolder: /path/to/radiomics_results
    #no output folder specify: save output in the input folder
    modelFolder: /path/to/radiomics_model
    radiomics_from_model_filename: radiomics_train.xlsx
    batch_from_model_filename: batch_train.xlsx
    radiomics_filename: radiomics.xlsx
    batch_filename: batch.xlsx
    harmonized_radiomics_filename: harmonized_radiomics.xlsx
    mode: external
    covars: gender
    ref_batch: center_1
    log: /path/to/logs/fharmonize.log
}
feature_harmonization.main(argv)[source]