# Firefly Import Preprocessor **Version:** 1.0.0 **Date:** 03 May 2026 **Status:** Production Ready 🌐 [Deutsch](README.de.md) --- ## Table of Contents 1. [Overview](#overview) 2. [Installation & Setup](#installation--setup) 3. [Quick Start](#quick-start) 4. [Configuration](#configuration) 5. [Transformation Types](#transformation-types) 6. [CLI Reference](#cli-reference) 7. [Debug Mode](#debug-mode) 8. [Firefly III Integration](#firefly-iii-integration) 9. [Architecture](#architecture) 10. [Error Handling](#error-handling) --- ## Overview The **Firefly Import Preprocessor** is a production-ready PHP preprocessor for bank CSV export files. It transforms bank data into a standardised format and can optionally import it into Firefly III. ### Core Features βœ… **Full CSV transformation** with complex pipelines βœ… **Metadata extraction** via regex (IBAN, currency, account name) βœ… **14 transformation types** for flexible data processing βœ… **Firefly III integration** β€” CLI, Docker, and HTTP upload βœ… **Debug mode** for full processing transparency βœ… **Production ready** with complete error handling βœ… **Zero dependencies** for core functionality ### Workflow ```text Input CSV ↓ Extract metadata (regex) ↓ Transform data rows (pipeline) ↓ Write output CSV ↓ [Optional] Import into Firefly III ``` --- ## Installation & Setup ### Requirements - PHP 8.1+ - Composer (recommended) - [Optional] Docker for Firefly III integration ### Installation ```bash # 1. Clone / copy the repository cd ff-imp-preprocessor # 2. Install dependencies (optional, dev tools only) composer install # 3. Create configuration cp config/config.example.json config/config.json # Edit config/config.json with your settings # 4. Create directories mkdir -p config/import/{source,output,archive,error} chmod 755 config/import/{source,output,archive,error} # 5. Run a test php bin/transformer.php validate config/config.json input.csv ``` --- ## Quick Start ### 1. Adjust configuration Edit `config/config.json` and make sure the extraction rules match your CSV format: ```json { "metadata": { "extractionRules": [ { "name": "account_iban", "lineNumber": 2, "regex": "IBAN:\\s*([A-Z0-9 ]+)", "captureGroup": 1 } ] }, "csvStructure": { "headerLine": 5, "delimiter": ";", "encoding": "UTF-8" } } ``` ### 2. Validate CSV ```bash php bin/transformer.php validate config/config.json input.csv ``` ### 3. Run transformation ```bash php bin/transformer.php transform input.csv config/config.json # With debug mode for troubleshooting php bin/transformer.php transform input.csv config/config.json --debug ``` ### 4. Inspect output ```bash php bin/transformer.php test input.csv config/config.json --debug # Shows up to 10 transformed rows and debug logs ``` --- ## Configuration ### config.json structure #### `metadata` β€” Metadata extraction ```json { "metadata": { "extractionRules": [ { "name": "account_iban", "lineNumber": 2, "regex": "IBAN:\\s*([A-Z0-9 ]+)", "captureGroup": 1 }, { "name": "currency_code", "lineNumber": 3, "regex": "Currency:\\s*([A-Z]{3})", "captureGroup": 1 } ] } } ``` | Field | Type | Description | | ------ | ----- | ----------- | | `name` | string | Name of the metadata variable (used in `constantvalue`) | | `lineNumber` | int | Line number in CSV (1-based, human-readable) | | `regex` | string | Regex pattern for extraction (without delimiters) | | `captureGroup` | int | Capture group index (0 = full match, 1 = first group, etc.) | **Regex example:** - Pattern: `IBAN:\s*([A-Z0-9 ]+)` - Input: `IBAN: CH93 0077 2020 6262 5252 7` - Capture group 1: `CH93 0077 2020 6262 5252 7` #### `csvStructure` β€” CSV format ```json { "csvStructure": { "headerLine": 5, "delimiter": ";", "encoding": "UTF-8", "hasBom": false } } ``` | Field | Type | Default | Description | | ------ | ----- | ------- | ----------- | | `headerLine` | int | 5 | Line number of the header row (1-based) | | `delimiter` | string | `;` | CSV delimiter | | `encoding` | string | `UTF-8` | Character encoding (UTF-8, ISO-8859-1, CP1252) | | `hasBom` | bool | false | Whether the file has a BOM (Byte Order Mark) | #### `columnTransformations` β€” Column transformations ```json { "columnTransformations": [ { "sourceColumn": "BookingDate", "transformations": [ { "type": "dateformat", "fromFormat": "d.m.Y", "toFormat": "Y-m-d" } ], "outputColumn": "date", "outputAction": "overwrite" } ] } ``` **outputAction:** | Value | Behaviour | |---|---| | `overwrite` | Replace the target column with the transformation result (default) | | `create` | Write the result into a new output column | | `append` | Concatenate the result to the end of the existing column value. Add `"appendDelimiter": " "` (any string) to insert a separator between the existing and new value β€” the delimiter is omitted when the target column is still empty | | `append-if-not-empty` | Same as `append` (including optional `appendDelimiter`) but skips entirely when the transformation result is empty β€” safe for optional values such as tags or notes lines | | `append-line` | Same as `append` but the separator is always a newline `\n`; no leading newline when the target is empty | | `overwrite-if-empty` | Only write the result if the target column is currently empty | | `overwrite-if-not-empty` | Only write the result if the transformation result is not empty | #### `directories` β€” File system ```json { "directories": { "source": "/opt/ff-imp-preprocessor/import/source", "output": "/opt/ff-imp-preprocessor/import/output", "archive": "/opt/ff-imp-preprocessor/import/archive", "error": "/opt/ff-imp-preprocessor/import/error" } } ``` | Field | Description | | ------ | ----------- | | `source` | Input directory | | `output` | Output directory | | `archive` | Archive for processed files | | `error` | Error directory for invalid files | #### `fireflyImport` β€” Firefly III integration Optional. When present, passing `--do-import` to the `transform` command (or using `auto-import`) will call the Firefly III Data Importer after the output CSV is written. See [Firefly III Integration](#firefly-iii-integration) for the full field reference and mode-specific examples. --- ## Transformation Types There are **14 supported transformation types** that can be combined as a pipeline: ### 1. **trim** β€” Remove whitespace Removes leading and trailing whitespace. ```json { "type": "trim" } ``` - Input: ` Coop Pronto ` β†’ Output: `Coop Pronto` --- ### 2. **lowercase** β€” Convert to lowercase Converts to lowercase (UTF-8 safe). ```json { "type": "lowercase" } ``` - Input: `COOP PRONTO CHUR` β†’ Output: `coop pronto chur` --- ### 3. **uppercase** β€” Convert to uppercase Converts to uppercase (UTF-8 safe). ```json { "type": "uppercase" } ``` - Input: `Coop Pronto Chur` β†’ Output: `COOP PRONTO CHUR` --- ### 4. **ucwordsfirst** β€” Capitalise after word separators Capitalises the first letter after each word separator. ```json { "type": "ucwordsfirst" } ``` - `COOP PRONTO CHUR` β†’ `Coop Pronto Chur` - `migros-rail city` β†’ `Migros-Rail City` - `O'NEILL STORE` β†’ `O'Neill Store` - `SAINT-JEAN-DE-MAURIENNE` β†’ `Saint-Jean-De-Maurienne` Separators: space, hyphen, apostrophe, slash, period, comma, semicolon, colon, parentheses. > **Guard:** If the input already contains *both* uppercase and lowercase letters (mixed-case), it is returned unchanged. This prevents accidentally re-casing intentionally formatted strings such as `"Coop pronto chur"`. Fully uppercase or fully lowercase inputs are always processed. --- ### 5. **replace** β€” String replacement Replaces a substring with another string (case-sensitive). ```json { "type": "replace", "search": " ", "replace": " " } ``` - Input: `Coop Pronto` (two spaces) β†’ Output: `Coop Pronto` (one space) --- ### 6. **split** β€” Split column Splits a value at a delimiter and keeps a defined part. ```json { "type": "split", "delimiter": ";", "part": 0 } ``` - Input: `Coop Pronto Chur;7007 Chur` β†’ Output: `Coop Pronto Chur` --- ### 7. **regex** β€” Regex replacement Replaces parts of a string using a regular expression. Uses PHP `preg_replace`. ```json { "type": "regex", "pattern": "^(.*?);.*$", "replace": "$1" } ``` **No match β†’ original value is passed through unchanged** (pipeline-safe). Use capture groups as `$1`, `$2`, … in the `replace` field. A pattern without `^`/`$` anchors replaces only the matched portion, not the whole value. --- ### 8. **regexextract** β€” Regex extraction Extracts a capture group and returns **only that**. Uses PHP `preg_match`. ```json { "type": "regexextract", "pattern": "(\\d{4,} [^;]+)" } ``` - Input: `Coop Pronto Chur, 7007 Chur` β†’ Output: `7007 Chur` - No match β†’ empty string > **⚠ Not pipeline-safe:** A no-match discards all previous pipeline results. Use `regex` instead if you want to preserve the current value on no-match. --- ### 9. **dateformat** β€” Date reformatting Converts between date formats. ```json { "type": "dateformat", "fromFormat": "d.m.Y", "toFormat": "Y-m-d" } ``` - Input: `10.12.2025` β†’ Output: `2025-12-10` Supports all PHP `DateTime` format characters. --- ### 10. **truncate** β€” Truncate string Truncates a string to a maximum length. ```json { "type": "truncate", "maxLength": 100 } ``` --- ### 11. **constantvalue** β€” Constant value from metadata Injects an extracted metadata value as a constant for every row. ```json { "sourceColumn": "_constant_", "transformations": [ { "type": "constantvalue", "metadataKey": "account_iban" } ], "outputColumn": "account_iban", "outputAction": "create" } ``` - Every row receives the extracted `account_iban` value (e.g. `CH9300777222666888999`) in a new column. --- ### 12. **map** β€” Copy / rename column Copies a column value as-is (optionally to a new name). ```json { "type": "map" } ``` --- ### 13. **pipeline** β€” Nested pipeline Runs a sub-pipeline as a single transformation step. ```json { "type": "pipeline", "steps": [ { "type": "trim" }, { "type": "lowercase" }, { "type": "ucwordsfirst" } ] } ``` Useful for grouping steps as a logical unit within a `transformations` array. --- ### 14. **timeperiod** β€” Map time to a period label Parses a time string and returns the label of the matching period range. Supports midnight-spanning ranges (e.g. 22:00–03:59). Returns `default` (empty string by default) when no range matches or the input is invalid. ```json { "type": "timeperiod", "timeFormat": "H:i:s", "periods": [ { "from": "04:00:00", "to": "08:59:59", "label": "Morgen" }, { "from": "09:00:00", "to": "10:59:59", "label": "Vormittag" }, { "from": "11:00:00", "to": "13:59:59", "label": "Mittag" }, { "from": "14:00:00", "to": "17:59:59", "label": "Nachmittag" }, { "from": "18:00:00", "to": "21:59:59", "label": "Abend" }, { "from": "22:00:00", "to": "03:59:59", "label": "Nacht" } ], "default": "" } ``` - `"09:30:00"` β†’ `"Vormittag"` - `"23:00:00"` β†’ `"Nacht"` (midnight-spanning range) - `"02:00:00"` β†’ `"Nacht"` (midnight-spanning range) - `""` or unparseable input β†’ `""` `timeFormat` follows PHP's `DateTime::createFromFormat` syntax (default `H:i:s`). --- ### Row filtering β€” `skipIf` Rows can be excluded from the output by adding a top-level `skipIf` key to the configuration. The value is a filter node β€” either a bare condition or a nested `and`/`or` group. **Bare condition:** ```json "skipIf": { "column": "Buchungstext", "operator": "equals", "value": "Saldovortrag" } ``` **AND group:** ```json "skipIf": { "and": [ { "column": "Beschreibung1", "operator": "empty" }, { "column": "Beschreibung2", "operator": "empty" } ] } ``` **Nested AND / OR:** ```json "skipIf": { "or": [ { "column": "Amount", "operator": "gt", "value": "10000" }, { "and": [ { "column": "Type", "operator": "equals", "value": "Saldo" }, { "column": "Notes", "operator": "empty" } ] } ] } ``` **Supported operators:** | Operator | Matches when… | |---|---| | `empty` | column value is empty string | | `not-empty` | column value is not empty | | `equals` | column value equals `"value"` | | `not-equals` | column value does not equal `"value"` | | `contains` | column value contains `"value"` | | `not-contains` | column value does not contain `"value"` | | `matches` | column value matches regex `"pattern"` | | `not-matches` | column value does not match regex `"pattern"` | | `gt` | `(float) column > (float) value` | | `gte` | `(float) column >= (float) value` | | `lt` | `(float) column < (float) value` | | `lte` | `(float) column <= (float) value` | --- ### Pipeline example Multiple transformations chained: ```json { "sourceColumn": "BookingText", "transformations": [ { "type": "trim" }, { "type": "replace", "search": " ", "replace": " " }, { "type": "lowercase" }, { "type": "ucwordsfirst" } ], "outputColumn": "description", "outputAction": "overwrite" } ``` **Processing:** 1. `" COOP PRONTO "` β†’ trim β†’ `"COOP PRONTO"` 2. `"COOP PRONTO"` β†’ replace β†’ `"COOP PRONTO"` 3. `"COOP PRONTO"` β†’ lowercase β†’ `"coop pronto"` 4. `"coop pronto"` β†’ ucwordsfirst β†’ `"Coop Pronto"` --- ## CLI Reference ```bash php bin/transformer.php [input] [config] [options] ``` ### Commands | Command | Description | | ------- | ----------- | | `test` | Test run (up to 10 rows) | | `transform` | Full transformation | | `validate` | Validate configuration | | `auto-import` | Directory monitoring | | `help` | Show help | ### Options | Option | Description | | ------ | ----------- | | `--debug`, `-d` | Enable debug mode | | `--rows=N` | Max. N rows (`test` command) | | `--output=FILE`, `-o` | Output path | | `--do-import` | Import into Firefly III after transformation (`transform` only) | | `--strict` | Strict validation | | `--watch` | Continuous monitoring | | `--interval=SEC` | Check interval in seconds (default: 60) | | `--dry-run` | Simulation mode, no real operations | --- ## Debug Mode ```bash php bin/transformer.php test input.csv config/config.json --debug ``` ### Log categories | Category | When | | -------- | ---- | | `transformer` | Start/end of transformation | | `csv_reader` | While reading CSV | | `metadata` | During metadata extraction | | `metadata_warning` | On extraction problems | | `transformation` | For each transformation step | | `csv_writer` | While writing output CSV | ### Debug log output (JSON) ```json { "success": true, "debug_logs": [ { "timestamp": 1702200120.5432, "category": "transformer", "message": "Transformation started", "data": { "inputFile": "input.csv", "maxRows": 0 } }, { "timestamp": 1702200120.5445, "category": "metadata", "message": "Extraction rule applied", "data": { "rule_name": "account_iban", "value": "CH93..." } } ] } ``` --- ## Firefly III Integration The transformer can automatically import transformed files into Firefly III. Three operating modes cover all typical deployment scenarios. ### Prerequisites (all modes) **1. Create a Firefly III Data Importer JSON configuration file** This file maps transformed CSV columns to Firefly III transaction fields (format v3). Recommended approach: upload a sample CSV once in the Firefly III Data Importer Web UI, configure the column mapping there, then download the finished configuration. Alternatively, use `config/firefly-import-config.example.json` as a template and adjust `default_account` to your asset account ID. **2. Choose an operating mode** β€” see sections below. --- ### `fireflyImport` field reference | Field | Type | Description | | --- | --- | --- | | `mode` | string | Operating mode: `cli` \| `docker` \| `http` (default: `cli`) | | `jsonConfig` | string | Path to the Firefly III Data Importer JSON config file (format v3). For `cli` and `http` modes the file must exist locally; relative paths are resolved from the **working directory** where `php bin/transformer.php` is invoked (typically the project root). For `docker` mode the path is **inside the container** β€” local existence is not checked. | | `importerCommand` | string | Full CLI command *(modes: cli, docker)* | | `importerUrl` | string | URL of the Data Importer *(mode: http)* | | `personalSecret` | string | The `AUTO_IMPORT_SECRET` set on the importer server (min. 16 chars). Sent as `?secret=` URL query parameter. *(mode: http)* | | `accessToken` | string | Firefly III Personal Access Token. Sent as `Authorization: Bearer` header. Required if not already set as `FIREFLY_III_ACCESS_TOKEN` in the importer environment. *(mode: http)* | | `deleteAfterImport` | boolean | Delete transformed CSV after successful import | | `chunkSize` | integer | Split the CSV into chunks of at most N data rows and import each chunk as a separate request. Prevents server-side timeouts on large files (rule of thumb: ~3–4 s/transaction for HTTP mode). `0` or absent = no chunking (default). Applies to all modes. | | `chunkRetries` | integer | Number of additional import attempts per chunk after the first. On failure the importer retries up to this many times before aborting. `0` or absent = no retry (default). Only effective when `chunkSize > 0`. | | `chunkRetryDelay` | integer | Pause in seconds before each chunk request after the first, and between retry attempts for the same failed chunk. Addresses both inter-chunk cooldown and retry back-off. `0` or absent = no pause (default). Only effective when `chunkSize > 0`. | | `connectionTimeout` | integer | Maximum seconds to wait for the TCP connection to the importer to be established. Distinct from `timeout` (full transfer duration). Default: `10`. *(mode: http only)* | | `timeout` | integer | Timeout in seconds per request (default: 300). For chunked imports this applies per chunk, not for the total run. | | `environment` | object | Additional environment variables *(modes: cli, docker)* | --- ### Mode `cli` β€” Transformer and Firefly on the same server Both the transformer and the Firefly III Data Importer run on the same server. The transformer calls the importer directly as a local command. ```json "fireflyImport": { "mode": "cli", "jsonConfig": "/opt/firefly-data-importer/storage/configurations/ubs-import.json", "importerCommand": "php /opt/firefly-data-importer/artisan importer:import", "deleteAfterImport": false, "chunkSize": 50, "chunkRetries": 3, "chunkRetryDelay": 10, "timeout": 300, "environment": { "FIREFLY_III_URL": "https://localhost", "FIREFLY_III_ACCESS_TOKEN": "your-token-here" } } ``` --- ### Mode `docker` β€” Transformer local, Firefly in Docker The transformer runs locally or in its own container; the Firefly III Data Importer runs in a Docker container. The transformer calls the importer via `docker exec`. **Important:** The transformer's output directory must be mounted as a volume in the importer container. `jsonConfig` is the path **inside the container** (not a local path). Do not use the `-it` flag (no TTY). Example `docker-compose.yml` for the importer: ```yaml services: firefly-importer: image: fireflyiii/data-importer:latest volumes: - /opt/ff-imp-preprocessor/import:/import environment: - FIREFLY_III_URL=https://your-firefly.com - FIREFLY_III_ACCESS_TOKEN=your-token-here - CAN_POST_FILES=false ``` ```json "fireflyImport": { "mode": "docker", "jsonConfig": "/import/configs/ubs-import.json", "importerCommand": "docker exec firefly-importer php artisan importer:import", "deleteAfterImport": false, "chunkSize": 50, "chunkRetries": 3, "chunkRetryDelay": 10, "timeout": 300 } ``` The JSON config file must be available inside the container β€” either via a volume mount or `docker cp`: ```bash docker cp ubs-import.json firefly-importer:/import/configs/ubs-import.json ``` --- ### Mode `http` β€” Transformer local, Firefly importer on a remote server The transformer runs locally; the Firefly III Data Importer is reachable over HTTP(S). The CSV and JSON configuration are sent as a multipart HTTP upload to the importer. **Requirements on the importer server:** ```text CAN_POST_FILES=true AUTO_IMPORT_SECRET= # at least 16 characters β€” set this as personalSecret in your config ``` **Local requirement:** PHP extension `ext-curl` ```json "fireflyImport": { "mode": "http", "importerUrl": "https://importer.your-server.com", "personalSecret": "your-auto-import-secret-min-16-chars", "accessToken": "your-firefly-iii-personal-access-token", "jsonConfig": "config/ubs-import.json", "deleteAfterImport": false, "chunkSize": 50, "chunkRetries": 3, "chunkRetryDelay": 10, "connectionTimeout": 10, "timeout": 300 } ``` The transformer sends a `POST` request to `{importerUrl}/autoupload?secret={personalSecret}` with the CSV and JSON config as multipart form fields. The `accessToken` is sent as `Authorization: Bearer`. If `FIREFLY_III_ACCESS_TOKEN` is already set in the importer's environment, `accessToken` can be omitted. --- ### Server-side tuning For large imports the bottleneck is usually the Firefly III Data Importer server, not the transformer. The settings below belong in the importer's environment (`.env` or `docker-compose.yml`): | Setting | Recommended value | Notes | |---|---|---| | `PHP_MEMORY_LIMIT` | `512M` – `2048M` | Docker env var. Raise when PHP crashes with "Allowed memory size exhausted". | | `CONNECTION_TIMEOUT` | `60` | Seconds to wait for TCP connect to Firefly III API. Default is ~31 s (Ο€ Γ— 10). | | `IGNORE_DUPLICATE_ERRORS` | `true` | Suppress duplicate-transaction warnings on repeated imports. | **nginx reverse proxy** (if applicable): ```nginx proxy_read_timeout 600s; # must exceed the longest single-chunk import time client_max_body_size 64M; # must accommodate your largest chunk CSV ``` **Docker Compose** example: ```yaml services: firefly-importer: environment: - PHP_MEMORY_LIMIT=1024M - CONNECTION_TIMEOUT=60 - IGNORE_DUPLICATE_ERRORS=true ``` --- ### Usage ```bash # Transform only (no import) php bin/transformer.php transform input.csv config/config.json # Transform and import into Firefly III php bin/transformer.php transform input.csv config/config.json --do-import # Watch mode: transform and import automatically for each new CSV in source directory php bin/transformer.php auto-import config/config.json --watch ``` --- ## Architecture ### Components ```text bin/transformer.php (CLI entry point) ↓ TransformerEngine (orchestration) β”œβ”€ ConfigurationLoader (load / validate config) β”œβ”€ CsvReader (read CSV) β”œβ”€ MetadataExtractor (metadata via regex) β”œβ”€ ColumnTransformer (apply transformations) β”œβ”€ CsvWriter (write CSV) β”œβ”€ FireflyImporter (Firefly III integration) └─ DebugLogger (debug logs) ``` ### Data flow ```text Input CSV ↓ CsvReader::readMetadataLines() β†’ array of lines ↓ MetadataExtractor::extract() β†’ {iban: "...", currency: "..."} ↓ CsvReader::readCsvData() β†’ array of rows ↓ ColumnTransformer::transformRow() β†’ transformed row (pipeline) ↓ CsvWriter::write() β†’ output CSV ``` ### Classes | Class | Responsibility | | ----- | -------------- | | `TransformerEngine` | Orchestrates the entire workflow | | `ConfigurationLoader` | Loads and validates JSON configuration | | `CsvReader` | Reads CSV with metadata support | | `MetadataExtractor` | Extracts metadata via regex | | `ColumnTransformer` | Transforms columns (pipeline) | | `CsvWriter` | Writes output CSV | | `FireflyImporter` | Imports into Firefly III | | `DebugLogger` | Static logger for debug output | --- ## Error Handling ### Common errors #### "Input file not found" ```bash # Check the file path ls -la input.csv # Use an absolute path if relative paths do not work php bin/transformer.php transform /absolute/path/input.csv config.json ``` --- #### "Missing metadata: account_iban" The IBAN could not be extracted β€” wrong regex or wrong line number. ```bash # Inspect the first lines of the CSV head -5 input.csv # Validate with debug output php bin/transformer.php validate config.json input.csv --debug ``` --- #### "Invalid JSON: …" Syntax error in `config.json`. ```bash php -r "json_decode(file_get_contents('config/config.json'), true) or die('JSON invalid');" ``` --- #### "Configuration: 'csvStructure.headerLine' required" A required configuration field is missing. ```bash diff config/config.json config/config.example.json ``` --- ### Exception handling ```php try { $result = $engine->transform($inputFile); if (!$result['success']) { echo "Error: " . $result['error']; } } catch (Exception $e) { echo "Fatal error: " . $e->getMessage(); } ``` --- ## Tips ### UTF-8 handling The transformer uses UTF-8 safe functions throughout: - `mb_strtolower()` instead of `strtolower()` - `mb_strtoupper()` instead of `strtoupper()` - `mb_strlen()` for correct character counting Supported encodings: UTF-8, ISO-8859-1, CP1252. ### Regex tips **Pattern without delimiters (auto-wrapped):** ```json "pattern": "IBAN:\\s*([A-Z0-9 ]+)" // becomes: /IBAN:\s*([A-Z0-9 ]+)/u ``` **With explicit flags:** ```json "pattern": "/IBAN:\\s*([A-Z0-9 ]+)/iu" // case-insensitive ``` ### Performance - **Optimised for:** up to 1 million rows - **Typical file size:** 10–100 k rows ### Batch processing ```bash #!/bin/bash for file in import/source/*.csv; do php bin/transformer.php transform "$file" config/config.json if [ $? -eq 0 ]; then mv "$file" import/archive/ else mv "$file" import/error/ fi done ``` --- ## Version History **v1.0.0 (03 May 2026)** - βœ… Initial release - βœ… 14 transformation types - βœ… Metadata extraction via regex - βœ… Debug mode - βœ… Firefly III integration (cli / docker / http) - βœ… Full documentation --- **License:** GPL-3.0 **Author:** PHP CSV Transformer Project **Repository:** [git.andare.ch/david.reindl/ff-imp-preprocessor](https://git.andare.ch/david.reindl/ff-imp-preprocessor)