5.0 KiB
Firefly Import Preprocessor — Agent Instructions
PHP 8.1+ CLI ETL tool that transforms bank CSV exports (UBS E-Banking) into Firefly III-compatible format. See README.md for full documentation.
Build & Test
composer test # PHPUnit tests
composer lint # phpcs PSR-12 check (src/ bin/)
composer lint-fix # phpcbf auto-fix
composer analyze # phpstan level 8
composer psalm # Psalm static analysis
Test Suite Overview
85 tests across 5 test classes:
| File | Tests | Scope |
|---|---|---|
tests/ColumnTransformerTest.php |
37 | All 13 transformation types, edge cases |
tests/ConfigurationLoaderTest.php |
18 | JSON loading, dot-notation access, validation |
tests/CsvReaderTest.php |
15 | CSV parsing, BOM handling, delimiter, encoding |
tests/MetadataExtractorTest.php |
14 | Pre-header regex extraction, edge cases |
tests/ConfigIntegrationTest.php |
1× per fixture | Golden-file integration tests (see below) |
Integration Tests (Golden-File Pattern)
ConfigIntegrationTest auto-discovers every subdirectory in tests/fixtures/ and runs a full transform pipeline against it. For each fixture directory tests/fixtures/<name>/:
input.csv— minimal representative CSV inputexpected.csv— exact expected output after transformationconfig/<name>.jsonmust exist in the project root config dir
Currently active fixtures: config-ubs-account
Adding a new fixture: create the directory, add input.csv and expected.csv, ensure the matching config/<name>.json exists. No code changes required — the provider discovers it automatically.
Regenerating expected.csv after a config change (replace <name> accordingly):
php -r "
require 'vendor/autoload.php';
use UbsCsvTransformer\ConfigurationLoader;
use UbsCsvTransformer\TransformerEngine;
\$tmpConfig = sys_get_temp_dir() . '/gen.json';
\$cfg = json_decode(file_get_contents('config/<name>.json'), true);
\$cfg['directories']['output'] = 'tests/fixtures/<name>';
\$cfg['csvStructure']['outputFilename'] = 'expected.csv';
file_put_contents(\$tmpConfig, json_encode(\$cfg, JSON_UNESCAPED_UNICODE));
\$loader = new ConfigurationLoader(\$tmpConfig); \$loader->load();
\$engine = new TransformerEngine(\$loader);
\$result = \$engine->transform('tests/fixtures/<name>/input.csv');
unlink(\$tmpConfig);
echo \$result['success'] ? 'OK' . PHP_EOL : 'ERROR: ' . \$result['error'] . PHP_EOL;
"
Run the tool:
php bin/transformer.php test input.csv config/config.json --rows=5
php bin/transformer.php transform input.csv config/config.json --output=output.csv
php bin/transformer.php validate config/config.json --strict
php bin/transformer.php auto-import config/config.json --watch
# Add --debug / -d for verbose output
Architecture
bin/transformer.php → TransformerEngine
├── ConfigurationLoader (JSON config)
├── CsvReader (reads + BOM handling)
├── MetadataExtractor (regex on pre-header lines)
├── ColumnTransformer (transformation pipeline)
├── CsvWriter (output CSV)
└── FireflyImporter (optional, shells to Firefly CLI)
DebugLogger is a static helper used across all components; activated by the --debug flag.
TransformerEngine instantiates CsvReader per call (in transform()/validate()), not in the constructor.
Conventions
- PSR-12 enforced via phpcs using
phpcs.xml(auto-discovered at root). Line length: soft 120, hard 150 chars. - PHPStan level 8 with
checkMissingCallableSignature: true.phpstan-baseline.neonis empty — do not add suppressions without good reason. - All source comments and docblocks are written in German.
- Namespace
UbsCsvTransformer\(PSR-4 →src/); tests useUbsCsvTransformer\Tests\(→tests/). - No runtime package dependencies — only
ext-jsonandext-mbstring.
Config Format
See config/config.example.json for a full reference. Three top-level sections:
metadata.extractionRules— regex rules against 1-based pre-header line numberscsvStructure—headerLine,delimiter,encoding,hasBomcolumnTransformations— array of per-column transformation pipelines
Key patterns in config
"sourceColumn": "_constant_"— injects an extracted metadata value (e.g. IBAN) as a new output column without reading a CSV column"outputAction": "create"vs"overwrite"— controls whether the result is a new column or replaces an existing oneMetadataExtractoruses 1-basedlineNumberin config; it converts to 0-based array index internally
Supported transformation types: map, replace, regex, regexextract, dateformat, split, trim, uppercase, lowercase, ucwordsfirst, truncate, constantvalue, pipeline