firefly-import-preprocessor/AGENTS.md
2026-05-06 23:17:54 +02:00

5.6 KiB
Raw Permalink Blame History

Firefly Import Preprocessor — Agent Instructions

PHP 8.1+ CLI ETL tool that transforms bank CSV exports (UBS E-Banking) into Firefly III-compatible format. See README.md for full documentation.

Build & Test

composer test           # PHPUnit tests
composer lint           # phpcs PSR-12 check (src/ bin/)
composer lint-fix       # phpcbf auto-fix
composer analyze        # phpstan level 8
composer psalm          # Psalm static analysis

Test Suite Overview

129 tests across 7 test classes:

File Tests Scope
tests/ColumnTransformerTest.php 51 All 14 transformation types, edge cases
tests/ConfigurationLoaderTest.php 18 JSON loading, dot-notation access, validation
tests/CsvReaderTest.php 15 CSV parsing, BOM handling, delimiter, encoding
tests/MetadataExtractorTest.php 14 Pre-header regex extraction, edge cases
tests/ConfigIntegrationTest.php 1× per fixture Golden-file integration tests (see below)
tests/RowFilterTest.php 19 skipIf conditions, all operators, nested AND/OR groups
tests/FireflyImporterChunkStateTest.php 11 Chunk state persistence, resume, reset

Integration Tests (Golden-File Pattern)

ConfigIntegrationTest auto-discovers every subdirectory in tests/fixtures/ and runs a full transform pipeline against it. For each fixture directory tests/fixtures/<name>/:

  • input.csv — minimal representative CSV input
  • expected.csv — exact expected output after transformation
  • config/<name>.json must exist in the project root config dir

Currently active fixtures: config-ubs-account

Adding a new fixture: create the directory, add input.csv and expected.csv, ensure the matching config/<name>.json exists. No code changes required — the provider discovers it automatically.

Regenerating expected.csv after a config change (replace <name> accordingly):

php -r "
require 'vendor/autoload.php';
use UbsCsvTransformer\ConfigurationLoader;
use UbsCsvTransformer\TransformerEngine;
\$tmpConfig = sys_get_temp_dir() . '/gen.json';
\$cfg = json_decode(file_get_contents('config/<name>.json'), true);
\$cfg['directories']['output'] = 'tests/fixtures/<name>';
\$cfg['csvStructure']['outputFilename'] = 'expected.csv';
file_put_contents(\$tmpConfig, json_encode(\$cfg, JSON_UNESCAPED_UNICODE));
\$loader = new ConfigurationLoader(\$tmpConfig); \$loader->load();
\$engine = new TransformerEngine(\$loader);
\$result = \$engine->transform('tests/fixtures/<name>/input.csv');
unlink(\$tmpConfig);
echo \$result['success'] ? 'OK' . PHP_EOL : 'ERROR: ' . \$result['error'] . PHP_EOL;
"

Run the tool:

php bin/transformer.php test input.csv config/config.json --rows=5
php bin/transformer.php transform input.csv config/config.json --output=output.csv
php bin/transformer.php validate config/config.json --strict
php bin/transformer.php auto-import config/config.json --watch
# Add --debug / -d for verbose output

Architecture

bin/transformer.php → TransformerEngine
                           ├── ConfigurationLoader   (JSON config)
                           ├── CsvReader             (reads + BOM handling)
                           ├── MetadataExtractor     (regex on pre-header lines)
                           ├── ColumnTransformer     (transformation pipeline)
                           ├── CsvWriter             (output CSV)
                           └── FireflyImporter       (optional, shells to Firefly CLI)

DebugLogger is a static helper used across all components; activated by the --debug flag.
TransformerEngine instantiates CsvReader per call (in transform()/validate()), not in the constructor.

Conventions

  • PSR-12 enforced via phpcs using phpcs.xml (auto-discovered at root). Line length: soft 120, hard 150 chars.
  • PHPStan level 8 with checkMissingCallableSignature: true. phpstan-baseline.neon is empty — do not add suppressions without good reason.
  • All source comments and docblocks are written in English.
  • Documentation language: README.md is the primary documentation in English. README.de.md is the German translation. Both cross-link to each other at the top.
  • showHelp() in bin/transformer.php is locale-aware: English is the default; German is shown when isGermanLocale() returns true (checks LANG, LC_ALL, LC_MESSAGES, LANGUAGE env vars for a de prefix).
  • License: GPL-3.0.
  • Namespace UbsCsvTransformer\ (PSR-4 → src/); tests use UbsCsvTransformer\Tests\ (→ tests/).
  • No runtime package dependencies — only ext-json and ext-mbstring.

Config Format

See config/config.example.json for a full reference. Three top-level sections:

  • metadata.extractionRules — regex rules against 1-based pre-header line numbers
  • csvStructureheaderLine, delimiter, encoding, hasBom
  • columnTransformations — array of per-column transformation pipelines

Key patterns in config

  • "sourceColumn": "_constant_" — injects an extracted metadata value (e.g. IBAN) as a new output column without reading a CSV column
  • "outputAction": "create" vs "overwrite" — controls whether the result is a new column or replaces an existing one
  • MetadataExtractor uses 1-based lineNumber in config; it converts to 0-based array index internally

Supported transformation types: map, replace, regex, regexextract, dateformat, split, trim, uppercase, lowercase, ucwordsfirst, truncate, constantvalue, pipeline, timeperiod