110 lines
5.4 KiB
Markdown
110 lines
5.4 KiB
Markdown
# Firefly Import Preprocessor — Agent Instructions
|
||
|
||
PHP 8.1+ CLI ETL tool that transforms bank CSV exports (UBS E-Banking) into Firefly III-compatible format. See [README.md](README.md) for full documentation.
|
||
|
||
## Build & Test
|
||
|
||
```bash
|
||
composer test # PHPUnit tests
|
||
composer lint # phpcs PSR-12 check (src/ bin/)
|
||
composer lint-fix # phpcbf auto-fix
|
||
composer analyze # phpstan level 8
|
||
composer psalm # Psalm static analysis
|
||
```
|
||
|
||
### Test Suite Overview
|
||
|
||
85 tests across 5 test classes:
|
||
|
||
| File | Tests | Scope |
|
||
| ------ | -------: | ------- |
|
||
| `tests/ColumnTransformerTest.php` | 37 | All 13 transformation types, edge cases |
|
||
| `tests/ConfigurationLoaderTest.php` | 18 | JSON loading, dot-notation access, validation |
|
||
| `tests/CsvReaderTest.php` | 15 | CSV parsing, BOM handling, delimiter, encoding |
|
||
| `tests/MetadataExtractorTest.php` | 14 | Pre-header regex extraction, edge cases |
|
||
| `tests/ConfigIntegrationTest.php` | 1× per fixture | Golden-file integration tests (see below) |
|
||
|
||
### Integration Tests (Golden-File Pattern)
|
||
|
||
`ConfigIntegrationTest` auto-discovers every subdirectory in `tests/fixtures/` and runs a full transform pipeline against it. For each fixture directory `tests/fixtures/<name>/`:
|
||
|
||
- `input.csv` — minimal representative CSV input
|
||
- `expected.csv` — exact expected output after transformation
|
||
- `config/<name>.json` must exist in the project root config dir
|
||
|
||
**Currently active fixtures:** `config-ubs-account`
|
||
|
||
**Adding a new fixture:** create the directory, add `input.csv` and `expected.csv`, ensure the matching `config/<name>.json` exists. No code changes required — the provider discovers it automatically.
|
||
|
||
**Regenerating `expected.csv`** after a config change (replace `<name>` accordingly):
|
||
|
||
```bash
|
||
php -r "
|
||
require 'vendor/autoload.php';
|
||
use UbsCsvTransformer\ConfigurationLoader;
|
||
use UbsCsvTransformer\TransformerEngine;
|
||
\$tmpConfig = sys_get_temp_dir() . '/gen.json';
|
||
\$cfg = json_decode(file_get_contents('config/<name>.json'), true);
|
||
\$cfg['directories']['output'] = 'tests/fixtures/<name>';
|
||
\$cfg['csvStructure']['outputFilename'] = 'expected.csv';
|
||
file_put_contents(\$tmpConfig, json_encode(\$cfg, JSON_UNESCAPED_UNICODE));
|
||
\$loader = new ConfigurationLoader(\$tmpConfig); \$loader->load();
|
||
\$engine = new TransformerEngine(\$loader);
|
||
\$result = \$engine->transform('tests/fixtures/<name>/input.csv');
|
||
unlink(\$tmpConfig);
|
||
echo \$result['success'] ? 'OK' . PHP_EOL : 'ERROR: ' . \$result['error'] . PHP_EOL;
|
||
"
|
||
```
|
||
|
||
Run the tool:
|
||
|
||
```bash
|
||
php bin/transformer.php test input.csv config/config.json --rows=5
|
||
php bin/transformer.php transform input.csv config/config.json --output=output.csv
|
||
php bin/transformer.php validate config/config.json --strict
|
||
php bin/transformer.php auto-import config/config.json --watch
|
||
# Add --debug / -d for verbose output
|
||
```
|
||
|
||
## Architecture
|
||
|
||
```bash
|
||
bin/transformer.php → TransformerEngine
|
||
├── ConfigurationLoader (JSON config)
|
||
├── CsvReader (reads + BOM handling)
|
||
├── MetadataExtractor (regex on pre-header lines)
|
||
├── ColumnTransformer (transformation pipeline)
|
||
├── CsvWriter (output CSV)
|
||
└── FireflyImporter (optional, shells to Firefly CLI)
|
||
```
|
||
|
||
`DebugLogger` is a static helper used across all components; activated by the `--debug` flag.
|
||
`TransformerEngine` instantiates `CsvReader` per call (in `transform()`/`validate()`), not in the constructor.
|
||
|
||
## Conventions
|
||
|
||
- **PSR-12** enforced via phpcs using `phpcs.xml` (auto-discovered at root). Line length: soft 120, hard 150 chars.
|
||
- **PHPStan level 8** with `checkMissingCallableSignature: true`. `phpstan-baseline.neon` is empty — do not add suppressions without good reason.
|
||
- **All source comments and docblocks are written in English.**
|
||
- **Documentation language:** `README.md` is the primary documentation in **English**. `README.de.md` is the German translation. Both cross-link to each other at the top.
|
||
- **`showHelp()` in `bin/transformer.php`** is locale-aware: English is the default; German is shown when `isGermanLocale()` returns `true` (checks `LANG`, `LC_ALL`, `LC_MESSAGES`, `LANGUAGE` env vars for a `de` prefix).
|
||
- **License:** GPL-3.0.
|
||
- Namespace `UbsCsvTransformer\` (PSR-4 → `src/`); tests use `UbsCsvTransformer\Tests\` (→ `tests/`).
|
||
- No runtime package dependencies — only `ext-json` and `ext-mbstring`.
|
||
|
||
## Config Format
|
||
|
||
See [config/config.example.json](config/config.example.json) for a full reference. Three top-level sections:
|
||
|
||
- **`metadata.extractionRules`** — regex rules against 1-based pre-header line numbers
|
||
- **`csvStructure`** — `headerLine`, `delimiter`, `encoding`, `hasBom`
|
||
- **`columnTransformations`** — array of per-column transformation pipelines
|
||
|
||
### Key patterns in config
|
||
|
||
- `"sourceColumn": "_constant_"` — injects an extracted metadata value (e.g. IBAN) as a new output column without reading a CSV column
|
||
- `"outputAction": "create"` vs `"overwrite"` — controls whether the result is a new column or replaces an existing one
|
||
- `MetadataExtractor` uses 1-based `lineNumber` in config; it converts to 0-based array index internally
|
||
|
||
Supported transformation types: `map`, `replace`, `regex`, `regexextract`, `dateformat`, `split`, `trim`, `uppercase`, `lowercase`, `ucwordsfirst`, `truncate`, `constantvalue`, `pipeline`
|