extended firefly import options, cleaned up language usage in code, new README in english, changed license references

Co-authored-by: Copilot <copilot@github.com>
This commit is contained in:
Reindl David (IT-PTR-CEN2-SL10) 2026-05-04 00:23:02 +02:00
parent 90dc37cafc
commit 0a9bd1cc2d
15 changed files with 2064 additions and 1208 deletions

View File

@ -17,7 +17,7 @@ composer psalm # Psalm static analysis
85 tests across 5 test classes:
| File | Tests | Scope |
|------|-------|-------|
| ------ | -------: | ------- |
| `tests/ColumnTransformerTest.php` | 37 | All 13 transformation types, edge cases |
| `tests/ConfigurationLoaderTest.php` | 18 | JSON loading, dot-notation access, validation |
| `tests/CsvReaderTest.php` | 15 | CSV parsing, BOM handling, delimiter, encoding |
@ -85,7 +85,10 @@ bin/transformer.php → TransformerEngine
- **PSR-12** enforced via phpcs using `phpcs.xml` (auto-discovered at root). Line length: soft 120, hard 150 chars.
- **PHPStan level 8** with `checkMissingCallableSignature: true`. `phpstan-baseline.neon` is empty — do not add suppressions without good reason.
- **All source comments and docblocks are written in German.**
- **All source comments and docblocks are written in English.**
- **Documentation language:** `README.md` is the primary documentation in **English**. `README.de.md` is the German translation. Both cross-link to each other at the top.
- **`showHelp()` in `bin/transformer.php`** is locale-aware: English is the default; German is shown when `isGermanLocale()` returns `true` (checks `LANG`, `LC_ALL`, `LC_MESSAGES`, `LANGUAGE` env vars for a `de` prefix).
- **License:** GPL-3.0.
- Namespace `UbsCsvTransformer\` (PSR-4 → `src/`); tests use `UbsCsvTransformer\Tests\` (→ `tests/`).
- No runtime package dependencies — only `ext-json` and `ext-mbstring`.

632
README.de.md Normal file
View File

@ -0,0 +1,632 @@
# Firefly Import Preprocessor — Dokumentation
**Version:** 1.0.0
**Datum:** 03. Mai 2026
**Status:** Production Ready
🌐 [English](README.md)
---
## 📋 Inhaltsverzeichnis
1. [Überblick](#überblick)
2. [Installation & Setup](#installation--setup)
3. [Schnellstart](#schnellstart)
4. [Konfiguration](#konfiguration)
5. [Transformationstypen](#transformationstypen)
6. [CLI-Referenz](#cli-referenz)
7. [Debug-Modus](#debug-modus)
8. [Firefly III Integration](#firefly-iii-integration)
9. [Architektur](#architektur)
10. [Fehlerbehandlung](#fehlerbehandlung)
---
## Überblick
Der **Firefly Import Preprocessor** ist ein produktionsreifer PHP-Preprocessor für Banken-CSV-Exportdateien. Er transformiert Bankdaten in ein standardisiertes Format und kann sie optional in Firefly III importieren.
### Kernfeatures
**Vollständige CSV-Transformation** mit komplexen Pipelines
**Metadaten-Extraktion** mit Regex (IBAN, Währung, Kontoname)
**13 Transformationstypen** für flexible Datenverarbeitung
**Firefly III Integration** — CLI, Docker und HTTP-Upload
**Debug-Modus** für Transparenz bei Verarbeitung
**Production Ready** mit vollständiger Fehlerbehandlung
**Zero Dependencies** für Core-Funktionalität
### Workflow
```text
Input CSV
Metadaten extrahieren (Regex)
Datenzeilen transformieren (Pipeline)
Output CSV schreiben
[Optional] In Firefly III importieren
```
---
## Installation & Setup
### Voraussetzungen
- PHP 8.1+
- Composer (empfohlen)
- [Optional] Docker für Firefly III Integration
### Installation
```bash
# 1. Repository clonen/kopieren
cd ff-imp-preprocessor
# 2. Abhängigkeiten installieren (optional)
composer install
# 3. Konfiguration erstellen
cp config/config.example.json config/config.json
# Bearbeite config/config.json mit deinen Einstellungen
# 4. Directories erstellen
mkdir -p config/import/{source,output,archive,error}
chmod 755 config/import/{source,output,archive,error}
# 5. Test durchführen
php bin/transformer.php validate config/config.json input.csv
```
---
## Schnellstart
### 1. Konfiguration anpassen
Bearbeite `config/config.json` und stelle sicher, dass die Extraction-Rules zu deinem CSV-Format passen:
```json
{
"metadata": {
"extractionRules": [
{
"name": "account_iban",
"lineNumber": 2,
"regex": "IBAN:\\s*([A-Z0-9 ]+)",
"captureGroup": 1
}
]
},
"csvStructure": {
"headerLine": 5,
"delimiter": ";",
"encoding": "UTF-8"
}
}
```
### 2. CSV validieren
```bash
php bin/transformer.php validate config/config.json input.csv
```
### 3. Transformation durchführen
```bash
php bin/transformer.php transform input.csv config/config.json
# Mit Debug-Modus für Fehlersuche
php bin/transformer.php transform input.csv config/config.json --debug
```
### 4. Output prüfen
```bash
php bin/transformer.php test input.csv config/config.json --debug
# Zeigt max. 10 transformierte Zeilen und Debug-Logs
```
---
## Konfiguration
### config.json Struktur
#### `metadata` - Metadaten-Extraktion
```json
{
"metadata": {
"extractionRules": [
{
"name": "account_iban",
"lineNumber": 2,
"regex": "IBAN:\\s*([A-Z0-9 ]+)",
"captureGroup": 1
},
{
"name": "currency_code",
"lineNumber": 3,
"regex": "Währung:\\s*([A-Z]{3})",
"captureGroup": 1
}
]
}
}
```
| Feld | Typ | Beschreibung |
| ------ | ----- | ------------- |
| `name` | string | Name der Metadaten-Variable (verwendet in constantvalue) |
| `lineNumber` | int | Zeilennummer in CSV (1-basiert, menschenlesbar) |
| `regex` | string | Regex-Pattern zur Extraktion (ohne Delimiter) |
| `captureGroup` | int | Nummer der Klammer-Gruppe (0=komplett, 1=erste Klammer, etc.) |
**Beispiel Regex:**
- Pattern: `IBAN:\s*([A-Z0-9 ]+)`
- Input: `IBAN: CH93 0077 2020 6262 5252 7`
- Capture Group 1: `CH93 0077 2020 6262 5252 7`
#### `csvStructure` - CSV-Format
```json
{
"csvStructure": {
"headerLine": 5,
"delimiter": ";",
"encoding": "UTF-8",
"hasBom": false
}
}
```
| Feld | Typ | Default | Beschreibung |
| ------ | ----- | --------- | ------------- |
| `headerLine` | int | 5 | Zeilennummer der Header (1-basiert) |
| `delimiter` | string | `;` | CSV-Delimiter |
| `encoding` | string | `UTF-8` | Zeichenkodierung (UTF-8, ISO-8859-1, CP1252) |
| `hasBom` | bool | false | Hat die Datei BOM (Byte Order Mark)? |
#### `columnTransformations` - Spalten-Transformationen
```json
{
"columnTransformations": [
{
"sourceColumn": "Buchungsdatum",
"transformations": [
{
"type": "dateformat",
"fromFormat": "d.m.Y",
"toFormat": "Y-m-d"
}
],
"outputColumn": "date",
"outputAction": "overwrite"
}
]
}
```
**outputAction:**
- `overwrite` — Überschreibe sourceColumn
- `create` — Erstelle neue Spalte (für Regex-Extract, Split, etc.)
#### `directories` - Dateisystem
```json
{
"directories": {
"source": "/opt/ff-imp-preprocessor/import/source",
"output": "/opt/ff-imp-preprocessor/import/output",
"archive": "/opt/ff-imp-preprocessor/import/archive",
"error": "/opt/ff-imp-preprocessor/import/error"
}
}
```
| Feld | Beschreibung |
| ------ | ------------- |
| `source` | Eingabe-Verzeichnis |
| `output` | Ausgabe-Verzeichnis |
| `archive` | Archiv für verarbeitete Dateien |
| `error` | Error-Verzeichnis für ungültige Dateien |
#### `fireflyImport` - Firefly III Integration
Der Betriebsmodus wird über das Feld `mode` gesteuert. Mögliche Werte: `cli`, `docker`, `http`.
Details und vollständige Konfigurationsbeispiele: [Firefly III Integration](#firefly-iii-integration).
```json
{
"fireflyImport": {
"mode": "docker",
"jsonConfig": "/import/configs/ubs-import.json",
"importerCommand": "docker exec firefly-importer php artisan importer:import",
"autoImport": false,
"deleteAfterImport": false,
"timeout": 300
}
}
```
| Feld | Typ | Beschreibung |
| --- | --- | --- |
| `mode` | String | Betriebsmodus: `cli` \| `docker` \| `http` (Standard: `cli`) |
| `jsonConfig` | String | Pfad zur Firefly III Data Importer JSON-Konfigurationsdatei (Format v3) |
| `importerCommand` | String | Vollständiges CLI-Kommando *(Modi: cli, docker)* |
| `importerUrl` | String | URL des Data Importers *(Modus: http)* |
| `importerSecret` | String | `AUTO_IMPORT_SECRET` des Importers (mind. 16 Zeichen) *(Modus: http)* |
| `autoImport` | Boolean | Import direkt nach Transformation ausführen |
| `deleteAfterImport` | Boolean | Transformierte CSV nach erfolgreichem Import löschen |
| `timeout` | Integer | Timeout in Sekunden (Standard: 300) |
| `environment` | Object | Zusätzliche Umgebungsvariablen *(Modi: cli, docker)* |
---
## Transformationstypen
Es gibt **13 unterstützte Transformationstypen**, die als Pipeline kombiniert werden können:
### 1. **trim** - Leerzeichen entfernen
```json
{ "type": "trim" }
```
- Input: ` Coop Pronto ` → Output: `Coop Pronto`
---
### 2. **lowercase** - Zu Kleinbuchstaben
```json
{ "type": "lowercase" }
```
- Input: `COOP PRONTO CHUR` → Output: `coop pronto chur`
---
### 3. **uppercase** - Zu Grossbuchstaben
```json
{ "type": "uppercase" }
```
- Input: `Coop Pronto Chur` → Output: `COOP PRONTO CHUR`
---
### 4. **ucwordsfirst** - Grossschreibung nach Trennzeichen
```json
{ "type": "ucwordsfirst" }
```
- `COOP PRONTO CHUR``Coop Pronto Chur`
- `migros-rail city``Migros-Rail City`
- `O'NEILL STORE``O'Neill Store`
Trennzeichen: Leerzeichen, Bindestrich, Apostroph, Slash, Punkt, Komma, Semikolon, Doppelpunkt, Klammern.
---
### 5. **replace** - String-Replacement
```json
{ "type": "replace", "search": " ", "replace": " " }
```
- Input: `Coop Pronto` → Output: `Coop Pronto`
---
### 6. **split** - Spalte teilen
```json
{ "type": "split", "delimiter": ";", "part": 0 }
```
- Input: `Coop Pronto Chur;7007 Chur` → Output: `Coop Pronto Chur`
---
### 7. **regex** - Regex-Ersetzung
```json
{ "type": "regex", "pattern": "^(.*?);.*$", "replace": "$1" }
```
- Kein Match → Originalwert bleibt **unverändert** (pipeline-sicher)
---
### 8. **regexextract** - Regex-Extraktion
```json
{ "type": "regexextract", "pattern": "(\\d{4,} [^;]+)" }
```
- Kein Match → leerer String (**nicht** pipeline-sicher)
---
### 9. **dateformat** - Datum-Umformat
```json
{ "type": "dateformat", "fromFormat": "d.m.Y", "toFormat": "Y-m-d" }
```
- Input: `10.12.2025` → Output: `2025-12-10`
---
### 10. **truncate** - String kürzen
```json
{ "type": "truncate", "maxLength": 100 }
```
---
### 11. **constantvalue** - Konstanten-Wert aus Metadaten
```json
{
"sourceColumn": "_constant_",
"transformations": [{ "type": "constantvalue", "metadataKey": "account_iban" }],
"outputColumn": "account_iban",
"outputAction": "create"
}
```
---
### 12. **map** - Spalte kopieren
```json
{ "type": "map" }
```
---
### 13. **pipeline** - Verschachtelte Pipeline
```json
{
"type": "pipeline",
"steps": [
{ "type": "trim" },
{ "type": "lowercase" },
{ "type": "ucwordsfirst" }
]
}
```
---
### Pipeline-Beispiel
```json
{
"sourceColumn": "Buchungstext",
"transformations": [
{ "type": "trim" },
{ "type": "replace", "search": " ", "replace": " " },
{ "type": "lowercase" },
{ "type": "ucwordsfirst" }
],
"outputColumn": "description",
"outputAction": "overwrite"
}
```
**Verarbeitung:**
1. `" COOP PRONTO "` → trim → `"COOP PRONTO"`
2. `"COOP PRONTO"` → replace → `"COOP PRONTO"`
3. `"COOP PRONTO"` → lowercase → `"coop pronto"`
4. `"coop pronto"` → ucwordsfirst → `"Coop Pronto"`
---
## CLI-Referenz
```bash
php bin/transformer.php <command> [input] [config] [options]
```
### Kommandos
| Kommando | Beschreibung |
| -------- | ------------- |
| `test` | Test-Run (max. 10 Zeilen) |
| `transform` | Vollständige Transformation |
| `validate` | Konfiguration validieren |
| `auto-import` | Verzeichnis-Überwachung |
| `help` | Hilfe anzeigen |
### Optionen
| Option | Beschreibung |
| -------- | ------------- |
| `--debug`, `-d` | Debug-Modus aktivieren |
| `--rows=N` | Max. N Zeilen (test-Kommando) |
| `--output=FILE`, `-o` | Output-Pfad |
| `--strict` | Strikte Validierung |
| `--watch` | Kontinuierliche Überwachung |
| `--interval=SEC` | Prüfintervall in Sekunden |
| `--dry-run` | Simulationsmodus |
---
## Debug-Modus
```bash
php bin/transformer.php test input.csv config/config.json --debug
```
Der Debug-Modus protokolliert Ereignisse in folgenden Kategorien:
| Kategorie | Wann |
| ----------- | ------ |
| `transformer` | Anfang/Ende Transformation |
| `csv_reader` | Beim CSV lesen |
| `metadata` | Bei Metadaten-Extraktion |
| `metadata_warning` | Bei Problemen |
| `transformation` | Bei jeder Transformation |
| `csv_writer` | Beim CSV schreiben |
---
## Firefly III Integration
Drei Betriebsmodi decken alle typischen Deployment-Szenarien ab.
### Modus `cli`
Transformer und Importer auf demselben Server.
```json
"fireflyImport": {
"mode": "cli",
"jsonConfig": "/opt/firefly-data-importer/storage/configurations/ubs-import.json",
"importerCommand": "php /opt/firefly-data-importer/artisan importer:import",
"autoImport": true,
"timeout": 300,
"environment": {
"FIREFLY_III_URL": "https://localhost",
"FIREFLY_III_ACCESS_TOKEN": "your-token-here"
}
}
```
### Modus `docker`
Transformer lokal, Importer in Docker. Das Ausgabeverzeichnis muss als Volume eingebunden sein. `jsonConfig` ist der Pfad **innerhalb des Containers**.
```json
"fireflyImport": {
"mode": "docker",
"jsonConfig": "/import/configs/ubs-import.json",
"importerCommand": "docker exec firefly-importer php artisan importer:import",
"autoImport": true,
"timeout": 300
}
```
### Modus `http`
Transformer lokal, Importer über HTTP(S) erreichbar. Benötigt `ext-curl`.
**Voraussetzungen auf dem Importer-Server:**
```text
CAN_POST_FILES=true
AUTO_IMPORT_SECRET=<secret> # mindestens 16 Zeichen
```
```json
"fireflyImport": {
"mode": "http",
"importerUrl": "https://importer.your-server.com",
"importerSecret": "your-auto-import-secret-min-16-chars",
"jsonConfig": "/local/path/to/ubs-import.json",
"autoImport": true,
"timeout": 300
}
```
---
## Architektur
```text
bin/transformer.php (CLI Entry Point)
TransformerEngine (Orchestrierung)
├─ ConfigurationLoader (Config laden/validieren)
├─ CsvReader (CSV einlesen)
├─ MetadataExtractor (Metadaten mit Regex)
├─ ColumnTransformer (Transformationen anwenden)
├─ CsvWriter (CSV schreiben)
├─ FireflyImporter (Firefly III Integration)
└─ DebugLogger (Debug-Protokolle)
```
| Klasse | Verantwortung |
| -------- | --------------- |
| `TransformerEngine` | Orchestriert gesamten Workflow |
| `ConfigurationLoader` | Lädt und validiert JSON-Konfiguration |
| `CsvReader` | Liest CSV mit Metadaten |
| `MetadataExtractor` | Extrahiert Metadaten mit Regex |
| `ColumnTransformer` | Transformiert Spalten (Pipeline) |
| `CsvWriter` | Schreibt CSV |
| `FireflyImporter` | Importiert in Firefly III |
| `DebugLogger` | Statischer Logger für Debug |
---
## Fehlerbehandlung
### Häufige Fehler
#### "Input file not found"
```bash
# Prüfe Dateipfad
ls -la input.csv
# Nutze absoluten Pfad wenn relativ nicht funktioniert
php bin/transformer.php transform /absolute/path/input.csv config.json
```
#### "Missing metadata: account_iban"
```bash
# Prüfe erste Zeilen des CSV
head -5 input.csv
# Überprüfe lineNumber und regex in config.json
php bin/transformer.php validate config.json input.csv --debug
```
#### "Invalid JSON"
```bash
php -r "json_decode(file_get_contents('config/config.json'), true) or die('JSON invalid');"
```
#### "Configuration: 'csvStructure.headerLine' required"
```bash
diff config/config.json config/config.example.json
```
---
## Version & Änderungen
**v1.0.0 (03. Mai 2026)**
- ✅ Initial Release
- ✅ 13 Transformationstypen
- ✅ Metadaten-Extraktion mit Regex
- ✅ Debug-Modus
- ✅ Firefly III Integration (cli / docker / http)
- ✅ Vollständige Dokumentation
---
**Lizenz:** GPL-3.0
**Author:** PHP CSV Transformer Project
**Repository:** [git.andare.ch/david.reindl/ff-imp-preprocessor](https://git.andare.ch/david.reindl/ff-imp-preprocessor)

986
README.md

File diff suppressed because it is too large Load Diff

View File

@ -18,7 +18,7 @@ use UbsCsvTransformer\ConfigurationLoader;
use UbsCsvTransformer\FireflyImporter;
// ============================================================================
// CLI-Argument-Verarbeitung
// CLI argument processing
// ============================================================================
$argc = $_SERVER['argc'] ?? 0;
@ -29,10 +29,10 @@ if ($argc < 2) {
exit(0);
}
// Debug-Modus aktivierbar
// Debug mode can be enabled
$debug = in_array('--debug', $argv) || in_array('-d', $argv);
// Extrahiere Kommando
// Extract command
$command = $argv[1];
try {
@ -54,11 +54,26 @@ try {
// ============================================================================
/**
* Zeige Hilfe und Verwendungsanleitung
* Returns true when the active shell locale is German (de_*)
*/
function isGermanLocale(): bool
{
foreach (['LANG', 'LC_ALL', 'LC_MESSAGES', 'LANGUAGE'] as $var) {
$val = getenv($var);
if ($val !== false && $val !== '') {
return str_starts_with(strtolower($val), 'de');
}
}
return false;
}
/**
* Show help and usage instructions
*/
function showHelp(): void
{
echo <<<'HELP'
if (isGermanLocale()) {
echo <<<'HELP_DE'
╔════════════════════════════════════════════════════════════════════════════╗
Firefly Import Preprocessor - Kommandozeilen-Tool
@ -170,26 +185,150 @@ KONFIGURATION:
DOKUMENTATION:
Siehe README.md und UBS_Transformer_Guide.md für vollständige Dokumentation
Siehe README.md für vollständige Dokumentation
LIZENZ:
MIT License
GPL 3
HELP;
HELP_DE;
return;
}
echo <<<'HELP_EN'
╔════════════════════════════════════════════════════════════════════════════╗
Firefly Import Preprocessor - Command Line Tool
A lightweight PHP 8 tool for transforming UBS E-Banking exports
into a Firefly III compatible format.
╚════════════════════════════════════════════════════════════════════════════╝
USAGE:
transformer [command] [options]
COMMANDS:
test [input] [config] [options]
Tests the transformation with a limited number of rows
Options:
--rows=N Process only N rows (default: 10)
--output=FILE, -o Also write result to file
Example:
transformer test ubs-export.csv config.json --rows=5
transformer test ubs-export.csv config.json -o test-output.csv
transform [input] [config] [options]
Transforms a complete CSV file
Options:
--output=FILE, -o Output path (default: input-transformed.csv)
--no-import Do not automatically import into Firefly III
Example:
transformer transform ubs-export.csv config.json
transformer transform ubs-export.csv config.json -o import.csv
validate [config] [options]
Validates the configuration file
Options:
--strict Strict validation (recommended)
Example:
transformer validate config.json
transformer validate config.json --strict
auto-import [config] [options]
Monitors source directory and processes new files
Options:
--watch Continuous monitoring (daemon mode)
--interval=SEC Check interval in seconds (default: 60)
--dry-run Show what would be done (no actual processing)
Example:
transformer auto-import config.json
transformer auto-import config.json --watch --interval=30
help, -h, --help
Show this help
GLOBAL OPTIONS:
--debug, -d Enable debug mode (detailed output)
INSTALLATION:
1. PHP 8.1+ must be installed
php --version
2. Autoloader setup (choose one):
Option A: With Composer (recommended)
composer install
Option B: Manual - files in directory structure:
ff-imp-preprocessor/
├── bin/transformer.php
├── src/*.php
└── config/config.json
3. Make executable:
chmod +x bin/transformer.php
4. Adjust configuration:
cp config/config.example.json config/config.json
nano config/config.json
EXAMPLES:
# Test transformation (first 5 rows)
./bin/transformer test data/ubs-export.csv config/config.json --rows=5
# Full transformation
./bin/transformer transform data/ubs-export.csv config/config.json \
--output=output/firefly-import.csv
# Validate configuration
./bin/transformer validate config/config.json --strict
# Start auto-import with monitoring
./bin/transformer auto-import config/config.json --watch
# Process only next file
./bin/transformer auto-import config/config.json
CONFIGURATION:
The config.json must have the following structure:
{
"metadata": { "extractionRules": {...} },
"csvStructure": { "delimiter": ";", ... },
"columnTransformations": { ... },
"fireflyImport": { "apiUrl": "...", "apiKey": "..." },
"directories": {
"source": "./import/source",
"output": "./import/output",
"archive": "./import/archive",
"error": "./import/error"
}
}
DOCUMENTATION:
See README.md for full documentation
LICENSE:
GPL 3
HELP_EN;
}
/**
* Expandiert ~ zu absolutem Home-Verzeichnis und löst relative Pfade auf
* Expands ~ to absolute home directory and resolves relative paths
*/
function expandPath(string $path): string
{
if (str_starts_with($path, '~/') || $path === '~') {
$home = getenv('HOME') ?: posix_getpwuid(posix_getuid())['dir'];
$homeEnv = getenv('HOME');
$pwInfo = posix_getpwuid(posix_getuid());
$home = $homeEnv !== false && $homeEnv !== '' ? $homeEnv : ($pwInfo !== false ? $pwInfo['dir'] : '~');
$path = $home . substr($path, 1);
}
// Relative Pfade gegen cwd auflösen (ohne realpath, damit nicht-existierende Dirs erlaubt sind)
// Resolve relative paths against cwd (without realpath, so non-existent dirs are allowed)
if (!str_starts_with($path, '/')) {
$path = getcwd() . '/' . $path;
}
@ -198,7 +337,7 @@ function expandPath(string $path): string
}
/**
* Parse CLI-Optionen in assoziatives Array
* Parses CLI options into an associative array
*/
function parseOptions(array $argv, int $startIndex = 0): array
{
@ -217,9 +356,9 @@ function parseOptions(array $argv, int $startIndex = 0): array
}
/**
* Teste Transformation mit begrenzter Zeilenzahl
* Tests transformation with a limited number of rows
*/
function handleTest($argc, $argv): void
function handleTest(int $argc, array $argv): void
{
if ($argc < 4) {
throw new Exception("Usage: transformer test [input-file] [config-file] [options]");
@ -234,10 +373,10 @@ function handleTest($argc, $argv): void
$outputFile = $options['output'] ?? $options['o'] ?? null;
if (!file_exists($inputFile)) {
throw new Exception("Input-Datei nicht gefunden: $inputFile");
throw new Exception("Input file not found: $inputFile");
}
if (!file_exists($configFile)) {
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
throw new Exception("Configuration file not found: $configFile");
}
echo "\n📊 TEST-MODUS: Verarbeite max. $maxRows Zeilen\n";
@ -292,9 +431,9 @@ function handleTest($argc, $argv): void
}
/**
* Transformiere komplette CSV-Datei
* Transforms a complete CSV file
*/
function handleTransform($argc, $argv): void
function handleTransform(int $argc, array $argv): void
{
if ($argc < 4) {
throw new Exception("Usage: transformer transform [input-file] [config-file] [options]");
@ -308,10 +447,10 @@ function handleTransform($argc, $argv): void
$outputFile = $options['output'] ?? $options['o'] ?? null;
if (!file_exists($inputFile)) {
throw new Exception("Input-Datei nicht gefunden: $inputFile");
throw new Exception("Input file not found: $inputFile");
}
if (!file_exists($configFile)) {
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
throw new Exception("Configuration file not found: $configFile");
}
echo "\n🚀 TRANSFORMATION STARTEN\n";
@ -320,7 +459,7 @@ function handleTransform($argc, $argv): void
$configLoader = new ConfigurationLoader($configFile);
$configLoader->load();
// --output überschreibt Zielverzeichnis und Dateiname aus der Konfiguration
// --output overrides target directory and filename from configuration
if ($outputFile !== null) {
$outputFile = expandPath($outputFile);
$configLoader->set('directories.output', dirname($outputFile));
@ -338,9 +477,9 @@ function handleTransform($argc, $argv): void
}
/**
* Validiere Konfigurationsdatei
* Validates the configuration file
*/
function handleValidate($argc, $argv): void
function handleValidate(int $argc, array $argv): void
{
if ($argc < 3) {
throw new Exception("Usage: transformer validate [config-file] [options]");
@ -351,7 +490,7 @@ function handleValidate($argc, $argv): void
$strict = isset($options['strict']);
if (!file_exists($configFile)) {
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
throw new Exception("Configuration file not found: $configFile");
}
echo "\n✔️ KONFIGURATION VALIDIEREN\n";
@ -362,7 +501,7 @@ function handleValidate($argc, $argv): void
try {
$config = $configLoader->load();
// Basis-Validierung
// Basic validation
echo "✅ JSON-Format valide\n";
$required = ['metadata', 'csvStructure', 'columnTransformations'];
@ -377,7 +516,7 @@ function handleValidate($argc, $argv): void
}
}
// Firefly-Validierung
// Firefly validation
if (isset($config['fireflyImport'])) {
echo "✅ Firefly III Konfiguration vorhanden\n";
if (empty($config['fireflyImport']['apiUrl'])) {
@ -396,7 +535,7 @@ function handleValidate($argc, $argv): void
echo "⚠️ Firefly III Konfiguration nicht vorhanden (optional)\n";
}
// Verzeichnisse-Validierung
// Directory validation
if (isset($config['directories'])) {
echo "✅ Verzeichnisse konfiguriert\n";
$dirs = ['source', 'output', 'archive', 'error'];
@ -416,14 +555,14 @@ function handleValidate($argc, $argv): void
echo "\n⚠️ Konfiguration hat Warnungen aber ist funktional\n\n";
}
} catch (Exception $e) {
throw new Exception("Validierungsfehler: " . $e->getMessage());
throw new Exception("Validation error: " . $e->getMessage());
}
}
/**
* Auto-Import mit Verzeichnis-Überwachung
* Auto-import with directory monitoring
*/
function handleAutoImport($argc, $argv): void
function handleAutoImport(int $argc, array $argv): void
{
if ($argc < 3) {
throw new Exception("Usage: transformer auto-import [config-file] [options]");
@ -434,7 +573,7 @@ function handleAutoImport($argc, $argv): void
$debug = isset($options['debug']) || isset($options['d']);
if (!file_exists($configFile)) {
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
throw new Exception("Configuration file not found: $configFile");
}
$configLoader = new ConfigurationLoader($configFile);
@ -448,7 +587,7 @@ function handleAutoImport($argc, $argv): void
$watch = isset($options['watch']);
$interval = isset($options['interval']) ? (int)$options['interval'] : 60;
// Verzeichnisse erstellen
// Create directories
foreach ([$sourceDir, $outputDir, $archiveDir, $errorDir] as $dir) {
if (!is_dir($dir)) {
mkdir($dir, 0755, true);
@ -476,7 +615,9 @@ function handleAutoImport($argc, $argv): void
if ($watch) {
echo "⏳ Drücke Ctrl+C zum Beenden.\n\n";
while (true) {
$running = true;
/** @phpstan-ignore while.alwaysTrue (intentional infinite loop — terminated only via Ctrl+C / SIGINT) */
while ($running) {
processImportDirectory($sourceDir, $outputDir, $archiveDir, $errorDir, $config, $configFile, $dryRun, $debug);
sleep($interval);
}
@ -486,9 +627,9 @@ function handleAutoImport($argc, $argv): void
}
/**
* Verarbeite Verzeichnis mit CSV-Dateien
* Processes directory containing CSV files
*/
function processImportDirectory($sourceDir, $outputDir, $archiveDir, $errorDir, $config, $configFile, $dryRun = false, $debug = false): void
function processImportDirectory(string $sourceDir, string $outputDir, string $archiveDir, string $errorDir, array $config, string $configFile, bool $dryRun = false, bool $debug = false): void
{
if (!is_dir($sourceDir)) {
return;
@ -516,10 +657,10 @@ function processImportDirectory($sourceDir, $outputDir, $archiveDir, $errorDir,
$result = $engine->transform($file);
$outputFile = $result['outputFile'] ?? $outputFile;
// Archiviere Original-Datei
// Archive original file
$archiveFile = $archiveDir . '/' . $basename;
if (!rename($file, $archiveFile)) {
throw new Exception("Konnte nicht archivieren");
throw new Exception("Could not archive file");
}
// Firefly Import
@ -534,7 +675,7 @@ function processImportDirectory($sourceDir, $outputDir, $archiveDir, $errorDir,
echo "" . $e->getMessage() . "\n";
if (!$dryRun) {
// Verschiebe zu Error-Verzeichnis
// Move to error directory
$errorFile = $errorDir . '/' . $basename;
@rename($file, $errorFile);
}

View File

@ -40,6 +40,7 @@
"vimeo/psalm": "^5.0"
},
"suggest": {
"ext-curl": "Benötigt für Modus fireflyImport.mode=http (HTTP-Upload an den Data Importer)",
"monolog/monolog": "For advanced logging capabilities (optional)",
"guzzlehttp/guzzle": "For Firefly III HTTP client integration (optional)"
},

View File

@ -199,11 +199,16 @@
],
"fireflyImport": {
"jsonConfig": "/opt/firefly/import-config.json",
"importerCommand": "docker exec -it firefly-importer php artisan importer:import",
"mode": "docker",
"jsonConfig": "/import/configs/ubs-import.json",
"importerCommand": "docker exec firefly-importer php artisan importer:import",
"autoImport": false,
"deleteAfterImport": false,
"timeout": 300,
"environment": {
"FIREFLY_III_URL": "https://your-firefly.com",
"FIREFLY_III_ACCESS_TOKEN": "your-token-here"

View File

@ -0,0 +1,53 @@
{
"_comment_1": "Firefly III Data Importer configuration file (format version 3)",
"_comment_2": "Created for the output of config-ubs-account.json (11 columns, comma-delimited)",
"_comment_3": "Adjust: set 'default_account' to your Firefly III asset account ID (number, not name)",
"_comment_4": "Docs: https://docs.firefly-iii.org/references/data-importer/json/",
"version": 3,
"flow": "csv",
"date": "Y-m-d",
"delimiter": "comma",
"headers": true,
"conversion": false,
"default_account": 1,
"rules": true,
"skip_form": true,
"add_import_tag": true,
"duplicate_detection_method": "classic",
"ignore_duplicate_lines": true,
"ignore_duplicate_transactions": true,
"roles": [
"amount_debit",
"amount_credit",
"date_transaction",
"date_process",
"opposing-name",
"tags-comma",
"description",
"opposing-iban",
"note",
"account-iban",
"currency-code"
],
"do_mapping": {
"0": false,
"1": false,
"2": false,
"3": false,
"4": false,
"5": false,
"6": false,
"7": false,
"8": false,
"9": false,
"10": false
},
"mapping": {}
}

View File

@ -3,30 +3,30 @@
namespace UbsCsvTransformer;
/**
* Transformiert Spalten gemäß Konfiguration
* Transforms columns according to configuration
*
* Unterstützte Transformationstypen (canonical names):
* - map: Spalte kopieren/umbenennen (Standard)
* - replace: String-Replacement (str_replace)
* - regex: Regex-Replace mit preg_replace (Backreferenzen: $1, $2 )
* - dateformat: Datum-Formatierung
* - split: Spalte bei Delimiter teilen
* - regexextract: Mit Regex extrahieren
* - trim: Whitespace entfernen
* - uppercase: In Grossbuchstaben umwandeln
* - lowercase: In Kleinbuchstaben umwandeln
* - ucwordsfirst: Ersten Buchstaben nach Worttrennern gross
* - truncate: String auf maximale Länge kürzen
* - constantvalue: Konstanten-Wert aus Metadaten
* - pipeline: Mehrere Transformationen hintereinander (via steps[])
* - custom: Custom PHP-Callback
* Supported transformation types (canonical names):
* - map: Copy/rename column (default)
* - replace: String replacement (str_replace)
* - regex: Regex replace via preg_replace (backreferences: $1, $2 )
* - dateformat: Date formatting
* - split: Split column at delimiter
* - regexextract: Extract using regex
* - trim: Remove whitespace
* - uppercase: Convert to uppercase
* - lowercase: Convert to lowercase
* - ucwordsfirst: Capitalise first letter after word boundaries
* - truncate: Truncate string to maximum length
* - constantvalue: Constant value from metadata
* - pipeline: Chain multiple transformations (via steps[])
* - custom: Custom PHP callback
*
* Unterstützte outputAction-Werte:
* - create / overwrite: Ziel-Spalte setzen (Standard)
* - append: Wert anhängen
* - append-line: Wert auf neuer Zeile anhängen (kein Leerzeichen wenn Ziel leer)
* - overwrite-if-empty: Nur setzen wenn Ziel-Spalte leer
* - overwrite-if-not-empty: Nur setzen wenn Ergebnis nicht leer
* Supported outputAction values:
* - create / overwrite: Set target column (default)
* - append: Append value
* - append-line: Append value on new line (no leading newline if target is empty)
* - overwrite-if-empty: Only set if target column is empty
* - overwrite-if-not-empty: Only set if transformation result is not empty
*/
class ColumnTransformer
{
@ -36,11 +36,11 @@ class ColumnTransformer
private array $globalExceptions;
/**
* Initialisiert ColumnTransformer mit Transformationsregeln
* Initialises ColumnTransformer with transformation rules
*
* @param array $transformations Transformationskonfiguration aus config.json
* @param array $metadata Extrahierte Metadaten aus CSV-Header
* @param array $globalExceptions Globale Ausnahmeliste r ucwordsfirst
* @param array $transformations Transformation configuration from config.json
* @param array $metadata Extracted metadata from CSV header
* @param array $globalExceptions Global exceptions list for ucwordsfirst
*/
public function __construct(array $transformations, array $metadata = [], array $globalExceptions = [])
{
@ -51,36 +51,36 @@ class ColumnTransformer
}
/**
* Transformiert eine einzelne Datenzeile
* Transforms a single data row
*
* Wendet alle definierten Transformationen auf die Zeile an.
* Kann neue Spalten generieren (z.B. bei regex_extract).
* Applies all defined transformations to the row.
* Can generate new columns (e.g. for regex_extract).
*
* @param array $row Datenzeile mit Header-Keys als Array-Keys
* @param array $row Data row with header keys as array keys
*
* @return array Transformierte Datenzeile
* @return array Transformed data row
*/
public function transformRow(array $row): array
{
$transformedRow = $row;
foreach ($this->transformations as $config) {
// Multi-Output Detection (für split)
// Multi-output detection (for split)
if (isset($config['outputs']) && is_array($config['outputs'])) {
// Multi-Output Transformation (z.B. split in mehrere Spalten)
// Multi-output transformation (e.g. split into multiple columns)
$multiOutputResult = $this->handleMultiOutputTransformation($transformedRow, $config);
// Merge Ergebnisse in transformedRow
// Merge results into transformedRow
foreach ($multiOutputResult as $columnName => $value) {
$transformedRow[$columnName] = $value;
// Registriere neue Spalten
// Register new columns
if (!in_array($columnName, $this->outputColumns)) {
$this->outputColumns[] = $columnName;
}
}
// Fahre mit nächster Transformation fort
// Continue with next transformation
continue;
}
@ -90,7 +90,7 @@ class ColumnTransformer
if (empty($targetColumn)) {
throw new \RuntimeException(
"Transformation fehlt 'outputColumn' Feld: " . json_encode($config)
"Transformation missing 'outputColumn' field: " . json_encode($config)
);
}
@ -127,20 +127,20 @@ class ColumnTransformer
$transformedRow[$targetColumn] = ($transformedRow[$targetColumn] ?? '') . $resultValue;
break;
case 'append-line':
// Wert auf neuer Zeile anhängen; kein führender Zeilenumbruch wenn Ziel leer
// Append value on new line; no leading newline if target is empty
if ($resultValue !== '') {
$existing = $transformedRow[$targetColumn] ?? '';
$transformedRow[$targetColumn] = $existing !== '' ? $existing . "\n" . $resultValue : $resultValue;
}
break;
case 'overwrite-if-empty':
// Nur überschreiben wenn Ziel-Spalte leer ist
// Only overwrite if target column is empty
if (($transformedRow[$targetColumn] ?? '') === '') {
$transformedRow[$targetColumn] = $resultValue;
}
break;
case 'overwrite-if-not-empty':
// Nur überschreiben wenn das Transformations-Ergebnis nicht leer ist
// Only overwrite if the transformation result is not empty
if ($resultValue !== '') {
$transformedRow[$targetColumn] = $resultValue;
}
@ -157,14 +157,14 @@ class ColumnTransformer
}
/**
* Wendet eine einzelne Transformation auf einen Stringwert an
* Applies a single transformation to a string value
*
* Normalisiert den Typ-Namen (snake_case, PascalCase, no-separator alle akzeptiert)
* und delegiert an die jeweilige transformXxx()-Methode.
* Normalises the type name (snake_case, PascalCase, no-separator all accepted)
* and delegates to the respective transformXxx() method.
*
* @param string $value Eingabewert
* @param array $config Transformationskonfiguration
* @return string Transformierter Wert
* @param string $value Input value
* @param array $config Transformation configuration
* @return string Transformed value
*/
private function applySingleTransformation(string $value, array $config): string
{
@ -219,8 +219,8 @@ class ColumnTransformer
}
/**
* Normalisiert Transformationstyp-Namen: lowercase, Trennzeichen entfernt.
* Erlaubt z.B. dass 'dateformat' und 'dateFormat' beide funktionieren.
* Normalises transformation type names: lowercase, separators removed.
* Allows e.g. 'dateformat' and 'dateFormat' to both work.
*/
private function normalizeTransformType(string $type): string
{
@ -228,19 +228,19 @@ class ColumnTransformer
}
/**
* String-Replacement Transformation
* String replacement transformation
*
* Konfiguration:
* Configuration:
* ```
* "type": "replace",
* "search": "Alt",
* "replace": "Neu"
* "search": "old",
* "replace": "new"
* ```
*
* @param string $value Ursprungswert
* @param array $config Transformationskonfiguration
* @param string $value Source value
* @param array $config Transformation configuration
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformReplace(string $value, array $config): string
{
@ -255,22 +255,22 @@ class ColumnTransformer
}
/**
* Regex-Replace Transformation
* Regex replace transformation
*
* Wendet einen regulären Ausdruck auf den Wert an und ersetzt den Treffer.
* Backreferenz-Syntax: $1, $2 usw. im replace-String.
* Applies a regular expression to the value and replaces the match.
* Backreference syntax: $1, $2 etc. in the replace string.
*
* Konfiguration:
* Configuration:
* ```
* "type": "regex",
* "pattern": "SumUp \\*+(.*)",
* "replace": "[$1]"
* ```
*
* @param string $value Ursprungswert
* @param array $config Transformationskonfiguration
* @param string $value Source value
* @param array $config Transformation configuration
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformRegex(string $value, array $config): string
{
@ -288,19 +288,19 @@ class ColumnTransformer
}
/**
* Datum-Format Transformation
* Date format transformation
*
* Konfiguration:
* Configuration:
* ```
* "type": "date_format",
* "fromFormat": "d.m.Y",
* "toFormat": "Y-m-d"
* ```
*
* @param string $value Ursprungswert
* @param array $config Transformationskonfiguration
* @param string $value Source value
* @param array $config Transformation configuration
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformDate(string $value, array $config): string
{
@ -323,26 +323,26 @@ class ColumnTransformer
}
/**
* Split Transformation
* Split transformation
*
* Teilt einen Wert bei einem Delimiter und behaelt einen definierten Teil
* Splits a value at a delimiter and keeps a defined part
*
* Beispiel:
* Example:
* Input: "Coop Pronto Chur;7007 Chur"
* Config: delimiter=";", part=0
* Output: "Coop Pronto Chur"
*
* Konfiguration:
* Configuration:
* ```
* "type": "split",
* "delimiter": ";",
* "part": 0
* ```
*
* @param string $value Ursprungswert
* @param array $config Transformationskonfiguration
* @param string $value Source value
* @param array $config Transformation configuration
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformSplit(string $value, array $config): string
{
@ -370,16 +370,16 @@ class ColumnTransformer
}
/**
* Regex Extract Transformation
* Regex extract transformation
*
* Extrahiert einen Teil mit Regex und erstellt neue Spalte
* Extracts a portion using regex and creates a new column
*
* Beispiel:
* Example:
* Input: "Coop Pronto Chur;7007 Chur"
* Config: pattern="(\d{4,} .*)"
* Output: "7007 Chur" (in neuer Spalte "Location")
* Output: "7007 Chur" (in new column "Location")
*
* Konfiguration:
* Configuration:
* ```
* "Location": {
* "type": "regex_extract",
@ -388,10 +388,10 @@ class ColumnTransformer
* }
* ```
*
* @param string $value Ursprungswert
* @param array $config Transformationskonfiguration
* @param string $value Source value
* @param array $config Transformation configuration
*
* @return string|null Extrahierter Wert oder null
* @return string|null Extracted value or null
*/
private function transformRegexExtract(string $value, array $config): ?string
{
@ -421,22 +421,22 @@ class ColumnTransformer
}
/**
* Trim Transformation
* Trim transformation
*
* Entfernt Leerzeichen am Anfang und Ende eines Strings
* Removes whitespace from the beginning and end of a string
*
* Konfiguration:
* Configuration:
* ```
* "type": "trim"
* ```
*
* Beispiel:
* Example:
* Input: " Coop Pronto "
* Output: "Coop Pronto"
*
* @param string $value Ursprungswert
* @param string $value Source value
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformTrim(string $value): string
{
@ -444,22 +444,22 @@ class ColumnTransformer
}
/**
* Lowercase Transformation
* Lowercase transformation
*
* Wandelt einen String in Kleinbuchstaben um (UTF-8 safe)
* Converts a string to lowercase (UTF-8 safe)
*
* Konfiguration:
* Configuration:
* ```
* "type": "lowercase"
* ```
*
* Beispiel:
* Example:
* Input: "COOP PRONTO CHUR"
* Output: "coop pronto chur"
*
* @param string $value Ursprungswert
* @param string $value Source value
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformLowercase(string $value): string
{
@ -467,22 +467,22 @@ class ColumnTransformer
}
/**
* Uppercase Transformation
* Uppercase transformation
*
* Wandelt einen String in Grossbuchstaben um (UTF-8 safe)
* Converts a string to uppercase (UTF-8 safe)
*
* Konfiguration:
* Configuration:
* ```
* "type": "uppercase"
* ```
*
* Beispiel:
* Example:
* Input: "Coop Pronto Chur"
* Output: "COOP PRONTO CHUR"
*
* @param string $value Ursprungswert
* @param string $value Source value
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformUppercase(string $value): string
{
@ -490,65 +490,65 @@ class ColumnTransformer
}
/**
* Ucwords First Transformation
* Ucwords first transformation
*
* Grossschreibung nur des ersten Buchstabens nach Worttrennern.
* Alle anderen Buchstaben werden zu Kleinbuchstaben.
* Funktioniert auch, wenn Input komplett in Grossbuchstaben vorliegt.
* Capitalises only the first letter after word boundaries.
* All other letters are converted to lowercase.
* Works even when input is entirely in uppercase.
*
* Konfiguration:
* Configuration:
* ```
* "type": "ucwords_first"
* ```
*
* Mit Ausnahmeliste (Wörter, die exakt erhalten bleiben):
* With exceptions list (words that are preserved exactly):
* ```
* "type": "ucwords_first",
* "exceptions": ["SBB", "UBS", "AG", "GmbH"]
* ```
*
* Beispiele:
* Examples:
* "COOP PRONTO CHUR" "Coop Pronto Chur"
* "migros-rail city zuerich" "Migros-Rail City Zuerich"
* "O'NEILL STORE" "O'Neill Store"
* "SAINT-JEAN-DE-MAURIENNE" "Saint-Jean-De-Maurienne"
*
* Wortgrenzen definiert durch: Leerzeichen, Bindestrich, Apostroph,
* Slash, Punkt, Komma, Semikolon, Doppelpunkt, Klammern, Anführungszeichen
* Word boundaries defined by: space, hyphen, apostrophe,
* slash, period, comma, semicolon, colon, brackets, quotation marks
*
* @param string $value Ursprungswert
* @param string $value Source value
*
* @return string Transformierter Wert
* @return string Transformed value
*/
private function transformUcwordsFirst(string $value, array $config = []): string
{
// Schritt 1: Alles zu Kleinbuchstaben
// Step 1: Convert everything to lowercase
$value = mb_strtolower($value, 'UTF-8');
// Schritt 2: Definiere Wortgrenzen (Trennzeichen)
// Diese Zeichen markieren Grenzen, nach denen grossgeschrieben wird
// Step 2: Define word boundaries (delimiters)
// These characters mark boundaries after which capitalisation is applied
$delimiters = [
' ', // Leerzeichen
'-', // Bindestrich
'\'', // Apostroph
'/', // Slash
'.', // Punkt
',', // Komma
';', // Semikolon
':', // Doppelpunkt
'(', // Oeffnende Klammer
')', // Schliessende Klammer
'[', // Oeffnende eckige Klammer
']', // Schliessende eckige Klammer
'{', // Oeffnende geschweifte Klammer
'}', // Schliessende geschweifte Klammer
'"', // Anführungszeichen
'&', // Ampersand
'+' // Plus
' ', // space
'-', // hyphen
'\'', // apostrophe
'/', // slash
'.', // period
',', // comma
';', // semicolon
':', // colon
'(', // opening parenthesis
')', // closing parenthesis
'[', // opening square bracket
']', // closing square bracket
'{', // opening curly bracket
'}', // closing curly bracket
'"', // quotation mark
'&', // ampersand
'+' // plus
];
// Schritt 3: Regex-Pattern fuer "Stringanfang ODER Delimiter, gefolgt von Buchstabe"
// Die u-Flag ermoeglicht Unicode-Unterstaetzung (\p{L})
// Step 3: Regex pattern for "start of string OR delimiter, followed by letter"
// The u-flag enables Unicode support (\p{L})
$escapedDelimiters = array_map(function ($char) {
return preg_quote($char, '/');
}, $delimiters);
@ -556,18 +556,18 @@ class ColumnTransformer
$pattern = '/(^|[' . $delimiterPattern . '])(\p{L})/u';
// Schritt 4: Callback fuer preg_replace_callback
// Grossschreibe den gefangenen Buchstaben (Capture Group 2)
// Step 4: Callback for preg_replace_callback
// Capitalise the captured letter (capture group 2)
$callback = function (array $matches): string {
// $matches[1] = Stringanfang oder Trennzeichen
// $matches[2] = Buchstabe, der grossgeschrieben werden soll
// $matches[1] = start of string or delimiter
// $matches[2] = letter to be capitalised
return $matches[1] . mb_strtoupper($matches[2], 'UTF-8');
};
// Schritt 5: Anwende Transformation
// Step 5: Apply transformation
$result = preg_replace_callback($pattern, $callback, $value) ?? $value;
// Schritt 6: Ausnahmeliste anwenden (Wörter die exakt erhalten bleiben sollen, z.B. SBB, UBS, GmbH)
// Step 6: Apply exceptions list (words to be preserved exactly, e.g. SBB, UBS, GmbH)
$exceptions = $config['exceptions'] ?? $this->globalExceptions;
foreach ($exceptions as $exception) {
if (!is_string($exception) || $exception === '') {
@ -581,12 +581,12 @@ class ColumnTransformer
}
/**
* Pipeline Transformation
* Pipeline transformation
*
* Wendet mehrere Transformationen nacheinander auf einen Wert an.
* Jeder Schritt benutzt das Ergebnis des vorherigen Schrittes.
* Applies multiple transformations sequentially to a value.
* Each step uses the result of the previous step.
*
* Konfiguration:
* Configuration:
* ```
* "Merchant": {
* "type": "pipeline",
@ -599,17 +599,17 @@ class ColumnTransformer
* }
* ```
*
* Beispiel:
* Example:
* Input: " COOP PRONTO CHUR "
* Step 1 (trim): "COOP PRONTO CHUR"
* Step 2 (lowercase): "coop pronto chur"
* Step 3 (ucwords_first): "Coop Pronto Chur"
* Output: "Coop Pronto Chur"
*
* @param string $value Ursprungswert
* @param array $config Transformationskonfiguration mit 'steps' Array
* @param string $value Source value
* @param array $config Transformation configuration with 'steps' array
*
* @return string Transformierter Wert nach allen Schritten
* @return string Transformed value after all steps
*/
private function transformPipeline(string $value, array $config): string
{
@ -619,7 +619,7 @@ class ColumnTransformer
return $value;
}
// Wende jeden Schritt nacheinander an
// Apply each step sequentially
foreach ($steps as $step) {
if (!empty($step['type'] ?? $step['transform'] ?? null)) {
$value = $this->applySingleTransformation($value, $step);
@ -630,23 +630,23 @@ class ColumnTransformer
}
/**
* Custom Callback Transformation
* Custom callback transformation
*
* Ruft eine Custom-Funktion auf, die komplexe Logik implementiert
* Calls a custom function implementing complex logic
*
* Konfiguration:
* Configuration:
* ```
* "type": "custom",
* "callback": "myCustomFunction"
* ```
*
* Die Callback-Funktion erhaelt die gesamte Zeile und gibt die
* modifizierte Zeile zurueck.
* The callback function receives the entire row and returns the
* modified row.
*
* @param array $row Gesamte Datenzeile
* @param array $config Transformationskonfiguration
* @param array $row Complete data row
* @param array $config Transformation configuration
*
* @return array Transformierte Datenzeile
* @return array Transformed data row
*/
private function transformCustom(array $row, array $config): array
{
@ -664,10 +664,10 @@ class ColumnTransformer
}
/**
* Behandelt Multi-Output Transformationen
* Aktuell nur für 'split' implementiert.
* Handles multi-output transformations
* Currently only implemented for 'split'.
*
* Config-Beispiel:
* Config example:
* {
* "outputs": ["FirstName", "LastName"],
* "sourceColumn": "FullName",
@ -675,10 +675,10 @@ class ColumnTransformer
* "delimiter": " "
* }
*
* @param array $row Input-Zeile
* @param array $config Transformations-Konfiguration
* @return array Assoziatives Array: columnName => value
* @throws \RuntimeException wenn Transformation-Type nicht unterstützt
* @param array $row Input row
* @param array $config Transformation configuration
* @return array Associative array: columnName => value
* @throws \RuntimeException if transformation type is not supported
*/
private function handleMultiOutputTransformation(array $row, array $config): array
{
@ -687,39 +687,39 @@ class ColumnTransformer
$transformType = $this->normalizeTransformType($config['type'] ?? '');
if (empty($outputs) || empty($sourceColumn) || empty($transformType)) {
throw new \RuntimeException("Multi-Output Transformation benötigt 'outputs', 'sourceColumn' und 'type'");
throw new \RuntimeException("Multi-output transformation requires 'outputs', 'sourceColumn' and 'type'");
}
$sourceValue = $row[$sourceColumn] ?? '';
if ($transformType !== 'split') {
throw new \RuntimeException("Multi-Output nur für 'split' unterstützt, gegeben: {$transformType}");
throw new \RuntimeException("Multi-output only supported for 'split', given: {$transformType}");
}
return $this->handleMultiOutputSplit($sourceValue, $outputs, $config);
}
/**
* Split-Transformation mit Multi-Output
* Teilt einen String und verteilt die Teile auf mehrere Spalten
* Split transformation with multi-output
* Splits a string and distributes the parts across multiple columns
*
* @param string $value Zu teilender String
* @param array $outputs Liste der Ziel-Spaltennamen
* @param array $config Transformation-Config
* @return array Assoziatives Array: columnName => value
* @param string $value String to split
* @param array $outputs List of target column names
* @param array $config Transformation configuration
* @return array Associative array: columnName => value
*/
private function handleMultiOutputSplit(string $value, array $outputs, array $config): array
{
$delimiter = $config['delimiter'] ?? ';';
// Führe Split durch
// Perform split
$parts = explode($delimiter, $value);
// Mappe Parts zu Output-Spalten
// Map parts to output columns
$result = [];
foreach ($outputs as $index => $columnName) {
// Wenn Teil existiert: verwenden (getrimmt) // Wenn nicht: leerer String
// If part exists: use it (trimmed) // If not: empty string
$result[$columnName] = isset($parts[$index]) ? trim($parts[$index]) : '';
}
@ -730,11 +730,11 @@ class ColumnTransformer
}
/**
* Gibt die Anzahl der Output-Spalten zurueck
* Returns the number of output columns
*
* Zaehlt Original-Spalten plus neu generierte Spalten (z.B. bei regex_extract)
* Counts original columns plus newly generated columns (e.g. from regex_extract)
*
* @return int Anzahl Output-Spalten
* @return int Number of output columns
*/
public function getOutputColumns(): int
{

View File

@ -3,7 +3,7 @@
namespace UbsCsvTransformer;
/**
* Lädt und validiert JSON-Konfigurationsdateien
* Loads and validates JSON configuration files
*/
class ConfigurationLoader
{
@ -16,19 +16,19 @@ class ConfigurationLoader
}
/**
* Lädt die Konfigurationsdatei
* Loads the configuration file
*
* @return array Die geladene und validierte Konfiguration
* @throws \RuntimeException wenn Datei nicht gefunden oder ungültig
* @return array The loaded and validated configuration
* @throws \RuntimeException if file not found or invalid
*/
public function load(): array
{
if (!file_exists($this->configFile)) {
throw new \RuntimeException("Konfigurationsdatei nicht gefunden: {$this->configFile}");
throw new \RuntimeException("Configuration file not found: {$this->configFile}");
}
if (pathinfo($this->configFile, PATHINFO_EXTENSION) !== 'json') {
throw new \RuntimeException("Konfigurationsdatei muss eine JSON-Datei sein: {$this->configFile}");
throw new \RuntimeException("Configuration file must be a JSON file: {$this->configFile}");
}
$this->config = $this->loadJson($this->configFile);
@ -38,96 +38,96 @@ class ConfigurationLoader
}
/**
* Lädt eine JSON-Datei
* Loads a JSON file
*
* @param string $file Pfad zur JSON-Datei
* @return array Geparste Konfiguration
* @param string $file Path to JSON file
* @return array Parsed configuration
*/
private function loadJson(string $file): array
{
$json = file_get_contents($file);
if ($json === false) {
throw new \RuntimeException("Konnte JSON-Datei nicht lesen: {$file}");
throw new \RuntimeException("Could not read JSON file: {$file}");
}
$config = json_decode($json, true);
if ($config === null && json_last_error() !== JSON_ERROR_NONE) {
throw new \RuntimeException("Ungültiges JSON: " . json_last_error_msg());
throw new \RuntimeException("Invalid JSON: " . json_last_error_msg());
}
return $config;
}
/**
* Validiert die geladene Konfiguration auf erforderliche Felder
* Validates the loaded configuration for required fields
*
* @throws \RuntimeException wenn erforderliche Felder fehlen
* @throws \RuntimeException if required fields are missing
*/
private function validate(): void
{
// Metadata erforderlich
// Metadata required
if (empty($this->config['metadata'])) {
throw new \RuntimeException("Konfiguration: 'metadata' Section erforderlich");
throw new \RuntimeException("Configuration: 'metadata' section required");
}
if (!isset($this->config['metadata']['extractionRules']) || !is_array($this->config['metadata']['extractionRules'])) {
throw new \RuntimeException("Konfiguration: 'metadata.extractionRules' erforderlich (kann leer sein: [])");
throw new \RuntimeException("Configuration: 'metadata.extractionRules' required (may be empty: [])");
}
// CSV-Struktur erforderlich
// CSV structure required
if (empty($this->config['csvStructure'])) {
throw new \RuntimeException("Konfiguration: 'csvStructure' Section erforderlich");
throw new \RuntimeException("Configuration: 'csvStructure' section required");
}
if (!isset($this->config['csvStructure']['headerLine'])) {
throw new \RuntimeException("Konfiguration: 'csvStructure.headerLine' erforderlich");
throw new \RuntimeException("Configuration: 'csvStructure.headerLine' required");
}
// Column Transformations erforderlich
// Column transformations required
if (empty($this->config['columnTransformations'])) {
throw new \RuntimeException("Konfiguration: 'columnTransformations' erforderlich");
throw new \RuntimeException("Configuration: 'columnTransformations' required");
}
// Directories validieren (wenn auto-import genutzt wird)
// Validate directories (if auto-import is used)
if (!empty($this->config['directories'])) {
foreach (['source', 'output', 'archive', 'error'] as $dir) {
if (empty($this->config['directories'][$dir])) {
throw new \RuntimeException("Konfiguration: 'directories.{$dir}' erforderlich für Auto-Import");
throw new \RuntimeException("Configuration: 'directories.{$dir}' required for auto-import");
}
}
}
// Validiere CSV-Struktur Werte
// Validate CSV structure values
$headerLine = $this->config['csvStructure']['headerLine'] ?? 1;
if (!is_int($headerLine) || $headerLine < 1) {
throw new \Exception(
'Konfiguration csvStructure.headerLine muss eine positive Ganzzahl sein'
'Configuration: csvStructure.headerLine must be a positive integer'
);
}
$delimiter = $this->config['csvStructure']['inputDelimiter'] ?? '';
if (strlen($delimiter) === 0) {
throw new \Exception(
'Konfiguration csvStructure.inputDelimiter darf nicht leer sein'
'Configuration: csvStructure.inputDelimiter must not be empty'
);
}
// Validiere Encoding
// Validate encoding
$encoding = $this->config['csvStructure']['encoding'] ?? 'UTF-8';
if (!in_array($encoding, ['UTF-8', 'ISO-8859-1', 'CP1252'])) {
throw new \Exception(
'Konfiguration csvStructure.encoding: ' . $encoding . ' nicht unterstützt'
'Configuration: csvStructure.encoding: ' . $encoding . ' not supported'
);
}
}
/**
* Gibt eine einzelne Konfigurationsoption zurück
* Returns a single configuration option
*
* @param string $key Dot-Notation Key (z.B. 'metadata.extractionRules')
* @param mixed $default Standardwert wenn Key nicht existiert
* @return mixed Der Konfigurationswert
* @param string $key Dot-notation key (e.g. 'metadata.extractionRules')
* @param mixed $default Default value if key does not exist
* @return mixed The configuration value
*/
public function get(string $key, mixed $default = null): mixed
{
@ -145,9 +145,9 @@ class ConfigurationLoader
}
/**
* Gibt die vollständige Konfiguration zurück
* Returns the complete configuration
*
* @return array Die komplette Konfiguration
* @return array The full configuration
*/
public function getAll(): array
{
@ -155,10 +155,10 @@ class ConfigurationLoader
}
/**
* Setzt einen Konfigurationswert (überschreibt bestehenden Wert)
* Sets a configuration value (overwrites existing value)
*
* @param string $key Dot-Notation Key (z.B. 'directories.output')
* @param mixed $value Neuer Wert
* @param string $key Dot-notation key (e.g. 'directories.output')
* @param mixed $value New value
* @return void
*/
public function set(string $key, mixed $value): void
@ -179,9 +179,9 @@ class ConfigurationLoader
}
/**
* Prüft ob ein Konfigurationsschlüssel existiert
* Checks whether a configuration key exists
*
* @param string $key Dot-Notation Key
* @param string $key Dot-notation key
* @return bool
*/
public function has(string $key): bool

View File

@ -3,10 +3,10 @@
namespace UbsCsvTransformer;
/**
* Liest und parst CSV-Dateien
* Reads and parses CSV files
*
* Diese Klasse liest CSV-Dateien mit konfigurierbarem Delimiter
* und separiert Metadaten-Zeilen von den eigentlichen Datenzeilen.
* Reads CSV files with a configurable delimiter and separates
* metadata lines from the actual data rows.
*/
class CsvReader
{
@ -16,8 +16,8 @@ class CsvReader
private bool $hasBom;
/**
* @param string $filePath Pfad zur CSV-Datei
* @param array $csvStructure CSV-Struktur aus Konfiguration
* @param string $filePath Path to the CSV file
* @param array $csvStructure CSV structure from configuration
*/
public function __construct(string $filePath, array $csvStructure)
{
@ -28,25 +28,25 @@ class CsvReader
}
/**
* Liest alle Zeilen aus der Datei
* Reads all lines from the file
*
* @param int $maxLines Maximale Anzahl Zeilen (0 = alle)
* @return array Array mit Zeilen (ohne Newlines)
* @throws \RuntimeException wenn Datei nicht gelesen werden kann
* @param int $maxLines Maximum number of lines (0 = all)
* @return array Array of lines (without newlines)
* @throws \RuntimeException if file cannot be read
*/
public function readLines(int $maxLines = 0): array
{
if (!file_exists($this->filePath) || !is_readable($this->filePath)) {
throw new \RuntimeException("Konnte Datei nicht lesen: {$this->filePath}");
throw new \RuntimeException("Could not read file: {$this->filePath}");
}
$lines = file($this->filePath, FILE_IGNORE_NEW_LINES);
if ($lines === false) {
throw new \RuntimeException("Konnte Datei nicht lesen: {$this->filePath}");
throw new \RuntimeException("Could not read file: {$this->filePath}");
}
// BOM entfernen falls vorhanden
// Remove BOM if present
if ($this->hasBom && !empty($lines)) {
$lines[0] = $this->removeBom($lines[0]);
}
@ -59,9 +59,9 @@ class CsvReader
}
/**
* Liest die Metadaten-Zeilen (vor der Header-Zeile)
* Reads the metadata lines (before the header line)
*
* @return array Array mit Metadaten-Zeilen
* @return array Array of metadata lines
*/
public function readMetadataLines(): array
{
@ -75,28 +75,28 @@ class CsvReader
}
/**
* Liest die CSV-Daten mit Headers
* Reads CSV data with headers
*
* @param int $maxDataRows Maximale Anzahl Datenzeilen (0 = alle)
* @return array Array von assoziativen Arrays (mit Spalten-Namen als Keys)
* @throws \RuntimeException wenn Header-Zeile nicht gefunden
* @param int $maxDataRows Maximum number of data rows (0 = all)
* @return array Array of associative arrays (with column names as keys)
* @throws \RuntimeException if header line is not found
*/
public function readCsvData(int $maxDataRows = 0): array
{
$lines = $this->readLines();
if ($this->headerLine > count($lines)) {
throw new \RuntimeException("Header-Zeile {$this->headerLine} nicht gefunden in Datei mit " . count($lines) . " Zeilen");
throw new \RuntimeException("Header line {$this->headerLine} not found in file with " . count($lines) . " lines");
}
// Header parsen
// Parse header
$headerLineContent = $lines[$this->headerLine - 1];
$headers = str_getcsv($headerLineContent, $this->delimiter, '"', '\\');
$headers = array_map(static fn(?string $v): string => trim($v ?? ''), $headers);
// Datenzeilen parsen
// Parse data rows
$data = [];
$dataStartLine = $this->headerLine; // 0-basiert
$dataStartLine = $this->headerLine; // 0-based
$lineCount = 0;
for ($i = $dataStartLine; $i < count($lines); $i++) {
@ -106,7 +106,7 @@ class CsvReader
$lineContent = $lines[$i];
// Leere Zeilen überspringen
// Skip empty lines
if (trim($lineContent) === '') {
continue;
}
@ -114,7 +114,7 @@ class CsvReader
$row = str_getcsv($lineContent, $this->delimiter, '"', '\\');
$row = array_map(static fn(?string $v): string => trim($v ?? ''), $row);
// Zeile mit Header-Keys kombinieren
// Combine row with header keys
$rowData = [];
foreach ($headers as $index => $header) {
$rowData[$header] = $row[$index] ?? '';
@ -128,17 +128,17 @@ class CsvReader
}
/**
* Gibt die Spalten-Header zurück
* Returns the column headers
*
* @return array Array mit Spalten-Namen
* @throws \RuntimeException wenn Header-Zeile nicht gefunden
* @return array Array of column names
* @throws \RuntimeException if header line is not found
*/
public function getHeaders(): array
{
$lines = $this->readLines();
if ($this->headerLine > count($lines)) {
throw new \RuntimeException("Header-Zeile {$this->headerLine} nicht gefunden");
throw new \RuntimeException("Header line {$this->headerLine} not found");
}
$headerLineContent = $lines[$this->headerLine - 1];
@ -148,10 +148,10 @@ class CsvReader
}
/**
* Entfernt UTF-8 BOM (Byte Order Mark) von String
* Removes UTF-8 BOM (Byte Order Mark) from string
*
* @param string $text String mit potenziellem BOM
* @return string String ohne BOM
* @param string $text String with potential BOM
* @return string String without BOM
*/
private function removeBom(string $text): string
{
@ -162,9 +162,9 @@ class CsvReader
}
/**
* Gibt die Gesamtzahl der Zeilen in der Datei zurück
* Returns the total number of lines in the file
*
* @return int Anzahl Zeilen
* @return int Number of lines
*/
public function countLines(): int
{
@ -172,9 +172,9 @@ class CsvReader
}
/**
* Gibt die Anzahl der Datenzeilen zurück (ohne Header und Metadaten)
* Returns the number of data rows (excluding header and metadata)
*
* @return int Anzahl Datenzeilen
* @return int Number of data rows
*/
public function countDataRows(): int
{

View File

@ -3,10 +3,9 @@
namespace UbsCsvTransformer;
/**
* Schreibt transformierte Daten in CSV-Datei
* Writes transformed data to a CSV file
*
* Diese Klasse schreibt die transformierten Daten in eine
* Firefly III-kompatible CSV-Datei.
* Writes transformed data into a Firefly III-compatible CSV file.
*/
class CsvWriter
{
@ -14,8 +13,8 @@ class CsvWriter
private string $delimiter;
/**
* @param string $outputFile Pfad zur Output-Datei
* @param array $csvStructure CSV-Struktur aus Konfiguration
* @param string $outputFile Path to the output file
* @param array $csvStructure CSV structure from configuration
*/
public function __construct(string $outputFile, array $csvStructure = [])
{
@ -24,39 +23,39 @@ class CsvWriter
}
/**
* Schreibt Daten in CSV-Datei
* Writes data to a CSV file
*
* @param array $data Array von assoziativen Arrays (Zeilen)
* @throws \RuntimeException wenn Datei nicht geschrieben werden kann
* @param array $data Array of associative arrays (rows)
* @throws \RuntimeException if file cannot be written
*/
public function write(array $data): void
{
if (empty($data)) {
throw new \RuntimeException("Keine Daten zum Schreiben");
throw new \RuntimeException("No data to write");
}
// Output-Verzeichnis erstellen falls nicht vorhanden
// Create output directory if it does not exist
$dir = dirname($this->outputFile);
if (!is_dir($dir)) {
if (!mkdir($dir, 0755, true)) {
throw new \RuntimeException("Konnte Output-Verzeichnis nicht erstellen: {$dir}");
throw new \RuntimeException("Could not create output directory: {$dir}");
}
}
$fp = fopen($this->outputFile, 'w');
if ($fp === false) {
throw new \RuntimeException("Konnte Output-Datei nicht erstellen: {$this->outputFile}");
throw new \RuntimeException("Could not create output file: {$this->outputFile}");
}
try {
// Headers schreiben (Spalten-Namen aus erster Zeile)
// Write headers (column names from first row)
$headers = array_keys($data[0]);
$this->writeCsvLine($fp, $headers);
// Datenzeilen schreiben
// Write data rows
foreach ($data as $row) {
// Sicherstellen dass alle Spalten vorhanden sind
// Ensure all columns are present
$values = [];
foreach ($headers as $header) {
$values[] = $row[$header] ?? '';
@ -70,25 +69,25 @@ class CsvWriter
}
/**
* Schreibt eine CSV-Zeile mit fputcsv
* Writes a CSV line using fputcsv
*
* @param resource $fp File-Handle
* @param array $values Array mit Werten
* @throws \RuntimeException wenn Schreiben fehlschlägt
* @param resource $fp File handle
* @param array $values Array of values
* @throws \RuntimeException if writing fails
*/
private function writeCsvLine($fp, array $values): void
{
$result = fputcsv($fp, $values, $this->delimiter, '"', '\\');
if ($result === false) {
throw new \RuntimeException("Fehler beim Schreiben der CSV-Zeile");
throw new \RuntimeException("Error writing CSV row");
}
}
/**
* Gibt den Pfad zur Output-Datei zurück
* Returns the path to the output file
*
* @return string Output-Dateipfad
* @return string Output file path
*/
public function getOutputFile(): string
{
@ -96,9 +95,9 @@ class CsvWriter
}
/**
* Prüft ob Output-Datei erstellt wurde
* Checks whether the output file was created
*
* @return bool True wenn Datei existiert
* @return bool True if file exists
*/
public function fileExists(): bool
{
@ -106,9 +105,9 @@ class CsvWriter
}
/**
* Gibt die Größe der Output-Datei zurück
* Returns the size of the output file
*
* @return int|false Dateigröße in Bytes oder false bei Fehler
* @return int|false File size in bytes or false on error
*/
public function getFileSize(): int|false
{

View File

@ -3,42 +3,42 @@
namespace UbsCsvTransformer;
/**
* Zentraler Debug-Logger für Transparenz
* Central debug logger for transparency
*
* Sammelt Debug-Informationen aus allen Komponenten und macht die
* Verarbeitung nachvollziehbar. Ermöglicht Transparenz über alle
* Verarbeitungsschritte: Metadaten-Extraktion, Transformationen,
* CSV-Lesevorgänge etc.
* Collects debug information from all components and makes the
* processing traceable. Provides transparency over all
* processing steps: metadata extraction, transformations,
* CSV reads etc.
*
* Verwendung:
* - DebugLogger::enable() Debug-Modus aktivieren
* - DebugLogger::log('category', 'message', $data) Nachricht loggen
* - DebugLogger::getLogs() Alle Logs abrufen
* - DebugLogger::reset() Logs zurücksetzen
* Usage:
* - DebugLogger::enable() activate debug mode
* - DebugLogger::log('category', 'message', $data) log message
* - DebugLogger::getLogs() retrieve all logs
* - DebugLogger::reset() reset logs
*
* Beispiel:
* Example:
* ```php
* DebugLogger::enable();
* DebugLogger::log('metadata', 'IBAN extrahiert', ['iban' => 'CH9300762011623852957']);
* DebugLogger::log('metadata', 'IBAN extracted', ['iban' => 'CH9300762011623852957']);
* $logs = DebugLogger::getLogs();
* ```
*/
class DebugLogger
{
/**
* @var bool Ist Debug-Modus aktiviert?
* @var bool Whether debug mode is enabled
*/
private static bool $enabled = false;
/**
* @var array Gesammelte Logs mit Timestamp, Kategorie, Nachricht und Daten
* @var array Collected logs with timestamp, category, message and data
*/
private static array $logs = [];
/**
* Aktiviert den Debug-Modus
* Enables debug mode
*
* Nach Aktivierung werden alle DebugLogger::log() Aufrufe protokolliert.
* Once enabled, all DebugLogger::log() calls are recorded.
*
* @return void
*/
@ -48,9 +48,9 @@ class DebugLogger
}
/**
* Deaktiviert den Debug-Modus
* Disables debug mode
*
* Nach Deaktivierung werden DebugLogger::log() Aufrufe ignoriert.
* Once disabled, DebugLogger::log() calls are ignored.
*
* @return void
*/
@ -60,16 +60,16 @@ class DebugLogger
}
/**
* Protokolliert eine Debug-Nachricht
* Records a debug message
*
* Sammelt Informationen über jeden Verarbeitungsschritt mit Timestamp,
* Kategorie, Nachricht und optionalen Daten. Die Logs werden nur
* gesammelt, wenn der Debug-Modus aktiviert ist.
* Collects information about each processing step with timestamp,
* category, message and optional data. Logs are only collected
* when debug mode is enabled.
*
* @param string $category Kategorie der Log-Nachricht
* z.B. 'metadata', 'transformation', 'csv_reader', 'config'
* @param string $message Beschreibung der Aktion oder des Ereignisses
* @param mixed $data Zusätzliche Kontextdaten (Array oder beliebiger Wert)
* @param string $category Log message category
* e.g. 'metadata', 'transformation', 'csv_reader', 'config'
* @param string $message Description of the action or event
* @param mixed $data Additional context data (array or any value)
*
* @return void
*/
@ -88,16 +88,16 @@ class DebugLogger
}
/**
* Gibt alle gesammelten Logs zurück
* Returns all collected logs
*
* Liefert ein Array aller protokollierten Ereignisse mit vollständigen
* Informationen für Analyse und Debugging.
* Delivers an array of all recorded events with complete
* information for analysis and debugging.
*
* @return array Array von Log-Einträgen, jeder mit:
* - timestamp: Mikrosekunden-Zeitstempel
* - category: Log-Kategorie
* - message: Beschreibung
* - data: Zusätzliche Daten
* @return array Array of log entries, each with:
* - timestamp: microsecond timestamp
* - category: log category
* - message: description
* - data: additional data
*/
public static function getLogs(): array
{
@ -105,10 +105,10 @@ class DebugLogger
}
/**
* Setzt alle Logs zurück
* Resets all logs
*
* Löscht den gesamten Log-Buffer. Nützlich um zwischen mehreren
* Transformationen einen sauberen State zu haben.
* Clears the entire log buffer. Useful for maintaining a clean
* state between multiple transformations.
*
* @return void
*/
@ -118,9 +118,9 @@ class DebugLogger
}
/**
* Gibt die Anzahl der gesammelten Log-Einträge zurück
* Returns the number of collected log entries
*
* @return int Anzahl protokollierter Ereignisse
* @return int Number of recorded events
*/
public static function count(): int
{
@ -128,9 +128,9 @@ class DebugLogger
}
/**
* Prüft ob Debug-Modus aktiviert ist
* Checks whether debug mode is enabled
*
* @return bool true wenn aktiviert, false sonst
* @return bool true if enabled, false otherwise
*/
public static function isEnabled(): bool
{
@ -138,18 +138,18 @@ class DebugLogger
}
/**
* Gibt einen formattierten String aller Logs zurück
* Returns a formatted string of all logs
*
* Konvertiert den Log-Buffer in ein lesbares Format für Konsolen-Ausgabe.
* Converts the log buffer into a readable format for console output.
*
* @param bool $includeData true = auch Daten ausgeben, false = nur Messages
* @param bool $includeData true = also output data, false = messages only
*
* @return string Formatierte Log-Ausgabe
* @return string Formatted log output
*/
public static function format(bool $includeData = true): string
{
if (empty(self::$logs)) {
return "Keine Debug-Logs vorhanden.\n";
return "No debug logs available.\n";
}
$output = "\n=== DEBUG LOGS ===\n";

View File

@ -5,171 +5,247 @@ namespace UbsCsvTransformer;
/**
* Firefly III Data Importer Integration
*
* Diese Klasse integriert den Firefly III Data Importer.
* Der Import erfolgt über die offizielle Firefly III Data Importer CLI.
* This class integrates the Firefly III Data Importer (v2.x / configuration format v3).
* Three operating modes are supported (field "mode" in the configuration):
*
* SETUP-VORAUSSETZUNGEN:
* ----------------------
* ═══════════════════════════════════════════════════════════════════════════════
* MODE 1: "cli" Transformer and Firefly instance on the same server
* ═══════════════════════════════════════════════════════════════════════════════
* The transformer calls the Firefly III Data Importer directly via the command line.
* The importer must be installed locally (e.g. as a standalone installation).
*
* 1. Firefly III Data Importer installiert und konfiguriert
* - Docker: firefly/data-importer:latest
* - Oder: Standalone Installation
* "fireflyImport": {
* "mode": "cli",
* "jsonConfig": "/opt/firefly-data-importer/storage/configurations/ubs-import.json",
* "importerCommand": "php /opt/firefly-data-importer/artisan importer:import",
* "autoImport": true,
* "deleteAfterImport": false,
* "timeout": 300,
* "environment": {
* "FIREFLY_III_URL": "https://localhost",
* "FIREFLY_III_ACCESS_TOKEN": "your-token-here"
* }
* }
*
* 2. Import-Konfiguration erstellt (config.json):
* - In Firefly III Web-UI: Import Configure
* - CSV-Format konfigurieren
* - JSON-Konfiguration herunterladen
* - Speichern als z.B.: /opt/firefly/configs/ubs-import.json
* ═══════════════════════════════════════════════════════════════════════════════
* MODE 2: "docker" Transformer local or in Docker, Firefly in Docker container
* ═══════════════════════════════════════════════════════════════════════════════
* The transformer calls the importer via "docker exec" in the running container.
* The importer container must have the transformer's output directory mounted as a volume
* so the importer can read the CSV file.
*
* 3. Umgebungsvariablen für Firefly Data Importer:
* - FIREFLY_III_URL=https://your-firefly-instance.com
* - FIREFLY_III_ACCESS_TOKEN=<personal_access_token>
* - VANITY_URL (optional)
* IMPORTANT: "jsonConfig" is the path inside the container (not a local path).
* The file must be placed in the container via volume mount or "docker cp".
* "-it" flags in "docker exec" must be omitted (no TTY).
*
* INTEGRATION IN config.yaml:
* ---------------------------
* Example docker-compose.yml:
* volumes:
* - /opt/ubs-csv-transformer/import:/import # visible as /import inside the container
*
* fireflyImport:
* # Pfad zur JSON-Konfiguration (aus Firefly III exportiert)
* jsonConfig: '/opt/firefly/configs/ubs-import.json'
* "fireflyImport": {
* "mode": "docker",
* "jsonConfig": "/import/configs/ubs-import.json",
* "importerCommand": "docker exec firefly-importer php artisan importer:import",
* "autoImport": true,
* "deleteAfterImport": false,
* "timeout": 300
* }
*
* # Firefly Data Importer Kommando
* # Option 1: Docker
* importerCommand: 'docker exec -it firefly-importer php artisan importer:import'
* ═══════════════════════════════════════════════════════════════════════════════
* MODE 3: "http" Transformer local, Firefly importer on remote server (HTTP/S)
* ═══════════════════════════════════════════════════════════════════════════════
* The transformer uploads CSV and JSON configuration via HTTP multipart upload to the
* Firefly III Data Importer. The importer must have these environment variables set:
* CAN_POST_FILES=true (allows file upload via API)
* AUTO_IMPORT_SECRET=<secret> (at least 16 characters, must match "importerSecret")
*
* # Option 2: Standalone
* # importerCommand: 'cd /opt/firefly-data-importer && php artisan importer:import'
* HTTP endpoint: POST {importerUrl}/autoupload
* Fields: secret (string), json (JSON config file), importable (CSV file)
*
* # Automatisch nach Transformation importieren?
* autoImport: true
* Local requirement: PHP extension ext-curl must be available.
*
* # Output-Datei nach erfolgreichem Import löschen?
* deleteAfterImport: true
* "fireflyImport": {
* "mode": "http",
* "importerUrl": "https://importer.your-server.com",
* "importerSecret": "your-auto-import-secret-min-16-chars",
* "jsonConfig": "/local/path/to/ubs-import.json",
* "autoImport": true,
* "deleteAfterImport": false,
* "timeout": 300
* }
*
* # Timeout für Import (Sekunden)
* timeout: 300
* ═══════════════════════════════════════════════════════════════════════════════
* COMMON SETUP REQUIREMENTS (all modes)
* ═══════════════════════════════════════════════════════════════════════════════
*
* # Environment-Variablen für Firefly Data Importer
* environment:
* FIREFLY_III_URL: 'https://your-firefly.com'
* FIREFLY_III_ACCESS_TOKEN: 'your-token-here'
* Firefly III Data Importer JSON configuration file:
* - Describes the column mapping of the transformed CSV to Firefly III transaction fields
* (configuration format version 3).
* - Creation: configure a CSV once in the Firefly III Data Importer web UI,
* then download the configuration.
* - Alternative: use config/firefly-import-config.example.json as a template and
* adjust "default_account" to the desired Firefly III asset account ID.
*
* VERWENDUNG:
* -----------
*
* // Automatisch beim Auto-Import
* ./bin/transformer auto-import config/config.yaml
*
* // Oder manuell nach Transformation
* USAGE in code:
* $importer = new FireflyImporter($config['fireflyImport']);
* $result = $importer->import('/path/to/transformed.csv');
*/
class FireflyImporter
{
/** @var array<string, mixed> */
private array $config;
private string $mode;
private string $jsonConfigPath;
private string $importerCommand;
private string $importerUrl;
private string $importerSecret;
private bool $deleteAfterImport;
private int $timeout;
/** @var array<string, string> */
private array $environment;
/**
* @param array $config Firefly Import-Konfiguration aus config.yaml
* @throws \RuntimeException wenn Konfiguration ungültig
* @param array<string, mixed> $config Firefly import configuration
* @throws \RuntimeException if configuration is invalid
*/
public function __construct(array $config)
{
$this->config = $config;
// JSON-Konfigurationspfad validieren
$this->jsonConfigPath = $config['jsonConfig'] ?? '';
// Determine operating mode (default: "cli")
$this->mode = (string) ($config['mode'] ?? 'cli');
if (!in_array($this->mode, ['cli', 'docker', 'http'], true)) {
throw new \RuntimeException(
"Firefly Import: Invalid mode '{$this->mode}'. Allowed: cli, docker, http"
);
}
// JSON config path (local path for all modes — except docker)
$this->jsonConfigPath = (string) ($config['jsonConfig'] ?? '');
if (empty($this->jsonConfigPath)) {
throw new \RuntimeException("Firefly Import: 'jsonConfig' nicht konfiguriert");
throw new \RuntimeException("Firefly Import: 'jsonConfig' not configured");
}
if (!file_exists($this->jsonConfigPath)) {
throw new \RuntimeException("Firefly JSON-Konfiguration nicht gefunden: {$this->jsonConfigPath}");
// For cli and http: local file must exist
// For docker: path is inside the container (no local file_exists() check)
if ($this->mode !== 'docker' && !file_exists($this->jsonConfigPath)) {
throw new \RuntimeException("Firefly JSON configuration not found: {$this->jsonConfigPath}");
}
// Importer-Kommando
$this->importerCommand = $config['importerCommand'] ?? '';
// Validate mode-specific fields
if ($this->mode === 'http') {
if (!extension_loaded('curl')) {
throw new \RuntimeException(
"Firefly Import (mode 'http'): PHP extension ext-curl required"
);
}
$this->importerUrl = (string) ($config['importerUrl'] ?? '');
if (empty($this->importerUrl)) {
throw new \RuntimeException("Firefly Import: 'importerUrl' not configured (mode: http)");
}
$this->importerSecret = (string) ($config['importerSecret'] ?? '');
if (empty($this->importerSecret)) {
throw new \RuntimeException("Firefly Import: 'importerSecret' not configured (mode: http)");
}
$this->importerCommand = '';
} else {
$this->importerCommand = (string) ($config['importerCommand'] ?? '');
if (empty($this->importerCommand)) {
throw new \RuntimeException("Firefly Import: 'importerCommand' nicht konfiguriert");
throw new \RuntimeException(
"Firefly Import: 'importerCommand' not configured (mode: {$this->mode})"
);
}
$this->importerUrl = '';
$this->importerSecret = '';
}
// Optionale Einstellungen
$this->deleteAfterImport = $config['deleteAfterImport'] ?? false;
$this->environment = $config['environment'] ?? [];
// Common optional fields
$this->deleteAfterImport = (bool) ($config['deleteAfterImport'] ?? false);
$this->timeout = (int) ($config['timeout'] ?? 300);
/** @var array<string, string> $env */
$env = $config['environment'] ?? [];
$this->environment = $env;
}
/**
* Importiert eine transformierte CSV-Datei in Firefly III
* Imports a transformed CSV file into Firefly III
*
* Der Import erfolgt über den Firefly III Data Importer CLI:
* php artisan importer:import <csv_file> <config_file>
* Automatically selects the import method based on the configured mode:
* - cli/docker: calls the importer via command line (proc_open)
* - http: HTTP multipart upload to the importer endpoint (/autoupload)
*
* @param string $csvFile Pfad zur transformierten CSV-Datei
* @return array Import-Ergebnis mit Status und Ausgabe
* @param string $csvFile Path to the transformed CSV file
* @return array<string, mixed> Import result with status and output
*/
public function import(string $csvFile): array
{
if (!file_exists($csvFile)) {
return [
'success' => false,
'error' => "CSV-Datei nicht gefunden: {$csvFile}",
'output' => '',
'exit_code' => -1
'error' => "CSV file not found: {$csvFile}",
'output' => ['stdout' => '', 'stderr' => ''],
'exit_code' => -1,
];
}
// Kommando zusammenbauen
$command = $this->buildImportCommand($csvFile);
if ($this->mode === 'http') {
return $this->importViaHttp($csvFile);
}
// Environment-Variablen setzen
return $this->importViaCli($csvFile);
}
/**
* Import via command line (modes: cli, docker)
*
* Builds the command and executes it via proc_open.
* For docker, the CSV output directory must be mounted as a volume in the container.
*
* @param string $csvFile Path to the CSV file
* @return array<string, mixed> Import result
*/
private function importViaCli(string $csvFile): array
{
$command = $this->buildImportCommand($csvFile);
$env = $this->buildEnvironment();
// Import ausführen
/** @var array<string, string> $output */
$output = [];
$exitCode = 0;
$startTime = microtime(true);
try {
// Kommando ausführen mit Timeout
$descriptors = [
0 => ["pipe", "r"], // stdin
1 => ["pipe", "w"], // stdout
2 => ["pipe", "w"] // stderr
0 => ['pipe', 'r'],
1 => ['pipe', 'w'],
2 => ['pipe', 'w'],
];
$process = proc_open($command, $descriptors, $pipes, null, $env);
if (!is_resource($process)) {
throw new \RuntimeException("Konnte Import-Prozess nicht starten");
throw new \RuntimeException('Could not start import process');
}
// stdin schließen
fclose($pipes[0]);
// stdout und stderr lesen
$stdout = stream_get_contents($pipes[1]);
$stderr = stream_get_contents($pipes[2]);
fclose($pipes[1]);
fclose($pipes[2]);
// Auf Prozess-Ende warten
$exitCode = proc_close($process);
$output = [
'stdout' => $stdout,
'stderr' => $stderr
'stdout' => is_string($stdout) ? $stdout : '',
'stderr' => is_string($stderr) ? $stderr : '',
];
$duration = microtime(true) - $startTime;
$success = ($exitCode === 0);
// Bei Erfolg: Optional CSV-Datei löschen
if ($success && $this->deleteAfterImport) {
@unlink($csvFile);
}
@ -181,29 +257,100 @@ class FireflyImporter
'duration' => round($duration, 2),
'csv_file' => $csvFile,
'config_file' => $this->jsonConfigPath,
'deleted' => ($success && $this->deleteAfterImport)
'deleted' => ($success && $this->deleteAfterImport),
];
} catch (\Exception $e) {
return [
'success' => false,
'error' => $e->getMessage(),
'output' => $output,
'exit_code' => $exitCode
'exit_code' => $exitCode,
];
}
}
/**
* Baut das Import-Kommando zusammen
* Import via HTTP multipart upload (mode: http)
*
* @param string $csvFile Pfad zur CSV-Datei
* @return string Vollständiges Kommando
* Sends CSV file and JSON configuration to POST {importerUrl}/autoupload.
* The importer must have CAN_POST_FILES=true and AUTO_IMPORT_SECRET set.
*
* @param string $csvFile Path to the CSV file
* @return array<string, mixed> Import result
*/
private function importViaHttp(string $csvFile): array
{
$url = rtrim($this->importerUrl, '/') . '/autoupload';
$ch = curl_init();
if ($ch === false) {
return [
'success' => false,
'error' => 'Could not initialise cURL',
'output' => ['stdout' => '', 'stderr' => ''],
'exit_code' => -1,
];
}
$postFields = [
'secret' => $this->importerSecret,
'json' => new \CURLFile($this->jsonConfigPath),
'importable' => new \CURLFile($csvFile),
];
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postFields);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, $this->timeout);
$startTime = microtime(true);
$response = curl_exec($ch);
$httpCode = (int) curl_getinfo($ch, CURLINFO_HTTP_CODE);
$curlError = curl_error($ch);
curl_close($ch);
$duration = microtime(true) - $startTime;
$responseBody = is_string($response) ? $response : '';
if ($curlError !== '') {
return [
'success' => false,
'error' => "cURL error: {$curlError}",
'output' => ['stdout' => '', 'stderr' => $curlError],
'exit_code' => -1,
'duration' => round($duration, 2),
];
}
$success = ($httpCode === 200);
if ($success && $this->deleteAfterImport) {
@unlink($csvFile);
}
return [
'success' => $success,
'exit_code' => $httpCode,
'output' => ['stdout' => $responseBody, 'stderr' => ''],
'duration' => round($duration, 2),
'csv_file' => $csvFile,
'config_file' => $this->jsonConfigPath,
'deleted' => ($success && $this->deleteAfterImport),
];
}
/**
* Builds the CLI import command (modes: cli, docker)
*
* Firefly Data Importer CLI format:
* <importerCommand> <csv_file> <config_file>
*
* @param string $csvFile Path to the CSV file
* @return string Complete command
*/
private function buildImportCommand(string $csvFile): string
{
// Firefly Data Importer CLI-Format:
// php artisan importer:import <csv_file> <config_file>
return sprintf(
'%s %s %s',
$this->importerCommand,
@ -213,9 +360,12 @@ class FireflyImporter
}
/**
* Baut Environment-Variablen zusammen
* Builds environment variables (modes: cli, docker)
*
* @return array|null Environment-Variablen oder null
* Takes the current process environment and extends it with the
* variables defined in the configuration (e.g. FIREFLY_III_URL).
*
* @return array<string, string>|null Environment variables or null (no changes)
*/
private function buildEnvironment(): ?array
{
@ -223,7 +373,7 @@ class FireflyImporter
return null;
}
// Aktuelle Environment übernehmen und mit Custom-Vars erweitern
/** @var array<string, string> $env */
$env = $_ENV;
foreach ($this->environment as $key => $value) {
@ -234,35 +384,94 @@ class FireflyImporter
}
/**
* Testet die Firefly-Verbindung
* Tests the connection to the Firefly III Data Importer
*
* @return array Test-Ergebnis
* - cli/docker: checks whether the importer command is reachable (--version)
* - http: sends a GET request to {importerUrl}/health
*
* @return array<string, mixed> Test result
*/
public function testConnection(): array
{
// Test ob Importer-Kommando verfügbar ist
if ($this->mode === 'http') {
return $this->testConnectionHttp();
}
return $this->testConnectionCli();
}
/**
* Connection test for CLI/Docker mode
*
* @return array<string, mixed>
*/
private function testConnectionCli(): array
{
$testCommand = str_replace('importer:import', '--version', $this->importerCommand);
/** @var array<int, string> $output */
$output = [];
$exitCode = 0;
exec($testCommand . ' 2>&1', $output, $exitCode);
return [
'available' => ($exitCode === 0),
'output' => implode("\n", $output),
'exit_code' => $exitCode
'exit_code' => $exitCode,
];
}
/**
* Validiert die JSON-Konfiguration
* Connection test for HTTP mode (GET {importerUrl}/health)
*
* @return array Validierungsergebnis
* @return array<string, mixed>
*/
private function testConnectionHttp(): array
{
$url = rtrim($this->importerUrl, '/') . '/health';
$ch = curl_init();
if ($ch === false) {
return ['available' => false, 'output' => 'Could not initialise cURL', 'exit_code' => -1];
}
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$response = curl_exec($ch);
$httpCode = (int) curl_getinfo($ch, CURLINFO_HTTP_CODE);
$curlError = curl_error($ch);
curl_close($ch);
return [
'available' => ($curlError === '' && $httpCode === 200),
'output' => $curlError !== '' ? $curlError : (is_string($response) ? $response : ''),
'exit_code' => $httpCode,
];
}
/**
* Validates the Firefly III Data Importer JSON configuration file
*
* In docker mode, no local file_exists() check is performed
* because the path is located inside the container.
*
* @return array<string, mixed> Validation result
*/
public function validateConfig(): array
{
if (!file_exists($this->jsonConfigPath)) {
if ($this->mode !== 'docker' && !file_exists($this->jsonConfigPath)) {
return [
'valid' => false,
'error' => 'JSON-Konfiguration nicht gefunden'
'error' => 'JSON configuration not found (local path: ' . $this->jsonConfigPath . ')'
];
}
if ($this->mode === 'docker') {
return [
'valid' => true,
'notice' => 'Mode docker: configuration file is inside the container, local check skipped.',
];
}
@ -270,7 +479,7 @@ class FireflyImporter
if ($json === false) {
return [
'valid' => false,
'error' => 'Konfigurationsdatei nicht lesbar'
'error' => 'Configuration file not readable'
];
}
$config = json_decode($json, true);
@ -278,12 +487,23 @@ class FireflyImporter
if ($config === null) {
return [
'valid' => false,
'error' => 'Ungültiges JSON: ' . json_last_error_msg()
'error' => 'Invalid JSON: ' . json_last_error_msg()
];
}
// Prüfe erforderliche Felder in Firefly-Config
$requiredFields = ['file_type', 'import_account'];
// Check configuration format v3
if (($config['version'] ?? null) !== 3) {
return [
'valid' => false,
'error' => "Invalid configuration format: 'version' must be 3 " .
"(currently: " . ($config['version'] ?? 'not set') . "). " .
"Create the configuration file using the Firefly III Data Importer Web-UI " .
"or use config/firefly-import-config.example.json as a template."
];
}
// Check required fields for CSV import
$requiredFields = ['flow', 'roles', 'default_account'];
$missingFields = [];
foreach ($requiredFields as $field) {
@ -295,7 +515,21 @@ class FireflyImporter
if (!empty($missingFields)) {
return [
'valid' => false,
'error' => 'Fehlende Felder: ' . implode(', ', $missingFields)
'error' => 'Missing required fields: ' . implode(', ', $missingFields)
];
}
if ($config['flow'] !== 'csv') {
return [
'valid' => false,
'error' => "'flow' must be 'csv' for CSV import (currently: '{$config['flow']}')"
];
}
if (!is_array($config['roles']) || empty($config['roles'])) {
return [
'valid' => false,
'error' => "'roles' must be a non-empty array of column mappings"
];
}
@ -306,9 +540,19 @@ class FireflyImporter
}
/**
* Gibt die Konfiguration zurück
* Returns the active operating mode
*
* @return array Firefly Import-Konfiguration
* @return string cli|docker|http
*/
public function getMode(): string
{
return $this->mode;
}
/**
* Returns the configuration
*
* @return array<string, mixed> Firefly import configuration
*/
public function getConfig(): array
{

View File

@ -3,10 +3,10 @@
namespace UbsCsvTransformer;
/**
* Extrahiert Metadaten aus Header-Zeilen mit Regex
* Extracts metadata from header lines using regex
*
* Diese Klasse extrahiert konstante Werte aus den Metadatenzeilen
* (Header-Zeilen vor der eigentlichen CSV-Tabelle) mittels Regex-Regeln.
* Extracts constant values from metadata lines
* (header lines before the actual CSV table) using regex rules.
*/
class MetadataExtractor
{
@ -18,17 +18,17 @@ class MetadataExtractor
}
/**
* Extrahiert Metadaten aus den übergebenen Zeilen
* Extracts metadata from the provided lines
*
* @param array $lines Array von Zeilen aus dem CSV-Header
* @return array Extrahierte Metadaten
* @param array $lines Array of lines from the CSV header
* @return array Extracted metadata
*/
public function extract(array $lines): array
{
$metadata = [];
foreach ($this->rules as $rule) {
// Validiere erforderliche Felder
// Validate required fields
if (empty($rule['name']) || empty($rule['regex'])) {
continue;
}
@ -37,15 +37,15 @@ class MetadataExtractor
$lineNumber = $rule['lineNumber'] ?? 1;
$regex = $rule['regex'];
// ✅ KORRIGIERT: Off-by-One Fix
// config.json: "lineNumber": 1, 2, 3 (1-basiert, für Menschen lesbar)
// PHP Arrays: $lines[0], $lines[1], $lines[2] (0-basiert)
// Konvertierung: arrayIndex = lineNumber - 1
// Off-by-one fix
// config.json: "lineNumber": 1, 2, 3 (1-based, human-readable)
// PHP arrays: $lines[0], $lines[1], $lines[2] (0-based)
// Conversion: arrayIndex = lineNumber - 1
$arrayIndex = $lineNumber - 1;
// Prüfe ob Zeile existiert
// Check if line exists
if (!isset($lines[$arrayIndex])) {
// Zeile existiert nicht - Debug-Info für Support
// Line does not exist - debug info for support
DebugLogger::log('metadata_warning', "Extraction rule not found", [
'rule_name' => $ruleName,
'expected_lineNumber' => $lineNumber,
@ -57,7 +57,7 @@ class MetadataExtractor
$line = $lines[$arrayIndex];
// Regex mit '#' als Delimiter (erlaubt '/' in User-Patterns); '#' im Pattern escapen
// Regex with '#' as delimiter (allows '/' in user patterns); escape '#' in pattern
$pattern = '#' . str_replace('#', '\#', $regex) . '#u';
$matchResult = @preg_match_all($pattern, $line, $matches);
if ($matchResult === false) {
@ -68,7 +68,7 @@ class MetadataExtractor
continue;
}
if ($matchResult === 0) {
// Regex matched nicht auf dieser Zeile
// Regex did not match on this line
DebugLogger::log('metadata_warning', "Regex did not match", [
'rule_name' => $ruleName,
'lineNumber' => $lineNumber,
@ -78,20 +78,20 @@ class MetadataExtractor
continue;
}
// ✅ KORRIGIERT: captureGroup benutzen
// captureGroup definiert welche Klammer-Gruppe extrahiert wird
// 0 = komplette Match
// 1 = erste Klammer-Gruppe (...)
// 2 = zweite Klammer-Gruppe, etc.
// Use captureGroup to select the extraction group
// captureGroup defines which capture group is extracted
// 0 = complete match
// 1 = first capture group (...)
// 2 = second capture group, etc.
$captureGroup = isset($rule['captureGroup']) ? intval($rule['captureGroup']) : 1;
// Sicherstellen dass die Capture Group existiert
// Ensure the capture group exists
if (!isset($matches[$captureGroup]) || empty($matches[$captureGroup])) {
// Fallback: Nutze komplette Match wenn Gruppe nicht existiert
// Fallback: use complete match if group does not exist
$metadata[$ruleName] = $matches[0][0] ?? '';
// echo "DEBUG: extraction_rule '{$ruleName}' - captureGroup {$captureGroup} not found, falling back to complete match\n";
} else {
// Nutze die spezifische Capture Group
// Use the specific capture group
$metadata[$ruleName] = $matches[$captureGroup][0] ?? '';
}
@ -105,9 +105,9 @@ class MetadataExtractor
}
/**
* Gibt die Anzahl der definierten Extraction-Rules zurück
* Returns the number of defined extraction rules
*
* @return int Anzahl Rules
* @return int Number of rules
*/
public function getRuleCount(): int
{
@ -115,9 +115,9 @@ class MetadataExtractor
}
/**
* Gibt alle definierten Extraction-Rules zurück
* Returns all defined extraction rules
*
* @return array Die Rules
* @return array The rules
*/
public function getRules(): array
{

View File

@ -10,16 +10,16 @@ use UbsCsvTransformer\ColumnTransformer;
use UbsCsvTransformer\FireflyImporter;
/**
* Orchestriert die gesamte CSV-Transformations-Pipeline
* Orchestrates the complete CSV transformation pipeline
*
* Koordiniert alle Schritte von CSV-Einlesen über Metadaten-Extraktion
* und Spalten-Transformation bis zur Ausgabe und optional zum Import in Firefly III.
* Coordinates all steps from reading the CSV through metadata extraction
* and column transformation to output and optional import into Firefly III.
*
* @property ConfigurationLoader $configLoader Verwaltet Konfiguration
* @property CsvWriter $csvWriter Schreibt Output-CSV
* @property MetadataExtractor $metadataExtractor Extrahiert Metadaten aus Header
* @property ColumnTransformer $columnTransformer Transformiert Spalten
* @property array $csvStructure CSV-Struktur-Konfiguration
* @property ConfigurationLoader $configLoader Manages configuration
* @property CsvWriter $csvWriter Writes output CSV
* @property MetadataExtractor $metadataExtractor Extracts metadata from header
* @property ColumnTransformer $columnTransformer Transforms columns
* @property array $csvStructure CSV structure configuration
*/
class TransformerEngine
{
@ -33,16 +33,16 @@ class TransformerEngine
private bool $debugMode = false;
/**
* Initialisiert TransformerEngine mit Konfiguration
* Initialises TransformerEngine with configuration
*
* Lädt alle erforderlichen Konfigurationen und initialisiert
* die Komponenten (MetadataExtractor, ColumnTransformer, CsvWriter).
* CsvReader wird später in transform() und validate() initialisiert mit dem Dateipfad.
* Loads all required configurations and initialises
* the components (MetadataExtractor, ColumnTransformer, CsvWriter).
* CsvReader is instantiated later in transform() and validate() with the file path.
*
* @param ConfigurationLoader $configLoader Lädt Konfigurationsdateien
* @param bool $debugMode true = Debug-Modus aktivieren
* @param ConfigurationLoader $configLoader Loads configuration files
* @param bool $debugMode true = enable debug mode
*
* @throws \RuntimeException wenn erforderliche Konfigurationen fehlen
* @throws \RuntimeException if required configurations are missing
*/
public function __construct(ConfigurationLoader $configLoader, bool $debugMode = false)
{
@ -63,7 +63,7 @@ class TransformerEngine
$config['capitalizationExceptions'] ?? []
);
// Bestimme Output-Dateiname aus Konfiguration
// Determine output file name from configuration
$outputDir = $config['directories']['output'] ?? './output';
$outputFileName = $config['csvStructure']['outputFilename'] ?? 'transformed.csv';
$outputFile = rtrim($outputDir, '/') . '/' . $outputFileName;
@ -75,9 +75,9 @@ class TransformerEngine
}
/**
* Aktiviert oder deaktiviert den Debug-Modus
* Enables or disables debug mode
*
* @param bool $enabled true = Debug-Modus aktiviert
* @param bool $enabled true = debug mode enabled
* @return void
*/
public function setDebugMode(bool $enabled): void
@ -91,30 +91,30 @@ class TransformerEngine
}
/**
* Transformiert eine CSV-Datei
* Transforms a CSV file
*
* Führt folgende Schritte durch:
* 1. CSV-Datei einlesen mit CsvReader
* 2. Metadaten aus Header extrahieren
* 3. Spalten gemäß Konfiguration transformieren
* 4. Daten in Output-CSV schreiben
* 5. Beispiel-Daten sammeln (maximal 3 Zeilen oder maxRows)
* Performs the following steps:
* 1. Read CSV file with CsvReader
* 2. Extract metadata from header
* 3. Transform columns according to configuration
* 4. Write data to output CSV
* 5. Collect sample data (maximum 3 rows or maxRows)
*
* Der Output-Dateipfad wird aus der Konfiguration bestimmt und kann nicht überschrieben werden.
* The output file path is determined from the configuration and cannot be overridden.
*
* @param string $inputFile Pfad zur Input-CSV-Datei
* @param int $maxRows Maximale Anzahl Datenzeilen zu transformieren (0 = alle).
* Beispiel-Daten werden begrenzt auf min(3, maxRows)
* @param string $inputFile Path to the input CSV file
* @param int $maxRows Maximum number of data rows to transform (0 = all).
* Sample data is limited to min(3, maxRows)
*
* @return array Transformations-Ergebnis mit:
* - success: bool (true = erfolgreich, false = Fehler)
* - inputFile: string (Input-Dateipfad, nur bei Erfolg)
* - outputFile: string (Output-Dateipfad, nur bei Erfolg)
* - rowsProcessed: int (tatsächlich verarbeitete Datenzeilen)
* - sampleData: array (Erste Beispiel-Zeilen, max 3 oder maxRows)
* - metadata: array (Extrahierte Metadaten, nur bei Erfolg)
* - outputColumns: int (Anzahl Output-Spalten)
* - error: string (Fehlermeldung, nur bei Fehler)
* @return array Transformation result with:
* - success: bool (true = successful, false = error)
* - inputFile: string (input file path, on success only)
* - outputFile: string (output file path, on success only)
* - rowsProcessed: int (actually processed data rows)
* - sampleData: array (first sample rows, max 3 or maxRows)
* - metadata: array (extracted metadata, on success only)
* - outputColumns: int (number of output columns)
* - error: string (error message, on failure only)
*/
public function transform(string $inputFile, int $maxRows = 0): array
{
@ -130,50 +130,50 @@ class TransformerEngine
]);
}
// Validiere Input-Datei
// Validate input file
if (!file_exists($inputFile)) {
throw new \RuntimeException("Input-Datei nicht gefunden: {$inputFile}");
throw new \RuntimeException("Input file not found: {$inputFile}");
}
// Initialisiere CsvReader mit Dateipfad und Konfiguration
// Initialise CsvReader with file path and configuration
$csvReader = new CsvReader($inputFile, $this->csvStructure);
// Lese Metadaten-Zeilen (vor der Header-Zeile)
// Read metadata lines (before the header line)
$metadataLines = $csvReader->readMetadataLines();
// Extrahiere Metadaten aus den Metadaten-Zeilen
// Extract metadata from the metadata lines
$metadata = $this->metadataExtractor->extract($metadataLines);
// Initialisiere ColumnTransformer mit extrahierten Metadaten
// Initialise ColumnTransformer with extracted metadata
$this->columnTransformer = new ColumnTransformer(
$this->configLoader->get('columnTransformations', []),
$metadata,
$this->configLoader->get('capitalizationExceptions', [])
);
// Lese CSV-Daten mit Header-Keys als Array-Keys
// Read CSV data with header keys as array keys
$dataRows = $csvReader->readCsvData($maxRows);
if (empty($dataRows)) {
throw new \RuntimeException("Keine Datenzeilen in CSV-Datei");
throw new \RuntimeException("No data rows in CSV file");
}
// Berechne Limit für Beispiel-Daten
// Calculate limit for sample data
$sampleLimit = $maxRows == 0 ? 3 : $maxRows;
// Transformiere Zeilen und sammle sie
// Transform rows and collect them
$transformedData = [];
foreach ($dataRows as $row) {
// Prüfe ob maxRows erreicht
// Check if maxRows reached
if ($maxRows > 0 && $this->rowsProcessed >= $maxRows) {
break;
}
// Transformiere Zeile
// Transform row
$transformedRow = $this->columnTransformer->transformRow($row);
$transformedData[] = $transformedRow;
// Speichere Beispiel-Daten
// Save sample data
if (count($this->sampleData) < $sampleLimit) {
$this->sampleData[] = $transformedRow;
}
@ -181,7 +181,7 @@ class TransformerEngine
$this->rowsProcessed++;
}
// Entferne Spalten die aus dem Output ausgeschlossen werden sollen
// Remove columns to be excluded from the output
$excludeColumns = $this->csvStructure['excludeOutputColumns'] ?? [];
if (!empty($excludeColumns)) {
$excludeMap = array_flip($excludeColumns);
@ -195,7 +195,7 @@ class TransformerEngine
);
}
// Schreibe alle transformierten Daten in Output-CSV
// Write all transformed data to output CSV
$this->csvWriter->write($transformedData);
$result = [
@ -225,43 +225,43 @@ class TransformerEngine
}
/**
* Transformiert und importiert CSV in Firefly III
* Transforms and imports CSV into Firefly III
*
* Führt Transformation durch und importiert die Ausgabe-Datei
* in Firefly III wenn in der Konfiguration aktiviert.
* Performs transformation and imports the output file
* into Firefly III if enabled in the configuration.
*
* Rückwärts-kompatibel mit legacy Signatur.
* Backwards-compatible with legacy signature.
*
* @param string $inputFile Pfad zur Input-CSV-Datei
* @param int $maxRows Maximale Anzahl Datenzeilen zu verarbeiten (0 = alle)
* @param string $inputFile Path to the input CSV file
* @param int $maxRows Maximum number of data rows to process (0 = all)
*
* @return array Transformations- und Import-Ergebnis mit:
* - success: bool (true = transformation erfolgreich)
* @return array Transformation and import result with:
* - success: bool (true = transformation successful)
* - inputFile: string
* - outputFile: string
* - rowsProcessed: int
* - sampleData: array
* - metadata: array
* - outputColumns: int
* - import: array (Firefly Import-Ergebnis, wenn autoImport aktiv)
* - error: string (falls Fehler)
* - import: array (Firefly import result, if autoImport active)
* - error: string (if error)
*/
public function transformAndImport(string $inputFile, int $maxRows = 0): array
{
// Zuerst transformieren
// Transform first
$transformResult = $this->transform($inputFile, $maxRows);
if (!$transformResult['success']) {
return $transformResult;
}
// Prüfe ob Auto-Import in Konfiguration aktiviert ist
// Check whether auto-import is enabled in configuration
$fireflyConfig = $this->configLoader->get('fireflyImport', []);
if (empty($fireflyConfig['autoImport'])) {
return $transformResult;
}
// Führe Firefly-Import durch
// Perform Firefly import
try {
$importer = new FireflyImporter($fireflyConfig);
$importResult = $importer->import($transformResult['outputFile']);
@ -278,19 +278,19 @@ class TransformerEngine
}
/**
* Validiert eine CSV-Datei gegen die Konfiguration
* Validates a CSV file against the configuration
*
* Prüft ob erforderliche Metadaten vorhanden sind
* und ob die CSV-Struktur der Konfiguration entspricht.
* Checks whether required metadata is present
* and whether the CSV structure matches the configuration.
*
* @param string $inputFile Pfad zur zu validierenden CSV-Datei
* @param string $inputFile Path to the CSV file to validate
*
* @return array Validierungs-Ergebnis mit:
* - valid: bool (true = Validierung erfolgreich)
* - metadata: array (Extrahierte Metadaten, wenn valid)
* - line_count: int (Gesamtzahl Zeilen, wenn valid)
* - error: string (Fehlermeldung, wenn nicht valid)
* - metadata_found: array (Gefundene Metadaten trotz Fehler)
* @return array Validation result with:
* - valid: bool (true = validation successful)
* - metadata: array (extracted metadata, when valid)
* - line_count: int (total number of lines, when valid)
* - error: string (error message, when not valid)
* - metadata_found: array (found metadata despite error)
*/
public function validate(string $inputFile): array
{
@ -298,18 +298,18 @@ class TransformerEngine
if (!file_exists($inputFile)) {
return [
'valid' => false,
'error' => "Datei nicht gefunden: {$inputFile}",
'error' => "File not found: {$inputFile}",
];
}
// Initialisiere CsvReader mit Dateipfad
// Initialise CsvReader with file path
$csvReader = new CsvReader($inputFile, $this->csvStructure);
// Extrahiere Metadaten-Zeilen (vor der Header-Zeile)
// Extract metadata lines (before the header line)
$metadataLines = $csvReader->readMetadataLines();
$metadata = $this->metadataExtractor->extract($metadataLines);
// Prüfe auf erforderliche Metadaten
// Check for required metadata
$requiredMetadata = [
'account_iban',
'currency_code',
@ -325,12 +325,12 @@ class TransformerEngine
if (!empty($missingMetadata)) {
return [
'valid' => false,
'error' => 'Fehlende Metadaten: ' . implode(', ', $missingMetadata),
'error' => 'Missing metadata: ' . implode(', ', $missingMetadata),
'metadata_found' => $metadata,
];
}
// Zähle Gesamtzahl Zeilen
// Count total number of lines
$lineCount = $csvReader->countLines();
return [
@ -341,15 +341,15 @@ class TransformerEngine
} catch (\Exception $e) {
return [
'valid' => false,
'error' => 'Validierungs-Fehler: ' . $e->getMessage(),
'error' => 'Validation error: ' . $e->getMessage(),
];
}
}
/**
* Gibt die gesammelten Beispiel-Daten zurück
* Returns the collected sample data
*
* @return array Beispiel-Daten (maximal 3 oder maxRows Zeilen)
* @return array Sample data (maximum 3 or maxRows rows)
*/
public function getSampleData(): array
{
@ -357,9 +357,9 @@ class TransformerEngine
}
/**
* Gibt die Anzahl verarbeiteter Datenzeilen zurück
* Returns the number of processed data rows
*
* @return int Anzahl transformierter Zeilen
* @return int Number of transformed rows
*/
public function getRowsProcessed(): int
{