Compare commits
No commits in common. "main" and "release_1.0" have entirely different histories.
main
...
release_1.
1
.gitignore
vendored
1
.gitignore
vendored
@ -61,4 +61,3 @@ docker-compose.override.yml
|
||||
*.backup
|
||||
/tmp/
|
||||
/~archive/
|
||||
firefly-import-preprocessor.code-workspace
|
||||
|
||||
15
AGENTS.md
15
AGENTS.md
@ -14,17 +14,15 @@ composer psalm # Psalm static analysis
|
||||
|
||||
### Test Suite Overview
|
||||
|
||||
129 tests across 7 test classes:
|
||||
85 tests across 5 test classes:
|
||||
|
||||
| File | Tests | Scope |
|
||||
| ------ | -------: | ------- |
|
||||
| `tests/ColumnTransformerTest.php` | 51 | All 14 transformation types, edge cases |
|
||||
|------|-------|-------|
|
||||
| `tests/ColumnTransformerTest.php` | 37 | All 13 transformation types, edge cases |
|
||||
| `tests/ConfigurationLoaderTest.php` | 18 | JSON loading, dot-notation access, validation |
|
||||
| `tests/CsvReaderTest.php` | 15 | CSV parsing, BOM handling, delimiter, encoding |
|
||||
| `tests/MetadataExtractorTest.php` | 14 | Pre-header regex extraction, edge cases |
|
||||
| `tests/ConfigIntegrationTest.php` | 1× per fixture | Golden-file integration tests (see below) |
|
||||
| `tests/RowFilterTest.php` | 19 | skipIf conditions, all operators, nested AND/OR groups |
|
||||
| `tests/FireflyImporterChunkStateTest.php` | 11 | Chunk state persistence, resume, reset |
|
||||
|
||||
### Integration Tests (Golden-File Pattern)
|
||||
|
||||
@ -87,10 +85,7 @@ bin/transformer.php → TransformerEngine
|
||||
|
||||
- **PSR-12** enforced via phpcs using `phpcs.xml` (auto-discovered at root). Line length: soft 120, hard 150 chars.
|
||||
- **PHPStan level 8** with `checkMissingCallableSignature: true`. `phpstan-baseline.neon` is empty — do not add suppressions without good reason.
|
||||
- **All source comments and docblocks are written in English.**
|
||||
- **Documentation language:** `README.md` is the primary documentation in **English**. `README.de.md` is the German translation. Both cross-link to each other at the top.
|
||||
- **`showHelp()` in `bin/transformer.php`** is locale-aware: English is the default; German is shown when `isGermanLocale()` returns `true` (checks `LANG`, `LC_ALL`, `LC_MESSAGES`, `LANGUAGE` env vars for a `de` prefix).
|
||||
- **License:** GPL-3.0.
|
||||
- **All source comments and docblocks are written in German.**
|
||||
- Namespace `UbsCsvTransformer\` (PSR-4 → `src/`); tests use `UbsCsvTransformer\Tests\` (→ `tests/`).
|
||||
- No runtime package dependencies — only `ext-json` and `ext-mbstring`.
|
||||
|
||||
@ -108,4 +103,4 @@ See [config/config.example.json](config/config.example.json) for a full referenc
|
||||
- `"outputAction": "create"` vs `"overwrite"` — controls whether the result is a new column or replaces an existing one
|
||||
- `MetadataExtractor` uses 1-based `lineNumber` in config; it converts to 0-based array index internally
|
||||
|
||||
Supported transformation types: `map`, `replace`, `regex`, `regexextract`, `dateformat`, `split`, `trim`, `uppercase`, `lowercase`, `ucwordsfirst`, `truncate`, `constantvalue`, `pipeline`, `timeperiod`
|
||||
Supported transformation types: `map`, `replace`, `regex`, `regexextract`, `dateformat`, `split`, `trim`, `uppercase`, `lowercase`, `ucwordsfirst`, `truncate`, `constantvalue`, `pipeline`
|
||||
|
||||
767
README.de.md
767
README.de.md
@ -1,767 +0,0 @@
|
||||
# Firefly Import Preprocessor — Dokumentation
|
||||
|
||||
**Version:** 1.0.0
|
||||
**Datum:** 03. Mai 2026
|
||||
**Status:** Production Ready
|
||||
|
||||
🌐 [English](README.md)
|
||||
|
||||
---
|
||||
|
||||
## 📋 Inhaltsverzeichnis
|
||||
|
||||
1. [Überblick](#überblick)
|
||||
2. [Installation & Setup](#installation--setup)
|
||||
3. [Schnellstart](#schnellstart)
|
||||
4. [Konfiguration](#konfiguration)
|
||||
5. [Transformationstypen](#transformationstypen)
|
||||
6. [CLI-Referenz](#cli-referenz)
|
||||
7. [Debug-Modus](#debug-modus)
|
||||
8. [Firefly III Integration](#firefly-iii-integration)
|
||||
9. [Architektur](#architektur)
|
||||
10. [Fehlerbehandlung](#fehlerbehandlung)
|
||||
|
||||
---
|
||||
|
||||
## Überblick
|
||||
|
||||
Der **Firefly Import Preprocessor** ist ein produktionsreifer PHP-Preprocessor für Banken-CSV-Exportdateien. Er transformiert Bankdaten in ein standardisiertes Format und kann sie optional in Firefly III importieren.
|
||||
|
||||
### Kernfeatures
|
||||
|
||||
✅ **Vollständige CSV-Transformation** mit komplexen Pipelines
|
||||
✅ **Metadaten-Extraktion** mit Regex (IBAN, Währung, Kontoname)
|
||||
✅ **14 Transformationstypen** für flexible Datenverarbeitung
|
||||
✅ **Firefly III Integration** — CLI, Docker und HTTP-Upload
|
||||
✅ **Debug-Modus** für Transparenz bei Verarbeitung
|
||||
✅ **Production Ready** mit vollständiger Fehlerbehandlung
|
||||
✅ **Zero Dependencies** für Core-Funktionalität
|
||||
|
||||
### Workflow
|
||||
|
||||
```text
|
||||
Input CSV
|
||||
↓
|
||||
Metadaten extrahieren (Regex)
|
||||
↓
|
||||
Datenzeilen transformieren (Pipeline)
|
||||
↓
|
||||
Output CSV schreiben
|
||||
↓
|
||||
[Optional] In Firefly III importieren
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Installation & Setup
|
||||
|
||||
### Voraussetzungen
|
||||
|
||||
- PHP 8.1+
|
||||
- Composer (empfohlen)
|
||||
- [Optional] Docker für Firefly III Integration
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# 1. Repository clonen/kopieren
|
||||
cd ff-imp-preprocessor
|
||||
|
||||
# 2. Abhängigkeiten installieren (optional)
|
||||
composer install
|
||||
|
||||
# 3. Konfiguration erstellen
|
||||
cp config/config.example.json config/config.json
|
||||
# Bearbeite config/config.json mit deinen Einstellungen
|
||||
|
||||
# 4. Directories erstellen
|
||||
mkdir -p config/import/{source,output,archive,error}
|
||||
chmod 755 config/import/{source,output,archive,error}
|
||||
|
||||
# 5. Test durchführen
|
||||
php bin/transformer.php validate config/config.json input.csv
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Schnellstart
|
||||
|
||||
### 1. Konfiguration anpassen
|
||||
|
||||
Bearbeite `config/config.json` und stelle sicher, dass die Extraction-Rules zu deinem CSV-Format passen:
|
||||
|
||||
```json
|
||||
{
|
||||
"metadata": {
|
||||
"extractionRules": [
|
||||
{
|
||||
"name": "account_iban",
|
||||
"lineNumber": 2,
|
||||
"regex": "IBAN:\\s*([A-Z0-9 ]+)",
|
||||
"captureGroup": 1
|
||||
}
|
||||
]
|
||||
},
|
||||
"csvStructure": {
|
||||
"headerLine": 5,
|
||||
"delimiter": ";",
|
||||
"encoding": "UTF-8"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. CSV validieren
|
||||
|
||||
```bash
|
||||
php bin/transformer.php validate config/config.json input.csv
|
||||
```
|
||||
|
||||
### 3. Transformation durchführen
|
||||
|
||||
```bash
|
||||
php bin/transformer.php transform input.csv config/config.json
|
||||
|
||||
# Mit Debug-Modus für Fehlersuche
|
||||
php bin/transformer.php transform input.csv config/config.json --debug
|
||||
```
|
||||
|
||||
### 4. Output prüfen
|
||||
|
||||
```bash
|
||||
php bin/transformer.php test input.csv config/config.json --debug
|
||||
# Zeigt max. 10 transformierte Zeilen und Debug-Logs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Konfiguration
|
||||
|
||||
### config.json Struktur
|
||||
|
||||
#### `metadata` - Metadaten-Extraktion
|
||||
|
||||
```json
|
||||
{
|
||||
"metadata": {
|
||||
"extractionRules": [
|
||||
{
|
||||
"name": "account_iban",
|
||||
"lineNumber": 2,
|
||||
"regex": "IBAN:\\s*([A-Z0-9 ]+)",
|
||||
"captureGroup": 1
|
||||
},
|
||||
{
|
||||
"name": "currency_code",
|
||||
"lineNumber": 3,
|
||||
"regex": "Währung:\\s*([A-Z]{3})",
|
||||
"captureGroup": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Feld | Typ | Beschreibung |
|
||||
| ------ | ----- | ------------- |
|
||||
| `name` | string | Name der Metadaten-Variable (verwendet in constantvalue) |
|
||||
| `lineNumber` | int | Zeilennummer in CSV (1-basiert, menschenlesbar) |
|
||||
| `regex` | string | Regex-Pattern zur Extraktion (ohne Delimiter) |
|
||||
| `captureGroup` | int | Nummer der Klammer-Gruppe (0=komplett, 1=erste Klammer, etc.) |
|
||||
|
||||
**Beispiel Regex:**
|
||||
|
||||
- Pattern: `IBAN:\s*([A-Z0-9 ]+)`
|
||||
- Input: `IBAN: CH93 0077 2020 6262 5252 7`
|
||||
- Capture Group 1: `CH93 0077 2020 6262 5252 7`
|
||||
|
||||
#### `csvStructure` - CSV-Format
|
||||
|
||||
```json
|
||||
{
|
||||
"csvStructure": {
|
||||
"headerLine": 5,
|
||||
"delimiter": ";",
|
||||
"encoding": "UTF-8",
|
||||
"hasBom": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Feld | Typ | Default | Beschreibung |
|
||||
| ------ | ----- | --------- | ------------- |
|
||||
| `headerLine` | int | 5 | Zeilennummer der Header (1-basiert) |
|
||||
| `delimiter` | string | `;` | CSV-Delimiter |
|
||||
| `encoding` | string | `UTF-8` | Zeichenkodierung (UTF-8, ISO-8859-1, CP1252) |
|
||||
| `hasBom` | bool | false | Hat die Datei BOM (Byte Order Mark)? |
|
||||
|
||||
#### `columnTransformations` - Spalten-Transformationen
|
||||
|
||||
```json
|
||||
{
|
||||
"columnTransformations": [
|
||||
{
|
||||
"sourceColumn": "Buchungsdatum",
|
||||
"transformations": [
|
||||
{
|
||||
"type": "dateformat",
|
||||
"fromFormat": "d.m.Y",
|
||||
"toFormat": "Y-m-d"
|
||||
}
|
||||
],
|
||||
"outputColumn": "date",
|
||||
"outputAction": "overwrite"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**outputAction:**
|
||||
|
||||
| Wert | Verhalten |
|
||||
|---|---|
|
||||
| `overwrite` | Ziel-Spalte mit dem Transformations-Ergebnis überschreiben (Standard) |
|
||||
| `create` | Ergebnis in eine neue Ausgabe-Spalte schreiben |
|
||||
| `append` | Ergebnis ans Ende des bestehenden Spalten-Werts anhängen. Mit `"appendDelimiter": " "` (beliebige Zeichenkette) wird ein Trennzeichen zwischen bestehendem und neuem Wert eingefügt — der Trennzeichen entfällt, wenn die Ziel-Spalte noch leer ist |
|
||||
| `append-if-not-empty` | Wie `append` (inkl. `appendDelimiter`), aber überspringt die Operation vollständig, wenn das Transformations-Ergebnis leer ist — geeignet für optionale Werte wie Tags oder Notiz-Zeilen |
|
||||
| `append-line` | Wie `append`, aber als Trennzeichen wird immer ein Zeilenumbruch `\n` verwendet; kein führender Zeilenumbruch wenn die Ziel-Spalte leer ist |
|
||||
| `overwrite-if-empty` | Ergebnis nur schreiben, wenn die Ziel-Spalte aktuell leer ist |
|
||||
| `overwrite-if-not-empty` | Ergebnis nur schreiben, wenn das Transformations-Ergebnis nicht leer ist |
|
||||
|
||||
#### `directories` - Dateisystem
|
||||
|
||||
```json
|
||||
{
|
||||
"directories": {
|
||||
"source": "/opt/ff-imp-preprocessor/import/source",
|
||||
"output": "/opt/ff-imp-preprocessor/import/output",
|
||||
"archive": "/opt/ff-imp-preprocessor/import/archive",
|
||||
"error": "/opt/ff-imp-preprocessor/import/error"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
| Feld | Beschreibung |
|
||||
| ------ | ------------- |
|
||||
| `source` | Eingabe-Verzeichnis |
|
||||
| `output` | Ausgabe-Verzeichnis |
|
||||
| `archive` | Archiv für verarbeitete Dateien |
|
||||
| `error` | Error-Verzeichnis für ungültige Dateien |
|
||||
|
||||
#### `fireflyImport` - Firefly III Integration
|
||||
|
||||
Optional. Mit dem Flag `--do-import` beim `transform`-Kommando (oder via `auto-import`) wird der Firefly III Data Importer nach dem Schreiben der Output-CSV aufgerufen.
|
||||
|
||||
Details und vollständige Beispiele: [Firefly III Integration](#firefly-iii-integration).
|
||||
|
||||
---
|
||||
|
||||
## Transformationstypen
|
||||
|
||||
Es gibt **14 unterstützte Transformationstypen**, die als Pipeline kombiniert werden können:
|
||||
|
||||
### 1. **trim** - Leerzeichen entfernen
|
||||
|
||||
```json
|
||||
{ "type": "trim" }
|
||||
```
|
||||
|
||||
- Input: ` Coop Pronto ` → Output: `Coop Pronto`
|
||||
|
||||
---
|
||||
|
||||
### 2. **lowercase** - Zu Kleinbuchstaben
|
||||
|
||||
```json
|
||||
{ "type": "lowercase" }
|
||||
```
|
||||
|
||||
- Input: `COOP PRONTO CHUR` → Output: `coop pronto chur`
|
||||
|
||||
---
|
||||
|
||||
### 3. **uppercase** - Zu Grossbuchstaben
|
||||
|
||||
```json
|
||||
{ "type": "uppercase" }
|
||||
```
|
||||
|
||||
- Input: `Coop Pronto Chur` → Output: `COOP PRONTO CHUR`
|
||||
|
||||
---
|
||||
|
||||
### 4. **ucwordsfirst** - Grossschreibung nach Trennzeichen
|
||||
|
||||
```json
|
||||
{ "type": "ucwordsfirst" }
|
||||
```
|
||||
|
||||
- `COOP PRONTO CHUR` → `Coop Pronto Chur`
|
||||
- `migros-rail city` → `Migros-Rail City`
|
||||
- `O'NEILL STORE` → `O'Neill Store`
|
||||
|
||||
Trennzeichen: Leerzeichen, Bindestrich, Apostroph, Slash, Punkt, Komma, Semikolon, Doppelpunkt, Klammern.
|
||||
|
||||
> **Guard:** Wenn der Eingabe-String bereits sowohl Groß- als auch Kleinbuchstaben enthält (gemischte Groß-/Kleinschreibung), wird er unverändert zurückgegeben. So werden bereits korrekt formatierte Strings wie `"Coop pronto chur"` nicht verändert. Vollständig groß- oder kleingeschriebene Strings werden weiterhin verarbeitet.
|
||||
|
||||
---
|
||||
|
||||
### 5. **replace** - String-Replacement
|
||||
|
||||
```json
|
||||
{ "type": "replace", "search": " ", "replace": " " }
|
||||
```
|
||||
|
||||
- Input: `Coop Pronto` → Output: `Coop Pronto`
|
||||
|
||||
---
|
||||
|
||||
### 6. **split** - Spalte teilen
|
||||
|
||||
```json
|
||||
{ "type": "split", "delimiter": ";", "part": 0 }
|
||||
```
|
||||
|
||||
- Input: `Coop Pronto Chur;7007 Chur` → Output: `Coop Pronto Chur`
|
||||
|
||||
---
|
||||
|
||||
### 7. **regex** - Regex-Ersetzung
|
||||
|
||||
```json
|
||||
{ "type": "regex", "pattern": "^(.*?);.*$", "replace": "$1" }
|
||||
```
|
||||
|
||||
- Kein Match → Originalwert bleibt **unverändert** (pipeline-sicher)
|
||||
|
||||
---
|
||||
|
||||
### 8. **regexextract** - Regex-Extraktion
|
||||
|
||||
```json
|
||||
{ "type": "regexextract", "pattern": "(\\d{4,} [^;]+)" }
|
||||
```
|
||||
|
||||
- Kein Match → leerer String (**nicht** pipeline-sicher)
|
||||
|
||||
---
|
||||
|
||||
### 9. **dateformat** - Datum-Umformat
|
||||
|
||||
```json
|
||||
{ "type": "dateformat", "fromFormat": "d.m.Y", "toFormat": "Y-m-d" }
|
||||
```
|
||||
|
||||
- Input: `10.12.2025` → Output: `2025-12-10`
|
||||
|
||||
---
|
||||
|
||||
### 10. **truncate** - String kürzen
|
||||
|
||||
```json
|
||||
{ "type": "truncate", "maxLength": 100 }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 11. **constantvalue** - Konstanten-Wert aus Metadaten
|
||||
|
||||
```json
|
||||
{
|
||||
"sourceColumn": "_constant_",
|
||||
"transformations": [{ "type": "constantvalue", "metadataKey": "account_iban" }],
|
||||
"outputColumn": "account_iban",
|
||||
"outputAction": "create"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 12. **map** - Spalte kopieren
|
||||
|
||||
```json
|
||||
{ "type": "map" }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 13. **pipeline** - Verschachtelte Pipeline
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "pipeline",
|
||||
"steps": [
|
||||
{ "type": "trim" },
|
||||
{ "type": "lowercase" },
|
||||
{ "type": "ucwordsfirst" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 14. **timeperiod** - Zeit einer Tagesperiode zuordnen
|
||||
|
||||
Parst eine Zeitangabe und gibt das Label des passenden Perioden-Bereichs zurück.
|
||||
Unterstützt mitternachtübergreifende Bereiche (z. B. 22:00–03:59).
|
||||
Gibt `default` (standardmäßig leer) zurück, wenn keine Periode passt oder die Eingabe ungültig ist.
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "timeperiod",
|
||||
"timeFormat": "H:i:s",
|
||||
"periods": [
|
||||
{ "from": "04:00:00", "to": "08:59:59", "label": "Morgen" },
|
||||
{ "from": "09:00:00", "to": "10:59:59", "label": "Vormittag" },
|
||||
{ "from": "11:00:00", "to": "13:59:59", "label": "Mittag" },
|
||||
{ "from": "14:00:00", "to": "17:59:59", "label": "Nachmittag" },
|
||||
{ "from": "18:00:00", "to": "21:59:59", "label": "Abend" },
|
||||
{ "from": "22:00:00", "to": "03:59:59", "label": "Nacht" }
|
||||
],
|
||||
"default": ""
|
||||
}
|
||||
```
|
||||
|
||||
- `"09:30:00"` → `"Vormittag"`
|
||||
- `"23:00:00"` → `"Nacht"` (mitternachtübergreifender Bereich)
|
||||
- `"02:00:00"` → `"Nacht"` (mitternachtübergreifender Bereich)
|
||||
- `""` oder nicht parsbare Eingabe → `""`
|
||||
|
||||
`timeFormat` folgt der PHP-Syntax `DateTime::createFromFormat` (Standard: `H:i:s`).
|
||||
|
||||
---
|
||||
|
||||
### Zeilen-Filterung — `skipIf`
|
||||
|
||||
Zeilen können durch einen Top-Level-Schlüssel `skipIf` in der Konfiguration ausgeschlossen werden.
|
||||
Der Wert ist ein Filter-Knoten — entweder eine einzelne Bedingung oder eine verschachtelte `and`/`or`-Gruppe.
|
||||
|
||||
**Einzelne Bedingung:**
|
||||
|
||||
```json
|
||||
"skipIf": { "column": "Buchungstext", "operator": "equals", "value": "Saldovortrag" }
|
||||
```
|
||||
|
||||
**AND-Gruppe:**
|
||||
|
||||
```json
|
||||
"skipIf": {
|
||||
"and": [
|
||||
{ "column": "Beschreibung1", "operator": "empty" },
|
||||
{ "column": "Beschreibung2", "operator": "empty" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Verschachtelte AND/OR-Gruppen:**
|
||||
|
||||
```json
|
||||
"skipIf": {
|
||||
"or": [
|
||||
{ "column": "Amount", "operator": "gt", "value": "10000" },
|
||||
{
|
||||
"and": [
|
||||
{ "column": "Type", "operator": "equals", "value": "Saldo" },
|
||||
{ "column": "Notes", "operator": "empty" }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Unterstützte Operatoren:**
|
||||
|
||||
| Operator | Passt wenn… |
|
||||
|---|---|
|
||||
| `empty` | Spaltenwert ist leer |
|
||||
| `not-empty` | Spaltenwert ist nicht leer |
|
||||
| `equals` | Spaltenwert gleich `"value"` |
|
||||
| `not-equals` | Spaltenwert ungleich `"value"` |
|
||||
| `contains` | Spaltenwert enthält `"value"` |
|
||||
| `not-contains` | Spaltenwert enthält `"value"` nicht |
|
||||
| `matches` | Spaltenwert entspricht Regex `"pattern"` |
|
||||
| `not-matches` | Spaltenwert entspricht Regex `"pattern"` nicht |
|
||||
| `gt` | `(float) Spalte > (float) value` |
|
||||
| `gte` | `(float) Spalte >= (float) value` |
|
||||
| `lt` | `(float) Spalte < (float) value` |
|
||||
| `lte` | `(float) Spalte <= (float) value` |
|
||||
|
||||
---
|
||||
|
||||
### Pipeline-Beispiel
|
||||
|
||||
```json
|
||||
{
|
||||
"sourceColumn": "Buchungstext",
|
||||
"transformations": [
|
||||
{ "type": "trim" },
|
||||
{ "type": "replace", "search": " ", "replace": " " },
|
||||
{ "type": "lowercase" },
|
||||
{ "type": "ucwordsfirst" }
|
||||
],
|
||||
"outputColumn": "description",
|
||||
"outputAction": "overwrite"
|
||||
}
|
||||
```
|
||||
|
||||
**Verarbeitung:**
|
||||
|
||||
1. `" COOP PRONTO "` → trim → `"COOP PRONTO"`
|
||||
2. `"COOP PRONTO"` → replace → `"COOP PRONTO"`
|
||||
3. `"COOP PRONTO"` → lowercase → `"coop pronto"`
|
||||
4. `"coop pronto"` → ucwordsfirst → `"Coop Pronto"`
|
||||
|
||||
---
|
||||
|
||||
## CLI-Referenz
|
||||
|
||||
```bash
|
||||
php bin/transformer.php <command> [input] [config] [options]
|
||||
```
|
||||
|
||||
### Kommandos
|
||||
|
||||
| Kommando | Beschreibung |
|
||||
| -------- | ------------- |
|
||||
| `test` | Test-Run (max. 10 Zeilen) |
|
||||
| `transform` | Vollständige Transformation |
|
||||
| `validate` | Konfiguration validieren |
|
||||
| `auto-import` | Verzeichnis-Überwachung |
|
||||
| `help` | Hilfe anzeigen |
|
||||
|
||||
### Optionen
|
||||
|
||||
| Option | Beschreibung |
|
||||
| -------- | ------------- |
|
||||
| `--debug`, `-d` | Debug-Modus aktivieren |
|
||||
| `--rows=N` | Max. N Zeilen (test-Kommando) |
|
||||
| `--output=FILE`, `-o` | Output-Pfad |
|
||||
| `--do-import` | Nach der Transformation in Firefly III importieren (`transform`) |
|
||||
| `--strict` | Strikte Validierung |
|
||||
| `--watch` | Kontinuierliche Überwachung |
|
||||
| `--interval=SEC` | Prüfintervall in Sekunden |
|
||||
| `--dry-run` | Simulationsmodus |
|
||||
|
||||
---
|
||||
|
||||
## Debug-Modus
|
||||
|
||||
```bash
|
||||
php bin/transformer.php test input.csv config/config.json --debug
|
||||
```
|
||||
|
||||
Der Debug-Modus protokolliert Ereignisse in folgenden Kategorien:
|
||||
|
||||
| Kategorie | Wann |
|
||||
| ----------- | ------ |
|
||||
| `transformer` | Anfang/Ende Transformation |
|
||||
| `csv_reader` | Beim CSV lesen |
|
||||
| `metadata` | Bei Metadaten-Extraktion |
|
||||
| `metadata_warning` | Bei Problemen |
|
||||
| `transformation` | Bei jeder Transformation |
|
||||
| `csv_writer` | Beim CSV schreiben |
|
||||
|
||||
---
|
||||
|
||||
## Firefly III Integration
|
||||
|
||||
Drei Betriebsmodi decken alle typischen Deployment-Szenarien ab.
|
||||
|
||||
**`chunkSize`** (optional, Standard: 0 = deaktiviert): Die Output-CSV wird vor dem Import in Blöcke von maximal N Datenzeilen aufgeteilt. Jeder Block wird als separate Anfrage gesendet. Das verhindert serverseitige Timeouts bei grossen Dateien (Faustregel: ~3–4 s/Transaktion im HTTP-Modus). Der `timeout`-Wert gilt pro Block, nicht für den gesamten Lauf.
|
||||
|
||||
**`chunkRetries`** (optional, Standard: 0 = kein Retry): Anzahl zusätzlicher Importversuche pro Block nach dem ersten. Bei einem Fehler wiederholt der Importer den Upload bis zu dieser Anzahl, bevor er abbricht. Nur wirksam wenn `chunkSize > 0`.
|
||||
|
||||
**`chunkRetryDelay`** (optional, Standard: 0 = keine Pause): Pause in Sekunden vor jedem Block-Request ab dem zweiten Block sowie zwischen Wiederholungsversuchen desselben fehlgeschlagenen Blocks. Ein einziger Wert für Cooldown und Retry-Back-off. Nur wirksam wenn `chunkSize > 0`.
|
||||
|
||||
**`connectionTimeout`** (optional, Standard: 10): Maximale Wartezeit in Sekunden für den Aufbau der TCP-Verbindung zum Importer-Server. Unabhängig von `timeout` (der die gesamte Übertragungsdauer begrenzt). Nur im Modus `http`.
|
||||
|
||||
### Modus `cli`
|
||||
|
||||
Transformer und Importer auf demselben Server.
|
||||
|
||||
```json
|
||||
"fireflyImport": {
|
||||
"mode": "cli",
|
||||
"jsonConfig": "/opt/firefly-data-importer/storage/configurations/ubs-import.json",
|
||||
"importerCommand": "php /opt/firefly-data-importer/artisan importer:import",
|
||||
"chunkSize": 50,
|
||||
"chunkRetries": 3,
|
||||
"chunkRetryDelay": 10,
|
||||
"timeout": 300,
|
||||
"environment": {
|
||||
"FIREFLY_III_URL": "https://localhost",
|
||||
"FIREFLY_III_ACCESS_TOKEN": "your-token-here"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Modus `docker`
|
||||
|
||||
Transformer lokal, Importer in Docker. Das Ausgabeverzeichnis muss als Volume eingebunden sein. `jsonConfig` ist der Pfad **innerhalb des Containers**.
|
||||
|
||||
```json
|
||||
"fireflyImport": {
|
||||
"mode": "docker",
|
||||
"jsonConfig": "/import/configs/ubs-import.json",
|
||||
"importerCommand": "docker exec firefly-importer php artisan importer:import",
|
||||
"chunkSize": 50,
|
||||
"chunkRetries": 3,
|
||||
"chunkRetryDelay": 10,
|
||||
"timeout": 300
|
||||
}
|
||||
```
|
||||
|
||||
### Modus `http`
|
||||
|
||||
Transformer lokal, Importer über HTTP(S) erreichbar. Benötigt `ext-curl`.
|
||||
|
||||
**Voraussetzungen auf dem Importer-Server:**
|
||||
|
||||
```text
|
||||
CAN_POST_FILES=true
|
||||
AUTO_IMPORT_SECRET=<secret> # mindestens 16 Zeichen
|
||||
```
|
||||
|
||||
```json
|
||||
"fireflyImport": {
|
||||
"mode": "http",
|
||||
"importerUrl": "https://importer.your-server.com",
|
||||
"personalSecret": "your-auto-import-secret-min-16-chars",
|
||||
"accessToken": "your-firefly-iii-personal-access-token",
|
||||
"jsonConfig": "config/ubs-import.json",
|
||||
"chunkSize": 50,
|
||||
"chunkRetries": 3,
|
||||
"chunkRetryDelay": 10,
|
||||
"connectionTimeout": 10,
|
||||
"timeout": 300
|
||||
}
|
||||
```
|
||||
|
||||
Die Anfrage geht an `POST {importerUrl}/autoupload?secret={personalSecret}` mit CSV und JSON-Config als Multipart-Felder. `accessToken` wird als `Authorization: Bearer` gesendet. Falls `FIREFLY_III_ACCESS_TOKEN` bereits in der Importer-Umgebung gesetzt ist, kann `accessToken` weggelassen werden.
|
||||
|
||||
---
|
||||
|
||||
### Serverseitige Konfiguration
|
||||
|
||||
Bei grossen Importen liegt der Engpass meist auf dem Firefly III Data Importer-Server, nicht im Transformer. Die folgenden Einstellungen gehören in die Umgebung des Importers (`.env` oder `docker-compose.yml`):
|
||||
|
||||
| Einstellung | Empfohlener Wert | Hinweis |
|
||||
|---|---|---|
|
||||
| `PHP_MEMORY_LIMIT` | `512M` – `2048M` | Docker-Umgebungsvariable. Erhöhen, wenn PHP mit „Allowed memory size exhausted" abbricht. |
|
||||
| `CONNECTION_TIMEOUT` | `60` | Sekunden für den TCP-Verbindungsaufbau zu Firefly III. Standard ~31 s (π × 10). |
|
||||
| `IGNORE_DUPLICATE_ERRORS` | `true` | Doppelte Transaktionswarnungen bei Wiederholungsimporten unterdrücken. |
|
||||
|
||||
**nginx Reverse Proxy** (falls vorhanden):
|
||||
```nginx
|
||||
proxy_read_timeout 600s; # muss länger sein als der längste Einzelblock-Import
|
||||
client_max_body_size 64M; # muss die grösste Chunk-CSV abdecken
|
||||
```
|
||||
|
||||
**Docker Compose** Beispiel:
|
||||
```yaml
|
||||
services:
|
||||
firefly-importer:
|
||||
environment:
|
||||
- PHP_MEMORY_LIMIT=1024M
|
||||
- CONNECTION_TIMEOUT=60
|
||||
- IGNORE_DUPLICATE_ERRORS=true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Verwendung
|
||||
|
||||
```bash
|
||||
# Nur transformieren (kein Import)
|
||||
php bin/transformer.php transform input.csv config/config.json
|
||||
|
||||
# Transformieren und in Firefly III importieren
|
||||
php bin/transformer.php transform input.csv config/config.json --do-import
|
||||
|
||||
# Watch-Modus: automatisch transformieren und importieren bei neuer CSV
|
||||
php bin/transformer.php auto-import config/config.json --watch
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architektur
|
||||
|
||||
```text
|
||||
bin/transformer.php (CLI Entry Point)
|
||||
↓
|
||||
TransformerEngine (Orchestrierung)
|
||||
├─ ConfigurationLoader (Config laden/validieren)
|
||||
├─ CsvReader (CSV einlesen)
|
||||
├─ MetadataExtractor (Metadaten mit Regex)
|
||||
├─ ColumnTransformer (Transformationen anwenden)
|
||||
├─ CsvWriter (CSV schreiben)
|
||||
├─ FireflyImporter (Firefly III Integration)
|
||||
└─ DebugLogger (Debug-Protokolle)
|
||||
```
|
||||
|
||||
| Klasse | Verantwortung |
|
||||
| -------- | --------------- |
|
||||
| `TransformerEngine` | Orchestriert gesamten Workflow |
|
||||
| `ConfigurationLoader` | Lädt und validiert JSON-Konfiguration |
|
||||
| `CsvReader` | Liest CSV mit Metadaten |
|
||||
| `MetadataExtractor` | Extrahiert Metadaten mit Regex |
|
||||
| `ColumnTransformer` | Transformiert Spalten (Pipeline) |
|
||||
| `CsvWriter` | Schreibt CSV |
|
||||
| `FireflyImporter` | Importiert in Firefly III |
|
||||
| `DebugLogger` | Statischer Logger für Debug |
|
||||
|
||||
---
|
||||
|
||||
## Fehlerbehandlung
|
||||
|
||||
### Häufige Fehler
|
||||
|
||||
#### "Input file not found"
|
||||
|
||||
```bash
|
||||
# Prüfe Dateipfad
|
||||
ls -la input.csv
|
||||
|
||||
# Nutze absoluten Pfad wenn relativ nicht funktioniert
|
||||
php bin/transformer.php transform /absolute/path/input.csv config.json
|
||||
```
|
||||
|
||||
#### "Missing metadata: account_iban"
|
||||
|
||||
```bash
|
||||
# Prüfe erste Zeilen des CSV
|
||||
head -5 input.csv
|
||||
|
||||
# Überprüfe lineNumber und regex in config.json
|
||||
php bin/transformer.php validate config.json input.csv --debug
|
||||
```
|
||||
|
||||
#### "Invalid JSON"
|
||||
|
||||
```bash
|
||||
php -r "json_decode(file_get_contents('config/config.json'), true) or die('JSON invalid');"
|
||||
```
|
||||
|
||||
#### "Configuration: 'csvStructure.headerLine' required"
|
||||
|
||||
```bash
|
||||
diff config/config.json config/config.example.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Version & Änderungen
|
||||
|
||||
**v1.0.0 (03. Mai 2026)**
|
||||
|
||||
- ✅ Initial Release
|
||||
- ✅ 14 Transformationstypen
|
||||
- ✅ Metadaten-Extraktion mit Regex
|
||||
- ✅ Debug-Modus
|
||||
- ✅ Firefly III Integration (cli / docker / http)
|
||||
- ✅ Vollständige Dokumentation
|
||||
|
||||
---
|
||||
|
||||
**Lizenz:** GPL-3.0
|
||||
**Author:** PHP CSV Transformer Project
|
||||
**Repository:** [git.andare.ch/david.reindl/ff-imp-preprocessor](https://git.andare.ch/david.reindl/ff-imp-preprocessor)
|
||||
@ -16,10 +16,9 @@ require_once __DIR__ . '/../vendor/autoload.php';
|
||||
use UbsCsvTransformer\TransformerEngine;
|
||||
use UbsCsvTransformer\ConfigurationLoader;
|
||||
use UbsCsvTransformer\FireflyImporter;
|
||||
use UbsCsvTransformer\DebugLogger;
|
||||
|
||||
// ============================================================================
|
||||
// CLI argument processing
|
||||
// CLI-Argument-Verarbeitung
|
||||
// ============================================================================
|
||||
|
||||
$argc = $_SERVER['argc'] ?? 0;
|
||||
@ -30,10 +29,10 @@ if ($argc < 2) {
|
||||
exit(0);
|
||||
}
|
||||
|
||||
// Debug mode can be enabled
|
||||
// Debug-Modus aktivierbar
|
||||
$debug = in_array('--debug', $argv) || in_array('-d', $argv);
|
||||
|
||||
// Extract command
|
||||
// Extrahiere Kommando
|
||||
$command = $argv[1];
|
||||
|
||||
try {
|
||||
@ -55,26 +54,11 @@ try {
|
||||
// ============================================================================
|
||||
|
||||
/**
|
||||
* Returns true when the active shell locale is German (de_*)
|
||||
*/
|
||||
function isGermanLocale(): bool
|
||||
{
|
||||
foreach (['LANG', 'LC_ALL', 'LC_MESSAGES', 'LANGUAGE'] as $var) {
|
||||
$val = getenv($var);
|
||||
if ($val !== false && $val !== '') {
|
||||
return str_starts_with(strtolower($val), 'de');
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Show help and usage instructions
|
||||
* Zeige Hilfe und Verwendungsanleitung
|
||||
*/
|
||||
function showHelp(): void
|
||||
{
|
||||
if (isGermanLocale()) {
|
||||
echo <<<'HELP_DE'
|
||||
echo <<<'HELP'
|
||||
╔════════════════════════════════════════════════════════════════════════════╗
|
||||
║ Firefly Import Preprocessor - Kommandozeilen-Tool ║
|
||||
║ ║
|
||||
@ -100,10 +84,10 @@ KOMMANDOS:
|
||||
Transformiert eine komplette CSV-Datei
|
||||
Optionen:
|
||||
--output=FILE, -o Output-Pfad (Standard: input-transformed.csv)
|
||||
--do-import Nach der Transformation in Firefly III importieren
|
||||
--no-import Nicht automatisch in Firefly III importieren
|
||||
Beispiel:
|
||||
transformer transform ubs-export.csv config.json
|
||||
transformer transform ubs-export.csv config.json --do-import
|
||||
transformer transform ubs-export.csv config.json -o import.csv
|
||||
|
||||
validate [config] [options]
|
||||
Validiert die Konfigurationsdatei
|
||||
@ -186,150 +170,26 @@ KONFIGURATION:
|
||||
|
||||
DOKUMENTATION:
|
||||
|
||||
Siehe README.md für vollständige Dokumentation
|
||||
Siehe README.md und UBS_Transformer_Guide.md für vollständige Dokumentation
|
||||
|
||||
LIZENZ:
|
||||
|
||||
GPL 3
|
||||
MIT License
|
||||
|
||||
HELP_DE;
|
||||
return;
|
||||
}
|
||||
|
||||
echo <<<'HELP_EN'
|
||||
╔════════════════════════════════════════════════════════════════════════════╗
|
||||
║ Firefly Import Preprocessor - Command Line Tool ║
|
||||
║ ║
|
||||
║ A lightweight PHP 8 tool for transforming UBS E-Banking exports ║
|
||||
║ into a Firefly III compatible format. ║
|
||||
╚════════════════════════════════════════════════════════════════════════════╝
|
||||
|
||||
USAGE:
|
||||
transformer [command] [options]
|
||||
|
||||
COMMANDS:
|
||||
|
||||
test [input] [config] [options]
|
||||
Tests the transformation with a limited number of rows
|
||||
Options:
|
||||
--rows=N Process only N rows (default: 10)
|
||||
--output=FILE, -o Also write result to file
|
||||
Example:
|
||||
transformer test ubs-export.csv config.json --rows=5
|
||||
transformer test ubs-export.csv config.json -o test-output.csv
|
||||
|
||||
transform [input] [config] [options]
|
||||
Transforms a complete CSV file
|
||||
Options:
|
||||
--output=FILE, -o Output path (default: input-transformed.csv)
|
||||
--do-import Import into Firefly III after transformation
|
||||
Example:
|
||||
transformer transform ubs-export.csv config.json
|
||||
transformer transform ubs-export.csv config.json --do-import
|
||||
|
||||
validate [config] [options]
|
||||
Validates the configuration file
|
||||
Options:
|
||||
--strict Strict validation (recommended)
|
||||
Example:
|
||||
transformer validate config.json
|
||||
transformer validate config.json --strict
|
||||
|
||||
auto-import [config] [options]
|
||||
Monitors source directory and processes new files
|
||||
Options:
|
||||
--watch Continuous monitoring (daemon mode)
|
||||
--interval=SEC Check interval in seconds (default: 60)
|
||||
--dry-run Show what would be done (no actual processing)
|
||||
Example:
|
||||
transformer auto-import config.json
|
||||
transformer auto-import config.json --watch --interval=30
|
||||
|
||||
help, -h, --help
|
||||
Show this help
|
||||
|
||||
GLOBAL OPTIONS:
|
||||
--debug, -d Enable debug mode (detailed output)
|
||||
|
||||
INSTALLATION:
|
||||
|
||||
1. PHP 8.1+ must be installed
|
||||
php --version
|
||||
|
||||
2. Autoloader setup (choose one):
|
||||
Option A: With Composer (recommended)
|
||||
composer install
|
||||
Option B: Manual - files in directory structure:
|
||||
ff-imp-preprocessor/
|
||||
├── bin/transformer.php
|
||||
├── src/*.php
|
||||
└── config/config.json
|
||||
|
||||
3. Make executable:
|
||||
chmod +x bin/transformer.php
|
||||
|
||||
4. Adjust configuration:
|
||||
cp config/config.example.json config/config.json
|
||||
nano config/config.json
|
||||
|
||||
EXAMPLES:
|
||||
|
||||
# Test transformation (first 5 rows)
|
||||
./bin/transformer test data/ubs-export.csv config/config.json --rows=5
|
||||
|
||||
# Full transformation
|
||||
./bin/transformer transform data/ubs-export.csv config/config.json \
|
||||
--output=output/firefly-import.csv
|
||||
|
||||
# Validate configuration
|
||||
./bin/transformer validate config/config.json --strict
|
||||
|
||||
# Start auto-import with monitoring
|
||||
./bin/transformer auto-import config/config.json --watch
|
||||
|
||||
# Process only next file
|
||||
./bin/transformer auto-import config/config.json
|
||||
|
||||
CONFIGURATION:
|
||||
|
||||
The config.json must have the following structure:
|
||||
{
|
||||
"metadata": { "extractionRules": {...} },
|
||||
"csvStructure": { "delimiter": ";", ... },
|
||||
"columnTransformations": { ... },
|
||||
"fireflyImport": { "apiUrl": "...", "apiKey": "..." },
|
||||
"directories": {
|
||||
"source": "./import/source",
|
||||
"output": "./import/output",
|
||||
"archive": "./import/archive",
|
||||
"error": "./import/error"
|
||||
}
|
||||
}
|
||||
|
||||
DOCUMENTATION:
|
||||
|
||||
See README.md for full documentation
|
||||
|
||||
LICENSE:
|
||||
|
||||
GPL 3
|
||||
|
||||
HELP_EN;
|
||||
HELP;
|
||||
}
|
||||
|
||||
/**
|
||||
* Expands ~ to absolute home directory and resolves relative paths
|
||||
* Expandiert ~ zu absolutem Home-Verzeichnis und löst relative Pfade auf
|
||||
*/
|
||||
function expandPath(string $path): string
|
||||
{
|
||||
if (str_starts_with($path, '~/') || $path === '~') {
|
||||
$homeEnv = getenv('HOME');
|
||||
$pwInfo = posix_getpwuid(posix_getuid());
|
||||
$home = $homeEnv !== false && $homeEnv !== '' ? $homeEnv : ($pwInfo !== false ? $pwInfo['dir'] : '~');
|
||||
$home = getenv('HOME') ?: posix_getpwuid(posix_getuid())['dir'];
|
||||
$path = $home . substr($path, 1);
|
||||
}
|
||||
|
||||
// Resolve relative paths against cwd (without realpath, so non-existent dirs are allowed)
|
||||
// Relative Pfade gegen cwd auflösen (ohne realpath, damit nicht-existierende Dirs erlaubt sind)
|
||||
if (!str_starts_with($path, '/')) {
|
||||
$path = getcwd() . '/' . $path;
|
||||
}
|
||||
@ -338,7 +198,7 @@ function expandPath(string $path): string
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses CLI options into an associative array
|
||||
* Parse CLI-Optionen in assoziatives Array
|
||||
*/
|
||||
function parseOptions(array $argv, int $startIndex = 0): array
|
||||
{
|
||||
@ -357,9 +217,9 @@ function parseOptions(array $argv, int $startIndex = 0): array
|
||||
}
|
||||
|
||||
/**
|
||||
* Tests transformation with a limited number of rows
|
||||
* Teste Transformation mit begrenzter Zeilenzahl
|
||||
*/
|
||||
function handleTest(int $argc, array $argv): void
|
||||
function handleTest($argc, $argv): void
|
||||
{
|
||||
if ($argc < 4) {
|
||||
throw new Exception("Usage: transformer test [input-file] [config-file] [options]");
|
||||
@ -374,10 +234,10 @@ function handleTest(int $argc, array $argv): void
|
||||
$outputFile = $options['output'] ?? $options['o'] ?? null;
|
||||
|
||||
if (!file_exists($inputFile)) {
|
||||
throw new Exception("Input file not found: $inputFile");
|
||||
throw new Exception("Input-Datei nicht gefunden: $inputFile");
|
||||
}
|
||||
if (!file_exists($configFile)) {
|
||||
throw new Exception("Configuration file not found: $configFile");
|
||||
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
|
||||
}
|
||||
|
||||
echo "\n📊 TEST-MODUS: Verarbeite max. $maxRows Zeilen\n";
|
||||
@ -428,17 +288,13 @@ function handleTest(int $argc, array $argv): void
|
||||
echo "\n💾 Output-Datei: $outputFile\n";
|
||||
}
|
||||
|
||||
if ($debug) {
|
||||
echo DebugLogger::format(true);
|
||||
}
|
||||
|
||||
echo "\n✅ Test erfolgreich!\n\n";
|
||||
}
|
||||
|
||||
/**
|
||||
* Transforms a complete CSV file
|
||||
* Transformiere komplette CSV-Datei
|
||||
*/
|
||||
function handleTransform(int $argc, array $argv): void
|
||||
function handleTransform($argc, $argv): void
|
||||
{
|
||||
if ($argc < 4) {
|
||||
throw new Exception("Usage: transformer transform [input-file] [config-file] [options]");
|
||||
@ -450,23 +306,21 @@ function handleTransform(int $argc, array $argv): void
|
||||
$debug = isset($options['debug']) || isset($options['d']);
|
||||
|
||||
$outputFile = $options['output'] ?? $options['o'] ?? null;
|
||||
$doImport = isset($options['do-import']);
|
||||
$resetImport = isset($options['reset-import']);
|
||||
|
||||
if (!file_exists($inputFile)) {
|
||||
throw new Exception("Input file not found: $inputFile");
|
||||
throw new Exception("Input-Datei nicht gefunden: $inputFile");
|
||||
}
|
||||
if (!file_exists($configFile)) {
|
||||
throw new Exception("Configuration file not found: $configFile");
|
||||
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
|
||||
}
|
||||
|
||||
echo "\n🚀 TRANSFORMATION\n";
|
||||
echo "\n🚀 TRANSFORMATION STARTEN\n";
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n";
|
||||
|
||||
$configLoader = new ConfigurationLoader($configFile);
|
||||
$config = $configLoader->load();
|
||||
$configLoader->load();
|
||||
|
||||
// --output overrides target directory and filename from configuration
|
||||
// --output überschreibt Zielverzeichnis und Dateiname aus der Konfiguration
|
||||
if ($outputFile !== null) {
|
||||
$outputFile = expandPath($outputFile);
|
||||
$configLoader->set('directories.output', dirname($outputFile));
|
||||
@ -476,164 +330,17 @@ function handleTransform(int $argc, array $argv): void
|
||||
$engine = new TransformerEngine($configLoader, $debug);
|
||||
$result = $engine->transform($inputFile);
|
||||
|
||||
echo "✅ Transformation complete!\n";
|
||||
echo " Output file: " . ($result['outputFile'] ?? 'N/A') . "\n";
|
||||
echo " Rows transformed: " . ($result['rowsProcessed'] ?? 0) . "\n";
|
||||
echo "✅ Transformation erfolgreich!\n";
|
||||
echo " Output-Datei: " . ($result['outputFile'] ?? 'N/A') . "\n";
|
||||
echo " Zeilen transformiert: " . ($result['rowsProcessed'] ?? 0) . "\n";
|
||||
|
||||
if ($doImport) {
|
||||
if (!empty($config['fireflyImport'])) {
|
||||
echo "\n🚀 FIREFLY III IMPORT\n";
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n";
|
||||
|
||||
$fireflyConfig = $config['fireflyImport'];
|
||||
$importer = new FireflyImporter($fireflyConfig);
|
||||
|
||||
$outputCsv = $result['outputFile'] ?? '';
|
||||
|
||||
if ($resetImport) {
|
||||
$importer->resetImportState($outputCsv);
|
||||
echo " ℹ️ Import state cleared — starting fresh.\n";
|
||||
} elseif ($importer->hasResumeState($outputCsv)) {
|
||||
$stateRaw = @file_get_contents($outputCsv . '.ffi-state.json');
|
||||
$stateData = is_string($stateRaw) ? json_decode($stateRaw, true) : null;
|
||||
if (is_array($stateData)) {
|
||||
$doneSoFar = count((array) ($stateData['completed_chunks'] ?? []));
|
||||
$totalSoFar = (int) ($stateData['total_chunks'] ?? 0);
|
||||
echo " ℹ️ Resuming previous import: {$doneSoFar}/{$totalSoFar} chunks already completed.\n";
|
||||
echo " Add --reset-import to start over from scratch.\n";
|
||||
}
|
||||
}
|
||||
|
||||
$inChunkedMode = false;
|
||||
|
||||
// Detect the system timezone: PHP CLI often defaults to UTC even when the OS
|
||||
// is configured otherwise. Read /etc/localtime symlink to get the real zone.
|
||||
$localTzName = date_default_timezone_get();
|
||||
if (is_link('/etc/localtime')) {
|
||||
$link = (string) readlink('/etc/localtime');
|
||||
if (preg_match('#zoneinfo/(.+)$#', $link, $tzMatch) === 1) {
|
||||
$localTzName = $tzMatch[1];
|
||||
}
|
||||
}
|
||||
$localTz = new \DateTimeZone($localTzName);
|
||||
|
||||
$importer->setProgressCallback(
|
||||
function (string $event, array $data) use (&$inChunkedMode, $localTz): void {
|
||||
static $chunkHadRetry = false;
|
||||
$ts = '[' . (new \DateTimeImmutable('now', $localTz))->format('H:i:s') . ']';
|
||||
if ($event === 'chunk_start') {
|
||||
$inChunkedMode = true;
|
||||
$chunkHadRetry = false;
|
||||
echo " ⏳ {$ts} Chunk {$data['chunk']}/{$data['total']} ({$data['rows']} rows)...";
|
||||
flush();
|
||||
} elseif ($event === 'chunk_done') {
|
||||
$d = round((float) ($data['result']['duration'] ?? 0), 1);
|
||||
$status = $data['result']['success'] ? 'done' : 'failed';
|
||||
if ($chunkHadRetry) {
|
||||
// After retries the line is already terminated — print a full new line
|
||||
echo " ✅ {$ts} Chunk {$data['chunk']}/{$data['total']}: {$status} ({$d}s)\n";
|
||||
} else {
|
||||
echo " {$status} ({$d}s)\n";
|
||||
}
|
||||
flush();
|
||||
} elseif ($event === 'chunk_retry') {
|
||||
$chunkHadRetry = true;
|
||||
$err = (string) ($data['error'] ?? '');
|
||||
$msg = $err !== '' ? " — {$err}" : '';
|
||||
echo "\n 🔄 {$ts} Chunk {$data['chunk']}/{$data['total']}: attempt {$data['attempt']}/{$data['max_attempts']} failed{$msg}\n";
|
||||
flush();
|
||||
} elseif ($event === 'chunk_delay') {
|
||||
$ctx = ($data['context'] ?? '') === 'retry' ? 'retry' : 'next chunk';
|
||||
echo " ⏸ {$ts} Waiting {$data['seconds']}s before {$ctx}...\n";
|
||||
flush();
|
||||
} elseif ($event === 'chunk_skip') {
|
||||
echo " ⏭ {$ts} Chunk {$data['chunk']}/{$data['total']} already completed — skipping\n";
|
||||
flush();
|
||||
} elseif ($event === 'request_start' && !$inChunkedMode) {
|
||||
echo " ⏳ {$ts} Sending to importer...\n";
|
||||
flush();
|
||||
}
|
||||
}
|
||||
);
|
||||
|
||||
$outputDelimiter = (string) ($config['csvStructure']['outputDelimiter'] ?? ',');
|
||||
$importResult = $importer->importChunked($outputCsv, $outputDelimiter);
|
||||
|
||||
$duration = $importResult['duration'] ?? null;
|
||||
$chunks = $importResult['chunks'] ?? null;
|
||||
$summary = $importResult['summary'] ?? null;
|
||||
|
||||
if ($importResult['success']) {
|
||||
if (is_array($summary)) {
|
||||
$created = $summary['created'] ?? 0;
|
||||
$byType = $summary['by_type'] ?? [];
|
||||
$completed = $summary['completed'] ?? false;
|
||||
$duplicates = $summary['duplicates'] ?? 0;
|
||||
$errors = $summary['errors'] ?? [];
|
||||
|
||||
$status = $completed ? '✅ Import complete!' : '⚠️ Import finished (no "Done!" marker received)';
|
||||
echo $status . ($duration !== null ? " ({$duration}s)" : '') . "\n";
|
||||
echo " Transactions created: {$created}\n";
|
||||
|
||||
$typeLabels = ['deposit' => 'Deposits', 'withdrawal' => 'Withdrawals', 'transfer' => 'Transfers'];
|
||||
foreach ($byType as $type => $count) {
|
||||
$label = $typeLabels[$type] ?? ucfirst($type);
|
||||
echo " {$label}: {$count}\n";
|
||||
}
|
||||
|
||||
if ($duplicates > 0) {
|
||||
echo " ⚠️ Duplicates skipped: {$duplicates}\n";
|
||||
}
|
||||
|
||||
if (!empty($errors)) {
|
||||
$errorCount = count($errors);
|
||||
echo " ❌ Errors ({$errorCount}):\n";
|
||||
foreach ($errors as $err) {
|
||||
echo " - {$err}\n";
|
||||
}
|
||||
}
|
||||
} else {
|
||||
echo "✅ Import complete!" . ($duration !== null ? " ({$duration}s)" : '') . "\n";
|
||||
if (!empty($importResult['output']['stdout'])) {
|
||||
echo $importResult['output']['stdout'] . "\n";
|
||||
}
|
||||
}
|
||||
} else {
|
||||
$errorMsg = $importResult['error']
|
||||
?? ('HTTP ' . ($importResult['exit_code'] ?? '?'));
|
||||
$chunksData = $importResult['chunks'] ?? null;
|
||||
if (is_array($chunksData) && $chunksData['total'] > 1) {
|
||||
$failedChunk = $chunksData['done'] + 1;
|
||||
echo "❌ Import failed at chunk {$failedChunk}/{$chunksData['total']}: {$errorMsg}\n";
|
||||
echo " Run the same command again to resume from where it stopped.\n";
|
||||
echo " Add --reset-import to start over from scratch.\n";
|
||||
} else {
|
||||
echo "❌ Import failed: {$errorMsg}\n";
|
||||
}
|
||||
// Only dump the raw response body in debug mode
|
||||
if ($debug && !empty($importResult['output']['stdout'])) {
|
||||
echo $importResult['output']['stdout'] . "\n";
|
||||
}
|
||||
if (!empty($importResult['output']['stderr'])) {
|
||||
echo $importResult['output']['stderr'] . "\n";
|
||||
}
|
||||
}
|
||||
} else {
|
||||
echo "\n⚠️ --do-import specified but no fireflyImport section found in config.\n";
|
||||
}
|
||||
}
|
||||
|
||||
if ($debug) {
|
||||
echo DebugLogger::format(true);
|
||||
}
|
||||
|
||||
echo "\n✅ Done!\n\n";
|
||||
echo "\n✅ Fertig!\n\n";
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates the configuration file
|
||||
* Validiere Konfigurationsdatei
|
||||
*/
|
||||
function handleValidate(int $argc, array $argv): void
|
||||
function handleValidate($argc, $argv): void
|
||||
{
|
||||
if ($argc < 3) {
|
||||
throw new Exception("Usage: transformer validate [config-file] [options]");
|
||||
@ -644,7 +351,7 @@ function handleValidate(int $argc, array $argv): void
|
||||
$strict = isset($options['strict']);
|
||||
|
||||
if (!file_exists($configFile)) {
|
||||
throw new Exception("Configuration file not found: $configFile");
|
||||
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
|
||||
}
|
||||
|
||||
echo "\n✔️ KONFIGURATION VALIDIEREN\n";
|
||||
@ -655,7 +362,7 @@ function handleValidate(int $argc, array $argv): void
|
||||
try {
|
||||
$config = $configLoader->load();
|
||||
|
||||
// Basic validation
|
||||
// Basis-Validierung
|
||||
echo "✅ JSON-Format valide\n";
|
||||
|
||||
$required = ['metadata', 'csvStructure', 'columnTransformations'];
|
||||
@ -670,7 +377,7 @@ function handleValidate(int $argc, array $argv): void
|
||||
}
|
||||
}
|
||||
|
||||
// Firefly validation
|
||||
// Firefly-Validierung
|
||||
if (isset($config['fireflyImport'])) {
|
||||
echo "✅ Firefly III Konfiguration vorhanden\n";
|
||||
if (empty($config['fireflyImport']['apiUrl'])) {
|
||||
@ -689,7 +396,7 @@ function handleValidate(int $argc, array $argv): void
|
||||
echo "⚠️ Firefly III Konfiguration nicht vorhanden (optional)\n";
|
||||
}
|
||||
|
||||
// Directory validation
|
||||
// Verzeichnisse-Validierung
|
||||
if (isset($config['directories'])) {
|
||||
echo "✅ Verzeichnisse konfiguriert\n";
|
||||
$dirs = ['source', 'output', 'archive', 'error'];
|
||||
@ -709,14 +416,14 @@ function handleValidate(int $argc, array $argv): void
|
||||
echo "\n⚠️ Konfiguration hat Warnungen aber ist funktional\n\n";
|
||||
}
|
||||
} catch (Exception $e) {
|
||||
throw new Exception("Validation error: " . $e->getMessage());
|
||||
throw new Exception("Validierungsfehler: " . $e->getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Auto-import with directory monitoring
|
||||
* Auto-Import mit Verzeichnis-Überwachung
|
||||
*/
|
||||
function handleAutoImport(int $argc, array $argv): void
|
||||
function handleAutoImport($argc, $argv): void
|
||||
{
|
||||
if ($argc < 3) {
|
||||
throw new Exception("Usage: transformer auto-import [config-file] [options]");
|
||||
@ -727,7 +434,7 @@ function handleAutoImport(int $argc, array $argv): void
|
||||
$debug = isset($options['debug']) || isset($options['d']);
|
||||
|
||||
if (!file_exists($configFile)) {
|
||||
throw new Exception("Configuration file not found: $configFile");
|
||||
throw new Exception("Konfigurationsdatei nicht gefunden: $configFile");
|
||||
}
|
||||
|
||||
$configLoader = new ConfigurationLoader($configFile);
|
||||
@ -741,7 +448,7 @@ function handleAutoImport(int $argc, array $argv): void
|
||||
$watch = isset($options['watch']);
|
||||
$interval = isset($options['interval']) ? (int)$options['interval'] : 60;
|
||||
|
||||
// Create directories
|
||||
// Verzeichnisse erstellen
|
||||
foreach ([$sourceDir, $outputDir, $archiveDir, $errorDir] as $dir) {
|
||||
if (!is_dir($dir)) {
|
||||
mkdir($dir, 0755, true);
|
||||
@ -769,9 +476,7 @@ function handleAutoImport(int $argc, array $argv): void
|
||||
|
||||
if ($watch) {
|
||||
echo "⏳ Drücke Ctrl+C zum Beenden.\n\n";
|
||||
$running = true;
|
||||
/** @phpstan-ignore while.alwaysTrue (intentional infinite loop — terminated only via Ctrl+C / SIGINT) */
|
||||
while ($running) {
|
||||
while (true) {
|
||||
processImportDirectory($sourceDir, $outputDir, $archiveDir, $errorDir, $config, $configFile, $dryRun, $debug);
|
||||
sleep($interval);
|
||||
}
|
||||
@ -781,9 +486,9 @@ function handleAutoImport(int $argc, array $argv): void
|
||||
}
|
||||
|
||||
/**
|
||||
* Processes directory containing CSV files
|
||||
* Verarbeite Verzeichnis mit CSV-Dateien
|
||||
*/
|
||||
function processImportDirectory(string $sourceDir, string $outputDir, string $archiveDir, string $errorDir, array $config, string $configFile, bool $dryRun = false, bool $debug = false): void
|
||||
function processImportDirectory($sourceDir, $outputDir, $archiveDir, $errorDir, $config, $configFile, $dryRun = false, $debug = false): void
|
||||
{
|
||||
if (!is_dir($sourceDir)) {
|
||||
return;
|
||||
@ -811,10 +516,10 @@ function processImportDirectory(string $sourceDir, string $outputDir, string $ar
|
||||
$result = $engine->transform($file);
|
||||
$outputFile = $result['outputFile'] ?? $outputFile;
|
||||
|
||||
// Archive original file
|
||||
// Archiviere Original-Datei
|
||||
$archiveFile = $archiveDir . '/' . $basename;
|
||||
if (!rename($file, $archiveFile)) {
|
||||
throw new Exception("Could not archive file");
|
||||
throw new Exception("Konnte nicht archivieren");
|
||||
}
|
||||
|
||||
// Firefly Import
|
||||
@ -829,7 +534,7 @@ function processImportDirectory(string $sourceDir, string $outputDir, string $ar
|
||||
echo "❌ " . $e->getMessage() . "\n";
|
||||
|
||||
if (!$dryRun) {
|
||||
// Move to error directory
|
||||
// Verschiebe zu Error-Verzeichnis
|
||||
$errorFile = $errorDir . '/' . $basename;
|
||||
@rename($file, $errorFile);
|
||||
}
|
||||
|
||||
@ -40,7 +40,6 @@
|
||||
"vimeo/psalm": "^5.0"
|
||||
},
|
||||
"suggest": {
|
||||
"ext-curl": "Benötigt für Modus fireflyImport.mode=http (HTTP-Upload an den Data Importer)",
|
||||
"monolog/monolog": "For advanced logging capabilities (optional)",
|
||||
"guzzlehttp/guzzle": "For Firefly III HTTP client integration (optional)"
|
||||
},
|
||||
|
||||
@ -199,20 +199,11 @@
|
||||
],
|
||||
|
||||
"fireflyImport": {
|
||||
"mode": "docker",
|
||||
|
||||
"jsonConfig": "/import/configs/ubs-import.json",
|
||||
|
||||
"importerCommand": "docker exec firefly-importer php artisan importer:import",
|
||||
|
||||
"jsonConfig": "/opt/firefly/import-config.json",
|
||||
"importerCommand": "docker exec -it firefly-importer php artisan importer:import",
|
||||
"autoImport": false,
|
||||
"deleteAfterImport": false,
|
||||
"chunkSize": 0,
|
||||
"chunkRetries": 0,
|
||||
"chunkRetryDelay": 0,
|
||||
"connectionTimeout": 10,
|
||||
"timeout": 300,
|
||||
|
||||
"environment": {
|
||||
"FIREFLY_III_URL": "https://your-firefly.com",
|
||||
"FIREFLY_III_ACCESS_TOKEN": "your-token-here"
|
||||
|
||||
@ -1,79 +0,0 @@
|
||||
{
|
||||
"version": 3,
|
||||
"source": "ff3-importer-2.1.1",
|
||||
"created_at": "2026-05-04T22:22:39+02:00",
|
||||
"date": "Y-m-d",
|
||||
"default_account": 1,
|
||||
"delimiter": "comma",
|
||||
"headers": true,
|
||||
"rules": true,
|
||||
"webhooks": true,
|
||||
"skip_form": false,
|
||||
"add_import_tag": true,
|
||||
"roles": [
|
||||
"amount_debit",
|
||||
"amount_credit",
|
||||
"date_transaction",
|
||||
"date_process",
|
||||
"opposing-name",
|
||||
"tags-comma",
|
||||
"description",
|
||||
"opposing-iban",
|
||||
"opposing-number",
|
||||
"note",
|
||||
"account-iban",
|
||||
"currency-code"
|
||||
],
|
||||
"do_mapping": [
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false,
|
||||
false
|
||||
],
|
||||
"mapping": {},
|
||||
"duplicate_detection_method": "classic",
|
||||
"ignore_duplicate_lines": false,
|
||||
"unique_column_index": 0,
|
||||
"unique_column_type": "note",
|
||||
"pseudo_identifier": [],
|
||||
"flow": "file",
|
||||
"content_type": "csv",
|
||||
"camt_type": "",
|
||||
"custom_tag": "test001",
|
||||
"identifier": "0",
|
||||
"connection": "0",
|
||||
"ignore_spectre_categories": false,
|
||||
"grouped_transaction_handling": "",
|
||||
"use_entire_opposing_address": false,
|
||||
"map_all_data": true,
|
||||
"pending_transactions": false,
|
||||
"access_token": "",
|
||||
"accounts": {},
|
||||
"new_accounts": [],
|
||||
"date_range": "",
|
||||
"date_range_number": 30,
|
||||
"date_range_unit": "d",
|
||||
"date_range_not_after_unit": "",
|
||||
"date_range_not_after_number": 0,
|
||||
"date_not_before": "",
|
||||
"date_not_after": "",
|
||||
"nordigen_country": "",
|
||||
"nordigen_bank": "",
|
||||
"nordigen_requisitions": {},
|
||||
"nordigen_max_days": "90",
|
||||
"lunch_flow_api_key": "",
|
||||
"enable_banking_country": "",
|
||||
"enable_banking_bank": "",
|
||||
"enable_banking_auth_id": "",
|
||||
"enable_banking_sessions": [],
|
||||
"conversion": false,
|
||||
"ignore_duplicate_transactions": true
|
||||
}
|
||||
@ -1,53 +0,0 @@
|
||||
{
|
||||
"_comment_1": "Firefly III Data Importer – configuration file (format version 3)",
|
||||
"_comment_2": "Created for the output of config-ubs-account.json (11 columns, comma-delimited)",
|
||||
"_comment_3": "Adjust: set 'default_account' to your Firefly III asset account ID (number, not name)",
|
||||
"_comment_4": "Docs: https://docs.firefly-iii.org/references/data-importer/json/",
|
||||
|
||||
"version": 3,
|
||||
"flow": "csv",
|
||||
|
||||
"date": "Y-m-d",
|
||||
"delimiter": "comma",
|
||||
"headers": true,
|
||||
"conversion": false,
|
||||
|
||||
"default_account": 1,
|
||||
|
||||
"rules": true,
|
||||
"skip_form": true,
|
||||
"add_import_tag": true,
|
||||
"duplicate_detection_method": "classic",
|
||||
"ignore_duplicate_lines": true,
|
||||
"ignore_duplicate_transactions": true,
|
||||
|
||||
"roles": [
|
||||
"amount_debit",
|
||||
"amount_credit",
|
||||
"date_transaction",
|
||||
"date_process",
|
||||
"opposing-name",
|
||||
"tags-comma",
|
||||
"description",
|
||||
"opposing-iban",
|
||||
"note",
|
||||
"account-iban",
|
||||
"currency-code"
|
||||
],
|
||||
|
||||
"do_mapping": {
|
||||
"0": false,
|
||||
"1": false,
|
||||
"2": false,
|
||||
"3": false,
|
||||
"4": false,
|
||||
"5": false,
|
||||
"6": false,
|
||||
"7": false,
|
||||
"8": false,
|
||||
"9": false,
|
||||
"10": false
|
||||
},
|
||||
|
||||
"mapping": {}
|
||||
}
|
||||
@ -3,36 +3,30 @@
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Transforms columns according to configuration
|
||||
* Transformiert Spalten gemäß Konfiguration
|
||||
*
|
||||
* Supported transformation types (canonical names):
|
||||
* - map: Copy/rename column (default)
|
||||
* - replace: String replacement (str_replace)
|
||||
* - regex: Regex replace via preg_replace (backreferences: $1, $2 …)
|
||||
* - dateformat: Date formatting (toFormat: 'l' yields English weekday name)
|
||||
* - split: Split column at delimiter
|
||||
* - regexextract: Extract using regex
|
||||
* - trim: Remove whitespace
|
||||
* - uppercase: Convert to uppercase
|
||||
* - lowercase: Convert to lowercase
|
||||
* - ucwordsfirst: Capitalise first letter after word boundaries (only when input
|
||||
* has no lowercase letters; strings already mixed-case are returned
|
||||
* unchanged)
|
||||
* - truncate: Truncate string to maximum length
|
||||
* - constantvalue: Constant value from metadata
|
||||
* - pipeline: Chain multiple transformations (via steps[])
|
||||
* - custom: Custom PHP callback
|
||||
* - timeperiod: Map a time string to a period label (morning, evening, …)
|
||||
* Unterstützte Transformationstypen (canonical names):
|
||||
* - map: Spalte kopieren/umbenennen (Standard)
|
||||
* - replace: String-Replacement (str_replace)
|
||||
* - regex: Regex-Replace mit preg_replace (Backreferenzen: $1, $2 …)
|
||||
* - dateformat: Datum-Formatierung
|
||||
* - split: Spalte bei Delimiter teilen
|
||||
* - regexextract: Mit Regex extrahieren
|
||||
* - trim: Whitespace entfernen
|
||||
* - uppercase: In Grossbuchstaben umwandeln
|
||||
* - lowercase: In Kleinbuchstaben umwandeln
|
||||
* - ucwordsfirst: Ersten Buchstaben nach Worttrennern gross
|
||||
* - truncate: String auf maximale Länge kürzen
|
||||
* - constantvalue: Konstanten-Wert aus Metadaten
|
||||
* - pipeline: Mehrere Transformationen hintereinander (via steps[])
|
||||
* - custom: Custom PHP-Callback
|
||||
*
|
||||
* Supported outputAction values:
|
||||
* - create / overwrite: Set target column (default)
|
||||
* - append: Append value directly; optional "appendDelimiter" inserts a separator
|
||||
* between existing and new value (skipped when target is still empty)
|
||||
* - append-if-not-empty: Like append, but skips entirely when the transformation result is
|
||||
* empty (safe for optional values like tags and notes lines)
|
||||
* - append-line: Append value on new line (no leading newline if target is empty)
|
||||
* - overwrite-if-empty: Only set if target column is empty
|
||||
* - overwrite-if-not-empty: Only set if transformation result is not empty
|
||||
* Unterstützte outputAction-Werte:
|
||||
* - create / overwrite: Ziel-Spalte setzen (Standard)
|
||||
* - append: Wert anhängen
|
||||
* - append-line: Wert auf neuer Zeile anhängen (kein Leerzeichen wenn Ziel leer)
|
||||
* - overwrite-if-empty: Nur setzen wenn Ziel-Spalte leer
|
||||
* - overwrite-if-not-empty: Nur setzen wenn Ergebnis nicht leer
|
||||
*/
|
||||
class ColumnTransformer
|
||||
{
|
||||
@ -42,11 +36,11 @@ class ColumnTransformer
|
||||
private array $globalExceptions;
|
||||
|
||||
/**
|
||||
* Initialises ColumnTransformer with transformation rules
|
||||
* Initialisiert ColumnTransformer mit Transformationsregeln
|
||||
*
|
||||
* @param array $transformations Transformation configuration from config.json
|
||||
* @param array $metadata Extracted metadata from CSV header
|
||||
* @param array $globalExceptions Global exceptions list for ucwordsfirst
|
||||
* @param array $transformations Transformationskonfiguration aus config.json
|
||||
* @param array $metadata Extrahierte Metadaten aus CSV-Header
|
||||
* @param array $globalExceptions Globale Ausnahmeliste für ucwordsfirst
|
||||
*/
|
||||
public function __construct(array $transformations, array $metadata = [], array $globalExceptions = [])
|
||||
{
|
||||
@ -57,36 +51,36 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Transforms a single data row
|
||||
* Transformiert eine einzelne Datenzeile
|
||||
*
|
||||
* Applies all defined transformations to the row.
|
||||
* Can generate new columns (e.g. for regex_extract).
|
||||
* Wendet alle definierten Transformationen auf die Zeile an.
|
||||
* Kann neue Spalten generieren (z.B. bei regex_extract).
|
||||
*
|
||||
* @param array $row Data row with header keys as array keys
|
||||
* @param array $row Datenzeile mit Header-Keys als Array-Keys
|
||||
*
|
||||
* @return array Transformed data row
|
||||
* @return array Transformierte Datenzeile
|
||||
*/
|
||||
public function transformRow(array $row): array
|
||||
{
|
||||
$transformedRow = $row;
|
||||
|
||||
foreach ($this->transformations as $config) {
|
||||
// Multi-output detection (for split)
|
||||
// Multi-Output Detection (für split)
|
||||
if (isset($config['outputs']) && is_array($config['outputs'])) {
|
||||
// Multi-output transformation (e.g. split into multiple columns)
|
||||
// Multi-Output Transformation (z.B. split in mehrere Spalten)
|
||||
$multiOutputResult = $this->handleMultiOutputTransformation($transformedRow, $config);
|
||||
|
||||
// Merge results into transformedRow
|
||||
// Merge Ergebnisse in transformedRow
|
||||
foreach ($multiOutputResult as $columnName => $value) {
|
||||
$transformedRow[$columnName] = $value;
|
||||
|
||||
// Register new columns
|
||||
// Registriere neue Spalten
|
||||
if (!in_array($columnName, $this->outputColumns)) {
|
||||
$this->outputColumns[] = $columnName;
|
||||
}
|
||||
}
|
||||
|
||||
// Continue with next transformation
|
||||
// Fahre mit nächster Transformation fort
|
||||
continue;
|
||||
}
|
||||
|
||||
@ -96,7 +90,7 @@ class ColumnTransformer
|
||||
|
||||
if (empty($targetColumn)) {
|
||||
throw new \RuntimeException(
|
||||
"Transformation missing 'outputColumn' field: " . json_encode($config)
|
||||
"Transformation fehlt 'outputColumn' Feld: " . json_encode($config)
|
||||
);
|
||||
}
|
||||
|
||||
@ -130,38 +124,23 @@ class ColumnTransformer
|
||||
// Apply output action
|
||||
switch ($outputAction) {
|
||||
case 'append':
|
||||
$existing = $transformedRow[$targetColumn] ?? '';
|
||||
if (isset($config['appendDelimiter']) && $existing !== '') {
|
||||
$transformedRow[$targetColumn] = $existing . (string) $config['appendDelimiter'] . $resultValue;
|
||||
} else {
|
||||
$transformedRow[$targetColumn] = $existing . $resultValue;
|
||||
}
|
||||
break;
|
||||
case 'append-if-not-empty':
|
||||
if ($resultValue !== '') {
|
||||
$existing = $transformedRow[$targetColumn] ?? '';
|
||||
if (isset($config['appendDelimiter']) && $existing !== '') {
|
||||
$transformedRow[$targetColumn] = $existing . (string) $config['appendDelimiter'] . $resultValue;
|
||||
} else {
|
||||
$transformedRow[$targetColumn] = $existing . $resultValue;
|
||||
}
|
||||
}
|
||||
$transformedRow[$targetColumn] = ($transformedRow[$targetColumn] ?? '') . $resultValue;
|
||||
break;
|
||||
case 'append-line':
|
||||
// Append value on new line; no leading newline if target is empty
|
||||
// Wert auf neuer Zeile anhängen; kein führender Zeilenumbruch wenn Ziel leer
|
||||
if ($resultValue !== '') {
|
||||
$existing = $transformedRow[$targetColumn] ?? '';
|
||||
$transformedRow[$targetColumn] = $existing !== '' ? $existing . "\n" . $resultValue : $resultValue;
|
||||
}
|
||||
break;
|
||||
case 'overwrite-if-empty':
|
||||
// Only overwrite if target column is empty
|
||||
// Nur überschreiben wenn Ziel-Spalte leer ist
|
||||
if (($transformedRow[$targetColumn] ?? '') === '') {
|
||||
$transformedRow[$targetColumn] = $resultValue;
|
||||
}
|
||||
break;
|
||||
case 'overwrite-if-not-empty':
|
||||
// Only overwrite if the transformation result is not empty
|
||||
// Nur überschreiben wenn das Transformations-Ergebnis nicht leer ist
|
||||
if ($resultValue !== '') {
|
||||
$transformedRow[$targetColumn] = $resultValue;
|
||||
}
|
||||
@ -178,14 +157,14 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Applies a single transformation to a string value
|
||||
* Wendet eine einzelne Transformation auf einen Stringwert an
|
||||
*
|
||||
* Normalises the type name (snake_case, PascalCase, no-separator all accepted)
|
||||
* and delegates to the respective transformXxx() method.
|
||||
* Normalisiert den Typ-Namen (snake_case, PascalCase, no-separator alle akzeptiert)
|
||||
* und delegiert an die jeweilige transformXxx()-Methode.
|
||||
*
|
||||
* @param string $value Input value
|
||||
* @param array $config Transformation configuration
|
||||
* @return string Transformed value
|
||||
* @param string $value Eingabewert
|
||||
* @param array $config Transformationskonfiguration
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function applySingleTransformation(string $value, array $config): string
|
||||
{
|
||||
@ -226,9 +205,6 @@ class ColumnTransformer
|
||||
case 'pipeline':
|
||||
return $this->transformPipeline($value, $config);
|
||||
|
||||
case 'timeperiod':
|
||||
return $this->transformTimePeriod($value, $config);
|
||||
|
||||
case 'truncate':
|
||||
$maxLength = (int)($config['maxLength'] ?? 255);
|
||||
return mb_substr($value, 0, $maxLength, 'UTF-8');
|
||||
@ -243,8 +219,8 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Normalises transformation type names: lowercase, separators removed.
|
||||
* Allows e.g. 'dateformat' and 'dateFormat' to both work.
|
||||
* Normalisiert Transformationstyp-Namen: lowercase, Trennzeichen entfernt.
|
||||
* Erlaubt z.B. dass 'dateformat' und 'dateFormat' beide funktionieren.
|
||||
*/
|
||||
private function normalizeTransformType(string $type): string
|
||||
{
|
||||
@ -252,19 +228,19 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* String replacement transformation
|
||||
* String-Replacement Transformation
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "replace",
|
||||
* "search": "old",
|
||||
* "replace": "new"
|
||||
* "search": "Alt",
|
||||
* "replace": "Neu"
|
||||
* ```
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param array $config Transformation configuration
|
||||
* @param string $value Ursprungswert
|
||||
* @param array $config Transformationskonfiguration
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformReplace(string $value, array $config): string
|
||||
{
|
||||
@ -279,22 +255,22 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Regex replace transformation
|
||||
* Regex-Replace Transformation
|
||||
*
|
||||
* Applies a regular expression to the value and replaces the match.
|
||||
* Backreference syntax: $1, $2 etc. in the replace string.
|
||||
* Wendet einen regulären Ausdruck auf den Wert an und ersetzt den Treffer.
|
||||
* Backreferenz-Syntax: $1, $2 usw. im replace-String.
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "regex",
|
||||
* "pattern": "SumUp \\*+(.*)",
|
||||
* "replace": "[$1]"
|
||||
* ```
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param array $config Transformation configuration
|
||||
* @param string $value Ursprungswert
|
||||
* @param array $config Transformationskonfiguration
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformRegex(string $value, array $config): string
|
||||
{
|
||||
@ -312,19 +288,19 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Date format transformation
|
||||
* Datum-Format Transformation
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "date_format",
|
||||
* "fromFormat": "d.m.Y",
|
||||
* "toFormat": "Y-m-d"
|
||||
* ```
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param array $config Transformation configuration
|
||||
* @param string $value Ursprungswert
|
||||
* @param array $config Transformationskonfiguration
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformDate(string $value, array $config): string
|
||||
{
|
||||
@ -347,26 +323,26 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Split transformation
|
||||
* Split Transformation
|
||||
*
|
||||
* Splits a value at a delimiter and keeps a defined part
|
||||
* Teilt einen Wert bei einem Delimiter und behaelt einen definierten Teil
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* Input: "Coop Pronto Chur;7007 Chur"
|
||||
* Config: delimiter=";", part=0
|
||||
* Output: "Coop Pronto Chur"
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "split",
|
||||
* "delimiter": ";",
|
||||
* "part": 0
|
||||
* ```
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param array $config Transformation configuration
|
||||
* @param string $value Ursprungswert
|
||||
* @param array $config Transformationskonfiguration
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformSplit(string $value, array $config): string
|
||||
{
|
||||
@ -394,16 +370,16 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Regex extract transformation
|
||||
* Regex Extract Transformation
|
||||
*
|
||||
* Extracts a portion using regex and creates a new column
|
||||
* Extrahiert einen Teil mit Regex und erstellt neue Spalte
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* Input: "Coop Pronto Chur;7007 Chur"
|
||||
* Config: pattern="(\d{4,} .*)"
|
||||
* Output: "7007 Chur" (in new column "Location")
|
||||
* Output: "7007 Chur" (in neuer Spalte "Location")
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "Location": {
|
||||
* "type": "regex_extract",
|
||||
@ -412,10 +388,10 @@ class ColumnTransformer
|
||||
* }
|
||||
* ```
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param array $config Transformation configuration
|
||||
* @param string $value Ursprungswert
|
||||
* @param array $config Transformationskonfiguration
|
||||
*
|
||||
* @return string|null Extracted value or null
|
||||
* @return string|null Extrahierter Wert oder null
|
||||
*/
|
||||
private function transformRegexExtract(string $value, array $config): ?string
|
||||
{
|
||||
@ -445,22 +421,22 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Trim transformation
|
||||
* Trim Transformation
|
||||
*
|
||||
* Removes whitespace from the beginning and end of a string
|
||||
* Entfernt Leerzeichen am Anfang und Ende eines Strings
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "trim"
|
||||
* ```
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* Input: " Coop Pronto "
|
||||
* Output: "Coop Pronto"
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param string $value Ursprungswert
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformTrim(string $value): string
|
||||
{
|
||||
@ -468,22 +444,22 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Lowercase transformation
|
||||
* Lowercase Transformation
|
||||
*
|
||||
* Converts a string to lowercase (UTF-8 safe)
|
||||
* Wandelt einen String in Kleinbuchstaben um (UTF-8 safe)
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "lowercase"
|
||||
* ```
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* Input: "COOP PRONTO CHUR"
|
||||
* Output: "coop pronto chur"
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param string $value Ursprungswert
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformLowercase(string $value): string
|
||||
{
|
||||
@ -491,22 +467,22 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Uppercase transformation
|
||||
* Uppercase Transformation
|
||||
*
|
||||
* Converts a string to uppercase (UTF-8 safe)
|
||||
* Wandelt einen String in Grossbuchstaben um (UTF-8 safe)
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "uppercase"
|
||||
* ```
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* Input: "Coop Pronto Chur"
|
||||
* Output: "COOP PRONTO CHUR"
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param string $value Ursprungswert
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformUppercase(string $value): string
|
||||
{
|
||||
@ -514,73 +490,65 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Ucwords first transformation
|
||||
* Ucwords First Transformation
|
||||
*
|
||||
* Capitalises only the first letter after word boundaries.
|
||||
* All other letters are converted to lowercase.
|
||||
* Works even when input is entirely in uppercase.
|
||||
* Grossschreibung nur des ersten Buchstabens nach Worttrennern.
|
||||
* Alle anderen Buchstaben werden zu Kleinbuchstaben.
|
||||
* Funktioniert auch, wenn Input komplett in Grossbuchstaben vorliegt.
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "ucwords_first"
|
||||
* ```
|
||||
*
|
||||
* With exceptions list (words that are preserved exactly):
|
||||
* Mit Ausnahmeliste (Wörter, die exakt erhalten bleiben):
|
||||
* ```
|
||||
* "type": "ucwords_first",
|
||||
* "exceptions": ["SBB", "UBS", "AG", "GmbH"]
|
||||
* ```
|
||||
*
|
||||
* Examples:
|
||||
* Beispiele:
|
||||
* "COOP PRONTO CHUR" → "Coop Pronto Chur"
|
||||
* "migros-rail city zuerich" → "Migros-Rail City Zuerich"
|
||||
* "O'NEILL STORE" → "O'Neill Store"
|
||||
* "SAINT-JEAN-DE-MAURIENNE" → "Saint-Jean-De-Maurienne"
|
||||
*
|
||||
* Word boundaries defined by: space, hyphen, apostrophe,
|
||||
* slash, period, comma, semicolon, colon, brackets, quotation marks
|
||||
* Wortgrenzen definiert durch: Leerzeichen, Bindestrich, Apostroph,
|
||||
* Slash, Punkt, Komma, Semikolon, Doppelpunkt, Klammern, Anführungszeichen
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param string $value Ursprungswert
|
||||
*
|
||||
* @return string Transformed value
|
||||
* @return string Transformierter Wert
|
||||
*/
|
||||
private function transformUcwordsFirst(string $value, array $config = []): string
|
||||
{
|
||||
// Guard: if the string already contains both uppercase and lowercase letters
|
||||
// (i.e. mixed-case), it has already been intentionally formatted — leave it alone.
|
||||
// Fully-uppercase or fully-lowercase strings are still processed so that patterns
|
||||
// like "lowercase → ucwordsfirst" continue to work as expected.
|
||||
if (preg_match('/\p{Lu}/u', $value) && preg_match('/\p{Ll}/u', $value)) {
|
||||
return $value;
|
||||
}
|
||||
|
||||
// Step 1: Convert everything to lowercase
|
||||
// Schritt 1: Alles zu Kleinbuchstaben
|
||||
$value = mb_strtolower($value, 'UTF-8');
|
||||
|
||||
// Step 2: Define word boundaries (delimiters)
|
||||
// These characters mark boundaries after which capitalisation is applied
|
||||
// Schritt 2: Definiere Wortgrenzen (Trennzeichen)
|
||||
// Diese Zeichen markieren Grenzen, nach denen grossgeschrieben wird
|
||||
$delimiters = [
|
||||
' ', // space
|
||||
'-', // hyphen
|
||||
'\'', // apostrophe
|
||||
'/', // slash
|
||||
'.', // period
|
||||
',', // comma
|
||||
';', // semicolon
|
||||
':', // colon
|
||||
'(', // opening parenthesis
|
||||
')', // closing parenthesis
|
||||
'[', // opening square bracket
|
||||
']', // closing square bracket
|
||||
'{', // opening curly bracket
|
||||
'}', // closing curly bracket
|
||||
'"', // quotation mark
|
||||
'&', // ampersand
|
||||
'+' // plus
|
||||
' ', // Leerzeichen
|
||||
'-', // Bindestrich
|
||||
'\'', // Apostroph
|
||||
'/', // Slash
|
||||
'.', // Punkt
|
||||
',', // Komma
|
||||
';', // Semikolon
|
||||
':', // Doppelpunkt
|
||||
'(', // Oeffnende Klammer
|
||||
')', // Schliessende Klammer
|
||||
'[', // Oeffnende eckige Klammer
|
||||
']', // Schliessende eckige Klammer
|
||||
'{', // Oeffnende geschweifte Klammer
|
||||
'}', // Schliessende geschweifte Klammer
|
||||
'"', // Anführungszeichen
|
||||
'&', // Ampersand
|
||||
'+' // Plus
|
||||
];
|
||||
|
||||
// Step 3: Regex pattern for "start of string OR delimiter, followed by letter"
|
||||
// The u-flag enables Unicode support (\p{L})
|
||||
// Schritt 3: Regex-Pattern fuer "Stringanfang ODER Delimiter, gefolgt von Buchstabe"
|
||||
// Die u-Flag ermoeglicht Unicode-Unterstaetzung (\p{L})
|
||||
$escapedDelimiters = array_map(function ($char) {
|
||||
return preg_quote($char, '/');
|
||||
}, $delimiters);
|
||||
@ -588,18 +556,18 @@ class ColumnTransformer
|
||||
|
||||
$pattern = '/(^|[' . $delimiterPattern . '])(\p{L})/u';
|
||||
|
||||
// Step 4: Callback for preg_replace_callback
|
||||
// Capitalise the captured letter (capture group 2)
|
||||
// Schritt 4: Callback fuer preg_replace_callback
|
||||
// Grossschreibe den gefangenen Buchstaben (Capture Group 2)
|
||||
$callback = function (array $matches): string {
|
||||
// $matches[1] = start of string or delimiter
|
||||
// $matches[2] = letter to be capitalised
|
||||
// $matches[1] = Stringanfang oder Trennzeichen
|
||||
// $matches[2] = Buchstabe, der grossgeschrieben werden soll
|
||||
return $matches[1] . mb_strtoupper($matches[2], 'UTF-8');
|
||||
};
|
||||
|
||||
// Step 5: Apply transformation
|
||||
// Schritt 5: Anwende Transformation
|
||||
$result = preg_replace_callback($pattern, $callback, $value) ?? $value;
|
||||
|
||||
// Step 6: Apply exceptions list (words to be preserved exactly, e.g. SBB, UBS, GmbH)
|
||||
// Schritt 6: Ausnahmeliste anwenden (Wörter die exakt erhalten bleiben sollen, z.B. SBB, UBS, GmbH)
|
||||
$exceptions = $config['exceptions'] ?? $this->globalExceptions;
|
||||
foreach ($exceptions as $exception) {
|
||||
if (!is_string($exception) || $exception === '') {
|
||||
@ -613,12 +581,12 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Pipeline transformation
|
||||
* Pipeline Transformation
|
||||
*
|
||||
* Applies multiple transformations sequentially to a value.
|
||||
* Each step uses the result of the previous step.
|
||||
* Wendet mehrere Transformationen nacheinander auf einen Wert an.
|
||||
* Jeder Schritt benutzt das Ergebnis des vorherigen Schrittes.
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "Merchant": {
|
||||
* "type": "pipeline",
|
||||
@ -631,17 +599,17 @@ class ColumnTransformer
|
||||
* }
|
||||
* ```
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* Input: " COOP PRONTO CHUR "
|
||||
* Step 1 (trim): "COOP PRONTO CHUR"
|
||||
* Step 2 (lowercase): "coop pronto chur"
|
||||
* Step 3 (ucwords_first): "Coop Pronto Chur"
|
||||
* Output: "Coop Pronto Chur"
|
||||
*
|
||||
* @param string $value Source value
|
||||
* @param array $config Transformation configuration with 'steps' array
|
||||
* @param string $value Ursprungswert
|
||||
* @param array $config Transformationskonfiguration mit 'steps' Array
|
||||
*
|
||||
* @return string Transformed value after all steps
|
||||
* @return string Transformierter Wert nach allen Schritten
|
||||
*/
|
||||
private function transformPipeline(string $value, array $config): string
|
||||
{
|
||||
@ -651,7 +619,7 @@ class ColumnTransformer
|
||||
return $value;
|
||||
}
|
||||
|
||||
// Apply each step sequentially
|
||||
// Wende jeden Schritt nacheinander an
|
||||
foreach ($steps as $step) {
|
||||
if (!empty($step['type'] ?? $step['transform'] ?? null)) {
|
||||
$value = $this->applySingleTransformation($value, $step);
|
||||
@ -662,23 +630,23 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Custom callback transformation
|
||||
* Custom Callback Transformation
|
||||
*
|
||||
* Calls a custom function implementing complex logic
|
||||
* Ruft eine Custom-Funktion auf, die komplexe Logik implementiert
|
||||
*
|
||||
* Configuration:
|
||||
* Konfiguration:
|
||||
* ```
|
||||
* "type": "custom",
|
||||
* "callback": "myCustomFunction"
|
||||
* ```
|
||||
*
|
||||
* The callback function receives the entire row and returns the
|
||||
* modified row.
|
||||
* Die Callback-Funktion erhaelt die gesamte Zeile und gibt die
|
||||
* modifizierte Zeile zurueck.
|
||||
*
|
||||
* @param array $row Complete data row
|
||||
* @param array $config Transformation configuration
|
||||
* @param array $row Gesamte Datenzeile
|
||||
* @param array $config Transformationskonfiguration
|
||||
*
|
||||
* @return array Transformed data row
|
||||
* @return array Transformierte Datenzeile
|
||||
*/
|
||||
private function transformCustom(array $row, array $config): array
|
||||
{
|
||||
@ -696,10 +664,10 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Handles multi-output transformations
|
||||
* Currently only implemented for 'split'.
|
||||
* Behandelt Multi-Output Transformationen
|
||||
* Aktuell nur für 'split' implementiert.
|
||||
*
|
||||
* Config example:
|
||||
* Config-Beispiel:
|
||||
* {
|
||||
* "outputs": ["FirstName", "LastName"],
|
||||
* "sourceColumn": "FullName",
|
||||
@ -707,10 +675,10 @@ class ColumnTransformer
|
||||
* "delimiter": " "
|
||||
* }
|
||||
*
|
||||
* @param array $row Input row
|
||||
* @param array $config Transformation configuration
|
||||
* @return array Associative array: columnName => value
|
||||
* @throws \RuntimeException if transformation type is not supported
|
||||
* @param array $row Input-Zeile
|
||||
* @param array $config Transformations-Konfiguration
|
||||
* @return array Assoziatives Array: columnName => value
|
||||
* @throws \RuntimeException wenn Transformation-Type nicht unterstützt
|
||||
*/
|
||||
private function handleMultiOutputTransformation(array $row, array $config): array
|
||||
{
|
||||
@ -719,39 +687,39 @@ class ColumnTransformer
|
||||
$transformType = $this->normalizeTransformType($config['type'] ?? '');
|
||||
|
||||
if (empty($outputs) || empty($sourceColumn) || empty($transformType)) {
|
||||
throw new \RuntimeException("Multi-output transformation requires 'outputs', 'sourceColumn' and 'type'");
|
||||
throw new \RuntimeException("Multi-Output Transformation benötigt 'outputs', 'sourceColumn' und 'type'");
|
||||
}
|
||||
|
||||
$sourceValue = $row[$sourceColumn] ?? '';
|
||||
|
||||
if ($transformType !== 'split') {
|
||||
throw new \RuntimeException("Multi-output only supported for 'split', given: {$transformType}");
|
||||
throw new \RuntimeException("Multi-Output nur für 'split' unterstützt, gegeben: {$transformType}");
|
||||
}
|
||||
|
||||
return $this->handleMultiOutputSplit($sourceValue, $outputs, $config);
|
||||
}
|
||||
|
||||
/**
|
||||
* Split transformation with multi-output
|
||||
* Splits a string and distributes the parts across multiple columns
|
||||
* Split-Transformation mit Multi-Output
|
||||
* Teilt einen String und verteilt die Teile auf mehrere Spalten
|
||||
*
|
||||
* @param string $value String to split
|
||||
* @param array $outputs List of target column names
|
||||
* @param array $config Transformation configuration
|
||||
* @return array Associative array: columnName => value
|
||||
* @param string $value Zu teilender String
|
||||
* @param array $outputs Liste der Ziel-Spaltennamen
|
||||
* @param array $config Transformation-Config
|
||||
* @return array Assoziatives Array: columnName => value
|
||||
*/
|
||||
|
||||
private function handleMultiOutputSplit(string $value, array $outputs, array $config): array
|
||||
{
|
||||
$delimiter = $config['delimiter'] ?? ';';
|
||||
|
||||
// Perform split
|
||||
// Führe Split durch
|
||||
$parts = explode($delimiter, $value);
|
||||
|
||||
// Map parts to output columns
|
||||
// Mappe Parts zu Output-Spalten
|
||||
$result = [];
|
||||
foreach ($outputs as $index => $columnName) {
|
||||
// If part exists: use it (trimmed) // If not: empty string
|
||||
// Wenn Teil existiert: verwenden (getrimmt) // Wenn nicht: leerer String
|
||||
$result[$columnName] = isset($parts[$index]) ? trim($parts[$index]) : '';
|
||||
}
|
||||
|
||||
@ -762,89 +730,14 @@ class ColumnTransformer
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the number of output columns
|
||||
* Gibt die Anzahl der Output-Spalten zurueck
|
||||
*
|
||||
* Counts original columns plus newly generated columns (e.g. from regex_extract)
|
||||
* Zaehlt Original-Spalten plus neu generierte Spalten (z.B. bei regex_extract)
|
||||
*
|
||||
* @return int Number of output columns
|
||||
* @return int Anzahl Output-Spalten
|
||||
*/
|
||||
public function getOutputColumns(): int
|
||||
{
|
||||
return count(array_unique($this->outputColumns));
|
||||
}
|
||||
|
||||
/**
|
||||
* Time-period transformer
|
||||
*
|
||||
* Maps a time string to a period label via a configurable list of ranges.
|
||||
* Supports midnight-spanning ranges (e.g. "22:00:00" to "03:59:59").
|
||||
* Returns the configured default (empty string by default) when no range matches
|
||||
* or the input cannot be parsed.
|
||||
*
|
||||
* Configuration:
|
||||
* ```json
|
||||
* {
|
||||
* "type": "timeperiod",
|
||||
* "timeFormat": "H:i:s",
|
||||
* "periods": [
|
||||
* {"from": "04:00:00", "to": "08:59:59", "label": "Morgen"},
|
||||
* {"from": "22:00:00", "to": "03:59:59", "label": "Nacht"}
|
||||
* ],
|
||||
* "default": ""
|
||||
* }
|
||||
* ```
|
||||
*
|
||||
* @param string $value Time string to evaluate
|
||||
* @param array<string, mixed> $config Transformation configuration
|
||||
* @return string Period label or default
|
||||
*/
|
||||
private function transformTimePeriod(string $value, array $config): string
|
||||
{
|
||||
$default = (string) ($config['default'] ?? '');
|
||||
$timeFormat = (string) ($config['timeFormat'] ?? 'H:i:s');
|
||||
/** @var array<int, array<string, string>> $periods */
|
||||
$periods = $config['periods'] ?? [];
|
||||
|
||||
if ($value === '' || empty($periods)) {
|
||||
return $default;
|
||||
}
|
||||
|
||||
$parsed = \DateTime::createFromFormat($timeFormat, $value);
|
||||
if ($parsed === false) {
|
||||
return $default;
|
||||
}
|
||||
|
||||
// Represent time as total minutes from midnight for easy comparison
|
||||
$minutes = (int) $parsed->format('H') * 60 + (int) $parsed->format('i');
|
||||
|
||||
foreach ($periods as $period) {
|
||||
$fromStr = (string) ($period['from'] ?? '');
|
||||
$toStr = (string) ($period['to'] ?? '');
|
||||
$label = (string) ($period['label'] ?? '');
|
||||
|
||||
$fromParsed = \DateTime::createFromFormat($timeFormat, $fromStr);
|
||||
$toParsed = \DateTime::createFromFormat($timeFormat, $toStr);
|
||||
|
||||
if ($fromParsed === false || $toParsed === false) {
|
||||
continue;
|
||||
}
|
||||
|
||||
$fromMin = (int) $fromParsed->format('H') * 60 + (int) $fromParsed->format('i');
|
||||
$toMin = (int) $toParsed->format('H') * 60 + (int) $toParsed->format('i');
|
||||
|
||||
if ($fromMin <= $toMin) {
|
||||
// Normal range (e.g. 04:00 – 08:59)
|
||||
if ($minutes >= $fromMin && $minutes <= $toMin) {
|
||||
return $label;
|
||||
}
|
||||
} else {
|
||||
// Midnight-spanning range (e.g. 22:00 – 03:59)
|
||||
if ($minutes >= $fromMin || $minutes <= $toMin) {
|
||||
return $label;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return $default;
|
||||
}
|
||||
}
|
||||
|
||||
@ -3,7 +3,7 @@
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Loads and validates JSON configuration files
|
||||
* Lädt und validiert JSON-Konfigurationsdateien
|
||||
*/
|
||||
class ConfigurationLoader
|
||||
{
|
||||
@ -16,19 +16,19 @@ class ConfigurationLoader
|
||||
}
|
||||
|
||||
/**
|
||||
* Loads the configuration file
|
||||
* Lädt die Konfigurationsdatei
|
||||
*
|
||||
* @return array The loaded and validated configuration
|
||||
* @throws \RuntimeException if file not found or invalid
|
||||
* @return array Die geladene und validierte Konfiguration
|
||||
* @throws \RuntimeException wenn Datei nicht gefunden oder ungültig
|
||||
*/
|
||||
public function load(): array
|
||||
{
|
||||
if (!file_exists($this->configFile)) {
|
||||
throw new \RuntimeException("Configuration file not found: {$this->configFile}");
|
||||
throw new \RuntimeException("Konfigurationsdatei nicht gefunden: {$this->configFile}");
|
||||
}
|
||||
|
||||
if (pathinfo($this->configFile, PATHINFO_EXTENSION) !== 'json') {
|
||||
throw new \RuntimeException("Configuration file must be a JSON file: {$this->configFile}");
|
||||
throw new \RuntimeException("Konfigurationsdatei muss eine JSON-Datei sein: {$this->configFile}");
|
||||
}
|
||||
|
||||
$this->config = $this->loadJson($this->configFile);
|
||||
@ -38,96 +38,96 @@ class ConfigurationLoader
|
||||
}
|
||||
|
||||
/**
|
||||
* Loads a JSON file
|
||||
* Lädt eine JSON-Datei
|
||||
*
|
||||
* @param string $file Path to JSON file
|
||||
* @return array Parsed configuration
|
||||
* @param string $file Pfad zur JSON-Datei
|
||||
* @return array Geparste Konfiguration
|
||||
*/
|
||||
private function loadJson(string $file): array
|
||||
{
|
||||
$json = file_get_contents($file);
|
||||
if ($json === false) {
|
||||
throw new \RuntimeException("Could not read JSON file: {$file}");
|
||||
throw new \RuntimeException("Konnte JSON-Datei nicht lesen: {$file}");
|
||||
}
|
||||
|
||||
$config = json_decode($json, true);
|
||||
|
||||
if ($config === null && json_last_error() !== JSON_ERROR_NONE) {
|
||||
throw new \RuntimeException("Invalid JSON: " . json_last_error_msg());
|
||||
throw new \RuntimeException("Ungültiges JSON: " . json_last_error_msg());
|
||||
}
|
||||
|
||||
return $config;
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates the loaded configuration for required fields
|
||||
* Validiert die geladene Konfiguration auf erforderliche Felder
|
||||
*
|
||||
* @throws \RuntimeException if required fields are missing
|
||||
* @throws \RuntimeException wenn erforderliche Felder fehlen
|
||||
*/
|
||||
private function validate(): void
|
||||
{
|
||||
// Metadata required
|
||||
// Metadata erforderlich
|
||||
if (empty($this->config['metadata'])) {
|
||||
throw new \RuntimeException("Configuration: 'metadata' section required");
|
||||
throw new \RuntimeException("Konfiguration: 'metadata' Section erforderlich");
|
||||
}
|
||||
|
||||
if (!isset($this->config['metadata']['extractionRules']) || !is_array($this->config['metadata']['extractionRules'])) {
|
||||
throw new \RuntimeException("Configuration: 'metadata.extractionRules' required (may be empty: [])");
|
||||
throw new \RuntimeException("Konfiguration: 'metadata.extractionRules' erforderlich (kann leer sein: [])");
|
||||
}
|
||||
|
||||
// CSV structure required
|
||||
// CSV-Struktur erforderlich
|
||||
if (empty($this->config['csvStructure'])) {
|
||||
throw new \RuntimeException("Configuration: 'csvStructure' section required");
|
||||
throw new \RuntimeException("Konfiguration: 'csvStructure' Section erforderlich");
|
||||
}
|
||||
|
||||
if (!isset($this->config['csvStructure']['headerLine'])) {
|
||||
throw new \RuntimeException("Configuration: 'csvStructure.headerLine' required");
|
||||
throw new \RuntimeException("Konfiguration: 'csvStructure.headerLine' erforderlich");
|
||||
}
|
||||
|
||||
// Column transformations required
|
||||
// Column Transformations erforderlich
|
||||
if (empty($this->config['columnTransformations'])) {
|
||||
throw new \RuntimeException("Configuration: 'columnTransformations' required");
|
||||
throw new \RuntimeException("Konfiguration: 'columnTransformations' erforderlich");
|
||||
}
|
||||
|
||||
// Validate directories (if auto-import is used)
|
||||
// Directories validieren (wenn auto-import genutzt wird)
|
||||
if (!empty($this->config['directories'])) {
|
||||
foreach (['source', 'output', 'archive', 'error'] as $dir) {
|
||||
if (empty($this->config['directories'][$dir])) {
|
||||
throw new \RuntimeException("Configuration: 'directories.{$dir}' required for auto-import");
|
||||
throw new \RuntimeException("Konfiguration: 'directories.{$dir}' erforderlich für Auto-Import");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Validate CSV structure values
|
||||
// Validiere CSV-Struktur Werte
|
||||
$headerLine = $this->config['csvStructure']['headerLine'] ?? 1;
|
||||
if (!is_int($headerLine) || $headerLine < 1) {
|
||||
throw new \Exception(
|
||||
'Configuration: csvStructure.headerLine must be a positive integer'
|
||||
'Konfiguration csvStructure.headerLine muss eine positive Ganzzahl sein'
|
||||
);
|
||||
}
|
||||
|
||||
$delimiter = $this->config['csvStructure']['inputDelimiter'] ?? '';
|
||||
if (strlen($delimiter) === 0) {
|
||||
throw new \Exception(
|
||||
'Configuration: csvStructure.inputDelimiter must not be empty'
|
||||
'Konfiguration csvStructure.inputDelimiter darf nicht leer sein'
|
||||
);
|
||||
}
|
||||
|
||||
// Validate encoding
|
||||
// Validiere Encoding
|
||||
$encoding = $this->config['csvStructure']['encoding'] ?? 'UTF-8';
|
||||
if (!in_array($encoding, ['UTF-8', 'ISO-8859-1', 'CP1252'])) {
|
||||
throw new \Exception(
|
||||
'Configuration: csvStructure.encoding: ' . $encoding . ' not supported'
|
||||
'Konfiguration csvStructure.encoding: ' . $encoding . ' nicht unterstützt'
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a single configuration option
|
||||
* Gibt eine einzelne Konfigurationsoption zurück
|
||||
*
|
||||
* @param string $key Dot-notation key (e.g. 'metadata.extractionRules')
|
||||
* @param mixed $default Default value if key does not exist
|
||||
* @return mixed The configuration value
|
||||
* @param string $key Dot-Notation Key (z.B. 'metadata.extractionRules')
|
||||
* @param mixed $default Standardwert wenn Key nicht existiert
|
||||
* @return mixed Der Konfigurationswert
|
||||
*/
|
||||
public function get(string $key, mixed $default = null): mixed
|
||||
{
|
||||
@ -145,9 +145,9 @@ class ConfigurationLoader
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the complete configuration
|
||||
* Gibt die vollständige Konfiguration zurück
|
||||
*
|
||||
* @return array The full configuration
|
||||
* @return array Die komplette Konfiguration
|
||||
*/
|
||||
public function getAll(): array
|
||||
{
|
||||
@ -155,10 +155,10 @@ class ConfigurationLoader
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets a configuration value (overwrites existing value)
|
||||
* Setzt einen Konfigurationswert (überschreibt bestehenden Wert)
|
||||
*
|
||||
* @param string $key Dot-notation key (e.g. 'directories.output')
|
||||
* @param mixed $value New value
|
||||
* @param string $key Dot-Notation Key (z.B. 'directories.output')
|
||||
* @param mixed $value Neuer Wert
|
||||
* @return void
|
||||
*/
|
||||
public function set(string $key, mixed $value): void
|
||||
@ -179,9 +179,9 @@ class ConfigurationLoader
|
||||
}
|
||||
|
||||
/**
|
||||
* Checks whether a configuration key exists
|
||||
* Prüft ob ein Konfigurationsschlüssel existiert
|
||||
*
|
||||
* @param string $key Dot-notation key
|
||||
* @param string $key Dot-Notation Key
|
||||
* @return bool
|
||||
*/
|
||||
public function has(string $key): bool
|
||||
|
||||
@ -3,10 +3,10 @@
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Reads and parses CSV files
|
||||
* Liest und parst CSV-Dateien
|
||||
*
|
||||
* Reads CSV files with a configurable delimiter and separates
|
||||
* metadata lines from the actual data rows.
|
||||
* Diese Klasse liest CSV-Dateien mit konfigurierbarem Delimiter
|
||||
* und separiert Metadaten-Zeilen von den eigentlichen Datenzeilen.
|
||||
*/
|
||||
class CsvReader
|
||||
{
|
||||
@ -16,8 +16,8 @@ class CsvReader
|
||||
private bool $hasBom;
|
||||
|
||||
/**
|
||||
* @param string $filePath Path to the CSV file
|
||||
* @param array $csvStructure CSV structure from configuration
|
||||
* @param string $filePath Pfad zur CSV-Datei
|
||||
* @param array $csvStructure CSV-Struktur aus Konfiguration
|
||||
*/
|
||||
public function __construct(string $filePath, array $csvStructure)
|
||||
{
|
||||
@ -28,25 +28,25 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads all lines from the file
|
||||
* Liest alle Zeilen aus der Datei
|
||||
*
|
||||
* @param int $maxLines Maximum number of lines (0 = all)
|
||||
* @return array Array of lines (without newlines)
|
||||
* @throws \RuntimeException if file cannot be read
|
||||
* @param int $maxLines Maximale Anzahl Zeilen (0 = alle)
|
||||
* @return array Array mit Zeilen (ohne Newlines)
|
||||
* @throws \RuntimeException wenn Datei nicht gelesen werden kann
|
||||
*/
|
||||
public function readLines(int $maxLines = 0): array
|
||||
{
|
||||
if (!file_exists($this->filePath) || !is_readable($this->filePath)) {
|
||||
throw new \RuntimeException("Could not read file: {$this->filePath}");
|
||||
throw new \RuntimeException("Konnte Datei nicht lesen: {$this->filePath}");
|
||||
}
|
||||
|
||||
$lines = file($this->filePath, FILE_IGNORE_NEW_LINES);
|
||||
|
||||
if ($lines === false) {
|
||||
throw new \RuntimeException("Could not read file: {$this->filePath}");
|
||||
throw new \RuntimeException("Konnte Datei nicht lesen: {$this->filePath}");
|
||||
}
|
||||
|
||||
// Remove BOM if present
|
||||
// BOM entfernen falls vorhanden
|
||||
if ($this->hasBom && !empty($lines)) {
|
||||
$lines[0] = $this->removeBom($lines[0]);
|
||||
}
|
||||
@ -59,9 +59,9 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads the metadata lines (before the header line)
|
||||
* Liest die Metadaten-Zeilen (vor der Header-Zeile)
|
||||
*
|
||||
* @return array Array of metadata lines
|
||||
* @return array Array mit Metadaten-Zeilen
|
||||
*/
|
||||
public function readMetadataLines(): array
|
||||
{
|
||||
@ -75,28 +75,28 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads CSV data with headers
|
||||
* Liest die CSV-Daten mit Headers
|
||||
*
|
||||
* @param int $maxDataRows Maximum number of data rows (0 = all)
|
||||
* @return array Array of associative arrays (with column names as keys)
|
||||
* @throws \RuntimeException if header line is not found
|
||||
* @param int $maxDataRows Maximale Anzahl Datenzeilen (0 = alle)
|
||||
* @return array Array von assoziativen Arrays (mit Spalten-Namen als Keys)
|
||||
* @throws \RuntimeException wenn Header-Zeile nicht gefunden
|
||||
*/
|
||||
public function readCsvData(int $maxDataRows = 0): array
|
||||
{
|
||||
$lines = $this->readLines();
|
||||
|
||||
if ($this->headerLine > count($lines)) {
|
||||
throw new \RuntimeException("Header line {$this->headerLine} not found in file with " . count($lines) . " lines");
|
||||
throw new \RuntimeException("Header-Zeile {$this->headerLine} nicht gefunden in Datei mit " . count($lines) . " Zeilen");
|
||||
}
|
||||
|
||||
// Parse header
|
||||
// Header parsen
|
||||
$headerLineContent = $lines[$this->headerLine - 1];
|
||||
$headers = str_getcsv($headerLineContent, $this->delimiter, '"', '\\');
|
||||
$headers = array_map(static fn(?string $v): string => trim($v ?? ''), $headers);
|
||||
|
||||
// Parse data rows
|
||||
// Datenzeilen parsen
|
||||
$data = [];
|
||||
$dataStartLine = $this->headerLine; // 0-based
|
||||
$dataStartLine = $this->headerLine; // 0-basiert
|
||||
$lineCount = 0;
|
||||
|
||||
for ($i = $dataStartLine; $i < count($lines); $i++) {
|
||||
@ -106,7 +106,7 @@ class CsvReader
|
||||
|
||||
$lineContent = $lines[$i];
|
||||
|
||||
// Skip empty lines
|
||||
// Leere Zeilen überspringen
|
||||
if (trim($lineContent) === '') {
|
||||
continue;
|
||||
}
|
||||
@ -114,7 +114,7 @@ class CsvReader
|
||||
$row = str_getcsv($lineContent, $this->delimiter, '"', '\\');
|
||||
$row = array_map(static fn(?string $v): string => trim($v ?? ''), $row);
|
||||
|
||||
// Combine row with header keys
|
||||
// Zeile mit Header-Keys kombinieren
|
||||
$rowData = [];
|
||||
foreach ($headers as $index => $header) {
|
||||
$rowData[$header] = $row[$index] ?? '';
|
||||
@ -128,17 +128,17 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the column headers
|
||||
* Gibt die Spalten-Header zurück
|
||||
*
|
||||
* @return array Array of column names
|
||||
* @throws \RuntimeException if header line is not found
|
||||
* @return array Array mit Spalten-Namen
|
||||
* @throws \RuntimeException wenn Header-Zeile nicht gefunden
|
||||
*/
|
||||
public function getHeaders(): array
|
||||
{
|
||||
$lines = $this->readLines();
|
||||
|
||||
if ($this->headerLine > count($lines)) {
|
||||
throw new \RuntimeException("Header line {$this->headerLine} not found");
|
||||
throw new \RuntimeException("Header-Zeile {$this->headerLine} nicht gefunden");
|
||||
}
|
||||
|
||||
$headerLineContent = $lines[$this->headerLine - 1];
|
||||
@ -148,10 +148,10 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Removes UTF-8 BOM (Byte Order Mark) from string
|
||||
* Entfernt UTF-8 BOM (Byte Order Mark) von String
|
||||
*
|
||||
* @param string $text String with potential BOM
|
||||
* @return string String without BOM
|
||||
* @param string $text String mit potenziellem BOM
|
||||
* @return string String ohne BOM
|
||||
*/
|
||||
private function removeBom(string $text): string
|
||||
{
|
||||
@ -162,9 +162,9 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the total number of lines in the file
|
||||
* Gibt die Gesamtzahl der Zeilen in der Datei zurück
|
||||
*
|
||||
* @return int Number of lines
|
||||
* @return int Anzahl Zeilen
|
||||
*/
|
||||
public function countLines(): int
|
||||
{
|
||||
@ -172,9 +172,9 @@ class CsvReader
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the number of data rows (excluding header and metadata)
|
||||
* Gibt die Anzahl der Datenzeilen zurück (ohne Header und Metadaten)
|
||||
*
|
||||
* @return int Number of data rows
|
||||
* @return int Anzahl Datenzeilen
|
||||
*/
|
||||
public function countDataRows(): int
|
||||
{
|
||||
|
||||
@ -3,9 +3,10 @@
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Writes transformed data to a CSV file
|
||||
* Schreibt transformierte Daten in CSV-Datei
|
||||
*
|
||||
* Writes transformed data into a Firefly III-compatible CSV file.
|
||||
* Diese Klasse schreibt die transformierten Daten in eine
|
||||
* Firefly III-kompatible CSV-Datei.
|
||||
*/
|
||||
class CsvWriter
|
||||
{
|
||||
@ -13,8 +14,8 @@ class CsvWriter
|
||||
private string $delimiter;
|
||||
|
||||
/**
|
||||
* @param string $outputFile Path to the output file
|
||||
* @param array $csvStructure CSV structure from configuration
|
||||
* @param string $outputFile Pfad zur Output-Datei
|
||||
* @param array $csvStructure CSV-Struktur aus Konfiguration
|
||||
*/
|
||||
public function __construct(string $outputFile, array $csvStructure = [])
|
||||
{
|
||||
@ -23,39 +24,39 @@ class CsvWriter
|
||||
}
|
||||
|
||||
/**
|
||||
* Writes data to a CSV file
|
||||
* Schreibt Daten in CSV-Datei
|
||||
*
|
||||
* @param array $data Array of associative arrays (rows)
|
||||
* @throws \RuntimeException if file cannot be written
|
||||
* @param array $data Array von assoziativen Arrays (Zeilen)
|
||||
* @throws \RuntimeException wenn Datei nicht geschrieben werden kann
|
||||
*/
|
||||
public function write(array $data): void
|
||||
{
|
||||
if (empty($data)) {
|
||||
throw new \RuntimeException("No data to write");
|
||||
throw new \RuntimeException("Keine Daten zum Schreiben");
|
||||
}
|
||||
|
||||
// Create output directory if it does not exist
|
||||
// Output-Verzeichnis erstellen falls nicht vorhanden
|
||||
$dir = dirname($this->outputFile);
|
||||
if (!is_dir($dir)) {
|
||||
if (!mkdir($dir, 0755, true)) {
|
||||
throw new \RuntimeException("Could not create output directory: {$dir}");
|
||||
throw new \RuntimeException("Konnte Output-Verzeichnis nicht erstellen: {$dir}");
|
||||
}
|
||||
}
|
||||
|
||||
$fp = fopen($this->outputFile, 'w');
|
||||
|
||||
if ($fp === false) {
|
||||
throw new \RuntimeException("Could not create output file: {$this->outputFile}");
|
||||
throw new \RuntimeException("Konnte Output-Datei nicht erstellen: {$this->outputFile}");
|
||||
}
|
||||
|
||||
try {
|
||||
// Write headers (column names from first row)
|
||||
// Headers schreiben (Spalten-Namen aus erster Zeile)
|
||||
$headers = array_keys($data[0]);
|
||||
$this->writeCsvLine($fp, $headers);
|
||||
|
||||
// Write data rows
|
||||
// Datenzeilen schreiben
|
||||
foreach ($data as $row) {
|
||||
// Ensure all columns are present
|
||||
// Sicherstellen dass alle Spalten vorhanden sind
|
||||
$values = [];
|
||||
foreach ($headers as $header) {
|
||||
$values[] = $row[$header] ?? '';
|
||||
@ -69,25 +70,25 @@ class CsvWriter
|
||||
}
|
||||
|
||||
/**
|
||||
* Writes a CSV line using fputcsv
|
||||
* Schreibt eine CSV-Zeile mit fputcsv
|
||||
*
|
||||
* @param resource $fp File handle
|
||||
* @param array $values Array of values
|
||||
* @throws \RuntimeException if writing fails
|
||||
* @param resource $fp File-Handle
|
||||
* @param array $values Array mit Werten
|
||||
* @throws \RuntimeException wenn Schreiben fehlschlägt
|
||||
*/
|
||||
private function writeCsvLine($fp, array $values): void
|
||||
{
|
||||
$result = fputcsv($fp, $values, $this->delimiter, '"', '\\');
|
||||
|
||||
if ($result === false) {
|
||||
throw new \RuntimeException("Error writing CSV row");
|
||||
throw new \RuntimeException("Fehler beim Schreiben der CSV-Zeile");
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the path to the output file
|
||||
* Gibt den Pfad zur Output-Datei zurück
|
||||
*
|
||||
* @return string Output file path
|
||||
* @return string Output-Dateipfad
|
||||
*/
|
||||
public function getOutputFile(): string
|
||||
{
|
||||
@ -95,9 +96,9 @@ class CsvWriter
|
||||
}
|
||||
|
||||
/**
|
||||
* Checks whether the output file was created
|
||||
* Prüft ob Output-Datei erstellt wurde
|
||||
*
|
||||
* @return bool True if file exists
|
||||
* @return bool True wenn Datei existiert
|
||||
*/
|
||||
public function fileExists(): bool
|
||||
{
|
||||
@ -105,9 +106,9 @@ class CsvWriter
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the size of the output file
|
||||
* Gibt die Größe der Output-Datei zurück
|
||||
*
|
||||
* @return int|false File size in bytes or false on error
|
||||
* @return int|false Dateigröße in Bytes oder false bei Fehler
|
||||
*/
|
||||
public function getFileSize(): int|false
|
||||
{
|
||||
|
||||
@ -3,42 +3,42 @@
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Central debug logger for transparency
|
||||
* Zentraler Debug-Logger für Transparenz
|
||||
*
|
||||
* Collects debug information from all components and makes the
|
||||
* processing traceable. Provides transparency over all
|
||||
* processing steps: metadata extraction, transformations,
|
||||
* CSV reads etc.
|
||||
* Sammelt Debug-Informationen aus allen Komponenten und macht die
|
||||
* Verarbeitung nachvollziehbar. Ermöglicht Transparenz über alle
|
||||
* Verarbeitungsschritte: Metadaten-Extraktion, Transformationen,
|
||||
* CSV-Lesevorgänge etc.
|
||||
*
|
||||
* Usage:
|
||||
* - DebugLogger::enable() → activate debug mode
|
||||
* - DebugLogger::log('category', 'message', $data) → log message
|
||||
* - DebugLogger::getLogs() → retrieve all logs
|
||||
* - DebugLogger::reset() → reset logs
|
||||
* Verwendung:
|
||||
* - DebugLogger::enable() → Debug-Modus aktivieren
|
||||
* - DebugLogger::log('category', 'message', $data) → Nachricht loggen
|
||||
* - DebugLogger::getLogs() → Alle Logs abrufen
|
||||
* - DebugLogger::reset() → Logs zurücksetzen
|
||||
*
|
||||
* Example:
|
||||
* Beispiel:
|
||||
* ```php
|
||||
* DebugLogger::enable();
|
||||
* DebugLogger::log('metadata', 'IBAN extracted', ['iban' => 'CH9300762011623852957']);
|
||||
* DebugLogger::log('metadata', 'IBAN extrahiert', ['iban' => 'CH9300762011623852957']);
|
||||
* $logs = DebugLogger::getLogs();
|
||||
* ```
|
||||
*/
|
||||
class DebugLogger
|
||||
{
|
||||
/**
|
||||
* @var bool Whether debug mode is enabled
|
||||
* @var bool Ist Debug-Modus aktiviert?
|
||||
*/
|
||||
private static bool $enabled = false;
|
||||
|
||||
/**
|
||||
* @var array Collected logs with timestamp, category, message and data
|
||||
* @var array Gesammelte Logs mit Timestamp, Kategorie, Nachricht und Daten
|
||||
*/
|
||||
private static array $logs = [];
|
||||
|
||||
/**
|
||||
* Enables debug mode
|
||||
* Aktiviert den Debug-Modus
|
||||
*
|
||||
* Once enabled, all DebugLogger::log() calls are recorded.
|
||||
* Nach Aktivierung werden alle DebugLogger::log() Aufrufe protokolliert.
|
||||
*
|
||||
* @return void
|
||||
*/
|
||||
@ -48,9 +48,9 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Disables debug mode
|
||||
* Deaktiviert den Debug-Modus
|
||||
*
|
||||
* Once disabled, DebugLogger::log() calls are ignored.
|
||||
* Nach Deaktivierung werden DebugLogger::log() Aufrufe ignoriert.
|
||||
*
|
||||
* @return void
|
||||
*/
|
||||
@ -60,16 +60,16 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Records a debug message
|
||||
* Protokolliert eine Debug-Nachricht
|
||||
*
|
||||
* Collects information about each processing step with timestamp,
|
||||
* category, message and optional data. Logs are only collected
|
||||
* when debug mode is enabled.
|
||||
* Sammelt Informationen über jeden Verarbeitungsschritt mit Timestamp,
|
||||
* Kategorie, Nachricht und optionalen Daten. Die Logs werden nur
|
||||
* gesammelt, wenn der Debug-Modus aktiviert ist.
|
||||
*
|
||||
* @param string $category Log message category
|
||||
* e.g. 'metadata', 'transformation', 'csv_reader', 'config'
|
||||
* @param string $message Description of the action or event
|
||||
* @param mixed $data Additional context data (array or any value)
|
||||
* @param string $category Kategorie der Log-Nachricht
|
||||
* z.B. 'metadata', 'transformation', 'csv_reader', 'config'
|
||||
* @param string $message Beschreibung der Aktion oder des Ereignisses
|
||||
* @param mixed $data Zusätzliche Kontextdaten (Array oder beliebiger Wert)
|
||||
*
|
||||
* @return void
|
||||
*/
|
||||
@ -88,16 +88,16 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns all collected logs
|
||||
* Gibt alle gesammelten Logs zurück
|
||||
*
|
||||
* Delivers an array of all recorded events with complete
|
||||
* information for analysis and debugging.
|
||||
* Liefert ein Array aller protokollierten Ereignisse mit vollständigen
|
||||
* Informationen für Analyse und Debugging.
|
||||
*
|
||||
* @return array Array of log entries, each with:
|
||||
* - timestamp: microsecond timestamp
|
||||
* - category: log category
|
||||
* - message: description
|
||||
* - data: additional data
|
||||
* @return array Array von Log-Einträgen, jeder mit:
|
||||
* - timestamp: Mikrosekunden-Zeitstempel
|
||||
* - category: Log-Kategorie
|
||||
* - message: Beschreibung
|
||||
* - data: Zusätzliche Daten
|
||||
*/
|
||||
public static function getLogs(): array
|
||||
{
|
||||
@ -105,10 +105,10 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Resets all logs
|
||||
* Setzt alle Logs zurück
|
||||
*
|
||||
* Clears the entire log buffer. Useful for maintaining a clean
|
||||
* state between multiple transformations.
|
||||
* Löscht den gesamten Log-Buffer. Nützlich um zwischen mehreren
|
||||
* Transformationen einen sauberen State zu haben.
|
||||
*
|
||||
* @return void
|
||||
*/
|
||||
@ -118,9 +118,9 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the number of collected log entries
|
||||
* Gibt die Anzahl der gesammelten Log-Einträge zurück
|
||||
*
|
||||
* @return int Number of recorded events
|
||||
* @return int Anzahl protokollierter Ereignisse
|
||||
*/
|
||||
public static function count(): int
|
||||
{
|
||||
@ -128,9 +128,9 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Checks whether debug mode is enabled
|
||||
* Prüft ob Debug-Modus aktiviert ist
|
||||
*
|
||||
* @return bool true if enabled, false otherwise
|
||||
* @return bool true wenn aktiviert, false sonst
|
||||
*/
|
||||
public static function isEnabled(): bool
|
||||
{
|
||||
@ -138,18 +138,18 @@ class DebugLogger
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a formatted string of all logs
|
||||
* Gibt einen formattierten String aller Logs zurück
|
||||
*
|
||||
* Converts the log buffer into a readable format for console output.
|
||||
* Konvertiert den Log-Buffer in ein lesbares Format für Konsolen-Ausgabe.
|
||||
*
|
||||
* @param bool $includeData true = also output data, false = messages only
|
||||
* @param bool $includeData true = auch Daten ausgeben, false = nur Messages
|
||||
*
|
||||
* @return string Formatted log output
|
||||
* @return string Formatierte Log-Ausgabe
|
||||
*/
|
||||
public static function format(bool $includeData = true): string
|
||||
{
|
||||
if (empty(self::$logs)) {
|
||||
return "No debug logs available.\n";
|
||||
return "Keine Debug-Logs vorhanden.\n";
|
||||
}
|
||||
|
||||
$output = "\n=== DEBUG LOGS ===\n";
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -3,10 +3,10 @@
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Extracts metadata from header lines using regex
|
||||
* Extrahiert Metadaten aus Header-Zeilen mit Regex
|
||||
*
|
||||
* Extracts constant values from metadata lines
|
||||
* (header lines before the actual CSV table) using regex rules.
|
||||
* Diese Klasse extrahiert konstante Werte aus den Metadatenzeilen
|
||||
* (Header-Zeilen vor der eigentlichen CSV-Tabelle) mittels Regex-Regeln.
|
||||
*/
|
||||
class MetadataExtractor
|
||||
{
|
||||
@ -18,17 +18,17 @@ class MetadataExtractor
|
||||
}
|
||||
|
||||
/**
|
||||
* Extracts metadata from the provided lines
|
||||
* Extrahiert Metadaten aus den übergebenen Zeilen
|
||||
*
|
||||
* @param array $lines Array of lines from the CSV header
|
||||
* @return array Extracted metadata
|
||||
* @param array $lines Array von Zeilen aus dem CSV-Header
|
||||
* @return array Extrahierte Metadaten
|
||||
*/
|
||||
public function extract(array $lines): array
|
||||
{
|
||||
$metadata = [];
|
||||
|
||||
foreach ($this->rules as $rule) {
|
||||
// Validate required fields
|
||||
// Validiere erforderliche Felder
|
||||
if (empty($rule['name']) || empty($rule['regex'])) {
|
||||
continue;
|
||||
}
|
||||
@ -37,15 +37,15 @@ class MetadataExtractor
|
||||
$lineNumber = $rule['lineNumber'] ?? 1;
|
||||
$regex = $rule['regex'];
|
||||
|
||||
// Off-by-one fix
|
||||
// config.json: "lineNumber": 1, 2, 3 (1-based, human-readable)
|
||||
// PHP arrays: $lines[0], $lines[1], $lines[2] (0-based)
|
||||
// Conversion: arrayIndex = lineNumber - 1
|
||||
// ✅ KORRIGIERT: Off-by-One Fix
|
||||
// config.json: "lineNumber": 1, 2, 3 (1-basiert, für Menschen lesbar)
|
||||
// PHP Arrays: $lines[0], $lines[1], $lines[2] (0-basiert)
|
||||
// Konvertierung: arrayIndex = lineNumber - 1
|
||||
$arrayIndex = $lineNumber - 1;
|
||||
|
||||
// Check if line exists
|
||||
// Prüfe ob Zeile existiert
|
||||
if (!isset($lines[$arrayIndex])) {
|
||||
// Line does not exist - debug info for support
|
||||
// Zeile existiert nicht - Debug-Info für Support
|
||||
DebugLogger::log('metadata_warning', "Extraction rule not found", [
|
||||
'rule_name' => $ruleName,
|
||||
'expected_lineNumber' => $lineNumber,
|
||||
@ -57,7 +57,7 @@ class MetadataExtractor
|
||||
|
||||
$line = $lines[$arrayIndex];
|
||||
|
||||
// Regex with '#' as delimiter (allows '/' in user patterns); escape '#' in pattern
|
||||
// Regex mit '#' als Delimiter (erlaubt '/' in User-Patterns); '#' im Pattern escapen
|
||||
$pattern = '#' . str_replace('#', '\#', $regex) . '#u';
|
||||
$matchResult = @preg_match_all($pattern, $line, $matches);
|
||||
if ($matchResult === false) {
|
||||
@ -68,7 +68,7 @@ class MetadataExtractor
|
||||
continue;
|
||||
}
|
||||
if ($matchResult === 0) {
|
||||
// Regex did not match on this line
|
||||
// Regex matched nicht auf dieser Zeile
|
||||
DebugLogger::log('metadata_warning', "Regex did not match", [
|
||||
'rule_name' => $ruleName,
|
||||
'lineNumber' => $lineNumber,
|
||||
@ -78,20 +78,20 @@ class MetadataExtractor
|
||||
continue;
|
||||
}
|
||||
|
||||
// Use captureGroup to select the extraction group
|
||||
// captureGroup defines which capture group is extracted
|
||||
// 0 = complete match
|
||||
// 1 = first capture group (...)
|
||||
// 2 = second capture group, etc.
|
||||
// ✅ KORRIGIERT: captureGroup benutzen
|
||||
// captureGroup definiert welche Klammer-Gruppe extrahiert wird
|
||||
// 0 = komplette Match
|
||||
// 1 = erste Klammer-Gruppe (...)
|
||||
// 2 = zweite Klammer-Gruppe, etc.
|
||||
$captureGroup = isset($rule['captureGroup']) ? intval($rule['captureGroup']) : 1;
|
||||
|
||||
// Ensure the capture group exists
|
||||
// Sicherstellen dass die Capture Group existiert
|
||||
if (!isset($matches[$captureGroup]) || empty($matches[$captureGroup])) {
|
||||
// Fallback: use complete match if group does not exist
|
||||
// Fallback: Nutze komplette Match wenn Gruppe nicht existiert
|
||||
$metadata[$ruleName] = $matches[0][0] ?? '';
|
||||
// echo "DEBUG: extraction_rule '{$ruleName}' - captureGroup {$captureGroup} not found, falling back to complete match\n";
|
||||
} else {
|
||||
// Use the specific capture group
|
||||
// Nutze die spezifische Capture Group
|
||||
$metadata[$ruleName] = $matches[$captureGroup][0] ?? '';
|
||||
}
|
||||
|
||||
@ -105,9 +105,9 @@ class MetadataExtractor
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the number of defined extraction rules
|
||||
* Gibt die Anzahl der definierten Extraction-Rules zurück
|
||||
*
|
||||
* @return int Number of rules
|
||||
* @return int Anzahl Rules
|
||||
*/
|
||||
public function getRuleCount(): int
|
||||
{
|
||||
@ -115,9 +115,9 @@ class MetadataExtractor
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns all defined extraction rules
|
||||
* Gibt alle definierten Extraction-Rules zurück
|
||||
*
|
||||
* @return array The rules
|
||||
* @return array Die Rules
|
||||
*/
|
||||
public function getRules(): array
|
||||
{
|
||||
|
||||
@ -1,161 +0,0 @@
|
||||
<?php
|
||||
|
||||
namespace UbsCsvTransformer;
|
||||
|
||||
/**
|
||||
* Evaluates row-filter conditions defined in the "skipIf" config key.
|
||||
*
|
||||
* A node is either:
|
||||
*
|
||||
* - A bare condition:
|
||||
* { "column": "A", "operator": "empty" }
|
||||
*
|
||||
* - An AND group:
|
||||
* { "and": [ <node>, <node>, ... ] }
|
||||
*
|
||||
* - An OR group:
|
||||
* { "or": [ <node>, <node>, ... ] }
|
||||
*
|
||||
* Groups may be nested arbitrarily.
|
||||
*
|
||||
* Supported operators for conditions:
|
||||
*
|
||||
* | Operator | Matches when … |
|
||||
* |----------------|------------------------------------------------------|
|
||||
* | empty | column value is empty string |
|
||||
* | not-empty | column value is not empty |
|
||||
* | equals | value === "value" (string compare) |
|
||||
* | not-equals | value !== "value" |
|
||||
* | contains | strpos(value, "value") !== false |
|
||||
* | not-contains | strpos(value, "value") === false |
|
||||
* | matches | preg_match("pattern", value) === 1 |
|
||||
* | not-matches | preg_match("pattern", value) === 0 |
|
||||
* | gt | (float) value > (float) "value" |
|
||||
* | gte | (float) value >= (float) "value" |
|
||||
* | lt | (float) value < (float) "value" |
|
||||
* | lte | (float) value <= (float) "value" |
|
||||
*
|
||||
* Usage in config:
|
||||
* ```json
|
||||
* "skipIf": {
|
||||
* "and": [
|
||||
* { "column": "Beschreibung1", "operator": "empty" },
|
||||
* { "column": "Beschreibung2", "operator": "empty" }
|
||||
* ]
|
||||
* }
|
||||
* ```
|
||||
*
|
||||
* ```json
|
||||
* "skipIf": {
|
||||
* "or": [
|
||||
* { "column": "Amount", "operator": "gt", "value": "10000" },
|
||||
* { "and": [
|
||||
* { "column": "Type", "operator": "equals", "value": "Saldo" },
|
||||
* { "column": "Notes", "operator": "empty" }
|
||||
* ]}
|
||||
* ]
|
||||
* }
|
||||
* ```
|
||||
*/
|
||||
class RowFilter
|
||||
{
|
||||
/**
|
||||
* Evaluates a filter node against a data row.
|
||||
*
|
||||
* Returns true when the row should be skipped.
|
||||
*
|
||||
* @param array<string, mixed> $node Filter node (condition or group)
|
||||
* @param array<string, string> $row Data row with column values
|
||||
*
|
||||
* @throws \InvalidArgumentException on unknown operator
|
||||
*/
|
||||
public static function evaluate(array $node, array $row): bool
|
||||
{
|
||||
// AND group
|
||||
if (isset($node['and'])) {
|
||||
/** @var array<int, array<string, mixed>> $children */
|
||||
$children = $node['and'];
|
||||
foreach ($children as $child) {
|
||||
if (!self::evaluate($child, $row)) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
// OR group
|
||||
if (isset($node['or'])) {
|
||||
/** @var array<int, array<string, mixed>> $children */
|
||||
$children = $node['or'];
|
||||
foreach ($children as $child) {
|
||||
if (self::evaluate($child, $row)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
// Bare condition
|
||||
return self::evaluateCondition($node, $row);
|
||||
}
|
||||
|
||||
/**
|
||||
* Evaluates a single leaf condition.
|
||||
*
|
||||
* @param array<string, mixed> $condition
|
||||
* @param array<string, string> $row
|
||||
*
|
||||
* @throws \InvalidArgumentException on unknown operator
|
||||
*/
|
||||
private static function evaluateCondition(array $condition, array $row): bool
|
||||
{
|
||||
$column = (string) ($condition['column'] ?? '');
|
||||
$operator = strtolower((string) ($condition['operator'] ?? ''));
|
||||
$colValue = (string) ($row[$column] ?? '');
|
||||
$cmpValue = (string) ($condition['value'] ?? '');
|
||||
$pattern = (string) ($condition['pattern'] ?? '');
|
||||
|
||||
switch ($operator) {
|
||||
case 'empty':
|
||||
return $colValue === '';
|
||||
|
||||
case 'not-empty':
|
||||
return $colValue !== '';
|
||||
|
||||
case 'equals':
|
||||
return $colValue === $cmpValue;
|
||||
|
||||
case 'not-equals':
|
||||
return $colValue !== $cmpValue;
|
||||
|
||||
case 'contains':
|
||||
return str_contains($colValue, $cmpValue);
|
||||
|
||||
case 'not-contains':
|
||||
return !str_contains($colValue, $cmpValue);
|
||||
|
||||
case 'matches':
|
||||
$delimited = '#' . str_replace('#', '\#', $pattern) . '#u';
|
||||
return preg_match($delimited, $colValue) === 1;
|
||||
|
||||
case 'not-matches':
|
||||
$delimited = '#' . str_replace('#', '\#', $pattern) . '#u';
|
||||
return preg_match($delimited, $colValue) !== 1;
|
||||
|
||||
case 'gt':
|
||||
return (float) $colValue > (float) $cmpValue;
|
||||
|
||||
case 'gte':
|
||||
return (float) $colValue >= (float) $cmpValue;
|
||||
|
||||
case 'lt':
|
||||
return (float) $colValue < (float) $cmpValue;
|
||||
|
||||
case 'lte':
|
||||
return (float) $colValue <= (float) $cmpValue;
|
||||
|
||||
default:
|
||||
throw new \InvalidArgumentException("Unknown RowFilter operator: '{$operator}'");
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -8,19 +8,18 @@ use UbsCsvTransformer\ConfigurationLoader;
|
||||
use UbsCsvTransformer\MetadataExtractor;
|
||||
use UbsCsvTransformer\ColumnTransformer;
|
||||
use UbsCsvTransformer\FireflyImporter;
|
||||
use UbsCsvTransformer\RowFilter;
|
||||
|
||||
/**
|
||||
* Orchestrates the complete CSV transformation pipeline
|
||||
* Orchestriert die gesamte CSV-Transformations-Pipeline
|
||||
*
|
||||
* Coordinates all steps from reading the CSV through metadata extraction
|
||||
* and column transformation to output and optional import into Firefly III.
|
||||
* Koordiniert alle Schritte von CSV-Einlesen über Metadaten-Extraktion
|
||||
* und Spalten-Transformation bis zur Ausgabe und optional zum Import in Firefly III.
|
||||
*
|
||||
* @property ConfigurationLoader $configLoader Manages configuration
|
||||
* @property CsvWriter $csvWriter Writes output CSV
|
||||
* @property MetadataExtractor $metadataExtractor Extracts metadata from header
|
||||
* @property ColumnTransformer $columnTransformer Transforms columns
|
||||
* @property array $csvStructure CSV structure configuration
|
||||
* @property ConfigurationLoader $configLoader Verwaltet Konfiguration
|
||||
* @property CsvWriter $csvWriter Schreibt Output-CSV
|
||||
* @property MetadataExtractor $metadataExtractor Extrahiert Metadaten aus Header
|
||||
* @property ColumnTransformer $columnTransformer Transformiert Spalten
|
||||
* @property array $csvStructure CSV-Struktur-Konfiguration
|
||||
*/
|
||||
class TransformerEngine
|
||||
{
|
||||
@ -34,24 +33,21 @@ class TransformerEngine
|
||||
private bool $debugMode = false;
|
||||
|
||||
/**
|
||||
* Initialises TransformerEngine with configuration
|
||||
* Initialisiert TransformerEngine mit Konfiguration
|
||||
*
|
||||
* Loads all required configurations and initialises
|
||||
* the components (MetadataExtractor, ColumnTransformer, CsvWriter).
|
||||
* CsvReader is instantiated later in transform() and validate() with the file path.
|
||||
* Lädt alle erforderlichen Konfigurationen und initialisiert
|
||||
* die Komponenten (MetadataExtractor, ColumnTransformer, CsvWriter).
|
||||
* CsvReader wird später in transform() und validate() initialisiert mit dem Dateipfad.
|
||||
*
|
||||
* @param ConfigurationLoader $configLoader Loads configuration files
|
||||
* @param bool $debugMode true = enable debug mode
|
||||
* @param ConfigurationLoader $configLoader Lädt Konfigurationsdateien
|
||||
* @param bool $debugMode true = Debug-Modus aktivieren
|
||||
*
|
||||
* @throws \RuntimeException if required configurations are missing
|
||||
* @throws \RuntimeException wenn erforderliche Konfigurationen fehlen
|
||||
*/
|
||||
public function __construct(ConfigurationLoader $configLoader, bool $debugMode = false)
|
||||
{
|
||||
$this->configLoader = $configLoader;
|
||||
$this->debugMode = $debugMode;
|
||||
if ($debugMode) {
|
||||
DebugLogger::enable();
|
||||
}
|
||||
|
||||
$config = $configLoader->getAll();
|
||||
|
||||
@ -67,7 +63,7 @@ class TransformerEngine
|
||||
$config['capitalizationExceptions'] ?? []
|
||||
);
|
||||
|
||||
// Determine output file name from configuration
|
||||
// Bestimme Output-Dateiname aus Konfiguration
|
||||
$outputDir = $config['directories']['output'] ?? './output';
|
||||
$outputFileName = $config['csvStructure']['outputFilename'] ?? 'transformed.csv';
|
||||
$outputFile = rtrim($outputDir, '/') . '/' . $outputFileName;
|
||||
@ -79,9 +75,9 @@ class TransformerEngine
|
||||
}
|
||||
|
||||
/**
|
||||
* Enables or disables debug mode
|
||||
* Aktiviert oder deaktiviert den Debug-Modus
|
||||
*
|
||||
* @param bool $enabled true = debug mode enabled
|
||||
* @param bool $enabled true = Debug-Modus aktiviert
|
||||
* @return void
|
||||
*/
|
||||
public function setDebugMode(bool $enabled): void
|
||||
@ -95,30 +91,30 @@ class TransformerEngine
|
||||
}
|
||||
|
||||
/**
|
||||
* Transforms a CSV file
|
||||
* Transformiert eine CSV-Datei
|
||||
*
|
||||
* Performs the following steps:
|
||||
* 1. Read CSV file with CsvReader
|
||||
* 2. Extract metadata from header
|
||||
* 3. Transform columns according to configuration
|
||||
* 4. Write data to output CSV
|
||||
* 5. Collect sample data (maximum 3 rows or maxRows)
|
||||
* Führt folgende Schritte durch:
|
||||
* 1. CSV-Datei einlesen mit CsvReader
|
||||
* 2. Metadaten aus Header extrahieren
|
||||
* 3. Spalten gemäß Konfiguration transformieren
|
||||
* 4. Daten in Output-CSV schreiben
|
||||
* 5. Beispiel-Daten sammeln (maximal 3 Zeilen oder maxRows)
|
||||
*
|
||||
* The output file path is determined from the configuration and cannot be overridden.
|
||||
* Der Output-Dateipfad wird aus der Konfiguration bestimmt und kann nicht überschrieben werden.
|
||||
*
|
||||
* @param string $inputFile Path to the input CSV file
|
||||
* @param int $maxRows Maximum number of data rows to transform (0 = all).
|
||||
* Sample data is limited to min(3, maxRows)
|
||||
* @param string $inputFile Pfad zur Input-CSV-Datei
|
||||
* @param int $maxRows Maximale Anzahl Datenzeilen zu transformieren (0 = alle).
|
||||
* Beispiel-Daten werden begrenzt auf min(3, maxRows)
|
||||
*
|
||||
* @return array Transformation result with:
|
||||
* - success: bool (true = successful, false = error)
|
||||
* - inputFile: string (input file path, on success only)
|
||||
* - outputFile: string (output file path, on success only)
|
||||
* - rowsProcessed: int (actually processed data rows)
|
||||
* - sampleData: array (first sample rows, max 3 or maxRows)
|
||||
* - metadata: array (extracted metadata, on success only)
|
||||
* - outputColumns: int (number of output columns)
|
||||
* - error: string (error message, on failure only)
|
||||
* @return array Transformations-Ergebnis mit:
|
||||
* - success: bool (true = erfolgreich, false = Fehler)
|
||||
* - inputFile: string (Input-Dateipfad, nur bei Erfolg)
|
||||
* - outputFile: string (Output-Dateipfad, nur bei Erfolg)
|
||||
* - rowsProcessed: int (tatsächlich verarbeitete Datenzeilen)
|
||||
* - sampleData: array (Erste Beispiel-Zeilen, max 3 oder maxRows)
|
||||
* - metadata: array (Extrahierte Metadaten, nur bei Erfolg)
|
||||
* - outputColumns: int (Anzahl Output-Spalten)
|
||||
* - error: string (Fehlermeldung, nur bei Fehler)
|
||||
*/
|
||||
public function transform(string $inputFile, int $maxRows = 0): array
|
||||
{
|
||||
@ -134,59 +130,50 @@ class TransformerEngine
|
||||
]);
|
||||
}
|
||||
|
||||
// Validate input file
|
||||
// Validiere Input-Datei
|
||||
if (!file_exists($inputFile)) {
|
||||
throw new \RuntimeException("Input file not found: {$inputFile}");
|
||||
throw new \RuntimeException("Input-Datei nicht gefunden: {$inputFile}");
|
||||
}
|
||||
|
||||
// Initialise CsvReader with file path and configuration
|
||||
// Initialisiere CsvReader mit Dateipfad und Konfiguration
|
||||
$csvReader = new CsvReader($inputFile, $this->csvStructure);
|
||||
|
||||
// Read metadata lines (before the header line)
|
||||
// Lese Metadaten-Zeilen (vor der Header-Zeile)
|
||||
$metadataLines = $csvReader->readMetadataLines();
|
||||
|
||||
// Extract metadata from the metadata lines
|
||||
// Extrahiere Metadaten aus den Metadaten-Zeilen
|
||||
$metadata = $this->metadataExtractor->extract($metadataLines);
|
||||
|
||||
// Initialise ColumnTransformer with extracted metadata
|
||||
// Initialisiere ColumnTransformer mit extrahierten Metadaten
|
||||
$this->columnTransformer = new ColumnTransformer(
|
||||
$this->configLoader->get('columnTransformations', []),
|
||||
$metadata,
|
||||
$this->configLoader->get('capitalizationExceptions', [])
|
||||
);
|
||||
|
||||
// Read CSV data with header keys as array keys
|
||||
// Lese CSV-Daten mit Header-Keys als Array-Keys
|
||||
$dataRows = $csvReader->readCsvData($maxRows);
|
||||
if (empty($dataRows)) {
|
||||
throw new \RuntimeException("No data rows in CSV file");
|
||||
throw new \RuntimeException("Keine Datenzeilen in CSV-Datei");
|
||||
}
|
||||
|
||||
// Calculate limit for sample data
|
||||
// Berechne Limit für Beispiel-Daten
|
||||
$sampleLimit = $maxRows == 0 ? 3 : $maxRows;
|
||||
|
||||
// Transform rows and collect them
|
||||
// Transformiere Zeilen und sammle sie
|
||||
$transformedData = [];
|
||||
|
||||
/** @var array<string, mixed>|null $skipIfNode */
|
||||
$skipIfNode = $this->configLoader->get('skipIf', null);
|
||||
|
||||
foreach ($dataRows as $row) {
|
||||
// Check if maxRows reached
|
||||
// Prüfe ob maxRows erreicht
|
||||
if ($maxRows > 0 && $this->rowsProcessed >= $maxRows) {
|
||||
break;
|
||||
}
|
||||
|
||||
// Skip row if filter condition matches
|
||||
if ($skipIfNode !== null && RowFilter::evaluate($skipIfNode, $row)) {
|
||||
DebugLogger::log('transformer', 'Row skipped by skipIf filter', ['row' => $row]);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Transform row
|
||||
// Transformiere Zeile
|
||||
$transformedRow = $this->columnTransformer->transformRow($row);
|
||||
$transformedData[] = $transformedRow;
|
||||
|
||||
// Save sample data
|
||||
// Speichere Beispiel-Daten
|
||||
if (count($this->sampleData) < $sampleLimit) {
|
||||
$this->sampleData[] = $transformedRow;
|
||||
}
|
||||
@ -194,7 +181,7 @@ class TransformerEngine
|
||||
$this->rowsProcessed++;
|
||||
}
|
||||
|
||||
// Remove columns to be excluded from the output
|
||||
// Entferne Spalten die aus dem Output ausgeschlossen werden sollen
|
||||
$excludeColumns = $this->csvStructure['excludeOutputColumns'] ?? [];
|
||||
if (!empty($excludeColumns)) {
|
||||
$excludeMap = array_flip($excludeColumns);
|
||||
@ -208,7 +195,7 @@ class TransformerEngine
|
||||
);
|
||||
}
|
||||
|
||||
// Write all transformed data to output CSV
|
||||
// Schreibe alle transformierten Daten in Output-CSV
|
||||
$this->csvWriter->write($transformedData);
|
||||
|
||||
$result = [
|
||||
@ -238,43 +225,43 @@ class TransformerEngine
|
||||
}
|
||||
|
||||
/**
|
||||
* Transforms and imports CSV into Firefly III
|
||||
* Transformiert und importiert CSV in Firefly III
|
||||
*
|
||||
* Performs transformation and imports the output file
|
||||
* into Firefly III if enabled in the configuration.
|
||||
* Führt Transformation durch und importiert die Ausgabe-Datei
|
||||
* in Firefly III wenn in der Konfiguration aktiviert.
|
||||
*
|
||||
* Backwards-compatible with legacy signature.
|
||||
* Rückwärts-kompatibel mit legacy Signatur.
|
||||
*
|
||||
* @param string $inputFile Path to the input CSV file
|
||||
* @param int $maxRows Maximum number of data rows to process (0 = all)
|
||||
* @param string $inputFile Pfad zur Input-CSV-Datei
|
||||
* @param int $maxRows Maximale Anzahl Datenzeilen zu verarbeiten (0 = alle)
|
||||
*
|
||||
* @return array Transformation and import result with:
|
||||
* - success: bool (true = transformation successful)
|
||||
* @return array Transformations- und Import-Ergebnis mit:
|
||||
* - success: bool (true = transformation erfolgreich)
|
||||
* - inputFile: string
|
||||
* - outputFile: string
|
||||
* - rowsProcessed: int
|
||||
* - sampleData: array
|
||||
* - metadata: array
|
||||
* - outputColumns: int
|
||||
* - import: array (Firefly import result, if autoImport active)
|
||||
* - error: string (if error)
|
||||
* - import: array (Firefly Import-Ergebnis, wenn autoImport aktiv)
|
||||
* - error: string (falls Fehler)
|
||||
*/
|
||||
public function transformAndImport(string $inputFile, int $maxRows = 0): array
|
||||
{
|
||||
// Transform first
|
||||
// Zuerst transformieren
|
||||
$transformResult = $this->transform($inputFile, $maxRows);
|
||||
|
||||
if (!$transformResult['success']) {
|
||||
return $transformResult;
|
||||
}
|
||||
|
||||
// Check whether auto-import is enabled in configuration
|
||||
// Prüfe ob Auto-Import in Konfiguration aktiviert ist
|
||||
$fireflyConfig = $this->configLoader->get('fireflyImport', []);
|
||||
if (empty($fireflyConfig['autoImport'])) {
|
||||
return $transformResult;
|
||||
}
|
||||
|
||||
// Perform Firefly import
|
||||
// Führe Firefly-Import durch
|
||||
try {
|
||||
$importer = new FireflyImporter($fireflyConfig);
|
||||
$importResult = $importer->import($transformResult['outputFile']);
|
||||
@ -291,19 +278,19 @@ class TransformerEngine
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates a CSV file against the configuration
|
||||
* Validiert eine CSV-Datei gegen die Konfiguration
|
||||
*
|
||||
* Checks whether required metadata is present
|
||||
* and whether the CSV structure matches the configuration.
|
||||
* Prüft ob erforderliche Metadaten vorhanden sind
|
||||
* und ob die CSV-Struktur der Konfiguration entspricht.
|
||||
*
|
||||
* @param string $inputFile Path to the CSV file to validate
|
||||
* @param string $inputFile Pfad zur zu validierenden CSV-Datei
|
||||
*
|
||||
* @return array Validation result with:
|
||||
* - valid: bool (true = validation successful)
|
||||
* - metadata: array (extracted metadata, when valid)
|
||||
* - line_count: int (total number of lines, when valid)
|
||||
* - error: string (error message, when not valid)
|
||||
* - metadata_found: array (found metadata despite error)
|
||||
* @return array Validierungs-Ergebnis mit:
|
||||
* - valid: bool (true = Validierung erfolgreich)
|
||||
* - metadata: array (Extrahierte Metadaten, wenn valid)
|
||||
* - line_count: int (Gesamtzahl Zeilen, wenn valid)
|
||||
* - error: string (Fehlermeldung, wenn nicht valid)
|
||||
* - metadata_found: array (Gefundene Metadaten trotz Fehler)
|
||||
*/
|
||||
public function validate(string $inputFile): array
|
||||
{
|
||||
@ -311,18 +298,18 @@ class TransformerEngine
|
||||
if (!file_exists($inputFile)) {
|
||||
return [
|
||||
'valid' => false,
|
||||
'error' => "File not found: {$inputFile}",
|
||||
'error' => "Datei nicht gefunden: {$inputFile}",
|
||||
];
|
||||
}
|
||||
|
||||
// Initialise CsvReader with file path
|
||||
// Initialisiere CsvReader mit Dateipfad
|
||||
$csvReader = new CsvReader($inputFile, $this->csvStructure);
|
||||
|
||||
// Extract metadata lines (before the header line)
|
||||
// Extrahiere Metadaten-Zeilen (vor der Header-Zeile)
|
||||
$metadataLines = $csvReader->readMetadataLines();
|
||||
$metadata = $this->metadataExtractor->extract($metadataLines);
|
||||
|
||||
// Check for required metadata
|
||||
// Prüfe auf erforderliche Metadaten
|
||||
$requiredMetadata = [
|
||||
'account_iban',
|
||||
'currency_code',
|
||||
@ -338,12 +325,12 @@ class TransformerEngine
|
||||
if (!empty($missingMetadata)) {
|
||||
return [
|
||||
'valid' => false,
|
||||
'error' => 'Missing metadata: ' . implode(', ', $missingMetadata),
|
||||
'error' => 'Fehlende Metadaten: ' . implode(', ', $missingMetadata),
|
||||
'metadata_found' => $metadata,
|
||||
];
|
||||
}
|
||||
|
||||
// Count total number of lines
|
||||
// Zähle Gesamtzahl Zeilen
|
||||
$lineCount = $csvReader->countLines();
|
||||
|
||||
return [
|
||||
@ -354,15 +341,15 @@ class TransformerEngine
|
||||
} catch (\Exception $e) {
|
||||
return [
|
||||
'valid' => false,
|
||||
'error' => 'Validation error: ' . $e->getMessage(),
|
||||
'error' => 'Validierungs-Fehler: ' . $e->getMessage(),
|
||||
];
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the collected sample data
|
||||
* Gibt die gesammelten Beispiel-Daten zurück
|
||||
*
|
||||
* @return array Sample data (maximum 3 or maxRows rows)
|
||||
* @return array Beispiel-Daten (maximal 3 oder maxRows Zeilen)
|
||||
*/
|
||||
public function getSampleData(): array
|
||||
{
|
||||
@ -370,9 +357,9 @@ class TransformerEngine
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the number of processed data rows
|
||||
* Gibt die Anzahl verarbeiteter Datenzeilen zurück
|
||||
*
|
||||
* @return int Number of transformed rows
|
||||
* @return int Anzahl transformierter Zeilen
|
||||
*/
|
||||
public function getRowsProcessed(): int
|
||||
{
|
||||
|
||||
@ -435,90 +435,6 @@ class ColumnTransformerTest extends TestCase
|
||||
$this->assertSame('Hello World', $result['B']);
|
||||
}
|
||||
|
||||
public function testOutputActionAppendWithDelimiter(): void
|
||||
{
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'append',
|
||||
'appendDelimiter' => ', ',
|
||||
], ['A' => 'World', 'B' => 'Hello']);
|
||||
$this->assertSame('Hello, World', $result['B']);
|
||||
}
|
||||
|
||||
public function testOutputActionAppendWithDelimiterSkippedWhenTargetEmpty(): void
|
||||
{
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'append',
|
||||
'appendDelimiter' => ', ',
|
||||
], ['A' => 'Hello', 'B' => '']);
|
||||
$this->assertSame('Hello', $result['B']);
|
||||
}
|
||||
|
||||
public function testOutputActionAppendLine(): void
|
||||
{
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'append-line',
|
||||
], ['A' => 'Line2', 'B' => 'Line1']);
|
||||
$this->assertSame("Line1\nLine2", $result['B']);
|
||||
}
|
||||
|
||||
public function testOutputActionAppendLineNoLeadingNewlineWhenEmpty(): void
|
||||
{
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'append-line',
|
||||
], ['A' => 'Line1', 'B' => '']);
|
||||
$this->assertSame('Line1', $result['B']);
|
||||
}
|
||||
|
||||
public function testOutputActionOverwriteIfEmpty(): void
|
||||
{
|
||||
$resultEmpty = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'overwrite-if-empty',
|
||||
], ['A' => 'new', 'B' => '']);
|
||||
$this->assertSame('new', $resultEmpty['B']);
|
||||
|
||||
$resultFilled = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'overwrite-if-empty',
|
||||
], ['A' => 'new', 'B' => 'existing']);
|
||||
$this->assertSame('existing', $resultFilled['B']);
|
||||
}
|
||||
|
||||
public function testOutputActionOverwriteIfNotEmpty(): void
|
||||
{
|
||||
$resultNotEmpty = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'overwrite-if-not-empty',
|
||||
], ['A' => 'new', 'B' => 'old']);
|
||||
$this->assertSame('new', $resultNotEmpty['B']);
|
||||
|
||||
$resultEmpty = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'overwrite-if-not-empty',
|
||||
], ['A' => '', 'B' => 'old']);
|
||||
$this->assertSame('old', $resultEmpty['B']);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// multi-output split
|
||||
// -------------------------------------------------------------------------
|
||||
@ -588,139 +504,4 @@ class ColumnTransformerTest extends TestCase
|
||||
$transformer->transformRow(['A' => '1', 'B' => '2', 'C' => '3']);
|
||||
$this->assertSame(2, $transformer->getOutputColumns());
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// timeperiod
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/** @var array<int, array<string, string>> */
|
||||
private array $testPeriods = [
|
||||
['from' => '04:00:00', 'to' => '08:59:59', 'label' => 'Morgen'],
|
||||
['from' => '09:00:00', 'to' => '10:59:59', 'label' => 'Vormittag'],
|
||||
['from' => '11:00:00', 'to' => '13:59:59', 'label' => 'Mittag'],
|
||||
['from' => '14:00:00', 'to' => '17:59:59', 'label' => 'Nachmittag'],
|
||||
['from' => '18:00:00', 'to' => '21:59:59', 'label' => 'Abend'],
|
||||
['from' => '22:00:00', 'to' => '03:59:59', 'label' => 'Nacht'],
|
||||
];
|
||||
|
||||
public function testTimePeriodBasicMapping(): void
|
||||
{
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'Time',
|
||||
'outputColumn' => 'Period',
|
||||
'type' => 'timeperiod',
|
||||
'timeFormat' => 'H:i:s',
|
||||
'periods' => $this->testPeriods,
|
||||
'default' => '',
|
||||
], ['Time' => '09:30:00', 'Period' => '']);
|
||||
$this->assertSame('Vormittag', $result['Period']);
|
||||
}
|
||||
|
||||
public function testTimePeriodMidnightSpanning(): void
|
||||
{
|
||||
$result1 = $this->applyOne([
|
||||
'sourceColumn' => 'Time',
|
||||
'outputColumn' => 'Period',
|
||||
'type' => 'timeperiod',
|
||||
'timeFormat' => 'H:i:s',
|
||||
'periods' => $this->testPeriods,
|
||||
'default' => '',
|
||||
], ['Time' => '23:00:00', 'Period' => '']);
|
||||
$this->assertSame('Nacht', $result1['Period']);
|
||||
|
||||
$result2 = $this->applyOne([
|
||||
'sourceColumn' => 'Time',
|
||||
'outputColumn' => 'Period',
|
||||
'type' => 'timeperiod',
|
||||
'timeFormat' => 'H:i:s',
|
||||
'periods' => $this->testPeriods,
|
||||
'default' => '',
|
||||
], ['Time' => '02:00:00', 'Period' => '']);
|
||||
$this->assertSame('Nacht', $result2['Period']);
|
||||
}
|
||||
|
||||
public function testTimePeriodNoMatch(): void
|
||||
{
|
||||
// 03:45 falls outside all labelled ranges except Nacht (00:00-03:59)
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'Time',
|
||||
'outputColumn' => 'Period',
|
||||
'type' => 'timeperiod',
|
||||
'timeFormat' => 'H:i:s',
|
||||
'periods' => [
|
||||
['from' => '09:00:00', 'to' => '17:59:59', 'label' => 'Day'],
|
||||
],
|
||||
'default' => 'Unknown',
|
||||
], ['Time' => '03:45:00', 'Period' => '']);
|
||||
$this->assertSame('Unknown', $result['Period']);
|
||||
}
|
||||
|
||||
public function testTimePeriodInvalidInput(): void
|
||||
{
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'Time',
|
||||
'outputColumn' => 'Period',
|
||||
'type' => 'timeperiod',
|
||||
'timeFormat' => 'H:i:s',
|
||||
'periods' => $this->testPeriods,
|
||||
'default' => 'N/A',
|
||||
], ['Time' => '', 'Period' => '']);
|
||||
$this->assertSame('N/A', $result['Period']);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// ucwordsfirst guard
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testUcwordsFirstSkipsLowercase(): void
|
||||
{
|
||||
// Input already contains lowercase letters → must be returned unchanged
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'A',
|
||||
'type' => 'ucwordsfirst',
|
||||
], ['A' => 'Coop pronto chur']);
|
||||
$this->assertSame('Coop pronto chur', $result['A']);
|
||||
}
|
||||
|
||||
public function testUcwordsFirstAppliesAllCaps(): void
|
||||
{
|
||||
// Fully uppercase input → capitalise first letter of each word
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'A',
|
||||
'type' => 'ucwordsfirst',
|
||||
], ['A' => 'COOP PRONTO']);
|
||||
$this->assertSame('Coop Pronto', $result['A']);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// append-if-not-empty
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testAppendIfNotEmptySkipsEmpty(): void
|
||||
{
|
||||
// Result is empty → target column must remain unchanged
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'append-if-not-empty',
|
||||
'appendDelimiter' => ' ',
|
||||
], ['A' => '', 'B' => 'existing']);
|
||||
$this->assertSame('existing', $result['B']);
|
||||
}
|
||||
|
||||
public function testAppendIfNotEmptyAppendsNonEmpty(): void
|
||||
{
|
||||
// Non-empty result → appended with delimiter
|
||||
$result = $this->applyOne([
|
||||
'sourceColumn' => 'A',
|
||||
'outputColumn' => 'B',
|
||||
'type' => 'map',
|
||||
'outputAction' => 'append-if-not-empty',
|
||||
'appendDelimiter' => ' ',
|
||||
], ['A' => 'new', 'B' => 'existing']);
|
||||
$this->assertSame('existing new', $result['B']);
|
||||
}
|
||||
}
|
||||
|
||||
@ -1,414 +0,0 @@
|
||||
<?php
|
||||
|
||||
namespace UbsCsvTransformer\Tests;
|
||||
|
||||
use PHPUnit\Framework\TestCase;
|
||||
use UbsCsvTransformer\FireflyImporter;
|
||||
use UbsCsvTransformer\DebugLogger;
|
||||
|
||||
/**
|
||||
* Tests for the chunked-import state file / resume feature.
|
||||
*
|
||||
* Strategy: subclass FireflyImporter and override import() so no real HTTP or
|
||||
* CLI call is made. The override is configured per test via a callable queue.
|
||||
*/
|
||||
class FireflyImporterChunkStateTest extends TestCase
|
||||
{
|
||||
/** @var string Temporary directory for CSV and state files */
|
||||
private string $tmpDir;
|
||||
|
||||
/** @var string Path to a throwaway JSON config file */
|
||||
private string $jsonConfig;
|
||||
|
||||
protected function setUp(): void
|
||||
{
|
||||
DebugLogger::reset();
|
||||
|
||||
$this->tmpDir = sys_get_temp_dir() . '/ffi_state_test_' . uniqid('', true);
|
||||
mkdir($this->tmpDir, 0700, true);
|
||||
|
||||
// Minimal Firefly importer config (format v3)
|
||||
$configData = [
|
||||
'version' => 3,
|
||||
'flow' => 'csv',
|
||||
'roles' => ['amount'],
|
||||
'default_account' => 1,
|
||||
];
|
||||
$this->jsonConfig = $this->tmpDir . '/ff-config.json';
|
||||
file_put_contents($this->jsonConfig, json_encode($configData));
|
||||
}
|
||||
|
||||
protected function tearDown(): void
|
||||
{
|
||||
// Remove all temp files
|
||||
foreach (glob($this->tmpDir . '/*') ?: [] as $f) {
|
||||
@unlink($f);
|
||||
}
|
||||
@rmdir($this->tmpDir);
|
||||
}
|
||||
|
||||
// ─── Helpers ─────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Creates an importer stub whose import() calls return results from $queue
|
||||
* in order. Each element of the queue is either true (success) or false (failure).
|
||||
*
|
||||
* @param array<bool> $importResultQueue
|
||||
* @param int $chunkSize
|
||||
*/
|
||||
private function makeImporter(array $importResultQueue, int $chunkSize): FireflyImporter
|
||||
{
|
||||
$config = [
|
||||
'mode' => 'http',
|
||||
'importerUrl' => 'https://example.com',
|
||||
'accessToken' => 'test-secret-1234567',
|
||||
'personalSecret' => 'test-pat',
|
||||
'jsonConfig' => $this->jsonConfig,
|
||||
'chunkSize' => $chunkSize,
|
||||
];
|
||||
|
||||
$queue = $importResultQueue;
|
||||
|
||||
return new class ($config, $queue) extends FireflyImporter {
|
||||
/** @var array<bool> */
|
||||
private array $queue;
|
||||
|
||||
/** @param array<bool> $queue */
|
||||
public function __construct(array $config, array $queue)
|
||||
{
|
||||
parent::__construct($config);
|
||||
$this->queue = $queue;
|
||||
}
|
||||
|
||||
public function import(string $csvFile): array
|
||||
{
|
||||
$success = array_shift($this->queue) ?? true;
|
||||
if ($success) {
|
||||
return [
|
||||
'success' => true,
|
||||
'exit_code' => 200,
|
||||
'output' => ['stdout' => '', 'stderr' => ''],
|
||||
'duration' => 1.0,
|
||||
'csv_file' => $csvFile,
|
||||
'summary' => [
|
||||
'completed' => true,
|
||||
'created' => 1,
|
||||
'by_type' => ['withdrawal' => 1],
|
||||
'duplicates' => 0,
|
||||
'errors' => [],
|
||||
],
|
||||
];
|
||||
}
|
||||
return [
|
||||
'success' => false,
|
||||
'error' => 'Simulated failure',
|
||||
'output' => ['stdout' => '', 'stderr' => ''],
|
||||
'exit_code' => 500,
|
||||
];
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Writes a CSV with $dataRows data rows (each row has two columns).
|
||||
*/
|
||||
private function writeCsv(string $path, int $dataRows): void
|
||||
{
|
||||
$fp = fopen($path, 'w');
|
||||
assert($fp !== false);
|
||||
fputcsv($fp, ['col_a', 'col_b'], ',', '"', '\\');
|
||||
for ($i = 1; $i <= $dataRows; $i++) {
|
||||
fputcsv($fp, ["val_a_{$i}", "val_b_{$i}"], ',', '"', '\\');
|
||||
}
|
||||
fclose($fp);
|
||||
}
|
||||
|
||||
private function stateFile(string $csvPath): string
|
||||
{
|
||||
return $csvPath . '.ffi-state.json';
|
||||
}
|
||||
|
||||
// ─── Tests ───────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* When chunkSize is 0, import() is used directly — no state file should appear.
|
||||
*/
|
||||
public function testNoStateFileWhenChunkingNotUsed(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 5);
|
||||
|
||||
$importer = $this->makeImporter([true], 0);
|
||||
$result = $importer->importChunked($csv);
|
||||
|
||||
$this->assertTrue($result['success']);
|
||||
$this->assertFileDoesNotExist($this->stateFile($csv));
|
||||
}
|
||||
|
||||
/**
|
||||
* When the file has fewer rows than chunkSize, no chunking occurs — no state file.
|
||||
*/
|
||||
public function testNoStateFileWhenRowsBelowChunkSize(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 3);
|
||||
|
||||
$importer = $this->makeImporter([true], 10);
|
||||
$importer->importChunked($csv);
|
||||
|
||||
$this->assertFileDoesNotExist($this->stateFile($csv));
|
||||
}
|
||||
|
||||
/**
|
||||
* After chunk 1 of 3 fails, the state file must exist and record 0 completed chunks.
|
||||
*/
|
||||
public function testStateFileCreatedOnFirstChunkFailure(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9); // 3 chunks of 3
|
||||
|
||||
// Chunk 1 fails immediately
|
||||
$importer = $this->makeImporter([false], 3);
|
||||
$result = $importer->importChunked($csv);
|
||||
|
||||
$this->assertFalse($result['success']);
|
||||
$this->assertFileExists($this->stateFile($csv));
|
||||
|
||||
/** @var array<string, mixed> $state */
|
||||
$state = json_decode((string) file_get_contents($this->stateFile($csv)), true);
|
||||
$this->assertSame([], $state['completed_chunks']);
|
||||
}
|
||||
|
||||
/**
|
||||
* After chunks 1 and 2 succeed but chunk 3 fails, the state file records [0, 1].
|
||||
*/
|
||||
public function testStateFileRecordsCompletedChunksOnPartialFailure(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9); // 3 chunks of 3
|
||||
|
||||
// Chunks 0, 1 succeed; chunk 2 fails
|
||||
$importer = $this->makeImporter([true, true, false], 3);
|
||||
$result = $importer->importChunked($csv);
|
||||
|
||||
$this->assertFalse($result['success']);
|
||||
$this->assertFileExists($this->stateFile($csv));
|
||||
|
||||
/** @var array<string, mixed> $state */
|
||||
$state = json_decode((string) file_get_contents($this->stateFile($csv)), true);
|
||||
$this->assertSame([0, 1], $state['completed_chunks']);
|
||||
$this->assertArrayHasKey('0', $state['chunk_results']);
|
||||
$this->assertArrayHasKey('1', $state['chunk_results']);
|
||||
}
|
||||
|
||||
/**
|
||||
* After full success the state file is deleted automatically.
|
||||
*/
|
||||
public function testStateFileDeletedAfterFullSuccess(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 6); // 2 chunks of 3
|
||||
|
||||
$importer = $this->makeImporter([true, true], 3);
|
||||
$result = $importer->importChunked($csv);
|
||||
|
||||
$this->assertTrue($result['success']);
|
||||
$this->assertFileDoesNotExist($this->stateFile($csv));
|
||||
}
|
||||
|
||||
/**
|
||||
* On a second run with an existing state showing [0, 1] done, only chunk 2
|
||||
* (index 2) should call import() — i.e., exactly one call is made.
|
||||
*/
|
||||
public function testResumeSkipsAlreadyCompletedChunks(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9); // 3 chunks of 3
|
||||
|
||||
// ── First run: chunks 0+1 succeed, chunk 2 fails ────────────────────
|
||||
$run1 = $this->makeImporter([true, true, false], 3);
|
||||
$run1->importChunked($csv);
|
||||
|
||||
$this->assertFileExists($this->stateFile($csv));
|
||||
|
||||
// ── Second run: only chunk 2 should be attempted ────────────────────
|
||||
// We record how many times import() is actually called via a counting wrapper
|
||||
$counter = new \stdClass();
|
||||
$counter->value = 0;
|
||||
|
||||
$config = [
|
||||
'mode' => 'http',
|
||||
'importerUrl' => 'https://example.com',
|
||||
'accessToken' => 'test-secret-1234567',
|
||||
'personalSecret' => 'test-pat',
|
||||
'jsonConfig' => $this->jsonConfig,
|
||||
'chunkSize' => 3,
|
||||
];
|
||||
|
||||
$run2 = new class ($config, $counter) extends FireflyImporter {
|
||||
private \stdClass $counter;
|
||||
|
||||
public function __construct(array $config, \stdClass $counter)
|
||||
{
|
||||
parent::__construct($config);
|
||||
$this->counter = $counter;
|
||||
}
|
||||
|
||||
public function import(string $csvFile): array
|
||||
{
|
||||
$this->counter->value++;
|
||||
return [
|
||||
'success' => true,
|
||||
'exit_code' => 200,
|
||||
'output' => ['stdout' => '', 'stderr' => ''],
|
||||
'duration' => 1.0,
|
||||
'csv_file' => $csvFile,
|
||||
'summary' => [
|
||||
'completed' => true,
|
||||
'created' => 1,
|
||||
'by_type' => ['withdrawal' => 1],
|
||||
'duplicates' => 0,
|
||||
'errors' => [],
|
||||
],
|
||||
];
|
||||
}
|
||||
};
|
||||
|
||||
$result2 = $run2->importChunked($csv);
|
||||
|
||||
$this->assertTrue($result2['success']);
|
||||
$this->assertSame(1, $counter->value, 'Only the 1 remaining chunk (index 2) should be imported');
|
||||
$this->assertFileDoesNotExist($this->stateFile($csv), 'State file must be deleted after full success');
|
||||
}
|
||||
|
||||
/**
|
||||
* A state file whose total_rows does not match the current CSV is silently
|
||||
* discarded and a fresh import is started.
|
||||
*/
|
||||
public function testStaleMismatchedStateIsIgnored(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9); // 3 chunks of 3
|
||||
|
||||
// Plant a stale state file with a wrong total_rows
|
||||
$staleState = [
|
||||
'csv_file' => realpath($csv) ?: $csv,
|
||||
'total_rows' => 99, // wrong
|
||||
'chunk_size' => 3,
|
||||
'total_chunks' => 3,
|
||||
'completed_chunks' => [0, 1],
|
||||
'chunk_results' => [],
|
||||
'created_at' => '2020-01-01T00:00:00+00:00',
|
||||
'updated_at' => '2020-01-01T00:00:00+00:00',
|
||||
];
|
||||
file_put_contents($this->stateFile($csv), json_encode($staleState));
|
||||
|
||||
// All 3 chunks should be called (fresh start despite stale state)
|
||||
$counter = new \stdClass();
|
||||
$counter->value = 0;
|
||||
|
||||
$config = [
|
||||
'mode' => 'http',
|
||||
'importerUrl' => 'https://example.com',
|
||||
'accessToken' => 'test-secret-1234567',
|
||||
'personalSecret' => 'test-pat',
|
||||
'jsonConfig' => $this->jsonConfig,
|
||||
'chunkSize' => 3,
|
||||
];
|
||||
|
||||
$importer = new class ($config, $counter) extends FireflyImporter {
|
||||
private \stdClass $counter;
|
||||
|
||||
public function __construct(array $config, \stdClass $counter)
|
||||
{
|
||||
parent::__construct($config);
|
||||
$this->counter = $counter;
|
||||
}
|
||||
|
||||
public function import(string $csvFile): array
|
||||
{
|
||||
$this->counter->value++;
|
||||
return [
|
||||
'success' => true,
|
||||
'exit_code' => 200,
|
||||
'output' => ['stdout' => '', 'stderr' => ''],
|
||||
'duration' => 1.0,
|
||||
'csv_file' => $csvFile,
|
||||
'summary' => [
|
||||
'completed' => true,
|
||||
'created' => 1,
|
||||
'by_type' => ['withdrawal' => 1],
|
||||
'duplicates' => 0,
|
||||
'errors' => [],
|
||||
],
|
||||
];
|
||||
}
|
||||
};
|
||||
|
||||
$result = $importer->importChunked($csv);
|
||||
|
||||
$this->assertTrue($result['success']);
|
||||
$this->assertSame(3, $counter->value, 'All 3 chunks must be imported when stale state is discarded');
|
||||
}
|
||||
|
||||
/**
|
||||
* A corrupt (non-JSON) state file is silently discarded; no exception is thrown.
|
||||
*/
|
||||
public function testCorruptStateFileIsIgnored(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 6); // 2 chunks of 3
|
||||
|
||||
file_put_contents($this->stateFile($csv), '{this is not valid json!!!}');
|
||||
|
||||
$importer = $this->makeImporter([true, true], 3);
|
||||
$result = $importer->importChunked($csv);
|
||||
|
||||
$this->assertTrue($result['success']);
|
||||
}
|
||||
|
||||
/**
|
||||
* resetImportState() deletes an existing state file.
|
||||
*/
|
||||
public function testResetImportStateClearsStateFile(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9);
|
||||
|
||||
// Plant a state file
|
||||
file_put_contents($this->stateFile($csv), '{}');
|
||||
$this->assertFileExists($this->stateFile($csv));
|
||||
|
||||
$importer = $this->makeImporter([], 3);
|
||||
$importer->resetImportState($csv);
|
||||
|
||||
$this->assertFileDoesNotExist($this->stateFile($csv));
|
||||
}
|
||||
|
||||
/**
|
||||
* hasResumeState() returns false when no state file is present.
|
||||
*/
|
||||
public function testHasResumeStateReturnsFalseWithoutStateFile(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9);
|
||||
|
||||
$importer = $this->makeImporter([], 3);
|
||||
$this->assertFalse($importer->hasResumeState($csv));
|
||||
}
|
||||
|
||||
/**
|
||||
* hasResumeState() returns true after a partial failure creates a valid state file.
|
||||
*/
|
||||
public function testHasResumeStateReturnsTrueAfterPartialFailure(): void
|
||||
{
|
||||
$csv = $this->tmpDir . '/test.csv';
|
||||
$this->writeCsv($csv, 9); // 3 chunks of 3
|
||||
|
||||
$importer = $this->makeImporter([true, false], 3); // chunk 2 (index 1) fails
|
||||
$importer->importChunked($csv);
|
||||
|
||||
$importer2 = $this->makeImporter([], 3);
|
||||
$this->assertTrue($importer2->hasResumeState($csv));
|
||||
}
|
||||
}
|
||||
@ -1,255 +0,0 @@
|
||||
<?php
|
||||
|
||||
namespace UbsCsvTransformer\Tests;
|
||||
|
||||
use PHPUnit\Framework\TestCase;
|
||||
use UbsCsvTransformer\RowFilter;
|
||||
|
||||
class RowFilterTest extends TestCase
|
||||
{
|
||||
// -------------------------------------------------------------------------
|
||||
// Leaf-condition operators
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testEmptyOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'empty'],
|
||||
['A' => '']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'empty'],
|
||||
['A' => 'something']
|
||||
));
|
||||
}
|
||||
|
||||
public function testNotEmptyOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-empty'],
|
||||
['A' => 'value']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-empty'],
|
||||
['A' => '']
|
||||
));
|
||||
}
|
||||
|
||||
public function testEqualsOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'equals', 'value' => 'hello'],
|
||||
['A' => 'hello']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'equals', 'value' => 'hello'],
|
||||
['A' => 'world']
|
||||
));
|
||||
}
|
||||
|
||||
public function testNotEqualsOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-equals', 'value' => 'hello'],
|
||||
['A' => 'world']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-equals', 'value' => 'hello'],
|
||||
['A' => 'hello']
|
||||
));
|
||||
}
|
||||
|
||||
public function testContainsOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'contains', 'value' => 'foo'],
|
||||
['A' => 'foobar']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'contains', 'value' => 'baz'],
|
||||
['A' => 'foobar']
|
||||
));
|
||||
}
|
||||
|
||||
public function testNotContainsOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-contains', 'value' => 'baz'],
|
||||
['A' => 'foobar']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-contains', 'value' => 'foo'],
|
||||
['A' => 'foobar']
|
||||
));
|
||||
}
|
||||
|
||||
public function testMatchesOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'matches', 'pattern' => '^\d{4}$'],
|
||||
['A' => '1234']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'matches', 'pattern' => '^\d{4}$'],
|
||||
['A' => 'abcd']
|
||||
));
|
||||
}
|
||||
|
||||
public function testNotMatchesOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-matches', 'pattern' => '^\d{4}$'],
|
||||
['A' => 'abcd']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'not-matches', 'pattern' => '^\d{4}$'],
|
||||
['A' => '1234']
|
||||
));
|
||||
}
|
||||
|
||||
public function testGtOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'gt', 'value' => '100'],
|
||||
['Amount' => '150.50']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'gt', 'value' => '100'],
|
||||
['Amount' => '50']
|
||||
));
|
||||
}
|
||||
|
||||
public function testGteOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'gte', 'value' => '100'],
|
||||
['Amount' => '100']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'gte', 'value' => '100'],
|
||||
['Amount' => '99.99']
|
||||
));
|
||||
}
|
||||
|
||||
public function testLtOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'lt', 'value' => '100'],
|
||||
['Amount' => '50']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'lt', 'value' => '100'],
|
||||
['Amount' => '200']
|
||||
));
|
||||
}
|
||||
|
||||
public function testLteOperator(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'lte', 'value' => '100'],
|
||||
['Amount' => '100']
|
||||
));
|
||||
$this->assertFalse(RowFilter::evaluate(
|
||||
['column' => 'Amount', 'operator' => 'lte', 'value' => '100'],
|
||||
['Amount' => '100.01']
|
||||
));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Groups
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testAndGroupBothTrue(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate([
|
||||
'and' => [
|
||||
['column' => 'A', 'operator' => 'empty'],
|
||||
['column' => 'B', 'operator' => 'empty'],
|
||||
],
|
||||
], ['A' => '', 'B' => '']));
|
||||
}
|
||||
|
||||
public function testAndGroupOneFalse(): void
|
||||
{
|
||||
$this->assertFalse(RowFilter::evaluate([
|
||||
'and' => [
|
||||
['column' => 'A', 'operator' => 'empty'],
|
||||
['column' => 'B', 'operator' => 'empty'],
|
||||
],
|
||||
], ['A' => '', 'B' => 'not-empty']));
|
||||
}
|
||||
|
||||
public function testOrGroupOneTrue(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate([
|
||||
'or' => [
|
||||
['column' => 'A', 'operator' => 'equals', 'value' => 'yes'],
|
||||
['column' => 'B', 'operator' => 'equals', 'value' => 'yes'],
|
||||
],
|
||||
], ['A' => 'no', 'B' => 'yes']));
|
||||
}
|
||||
|
||||
public function testOrGroupBothFalse(): void
|
||||
{
|
||||
$this->assertFalse(RowFilter::evaluate([
|
||||
'or' => [
|
||||
['column' => 'A', 'operator' => 'equals', 'value' => 'yes'],
|
||||
['column' => 'B', 'operator' => 'equals', 'value' => 'yes'],
|
||||
],
|
||||
], ['A' => 'no', 'B' => 'no']));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Nested groups
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testNestedAndOrGroup(): void
|
||||
{
|
||||
// (A is empty) AND (B equals "foo" OR C not-empty)
|
||||
$node = [
|
||||
'and' => [
|
||||
['column' => 'A', 'operator' => 'empty'],
|
||||
[
|
||||
'or' => [
|
||||
['column' => 'B', 'operator' => 'equals', 'value' => 'foo'],
|
||||
['column' => 'C', 'operator' => 'not-empty'],
|
||||
],
|
||||
],
|
||||
],
|
||||
];
|
||||
|
||||
// A empty, B matches → true
|
||||
$this->assertTrue(RowFilter::evaluate($node, ['A' => '', 'B' => 'foo', 'C' => '']));
|
||||
// A empty, C not-empty → true
|
||||
$this->assertTrue(RowFilter::evaluate($node, ['A' => '', 'B' => 'bar', 'C' => 'value']));
|
||||
// A empty, but neither B nor C match → false
|
||||
$this->assertFalse(RowFilter::evaluate($node, ['A' => '', 'B' => 'bar', 'C' => '']));
|
||||
// A not empty → false
|
||||
$this->assertFalse(RowFilter::evaluate($node, ['A' => 'x', 'B' => 'foo', 'C' => '']));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Unknown operator
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testUnknownOperatorThrows(): void
|
||||
{
|
||||
$this->expectException(\InvalidArgumentException::class);
|
||||
RowFilter::evaluate(
|
||||
['column' => 'A', 'operator' => 'nonexistent'],
|
||||
['A' => 'value']
|
||||
);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Missing column (treats as empty string)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
public function testMissingColumnTreatedAsEmpty(): void
|
||||
{
|
||||
$this->assertTrue(RowFilter::evaluate(
|
||||
['column' => 'NonExistent', 'operator' => 'empty'],
|
||||
['A' => 'something']
|
||||
));
|
||||
}
|
||||
}
|
||||
31
tests/fixtures/config-ubs-account/expected.csv
vendored
31
tests/fixtures/config-ubs-account/expected.csv
vendored
@ -1,14 +1,17 @@
|
||||
Belastung,Gutschrift,date,process_date,tags,opposing_iban,opposing_account,opposing_name,notes,description,account_iban,account_currency
|
||||
-600.00,,2022-12-30,2022-12-30,Dauerauftrag,"CH37 0026 7267 9314 35M2 P",,"David Peter Reindl","8906 Bonstetten
|
||||
9967864LK2659211","David Peter Reindl;8906 Bonstetten; STEUERRUECKSTELLUNG; Dauerauftrag","CH18 0026 7267 9314 3540 D",CHF
|
||||
-46.35,,2022-12-30,2022-12-31,,,,"UBS AG",9900365AP6356307,"Saldo Zinsabschluss; Periode: 2022-10-01 - 2022-12-30","CH18 0026 7267 9314 3540 D",CHF
|
||||
-39.90,,2022-12-30,2022-12-30,TWINT,,,"Swisscom Grossunternehme","Muellerstrasse 16 8004 Zuerich TWINT-Acc.:+41796305690
|
||||
9967364GK5707142","SWISSCOM GROSSUNTERNEHME; Zahlung UBS TWINT; Muellerstrasse 16 na, 8004 Zuerich TWINT-Acc.:+41796305690","CH18 0026 7267 9314 3540 D",CHF
|
||||
-8.75,,2022-12-28,2022-12-27,"Abend Debitkarte",,,"Coop Pronto Chur","18279748-0 08/24
|
||||
7007 Chur
|
||||
9930862BN7826808","Coop Pronto Chur;7007 Chur; Zahlung Debitkarte","CH18 0026 7267 9314 3540 D",CHF
|
||||
-1800.00,,2022-12-27,2022-12-27,e-banking,"CH63 0023 2232 5560 5988 0",,"Janine Geigele","8049 Zuerich
|
||||
9967361TI3188436","Janine Geigele;Am Wasser 36; 8049 Zuerich; CH; SKIFERIEN DOLOMITEN; e-banking-Vergütungsauftrag; Wohnung Dolomiten, 2 Personen","CH18 0026 7267 9314 3540 D",CHF
|
||||
,9.00,2022-12-22,2022-12-22,TWINT,,,"Friis, Daniela Silvia",9930356GK0440989,"Friis, Daniela Silvia; Gutschrift UBS TWINT; +41796741245; TWINT-Acc.:+41796305690","CH18 0026 7267 9314 3540 D",CHF
|
||||
,19764.80,2022-11-25,2022-11-25,Gutschrift,,,SBB,9901820E67741531,"SBB;Corporate Treasury; Gutschrift; Lohn/Gehalt 00229537/202211","CH18 0026 7267 9314 3540 D",CHF
|
||||
-14.00,,2022-08-22,2022-08-21,TWINT,,,"Friis-Loop, Daniela",9967233GK1553933,"FRIIS-LOOP, DANIELA; Belastung UBS TWINT; +41796741245; TWINT-Acc.:+41796305690","CH18 0026 7267 9314 3540 D",CHF
|
||||
Belastung,Gutschrift,date,process_date,opposing_name,tags,description,opposing_account,notes,account_iban,account_currency
|
||||
-600.00,,2022-12-30,2022-12-30,"David Peter Reindl",Dauerauftrag,"Steuerrueckstellung
|
||||
David Peter Reindl;8906 Bonstetten","CH37 0026 7267 9314 35M2 P","9967864LK2659211
|
||||
8906 Bonstetten","CH18 0026 7267 9314 3540 D",CHF
|
||||
-46.35,,2022-12-30,2022-12-31,"UBS AG",,"Periode: 2022-10-01 - 2022-12-30
|
||||
Zinsabschluss",,9900365AP6356307,"CH18 0026 7267 9314 3540 D",CHF
|
||||
-39.90,,2022-12-30,2022-12-30,"Swisscom Grossunternehme",TWINT,"Swisscom Grossunternehme; Zahlung UBS TWINT",,"9967364GK5707142
|
||||
8004 Zuerich","CH18 0026 7267 9314 3540 D",CHF
|
||||
-8.75,,2022-12-28,2022-12-27,"Coop Pronto Chur",Debitkarte,"18279748-0 08/24
|
||||
Coop Pronto Chur;7007 Chur",,"9930862BN7826808
|
||||
7007 Chur","CH18 0026 7267 9314 3540 D",CHF
|
||||
-1800.00,,2022-12-27,2022-12-27,"Janine Geigele",e-banking,"Skiferien Dolomiten
|
||||
Janine Geigele;Am Wasser 36; 8049 Zuerich; CH","CH63 0023 2232 5560 5988 0","9967361TI3188436
|
||||
8049 Zuerich","CH18 0026 7267 9314 3540 D",CHF
|
||||
,9.00,2022-12-22,2022-12-22,"Friis, Daniela Silvia",TWINT,"Friis, Daniela Silvia",,9930356GK0440989,"CH18 0026 7267 9314 3540 D",CHF
|
||||
,19764.80,2022-11-25,2022-11-25,SBB,Gutschrift,"SBB;Corporate Treasury",,9901820E67741531,"CH18 0026 7267 9314 3540 D",CHF
|
||||
-14.00,,2022-08-22,2022-08-21,"Friis-Loop, Daniela",TWINT,"Friis-Loop, Daniela; Belastung UBS TWINT",,9967233GK1553933,"CH18 0026 7267 9314 3540 D",CHF
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user