Einfache Pipelines sind… einfach. Aber Enterprise-Realität ist komplex: Monorepos mit Dutzenden Services, Multi-Team-Collaboration, hunderte parallele Jobs, Cross-Project-Dependencies, und Compliance-Requirements. Dieses Kapitel zeigt Pattern und Strategien für Production-Scale-Pipelines.
Wir kombinieren Features aus vorherigen Kapiteln (includes, rules, needs, parallel, etc.) zu Real-World-Lösungen für: - Monorepo-CI/CD (nur betroffene Services testen) - Template-Hierarchien (Organization → Team → Project) - Performance bei Scale (100+ Jobs, 10+ Minuten → 3 Minuten) - Cross-Project-Dependencies (Service A braucht Service B) - Dynamic Pipeline Generation (YAML on-the-fly generieren) - Multi-Environment-Strategies (Dev → Staging → Production mit Gates)
Problem: Monorepo mit 20 Services. Jeder Push triggert 200 Jobs (10 pro Service), dauert 30 Minuten. 95% der Jobs sind irrelevant (Service nicht geändert).
Lösung: Conditional Execution basierend auf Changes.
Monorepo-Struktur:
repo/
├── services/
│ ├── api/
│ ├── frontend/
│ ├── auth/
│ ├── payment/
│ └── ... (20 Services)
├── shared/
│ └── common-lib/
└── .gitlab-ci.yml
.gitlab-ci.yml:
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
stages:
- build
- test
- deploy
# Template für Service-Jobs
.service-job:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
changes:
- services/${SERVICE_NAME}/**/*
- shared/**/* # Shared lib affects all
- if: $CI_COMMIT_BRANCH == "main"
# API Service
build:api:
extends: .service-job
variables:
SERVICE_NAME: api
script:
- cd services/api
- npm run build
artifacts:
paths:
- services/api/dist/
test:api:
extends: .service-job
needs: [build:api]
variables:
SERVICE_NAME: api
script:
- cd services/api
- npm test
# Frontend Service
build:frontend:
extends: .service-job
variables:
SERVICE_NAME: frontend
script:
- cd services/frontend
- npm run build
artifacts:
paths:
- services/frontend/dist/
test:frontend:
extends: .service-job
needs: [build:frontend]
variables:
SERVICE_NAME: frontend
script:
- cd services/frontend
- npm test
# ... repeat for all 20 servicesProblem mit diesem Ansatz: Viel Repetition (20 Services × 3 Jobs = 60 Job-Definitionen).
Lösung: Generate .gitlab-ci.yml
dynamisch.
.gitlab-ci.yml:
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
stages:
- prepare
- build
- test
- deploy
# Generate dynamic config
generate-pipeline:
stage: prepare
image: python:3.11-slim
script:
- python scripts/generate_pipeline.py > generated-pipeline.yml
artifacts:
paths:
- generated-pipeline.yml
# Include generated config
include:
- artifact: generated-pipeline.yml
job: generate-pipelinescripts/generate_pipeline.py:
#!/usr/bin/env python3
import os
import yaml
from pathlib import Path
services_dir = Path("services")
services = [d.name for d in services_dir.iterdir() if d.is_dir()]
pipeline = {
"stages": ["build", "test", "deploy"],
".service-template": {
"rules": [
{
"if": "$CI_PIPELINE_SOURCE == 'merge_request_event'",
"changes": ["services/${SERVICE_NAME}/**/*", "shared/**/*"]
},
{"if": "$CI_COMMIT_BRANCH == 'main'"}
]
}
}
for service in services:
# Build job
pipeline[f"build:{service}"] = {
"extends": ".service-template",
"stage": "build",
"variables": {"SERVICE_NAME": service},
"script": [
f"cd services/{service}",
"npm ci",
"npm run build"
],
"artifacts": {
"paths": [f"services/{service}/dist/"]
}
}
# Test job
pipeline[f"test:{service}"] = {
"extends": ".service-template",
"stage": "test",
"needs": [f"build:{service}"],
"variables": {"SERVICE_NAME": service},
"script": [
f"cd services/{service}",
"npm test"
]
}
print(yaml.dump(pipeline, default_flow_style=False))Effekt: - Developer ändert nur
services/api/ → Nur build:api +
test:api laufen - 200 Jobs → 2 Jobs - 30 Minuten → 2
Minuten
Problem: frontend braucht
api (API-Specs für Codegen).
Lösung: Dependency-Tracking mit needs.
services/dependencies.yml:
frontend:
depends_on:
- api
payment:
depends_on:
- auth
- apigenerate_pipeline.py (erweitert):
import yaml
from pathlib import Path
# Load dependencies
with open("services/dependencies.yml") as f:
dependencies = yaml.safe_load(f)
# Generate jobs with needs
for service in services:
deps = dependencies.get(service, {}).get("depends_on", [])
needs_list = [f"build:{dep}" for dep in deps]
pipeline[f"test:{service}"] = {
"stage": "test",
"needs": [f"build:{service}"] + needs_list,
"script": [f"cd services/{service}", "npm test"]
}Effekt: test:frontend wartet auf
build:api (API-Specs verfügbar).
Problem: 100 Projekte in Organization. Jedes hat
eigene .gitlab-ci.yml. Security-Update (neue SAST-Regel)
muss in 100 Files geändert werden.
Lösung: 3-Tier Template-Hierarchie.
Repository: org/ci-templates
/templates/base.yml:
# Organization-wide defaults
default:
image: registry.company.com/ci-images/base:latest
retry:
max: 2
when:
- runner_system_failure
workflow:
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
# Security scanning (mandatory)
.security-scan:
stage: security
image: registry.company.com/security-scanner:latest
script:
- security-scan .
artifacts:
reports:
sast: sast-report.json
allow_failure: false # Mandatory pass
# Code quality (mandatory)
.code-quality:
stage: quality
image: registry.company.com/code-quality:latest
script:
- code-quality-scan .
artifacts:
reports:
codequality: codequality-report.json
allow_failure: trueRepository:
org/backend-team-templates
/templates/backend.yml:
# Include org base
include:
- project: org/ci-templates
file: /templates/base.yml
# Backend-specific defaults
.backend-base:
image: registry.company.com/ci-images/python:3.11
before_script:
- pip install -r requirements.txt
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- .venv/
# Backend test template
.backend-test:
extends: .backend-base
stage: test
script:
- pytest tests/
coverage: '/TOTAL.*\s+(\d+)%/'
artifacts:
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage.xmlProject: org/api-service
.gitlab-ci.yml:
# Include team templates
include:
- project: org/backend-team-templates
file: /templates/backend.yml
stages:
- security
- quality
- test
- deploy
# Use org security template
security:
extends: .security-scan
# Use org quality template
quality:
extends: .code-quality
# Use team test template
test:
extends: .backend-test
# Project-specific deployment
deploy:staging:
stage: deploy
script:
- kubectl apply -f k8s/staging/
environment:
name: staging
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCHEffekt: - Security-Update: Ändere nur
org/ci-templates/base.yml → Alle 100 Projekte bekommen
Update automatisch - Zero-Touch für Projects (automatic Rollout)
Problem: Template-Change bricht 10 Projekte. Wie kontrollieren?
Lösung: Versioned Templates mit opt-in Upgrades.
org/ci-templates/templates/base.yml:
# Version 2.0 (breaking changes)org/ci-templates/templates/base-v1.yml:
# Version 1.0 (stable, deprecated)Project .gitlab-ci.yml:
include:
- project: org/ci-templates
ref: v1.0.0 # Pin to tag
file: /templates/base.ymlUpgrade-Process: 1. Team releases
v2.0.0 mit Breaking Changes 2. Kommunikation: “v2.0
available, v1.0 deprecated” 3. Projects upgraden individuell:
ref: v1.0.0 → ref: v2.0.0 4. Nach 3 Monaten:
v1.0 Support-Ende
Problem: Pipeline mit 150 Jobs (10 Services × 15 Jobs), dauert 45 Minuten.
Analyse:
Stage: build (5 min) ← OK
Stage: test (35 min) ← BOTTLENECK (100 test jobs)
Stage: deploy (5 min) ← OK
Before (Sequential):
test:
script:
- npm test # Runs all 5000 tests sequentiallyAfter (Parallel):
test:
parallel: 10 # Split into 10 jobs
script:
- npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTALEffekt: 35 Minuten → 4 Minuten.
Before (Stages):
stages:
- build
- test
- integration
- deploy
build:backend:
stage: build
# ...
test:backend:
stage: test # Wartet auf ALL builds
# ...
deploy:backend:
stage: deploy # Wartet auf ALL tests
# ...Total: Build (5 min) → Test (10 min) → Integration (8 min) → Deploy (3 min) = 26 min
After (DAG mit needs):
build:backend:
stage: build
# ...
test:backend:
needs: [build:backend] # Startet sofort nach build:backend
# ...
deploy:backend:
needs: [test:backend] # Startet sofort nach test:backend
# ...Total: Build (5 min) + Test (10 min, parallel) + Deploy (3 min) = 18 min
Speedup: 26 min → 18 min (31% faster).
Before:
test:
script:
- npm ci # Downloads 500 MB every run
- npm testAfter:
test:
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
script:
- npm ci --cache .npm --prefer-offline # Uses cache
- npm testEffekt: - First run: 3 Minuten (download) - Subsequent runs: 30 Sekunden (cache hit)
Problem: Developer pusht 5 Commits hintereinander. 5 Pipelines laufen parallel, blockieren alle Runner.
Lösung:
test:
interruptible: true # Cancel wenn neuer Push
script:
- npm testEffekt: Alte Pipelines werden cancelled, nur neueste läuft → Spare Runner-Resources.
Before:
test:
parallel: 20
script:
- npm testProblem: Shard #1 failed nach 1 Minute, aber Shard #2-20 laufen weiter für 10 Minuten.
After:
workflow:
auto_cancel:
on_new_commit: interruptible
on_job_failure: all # Cancel all jobs wenn einer fehlschlägt
test:
parallel: 20
script:
- npm testEffekt: Shard #1 fails → Pipeline cancelled sofort → 10 Minuten gespart.
Problem: Service A braucht Library B (separates Repository).
Library B (.gitlab-ci.yml):
build:
script:
- make build
artifacts:
paths:
- dist/lib.so
# Publish to Package Registry
publish:
script:
- mvn deploy
rules:
- if: $CI_COMMIT_TAGService A (.gitlab-ci.yml):
# Trigger Library B build (if changed)
trigger:lib-b:
stage: prepare
trigger:
project: org/library-b
branch: main
strategy: depend # Wait for completion
rules:
- if: $CI_PIPELINE_SOURCE == "schedule" # Nightly
# Use Library B
build:
needs: [trigger:lib-b]
script:
- mvn install # Pulls latest lib-b from registry
- make buildParent Pipeline (.gitlab-ci.yml):
stages:
- trigger
- deploy
trigger:service-a:
stage: trigger
trigger:
project: org/service-a
strategy: depend
trigger:service-b:
stage: trigger
trigger:
project: org/service-b
strategy: depend
deploy:all:
stage: deploy
needs:
- trigger:service-a
- trigger:service-b
script:
- kubectl apply -f k8s/Use Case: Orchestration-Repo, das mehrere Services deployed.
Problem: Code-Flow: Dev → Staging → Production. Jede Environment braucht Approval/Tests.
stages:
- build
- test
- deploy:dev
- verify:dev
- deploy:staging
- verify:staging
- deploy:production
build:
stage: build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
test:
stage: test
script:
- npm test
# Dev (Auto)
deploy:dev:
stage: deploy:dev
script:
- kubectl set image deployment/app app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA -n dev
environment:
name: dev
url: https://dev.example.com
rules:
- if: $CI_COMMIT_BRANCH == "main"
verify:dev:
stage: verify:dev
script:
- curl https://dev.example.com/health
- run_smoke_tests.sh dev
rules:
- if: $CI_COMMIT_BRANCH == "main"
# Staging (Auto after Dev success)
deploy:staging:
stage: deploy:staging
needs: [verify:dev]
script:
- kubectl set image deployment/app app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA -n staging
environment:
name: staging
url: https://staging.example.com
rules:
- if: $CI_COMMIT_BRANCH == "main"
verify:staging:
stage: verify:staging
script:
- curl https://staging.example.com/health
- run_smoke_tests.sh staging
rules:
- if: $CI_COMMIT_BRANCH == "main"
# Production (Manual approval)
deploy:production:
stage: deploy:production
needs: [verify:staging]
script:
- kubectl set image deployment/app app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA -n production
environment:
name: production
url: https://example.com
when: manual # Requires approval
rules:
- if: $CI_COMMIT_BRANCH == "main"Flow:
Build → Test → Deploy Dev → Verify Dev → Deploy Staging → Verify Staging
↓
[Manual Approval]
↓
Deploy Production
deploy:production:
script:
# Determine current (blue/green)
- CURRENT=$(kubectl get service app -n production -o jsonpath='{.spec.selector.version}')
- NEW=$([ "$CURRENT" == "blue" ] && echo "green" || echo "blue")
# Deploy new version
- kubectl apply -f k8s/production/deployment-${NEW}.yaml
- kubectl wait --for=condition=ready pod -l version=${NEW} -n production
# Smoke test
- run_smoke_tests.sh production-${NEW}
# Switch traffic
- kubectl patch service app -n production -p '{"spec":{"selector":{"version":"'${NEW}'"}}}'
environment:
name: production
url: https://example.com
when: manual# Organization template
.security-mandatory:
stage: security
allow_failure: false # Must pass
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
sast:
extends: .security-mandatory
script:
- security-scanner sast .
artifacts:
reports:
sast: sast-report.json
dependency-scan:
extends: .security-mandatory
script:
- security-scanner dependencies .
artifacts:
reports:
dependency_scanning: dependency-report.json
container-scan:
extends: .security-mandatory
script:
- trivy image $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
artifacts:
reports:
container_scanning: container-report.jsondeploy:production:
before_script:
# Log to audit system
- |
curl -X POST https://audit.company.com/api/events \
-H "Authorization: Bearer $AUDIT_TOKEN" \
-d "{
\"event\": \"production_deployment\",
\"user\": \"$GITLAB_USER_LOGIN\",
\"commit\": \"$CI_COMMIT_SHA\",
\"pipeline\": \"$CI_PIPELINE_URL\",
\"timestamp\": \"$(date -Iseconds)\"
}"
script:
- ./deploy.sh production
environment:
name: productioncompliance:
stage: compliance
script:
# Check all required approvals
- python scripts/check_approvals.py
# Check no direct commits to main
- python scripts/check_no_direct_commits.py
# Check security scans passed
- python scripts/check_security_reports.py
rules:
- if: $CI_COMMIT_BRANCH == "main"
allow_failure: falseProblem: GitLab.com Free Tier: 400 CI/CD Minuten/Monat. Team verbraucht 2000 Minuten.
Strategy 1: Specific Runner für Heavy Jobs
build:heavy:
tags:
- self-hosted # Own runner, no minute limit
script:
- ./heavy_build.sh
test:light:
# No tags → Uses shared runner (counts against quota)
script:
- npm testStrategy 2: Conditional Pipelines
expensive:job:
script:
- run_expensive_tests.sh
rules:
- if: $CI_COMMIT_BRANCH == "main" # Nur auf main
- if: $CI_PIPELINE_SOURCE == "schedule" # Oder nightly
# Nicht bei jedem feature branchStrategy 3: Interruptible für Branches
test:
interruptible: true # Cancel alte Pipelines
script:
- npm test
rules:
- if: $CI_COMMIT_BRANCH != "main" # Nur für branchesMonorepo: Change Detection + Dynamic Generation → 200 Jobs → 2 Jobs Templates: 3-Tier-Hierarchy (Org → Team → Project) → Zero-Touch-Updates Performance: Parallel + DAG + Cache + Interruptible → 45 min → 8 min Cross-Project: Trigger + Multi-Project Pipelines → Service-Orchestration Multi-Environment: Dev → Staging → Prod mit Gates → Controlled Rollout Security: Mandatory Gates + Audit Trail + Compliance → Regulations Cost: Self-Hosted + Conditional + Interruptible → Optimize Minutes
Diese Patterns kombinieren Features aus vorherigen Kapiteln zu Production-Ready-Lösungen für Real-World-Komplexität.
Enterprise-CI/CD ist nicht “mehr Jobs” – es ist smart orchestration mit right tool for right job.