Cloud & Platform Engineering

Q: Azure or AWS?

Depends on context. Azure is strong in enterprise and Microsoft ecosystem. AWS has the broadest offering. GCP excels in data and ML. We help choose and minimize vendor lock-in.

Q: How long does cloud migration take?

Simple migration: 4–8 weeks. Complex enterprise with compliance: 6–12 months. We migrate iteratively — first service runs in cloud within weeks.

Q: Do we need Kubernetes?

Not always. For simple applications, App Service or Lambda suffices. Kubernetes makes sense with 5+ microservices, multi-cloud needs or specific operational requirements.

Q: How much will we save on cloud?

Typically 30-50% compared to unoptimized state. Reserved instances, right-sizing, spot instances, automatic scaling. FinOps as a continuous process.

Q: How do you minimize vendor lock-in?

Infrastructure as Code (Terraform) for portability, containerization (Docker/K8s) for runtime agnosticism, abstraction over managed services. 100% vendor neutrality is an illusion — but 80% portability is achievable and worth it.

Q: What if we have hybrid cloud requirements?

Azure Arc, AWS Outposts, or Anthos for consistent management. VPN/ExpressRoute for connectivity. Unified monitoring and deployment pipeline across both environments.

Cloud migration

Assessment, risk mapping, dependency analysis. Zero-downtime migration with hybrid bridge — not 'we'll move it over the weekend and hope'. Iterative approach with rollback plan for every step.

Lift & shift is a trap. Moving on-prem VMs to cloud without redesign means triple costs with the same problems. We migrate strategically — assessment, workload prioritization, hybrid bridge, gradual switching.

5R Assessment Framework: For each workload we decide: Rehost (lift & shift — only for legacy with short lifespan), Replatform (containerization, managed services), Refactor (redesign for cloud-native), Replace (SaaS alternative), Retire (shut down). Most workloads are a mix of replatform + refactor.

Hybrid bridge: Old and new worlds run in parallel. VPN/ExpressRoute between on-prem and cloud. Gradual service switching with traffic splitting and automatic rollback. No big bang cutover.

Dependency mapping: Before migrating anything, you need to know what depends on what. Automatic discovery (Azure Migrate, AWS Migration Hub) + manual validation. Output: dependency graph with risk scoring for each workload.

Typical timeline: Assessment (2-4 weeks) → Pilot (4-6 weeks, 2-3 services) → Wave migration (2-4 services/month) → Consolidation (4-6 weeks). Overall 3-12 months depending on size.

migrationassessmenthybrid

Detail →

Infrastructure as Code

Terraform, Pulumi, GitOps. Infrastructure versioned, tested, reproducible. Never again 'who changed those firewall rules' — everything is in git with code review.

Manual infrastructure is technical debt. A server configured via SSH console is a snowflake — nobody knows exactly how to reproduce it, documentation is outdated, disaster recovery is a guessing game. IaC eliminates this.

Terraform vs. Pulumi: Terraform (HCL) is the industry standard — huge ecosystem of providers, mature tooling, large community. Pulumi allows writing infrastructure in TypeScript/Python — better for teams that don’t want another language. We choose based on team context.

GitOps workflow: All infrastructure changes go through Pull Request. Code review, automated tests (terraform validate, tflint, checkov for security), plan preview in PR comments. Merge = apply. Audit trail in git history.

Modularization: Terraform modules for standard patterns — VPC/VNet, Kubernetes cluster, database, monitoring stack. Internal module registry. New team gets production-ready infrastructure in hours, not weeks.

State management: Remote state in encrypted storage (S3 + DynamoDB lock, Azure Blob + lock). State locking for team collaboration. Drift detection — automatic detection of manual changes.

terraformpulumiiac

Detail →

Kubernetes & containers

AKS, EKS, GKE — managed Kubernetes with Helm charts, ArgoCD for GitOps and progressive delivery. From dev environments to production with consistent configuration.

Kubernetes isn’t for everyone — but when you need it, we do it right. K8s makes sense with 5+ microservices, multi-cloud needs, or specific operational requirements (auto-scaling, service mesh, progressive delivery).

Managed Kubernetes: AKS (Azure), EKS (AWS), GKE (Google). We don’t build custom control planes — managed service eliminates 80% of operational overhead. Focus on workloads, not etcd backup.

GitOps with ArgoCD: Declarative deployment. Desired state in git, ArgoCD synchronizes cluster. Drift detection — if someone changes something manually, ArgoCD fixes it. Self-healing cluster.

Helm + Kustomize: Helm charts for standard components (nginx, cert-manager, monitoring). Kustomize for environment-specific overlays (dev/staging/prod). Templating without template hell.

Progressive delivery: Argo Rollouts for canary and blue-green releases. Automated analysis (Prometheus metrics) decides on rollout/rollback. Istio/Linkerd service mesh for traffic splitting and mTLS.

kuberneteshelmargocd

Detail →

CI/CD Pipeline

GitHub Actions, GitLab CI, Azure DevOps. From commit to production in minutes with automated quality gates, security scans and progressive delivery.

CI/CD isn’t just build and deploy. It’s the entire delivery pipeline — from commit through tests, security scans, quality gates, staging validation to progressive rollout to production. Every step automated, measurable, auditable.

Pipeline architecture: Build → Unit tests → SAST (security) → Container build → Integration tests → Deploy to staging → E2E tests → Deploy to prod (canary) → Automated analysis → Full rollout. Entire flow < 15 minutes for typical service.

Quality gates: Automated checks that stop deployment on failure. Test coverage < threshold? Stop. Security vulnerability (critical/high)? Stop. Performance regression > 10%? Stop. No manual approval for standard changes.

Monorepo vs. Polyrepo: We support both. For monorepo: affected detection (only changed services build/deploy). For polyrepo: standardized pipeline templates shared across all repos.

Metrics: DORA metrics as feedback loop. Deployment frequency, lead time, change failure rate, MTTR. Dashboard for engineering leadership. Trends, not snapshots.

cicdgithub-actionsgitlab

Detail →

Observability & SRE

Grafana, Prometheus, Loki, Jaeger, OpenTelemetry. SLO/SLI, error budgets, runbooks. You know WHY there's a problem, not just THAT there's a problem — and you have a process to solve it.

Monitoring tells you THAT. Observability tells you WHY. Three pillars: metrics (Prometheus), logs (Loki), traces (Jaeger/Tempo). OpenTelemetry as standard instrumentation — vendor-agnostic, instrument once, export anywhere.

SLO/SLI Framework: We define Service Level Objectives for every critical service. SLI (metrics) measures reality, SLO (target) defines acceptable quality. Error budget = how much “errorness” you can afford. When error budget runs out, stop feature work, fix reliability.

Alerting philosophy: We alert on symptoms, not causes. “API error rate > 1%” is a good alert. “CPU > 80%” is a bad alert (CPU can be 95% and everything works). Page only for actionable alerts — if on-call can’t do anything, it’s not a page.

SRE processes: On-call rotation, incident management (severity classification, communication protocol, escalation), blameless post-mortems. Runbooks for top 10 incidents. Toil tracking and elimination.

Dashboards: Executive dashboard (SLO compliance, availability, cost), engineering dashboard (latency, error rate, throughput per service), on-call dashboard (active incidents, recent deployments, anomaly detection).

observabilitysregrafana

Detail →

FinOps

Cloud cost optimization as a continuous process. You know how much you pay per unit of work, not for idle resources. Typically 30-50% savings compared to unoptimized state.

Cloud bill isn’t a weather report — it’s a controllable process. Most companies pay 30-50% more for cloud than they need to. Unused reserved instances, oversized VMs, forgotten resources, unoptimized storage tiering.

Cost visibility: Tagging strategy (team, environment, project, cost center). Cost allocation per team/project/service. Showback/chargeback model — teams see how much their services cost. When you see the price, you behave differently.

Optimization techniques: Reserved Instances/Savings Plans (commitment = 30-60% discount), right-sizing (most VMs are oversized 2-4×), spot/preemptible instances for non-critical workloads, autoscaling (scale to zero in dev/staging), storage tiering (hot → cool → archive).

Continuous optimization: Monthly cost review with recommendations. Automatic alerting on cost anomalies (unexpected spike). Waste detection (unused disks, unattached IPs, idle load balancers). FinOps dashboard with trends and forecasting.

Unit economics: Cost per transaction, cost per user, cost per API call. When you know unit cost, you can optimize meaningfully. “We cost 0.003 CZK per API call” is actionable. “Azure cost 500K last month” isn’t.

finopscostoptimization

Detail →

Platform Engineering

Building an internal platform that provides developers with standard service templates, unified logging, metrics, tracing, self-service environments and guardrails for security and costs.

Příklad z praxe: Company with 8 teams — each deployed differently. After platform introduction: one self-service portal, standard CI/CD, deploy in 10 minutes, zero-touch observability.

✓ Self-service for developers (deploy without ops ticket)
✓ Golden paths — standard service templates
✓ Guardrails for security and cost
✓ DORA metrics as feedback loop

99.95%

Platform availability

<15 min

Deployment pipeline

40%

Cloud cost savings

<5 min

MTTR

Jak to děláme

1

Cloud Assessment

We evaluate current infrastructure, applications and readiness for cloud migration.

2

Migration plan

We design target architecture, roadmap and transition strategy with minimal risk.

3

Pilot migration

We move first workloads, verify performance, security and operational processes.

4

Full migration & automation

Complete migration of remaining systems with IaC, CI/CD and auto-scaling.

5

Optimization & FinOps

Continuous optimization of costs, performance and governance over cloud environment.

When you need platform engineering¶

Typical situations¶

“We want to go to cloud” without strategy — Lift & shift for triple costs with same problems.
Releases hurt — Manual deploy, fear of Friday releases, rollbacks via SSH.
Snowflake servers — Servers configured manually, nobody knows how to reproduce them.
Cloud cost without control — Surprising bills at month end, no visibility.
Every team deploys differently — 8 teams, 8 pipeline variants, no standard.

Internal Developer Platform¶

Platform engineering isn’t just infrastructure — it’s a product for your developers. Self-service portal where a team creates new environment in minutes, deploys service, sets up monitoring — without operations ticket.

What the platform provides¶

Capability	Without platform	With platform
New environment	Ticket, 2 weeks	Self-service, 10 minutes
Deployment	Manual, scary	CI/CD, automatic
Monitoring	Each team different	Standard, zero-touch
Security	Audit at the end	Guardrails from start
Cost visibility	Monthly invoice	Real-time per team

Golden Paths¶

Standard templates for typical workloads:

Web API — Container, Kubernetes deployment, ingress, TLS, monitoring, CI/CD
Event consumer — Kafka consumer, dead letter queue, retry logic, monitoring
Scheduled job — CronJob/Azure Function, monitoring, alerting
Static site — CDN, TLS, CI/CD from git

Team selects golden path, fills parameters, platform creates everything needed. Guardrails built-in — security best practices, cost limits, naming conventions.

Migration process¶

From on-prem to cloud without downtime — 5 steps:

Assessment & Planning — 5R analysis (Rehost, Replatform, Refactor, Replace, Retire). Dependency mapping. Risk scoring. Migration roadmap with prioritization by business value.
Foundation — Landing zone setup. Networking (VPN/ExpressRoute), IAM, security baseline, monitoring. Terraform modules for standard patterns.
Pilot Migration — 2-3 workloads with different risk profiles. Process validation, tooling, rollback. Lessons learned for next waves.
Wave Migration — Systematic migration in waves (2-4 workloads/month). Hybrid bridge, traffic shifting, automated validation.
Optimization & Decommission — FinOps optimization, decommission on-prem, SRE processes, knowledge transfer.

DORA metrics¶

We measure what really matters:

Deployment frequency — How many times per day you deploy. Elite: multiple per day.
Lead time for changes — From commit to production. Elite: < 1 hour.
Change failure rate — With guardrails under 5%. Elite: < 5%.
MTTR — From hours to minutes thanks to observability. Elite: < 1 hour.

Dashboard with trends, not snapshots. DORA metrics retrospective every 2 weeks.

Stack¶

Category	Technologies
Cloud	Azure, AWS, GCP
IaC	Terraform, Pulumi, Crossplane
Container	Docker, Kubernetes (AKS/EKS/GKE), Helm
GitOps	ArgoCD, Flux
CI/CD	GitHub Actions, GitLab CI, Azure DevOps
Observability	Grafana, Prometheus, Loki, Jaeger, OpenTelemetry
Service Mesh	Istio, Linkerd
Security	Vault, cert-manager, Falco, Trivy
FinOps	Kubecost, AWS Cost Explorer, Azure Cost Management

Časté otázky

Depends on context. Azure is strong in enterprise and Microsoft ecosystem. AWS has the broadest offering. GCP excels in data and ML. We help choose and minimize vendor lock-in.

Simple migration: 4–8 weeks. Complex enterprise with compliance: 6–12 months. We migrate iteratively — first service runs in cloud within weeks.

Not always. For simple applications, App Service or Lambda suffices. Kubernetes makes sense with 5+ microservices, multi-cloud needs or specific operational requirements.

Typically 30-50% compared to unoptimized state. Reserved instances, right-sizing, spot instances, automatic scaling. FinOps as a continuous process.

Infrastructure as Code (Terraform) for portability, containerization (Docker/K8s) for runtime agnosticism, abstraction over managed services. 100% vendor neutrality is an illusion — but 80% portability is achievable and worth it.

Azure Arc, AWS Outposts, or Anthos for consistent management. VPN/ExpressRoute for connectivity. Unified monitoring and deployment pipeline across both environments.

Souvisí s

Security & Compliance {'cs': 'Zero Trust, IAM, audit, compliance.', 'en': 'Zero Trust, IAM, audit, compliance.'}

QA, Testing & Observability {'cs': 'Automatizované testování, monitoring a observability stack.', 'en': 'Automated testing, monitoring and observability stack.'}

Cloud & Platform Engineering

Cloud migration

Infrastructure as Code

Kubernetes & containers

CI/CD Pipeline

Observability & SRE

FinOps

Platform Engineering

Jak to děláme

Cloud Assessment

Migration plan

Pilot migration

Full migration & automation

Optimization & FinOps

When you need platform engineering¶

Typical situations¶

Internal Developer Platform¶

What the platform provides¶

Golden Paths¶

Migration process¶

DORA metrics¶

Stack¶

Časté otázky

Souvisí s

Máte projekt?