_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Cloud & Platform Engineering

Infrastructure as code. Platform as a product.

We build cloud infrastructure and internal platforms that give developers superpowers — and operations peace of mind.

Cloud migration

Assessment, risk mapping, dependency analysis. Zero-downtime migration with hybrid bridge — not 'we'll move it over the weekend and hope'. Iterative approach with rollback plan for every step.

Lift & shift is a trap. Moving on-prem VMs to cloud without redesign means triple costs with the same problems. We migrate strategically — assessment, workload prioritization, hybrid bridge, gradual switching.

5R Assessment Framework: For each workload we decide: Rehost (lift & shift — only for legacy with short lifespan), Replatform (containerization, managed services), Refactor (redesign for cloud-native), Replace (SaaS alternative), Retire (shut down). Most workloads are a mix of replatform + refactor.

Hybrid bridge: Old and new worlds run in parallel. VPN/ExpressRoute between on-prem and cloud. Gradual service switching with traffic splitting and automatic rollback. No big bang cutover.

Dependency mapping: Before migrating anything, you need to know what depends on what. Automatic discovery (Azure Migrate, AWS Migration Hub) + manual validation. Output: dependency graph with risk scoring for each workload.

Typical timeline: Assessment (2-4 weeks) → Pilot (4-6 weeks, 2-3 services) → Wave migration (2-4 services/month) → Consolidation (4-6 weeks). Overall 3-12 months depending on size.

migrationassessmenthybrid
Detail →

Infrastructure as Code

Terraform, Pulumi, GitOps. Infrastructure versioned, tested, reproducible. Never again 'who changed those firewall rules' — everything is in git with code review.

Manual infrastructure is technical debt. A server configured via SSH console is a snowflake — nobody knows exactly how to reproduce it, documentation is outdated, disaster recovery is a guessing game. IaC eliminates this.

Terraform vs. Pulumi: Terraform (HCL) is the industry standard — huge ecosystem of providers, mature tooling, large community. Pulumi allows writing infrastructure in TypeScript/Python — better for teams that don’t want another language. We choose based on team context.

GitOps workflow: All infrastructure changes go through Pull Request. Code review, automated tests (terraform validate, tflint, checkov for security), plan preview in PR comments. Merge = apply. Audit trail in git history.

Modularization: Terraform modules for standard patterns — VPC/VNet, Kubernetes cluster, database, monitoring stack. Internal module registry. New team gets production-ready infrastructure in hours, not weeks.

State management: Remote state in encrypted storage (S3 + DynamoDB lock, Azure Blob + lock). State locking for team collaboration. Drift detection — automatic detection of manual changes.

terraformpulumiiac
Detail →

Kubernetes & containers

AKS, EKS, GKE — managed Kubernetes with Helm charts, ArgoCD for GitOps and progressive delivery. From dev environments to production with consistent configuration.

Kubernetes isn’t for everyone — but when you need it, we do it right. K8s makes sense with 5+ microservices, multi-cloud needs, or specific operational requirements (auto-scaling, service mesh, progressive delivery).

Managed Kubernetes: AKS (Azure), EKS (AWS), GKE (Google). We don’t build custom control planes — managed service eliminates 80% of operational overhead. Focus on workloads, not etcd backup.

GitOps with ArgoCD: Declarative deployment. Desired state in git, ArgoCD synchronizes cluster. Drift detection — if someone changes something manually, ArgoCD fixes it. Self-healing cluster.

Helm + Kustomize: Helm charts for standard components (nginx, cert-manager, monitoring). Kustomize for environment-specific overlays (dev/staging/prod). Templating without template hell.

Progressive delivery: Argo Rollouts for canary and blue-green releases. Automated analysis (Prometheus metrics) decides on rollout/rollback. Istio/Linkerd service mesh for traffic splitting and mTLS.

kuberneteshelmargocd
Detail →

CI/CD Pipeline

GitHub Actions, GitLab CI, Azure DevOps. From commit to production in minutes with automated quality gates, security scans and progressive delivery.

CI/CD isn’t just build and deploy. It’s the entire delivery pipeline — from commit through tests, security scans, quality gates, staging validation to progressive rollout to production. Every step automated, measurable, auditable.

Pipeline architecture: Build → Unit tests → SAST (security) → Container build → Integration tests → Deploy to staging → E2E tests → Deploy to prod (canary) → Automated analysis → Full rollout. Entire flow < 15 minutes for typical service.

Quality gates: Automated checks that stop deployment on failure. Test coverage < threshold? Stop. Security vulnerability (critical/high)? Stop. Performance regression > 10%? Stop. No manual approval for standard changes.

Monorepo vs. Polyrepo: We support both. For monorepo: affected detection (only changed services build/deploy). For polyrepo: standardized pipeline templates shared across all repos.

Metrics: DORA metrics as feedback loop. Deployment frequency, lead time, change failure rate, MTTR. Dashboard for engineering leadership. Trends, not snapshots.

cicdgithub-actionsgitlab
Detail →

Observability & SRE

Grafana, Prometheus, Loki, Jaeger, OpenTelemetry. SLO/SLI, error budgets, runbooks. You know WHY there's a problem, not just THAT there's a problem — and you have a process to solve it.

Monitoring tells you THAT. Observability tells you WHY. Three pillars: metrics (Prometheus), logs (Loki), traces (Jaeger/Tempo). OpenTelemetry as standard instrumentation — vendor-agnostic, instrument once, export anywhere.

SLO/SLI Framework: We define Service Level Objectives for every critical service. SLI (metrics) measures reality, SLO (target) defines acceptable quality. Error budget = how much “errorness” you can afford. When error budget runs out, stop feature work, fix reliability.

Alerting philosophy: We alert on symptoms, not causes. “API error rate > 1%” is a good alert. “CPU > 80%” is a bad alert (CPU can be 95% and everything works). Page only for actionable alerts — if on-call can’t do anything, it’s not a page.

SRE processes: On-call rotation, incident management (severity classification, communication protocol, escalation), blameless post-mortems. Runbooks for top 10 incidents. Toil tracking and elimination.

Dashboards: Executive dashboard (SLO compliance, availability, cost), engineering dashboard (latency, error rate, throughput per service), on-call dashboard (active incidents, recent deployments, anomaly detection).

observabilitysregrafana
Detail →

FinOps

Cloud cost optimization as a continuous process. You know how much you pay per unit of work, not for idle resources. Typically 30-50% savings compared to unoptimized state.

Cloud bill isn’t a weather report — it’s a controllable process. Most companies pay 30-50% more for cloud than they need to. Unused reserved instances, oversized VMs, forgotten resources, unoptimized storage tiering.

Cost visibility: Tagging strategy (team, environment, project, cost center). Cost allocation per team/project/service. Showback/chargeback model — teams see how much their services cost. When you see the price, you behave differently.

Optimization techniques: Reserved Instances/Savings Plans (commitment = 30-60% discount), right-sizing (most VMs are oversized 2-4×), spot/preemptible instances for non-critical workloads, autoscaling (scale to zero in dev/staging), storage tiering (hot → cool → archive).

Continuous optimization: Monthly cost review with recommendations. Automatic alerting on cost anomalies (unexpected spike). Waste detection (unused disks, unattached IPs, idle load balancers). FinOps dashboard with trends and forecasting.

Unit economics: Cost per transaction, cost per user, cost per API call. When you know unit cost, you can optimize meaningfully. “We cost 0.003 CZK per API call” is actionable. “Azure cost 500K last month” isn’t.

finopscostoptimization
Detail →
Platform Engineering

Platform Engineering

Building an internal platform that provides developers with standard service templates, unified logging, metrics, tracing, self-service environments and guardrails for security and costs.

Příklad z praxe: Company with 8 teams — each deployed differently. After platform introduction: one self-service portal, standard CI/CD, deploy in 10 minutes, zero-touch observability.
  • Self-service for developers (deploy without ops ticket)
  • Golden paths — standard service templates
  • Guardrails for security and cost
  • DORA metrics as feedback loop
99.95%
Platform availability
<15 min
Deployment pipeline
40%
Cloud cost savings
<5 min
MTTR

Jak to děláme

1

Cloud Assessment

We evaluate current infrastructure, applications and readiness for cloud migration.

2

Migration plan

We design target architecture, roadmap and transition strategy with minimal risk.

3

Pilot migration

We move first workloads, verify performance, security and operational processes.

4

Full migration & automation

Complete migration of remaining systems with IaC, CI/CD and auto-scaling.

5

Optimization & FinOps

Continuous optimization of costs, performance and governance over cloud environment.

When you need platform engineering

Typical situations

  1. “We want to go to cloud” without strategy — Lift & shift for triple costs with same problems.
  2. Releases hurt — Manual deploy, fear of Friday releases, rollbacks via SSH.
  3. Snowflake servers — Servers configured manually, nobody knows how to reproduce them.
  4. Cloud cost without control — Surprising bills at month end, no visibility.
  5. Every team deploys differently — 8 teams, 8 pipeline variants, no standard.

Internal Developer Platform

Platform engineering isn’t just infrastructure — it’s a product for your developers. Self-service portal where a team creates new environment in minutes, deploys service, sets up monitoring — without operations ticket.

What the platform provides

Capability Without platform With platform
New environment Ticket, 2 weeks Self-service, 10 minutes
Deployment Manual, scary CI/CD, automatic
Monitoring Each team different Standard, zero-touch
Security Audit at the end Guardrails from start
Cost visibility Monthly invoice Real-time per team

Golden Paths

Standard templates for typical workloads:

  • Web API — Container, Kubernetes deployment, ingress, TLS, monitoring, CI/CD
  • Event consumer — Kafka consumer, dead letter queue, retry logic, monitoring
  • Scheduled job — CronJob/Azure Function, monitoring, alerting
  • Static site — CDN, TLS, CI/CD from git

Team selects golden path, fills parameters, platform creates everything needed. Guardrails built-in — security best practices, cost limits, naming conventions.

Migration process

From on-prem to cloud without downtime — 5 steps:

  1. Assessment & Planning — 5R analysis (Rehost, Replatform, Refactor, Replace, Retire). Dependency mapping. Risk scoring. Migration roadmap with prioritization by business value.
  2. Foundation — Landing zone setup. Networking (VPN/ExpressRoute), IAM, security baseline, monitoring. Terraform modules for standard patterns.
  3. Pilot Migration — 2-3 workloads with different risk profiles. Process validation, tooling, rollback. Lessons learned for next waves.
  4. Wave Migration — Systematic migration in waves (2-4 workloads/month). Hybrid bridge, traffic shifting, automated validation.
  5. Optimization & Decommission — FinOps optimization, decommission on-prem, SRE processes, knowledge transfer.

DORA metrics

We measure what really matters:

  • Deployment frequency — How many times per day you deploy. Elite: multiple per day.
  • Lead time for changes — From commit to production. Elite: < 1 hour.
  • Change failure rate — With guardrails under 5%. Elite: < 5%.
  • MTTR — From hours to minutes thanks to observability. Elite: < 1 hour.

Dashboard with trends, not snapshots. DORA metrics retrospective every 2 weeks.

Stack

Category Technologies
Cloud Azure, AWS, GCP
IaC Terraform, Pulumi, Crossplane
Container Docker, Kubernetes (AKS/EKS/GKE), Helm
GitOps ArgoCD, Flux
CI/CD GitHub Actions, GitLab CI, Azure DevOps
Observability Grafana, Prometheus, Loki, Jaeger, OpenTelemetry
Service Mesh Istio, Linkerd
Security Vault, cert-manager, Falco, Trivy
FinOps Kubecost, AWS Cost Explorer, Azure Cost Management

Časté otázky

Depends on context. Azure is strong in enterprise and Microsoft ecosystem. AWS has the broadest offering. GCP excels in data and ML. We help choose and minimize vendor lock-in.

Simple migration: 4–8 weeks. Complex enterprise with compliance: 6–12 months. We migrate iteratively — first service runs in cloud within weeks.

Not always. For simple applications, App Service or Lambda suffices. Kubernetes makes sense with 5+ microservices, multi-cloud needs or specific operational requirements.

Typically 30-50% compared to unoptimized state. Reserved instances, right-sizing, spot instances, automatic scaling. FinOps as a continuous process.

Infrastructure as Code (Terraform) for portability, containerization (Docker/K8s) for runtime agnosticism, abstraction over managed services. 100% vendor neutrality is an illusion — but 80% portability is achievable and worth it.

Azure Arc, AWS Outposts, or Anthos for consistent management. VPN/ExpressRoute for connectivity. Unified monitoring and deployment pipeline across both environments.

Máte projekt?

Pojďme si o něm promluvit.

Domluvit schůzku