Библиотека статей по теме DevOps и SRE. Реклама: @ostinostin Контент: @mxssl РКН: https://knd.gov.ru/license?id=67704b536aa9672b963777b3®istryType=bloggersPermission
Как устроено блочное хранилище в публичном облаке?
Расскажем в 7-й серии реалити-проекта для инженеров — Building the Cloud.
Что будет в эфире:
⏺ расскажем, куда и как ВМ пишет данные
⏺ залезем под капот block storage в MWS Cloud Platform — нового облака от MWS
⏺ обсудим, что ещё есть в block storage, кроме дисков
⏺ поделимся, как мы планируем развивать системы хранения в новой облачной платформе
⏺ответим на вопросы
Розыгрыш фирменного мерча — за лучший вопрос в чате.
Работаете с инфраструктурой, бэкендом или интересуетесь облаками? Подключайтесь к эфиру.
📆 16 июля в 14:00 (мск)
⏩ Зарегистрироваться
tigrisfs
We're proud to announce the immediate availability of tigrisfs, the native filesystem interface for Tigris. This lets you mount Tigris buckets to your laptops, desktops, and servers so you can use data in your buckets as if it was local. This bridges the gap between the cloud and your machine.
Как бесшовно переехать с MS Exchange на российскую корпоративную почту и календарь на серверах компании?
Можно выбрать VK WorkSpace — коммуникационную платформу от российского разработчика VK Tech с возможностью размещения в локальной инфраструктуре.
17 июля в 12:00 команда VK WorkSpace проведет бесплатный вебинар, посвященный миграции из MS Exchange при размещении в дата-центре заказчика.
Эксперты расскажут, как спланировать переезд, избежать сбоев и быстро синхронизироваться со службой каталогов.
📍 Регистрация здесь: ссылка
В программе:
🔹 основные предпосылки миграции почты и календаря
🔹 четыре шага планирования успешной миграции
🔹 синхронизация со службами каталогов ADLoader и LDAP-коннектор
🔹 как отключить авторизацию и изменить конфигурацию, чтобы ускорить обработку загрузок
🔹миграция из MS Exchange: как создать сборщиков, провести тестовую миграцию, исправить ошибки и перенести данные
Чтобы участвовать, обязательно зарегистрируйтесь по ссылке: ссылка
terrafetch
Terrafetch is the Neofetch of Terraform—because your infrastructure deserves a little flair. It scans your Terraform repository and displays key statistics like the number of variables, resources, modules, outputs, and more—all in a stylish, terminal-friendly format. Perfect for CLI screenshots, repo intros, or just flexing your infra hygiene.
unregistry
Unregistry is a lightweight container image registry that stores and serves images directly from your Docker daemon's storage.
Load Testing with Impulse at Airbnb
Comprehensive Load Testing with Load Generator, Dependency Mocker, Traffic Collector, and More
kpatch
kpatch is a Linux dynamic kernel patching infrastructure which allows you to patch a running kernel without rebooting or restarting any processes. It enables sysadmins to apply critical security patches to the kernel immediately, without having to wait for long-running tasks to complete, for users to log off, or for scheduled reboot windows. It gives more control over uptime without sacrificing security or stability.
Demystifying Swap in Kubernetes: A Handbook for DevOps Engineers
robertbotez/demystifying-swap-in-kubernetes-a-handbook-for-devops-engineers-e5ef934593e3" rel="nofollow">https://medium.com/@robertbotez/demystifying-swap-in-kubernetes-a-handbook-for-devops-engineers-e5ef934593e3
Load Balancing gRPC traffic with Istio
https://dev.to/visepol/load-balancing-grpc-traffic-with-istio-1k49
GKE Cost Cutting — Three Key Lookout Points to view your Potential Savings
https://medium.com/google-cloud/gke-cost-cutting-three-key-lookout-points-to-view-your-potential-savings-10f271dc4fa9
Operational Considerations for Managing Stateful Workloads
When managing stateful workloads, whether in Kubernetes or traditional infrastructure, operational concerns like isolation, lifecycle management, security, disaster recovery, scalability, and observability take center stage. While the examples focus on AWS, PostgreSQL, and Kubernetes, the principles and best practices discussed here are broadly applicable to any environment. This article approaches these topics from an operations perspective, prioritizing reliability, maintainability, and resilience. The goal is not just to run a database, but to ensure it operates efficiently, scales properly, and remains secure in real-world conditions. We’ll explore key aspects of running stateful workloads, from managing failure domains to ensuring observability, and how these impact both operations teams and developers. Whether you’re running a database in a cloud-native setup or on bare metal, these strategies will help you build a robust, well-managed system.
Why Our Pods Were Breaking Bad (and How We Fixed Them)
In this article, we’ll walk through the process of diagnosing a memory leak, analyzing the root cause, and implementing effective solutions to mitigate its impact. We’ll explore practical steps that any application, regardless of the underlying stack or architecture, can follow to troubleshoot and optimize performance.
How We Migrated 30+ Kubernetes Clusters to Terraform
https://medium.com/learnings-from-the-paas/how-we-migrated-30-kubernetes-clusters-to-terraform-cd2b1cef8b84
lstr
A blazingly fast, minimalist directory tree viewer, written in Rust. Inspired by the command line program tree, with a powerful interactive mode.
Moving on from Nix
After using nix in my dotfiles for over 2 years, I’m now moving away from it.
Here’s why.
octelium
Octelium is a free and open source, self-hosted, unified platform for zero trust resource access that is primarily meant to be a modern alternative to remote access VPNs and similar tools.
How Ahrefs Saved US$400M in 3 Years by NOT Going to the Cloud
https://tech.ahrefs.com/how-ahrefs-saved-us-400m-in-3-years-by-not-going-to-the-cloud-8939dd930af8
How Kubernetes Runs Containers : A Practical Deep Dive
Taking a deep dive into how Kubernetes runs containers as Linux processes
Terraform: Working with the State File in Memory
pilitsyn/terraform-working-with-the-state-file-in-memory-930a262dd154" rel="nofollow">https://medium.com/@pilitsyn/terraform-working-with-the-state-file-in-memory-930a262dd154
🤔 Как не дать развалиться системе из 1500 микросервисов под пиковой нагрузкой? И что делать при DDoS-атаке на 1 млн RPS?
Команда Яндекс Маркета выкатила детальный разбор своей инженерии надёжности. Внутри — честно о том, как на практике работает Graceful Degradation, зачем нужны war rooms и как они проводят нагрузочные тесты прямо на проде.
✅Философия Graceful Degradation.
✅Must-have архитектурные паттерны.
✅Распределение процессов во время инцидентов.
✅Нагрузочное тестирование на проде.
Статья будет полезна тем, кто строит и поддерживает высоконагруженные и распределённые системы. Отличная возможность заглянуть под капот гиганта e-commerce и сравнить их подходы со своими.
Реклама. Рекламодатель ООО «Яндекс.Такси». ИНН 7704340310
Understanding the Circuit Breaker: A Key Design Pattern for Resilient Systems
The Circuit Breaker Pattern is a key design pattern for building resilient systems by preventing cascading failures and ensuring graceful degradation.
Argo Rollouts — Canary Deployment with Istio
https://medium.chuklee.com/argo-rollouts-canary-deployment-with-istio-b432bc141ba9
Why Every Platform Engineer Should Care About Kubernetes Operators
https://www.pulumi.com/blog/why-every-platform-engineer-should-care-about-kubernetes-operators
How Kubernetes HPA Decides Which Pod to Terminate When Scaling Down
AlexanderObregon/how-kubernetes-hpa-decides-which-pod-to-terminate-when-scaling-down-6675ebbdf56f" rel="nofollow">https://medium.com/@AlexanderObregon/how-kubernetes-hpa-decides-which-pod-to-terminate-when-scaling-down-6675ebbdf56f
Can Configuration Languages (config DSLs) solve configuration complexity?
https://itnext.io/can-configuration-languages-dsls-solve-configuration-complexity-eee8f124e13a
FacetController: How we made infrastructure changes at Lyft simple
https://eng.lyft.com/facetcontroller-how-we-made-infrastructure-changes-at-lyft-simple-dab49f5b27c7
How We Integrated Native macOS Workloads with Kubernetes
https://medium.com/agoda-engineering/how-we-integrated-native-macos-workloads-with-kubernetes-b4d3c14881a0
canine
Canine is an easy to use intuitive deployment platform for Kubernetes clusters.
Staying on Nix
I have been using Nix regularly since roughly 2019, when I set up my primary build server to use Nix to manage the various toolchains, though it wasn't until 2022 that I really invested heavily, and I'm now using Nix in combination with other more traditional DevOps tools to provision and manage more than 10 physical machines and 50 VMs in my homelab.
pgrwl
pgrwl is a PostgreSQL write-ahead log (WAL) receiver written in Go. It’s a drop-in, container-friendly alternative to pg_receivewal, supporting streaming replication, encryption, compression, and remote storage (S3, SFTP).
Designed for disaster recovery and PITR (Point-in-Time Recovery), pgrwl ensures zero data loss (RPO=0) and seamless integration with Kubernetes environments.